@internet -- A Walk Through the MIME Fields

Last time, we discovered that not all MIME types include people in whiteface makeup who annoy tourists for a living. In fact, we learned that, Multipurpose Internet Mail Extensions (MIME) types can allow us to include non-ASCII content in Internet Mail messages, despite the fact that Internet Mail is, by default, currently limited to transmitting only the first 128 US ASCII characters (characters 0 through 127) in order to maintain downward compatibility with older Mail Transfer Agents (MTAs). Let's walk through an example MIME message between users on two systems of very different vintages. Although the body content of this example is all ASCII, MIME encapsulation employs the same technique with binary-encoded content..we just wouldn't be able to read it: From: Fred Flintstone <fred@granitequarry.com> To: George Jetson <george@spacelysprockets.com> Date: 29 Jan 96 02:51:55 PST Subject: Sample message Message-Id: qk1jef7o5jv@granitequarry.com MIME-Version: 1.0 Content-type: multipart/mixed; boundary="Xyz123abC" This is the preamble. Its content will be ignored, but it is a handy place for mail composers to include an explanatory note to non-MIME-compliant readers. --Xyz123abC This is implicitly typed plain ASCII text. It does NOT end with a double linebreak. --Xyz123abC Content-Type: text/richtext This is Rich Text Format text. It does end with a double linebreak, so it could alternatively be encoded binary data. --Xyz123abC-- This is the epilogue. It will also be ignored. In the example above, the first five headers (From, To, Date, Subject and Message-ID) are all required by RFC 822 (a quasi-hypertext version of which is available at http://www.cis.ohio-state.edu/htbin/rfc/rfc822.html). They identify the sender of the message, its intended recipient, the date and time it was sent, what it concerns (note that this field may be left blank) and the message's world-unique identifier (which is generated by the sender's MTA and refers to this version of this message--as opposed to later versions of the same message, which will receive their own unique identifiers). There could be another two dozen headers included in this message, covering everything from an alternate return address to the route the message has taken to encryption techniques and key identifiers, but these are the only ones that RFC 822 requires. The first of the MIME-specific headers is the MIME-Version identifier. The only current legal value for this header is 1.0, although there will eventually be other legal values. The second MIME header is the Content-type identifier (also now known as the Internet Media or IMEDIA type). RFC 1590, authored by Jon Postel, (postel@isi.edu), the Internet Assigned Numbers Authority (IANA), defines the registered values for this field as including "application", "audio", "image", "message", "multipart", "text" or "video". (There can also be private values which are defined by the X-token convention I described in last issue's column). Each of these types can have one or more subtypes. Our example message has a Content-type of "multipart" with a subtype of "mixed". This tells the receiving mailreader that there is more than one MIME encapsulation in the message body and that more than one encapsulation scheme has been used. The multipart IMEDIA type takes a required "boundary=" parameter, which is separated from the IMEDIA type definition by a semicolon and a whitespace character. The "boundary=" parameter defines the encapsulation boundary between each part of the included content. The actual boundary string will be randomly generated by the sender's mail agent and, although the MIME RFCs (1341 and 1521 through 1524) don't require that it be enclosed in double quotes, most modern MIME mail agents will do so in order to ensure that the contents of the boundary string are clearly defined. The end of the entire set of RFC 822 and MIME headers and the beginning of the body content is defined by two carriage return/linefeed pairs (this is true regardless of the presence of MIME headers). In our example, the message body begins with a preamble. Although a preamble is not required and although a MIME-compliant mail reader will ignore it, many MIME-capable mail agents will use it to insert an explanatory message which will be displayed only if the recipient's mail agent doesn't support MIME. The preamble is followed by a single carriage return/linefeed pair, two hyphen characters and the first iteration of the encapsulation boundary delimiter string. (There must never be any white space between the two hypens and the boundary delimiter.) The first boundary delimiter is followed by two carriage return/linefeed pairs and then the first part of the encapsulated content. As it happens, the first part of the content is just an ASCII string. If it were not part of a multipart/mixed MIME message, it wouldn't require a boundary delimiter at all. Since, in our example, it is part of a multipart/mixed MIME message, a boundary delimiter must both precede and follow it. Since this first encapsulation is simple ASCII text, it doesn't require two carriage return/linefeed pairs between it and the boundary delimiter which follows it. This is the only case in which the two carriage return/linefeed pairs requirement can be waived. There is a Content-Type: text/richtext header immediately below the boundary delimiter which precedes the second encapsulation in our example. Note that the two carriage return/linefeed pairs follow the Content-Type header, rather than the boundary delimiter. This part of our example content is, indeed, a short Rich Text Format (RTF) file which also adheres to the rule of two carriage return/linefeed pairs between the end of an encapsulation and the boundary delimiter which follows it. Since this is the last of the content inclusions, its trailing boundary delimiter is both preceded by and followed by two hyphen characters. The two hyphens which follow the final boundary delimiter signify the end of all encapsulated content for the entire message. There must never be any white space between the end of the boundary delimiter and the two trailing hypens.

Two carriage return/linefeed pairs aren't required after the final boundary delimiter. The epilogue which ends our example is strictly optional and it won't be displayed to a MIME-capable receiving agent.