@internet -- Internet Media Types Are a Real MIME Field

Perhaps you've had occasion to use Eudora©, or one of a myriad of other Internet Mail clients to send a binary attachment, such as a spreadsheet or zipfile, to one of your geographically-distant correspondents. Through the magic of Multipurpose Internet Media Extensions (MIME), you attached the file to the message on your end and encoded it into a form that could be understood by all of the different Mail Transfer Agents (MTA) between you and the recipient and your correspondent was able to transparently download and decode it into its original form on his or her end. Easy, wasn't it? It may surprise you to discover that this transparency is actually fairly new--and even now it is evolving at the same frantic rate as are all the other Internet technologies. In fact, it wasn't until the early 1990's that it became clear there was a widespread and immediate need for Internet Mail, which can reliably use only the first 127 US ASCII characters, to be able to accomodate binary and formatted-text attachments. Although discussions about so-called "multimedia mail" had been held as early as August, 1980, (in RFC 767), and substantive contributions were made in RFCs 934 (January 1, 1985) and 1049 (March 1, 1988), an experimental standard for encoding data in formats other than 7bit-ASCII so that it could pass through MTAs limited to 7-bit (i.e.--the first 128 ASCII characters) data was not developed until April, 1990 (in RFC 1154) and it wasn't until June of 1992 that a standard emerged. That standard, first specified in Nathaniel Borenstein and Ned Freed's RFC 1341, built on the standard Internet Mail message-header architecture as specified in Internet Standard 11, (most currently defined in RFC 822, a quasi-hypertext version of which is available at http://www.cis.ohio-state.edu/htbin/rfc/rfc822.html) to add the capability of including multi-part textual and non-textual body content to mail messages. RFC 1341 was itself updated in September, 1993 by Borenstein and Freed's RFCs 1521 through 1524, (point your browser at http://www.cis.ohio-state.edu/htbin/rfc/rfc1521.html and follow that document's internal links from RFC to RFC), but most of the changes were relatively minor. Standard 11, RFC 822, specifies the form and allowable content of the case-insensitive headers which Internet Mail must or may use to do such things as specify where the message originated, where it is going, the time it was sent and so on. MIME adds five new items to that list of headers: MIME-Version, Content-Type, Content-Transfer-Encolding, Content-ID and Content-Description. Mail clients (and servers) which don't understand these headers simply ignore them, without affecting the content of the message. The "MIME-Version" header specifies the MIME version to which the message conforms. At the moment, the only allowable value for it is 1.0. The "Content-Type" describes what the March, 1994 RFC 1590 authored by Jon Postel, (postel@isi.edu), who is the Internet Assigned Numbers Authority (IANA), now calls the Internet Media (IMEDIA) Type. Registered values for this field include "application", "audio", "image", "message", "multipart", "text" or "video", and each type can have one or more subtypes. Currently-registered subtypes include "octet-stream" or "PostScript" (for the "application" type), "basic" (audio), "gif" or "jpeg" (image), "rfc822", "partial" or "external-body" (message), "mixed", "alternative", "parallel" or "digest" (multipart), "plain", "html" or "richtext" (text) and "mpeg" (video). To be supported Internet-wide, an IMEDIA type or subtype must be registered with IANA (the form for this is appended to RFC 1590). However, any two agents (servers or clients) can use unregistered types and subtypes by prefacing them with what's called an "X-token". An X-token is simply the letter X and a dash prepended to an IMEDIA type or subtype. Since the HyperText Transfer Protocol (HTTP), on which the World Wide Web is based, uses this same convention, it is extremely useful. For example, a server could declare a private subtype corresponding to Microsoft Word 6.0 files by including a MIME header reading "Content-Type: text/x-msword6" (the server itself would have to have this subtype listed in its configuration file). As long as the receiving agent (browser) knows what "x-msword6" means, and has Word 6.0 configured as a "helper application", when the browser receives a Word 6.0 file in response to a request, it will automatically invoke and open the file in Word. The "Content-Transfer-Encoding" header defines the encoding mechanism by which the content of the message body has been transformed into 7-bit ASCII with lines no longer than 1,000 characters, in order to be able to pass successfully through MTAs which cannot handle larger character sets or longer lines. The default value (which will be assumed, if the header is missing) is "7bit". Legal alternative values are "quoted-printable" (the content is all 7bit ASCII characters with lines no longer than 76 characters), "base64" (a dense translation of binary data which converts each three octets of data into a string of 6 ASCII characters), "8bit" or "binary" (both are human-readable translations of binary data into hexadecimal notation, the distinction being that "8bit" conforms to the line-length limitations of "7bit" MTAs and "binary" does not) and "x-token" (any alternative encoding scheme that the server and client both understand). If the the value of the "Content-Type" header is "multipart" or "message", only the "7bit", "8bit" or "binary" types are legal, in order to prevent encapsulated data from being needlessly re-encoded by successive layers of Content-Type headers. If the value of the "Content-Type" header is "external-body", a "Content-ID" header is mandatory. It takes the same form as does the "Message-ID" header specified by RFC 822 (a unique ASCII identifier string prepended to the fully-qualified domain name of the machine on which it was generated) and, like the "Message-ID", it must be world-unique. When the "Content-Type" is "message/partial", "Content-ID" headers are also used to identify the fragmented parts of the original message, so that they may be reassembled in the proper order. The "Content-Description" header is simply a descriptive string, such as "Picture of the Space Shuttle orbiting over Asia".

Next time we'll deconstruct an example MIME message to see how data encapsulation is performed.