Keith Moore
MIME is widely supported, but many tools do a poor job of generating MIME. The tools either generate messages of excessive length, or which cannot be read on some platforms, or which -- for no good reason -- are not easily read with non-MIME mail readers. This has made some communities hostile to the introduction of MIME and delayed MIME's acceptance. This memo consists of advice to MIME implementors to generate ``Recipient-friendly MIME'' - that is, MIME which will not have an unnecessarily adverse impact on recipients.
NOTE: Most of what is written here applies only to typed-in text, not to attachments. The author's philosophy is that typed-in text should (by default) be readable by the maximum number of recipients, even if that involves making slight changes to the text (auto-wrapping of lines, deleting trailing spaces, substituting `` and '' for left and right double quote). Attachments, on the other hand, should be conveyed byte-for-byte intact to the recipient.
Quoted-printable should only be used when it's necessary to convey some characters that cannot be represented in 7-bit ASCII. Some mail senders automatically encode everything in quoted-printable (or even base64) whether it needs it or not. This is annoying to recipients who don't have MIME mail readers.
NOTE: Some people feel that enough mailers are 8-bit transparent, that quoted-printable should not be used at all by default. However, there are still some MTAs out there that will mishandle unencoded 8-bit text.
The format=flowed option (RFC 2646) is an extension to text/plain that allows the sending mail user agent to represent unbroken, wrappable text differently from text which is intended to be represented as-is (without wrapping). It is also designed to be readable on legacy mail readers that don't support format=flowed.
One advantage of format=flowed is that "wrappable" text can be wrapped to suit the width of the recipient's display or output medium - whether it's a big screen or a little PDA. Another advantage of format=flowed is that it works better with quotations, especially when those quotations must be wrapped.
In general, text attachments should NOT use format=flowed unless it is known that they are encoded in that format.
If someone accidentally types a space at the end of a line, and it gets encoded in quoted-printable, the result looks like this:
There is a space at the end of this line=20 Here is the following line.
or maybe even like this:
There is a space at the end of this line= = Here is the following line.
This is ugly. It either adds extra blank lines or extra =20 thingys which (again) are annoying to recipients who don't have MIME mail readers.
If you delete trailing spaces before applying quoted-printable encoding, short lines without = characters in them will look just the same to MIME recipients and non-MIME recipients.
Note that the "delete trailing spaces" advice applies only to text which the sender types in. It does NOT apply to SP CR LF sequences inserted into format=flowed text for the purpose of making such messages readable with legacy mail readers.
The wording in the spec is a bit muddy about this, but text/plain (format=fixed) is supposed to consist of preformatted text. Line breaks are explicit and indicated by CR LF.
Some existing products don't insert line breaks except between paragraphs. Recipients of messages from such products may see long lines that, depending on the recipient's user agent, either (a) are wrapped in strange places, (b) are truncated at the right margin, or (c) extend way past the right side of the window such that horizontal scrolling is required to read the text.
There are basically two kinds of user interfaces for typed-in text:
1. The mail composer automatically word-wraps long lines that are typed in.
In this case, the mail composer should label the text as format=flowed and emit lines of no more than 75 columns or so (regardless of the width of the compose window) before the message is sent. Long lines or "paragraphs" that are intended to be re-wrappable should have all but the last line terminated with SP CR LF. This should be done before any content-transfer-encoding is applied. Ideally the author should be able to turn off auto-wrap for the cases where he/she needs more control over the output.
In this case the mail composer should display line-endings for "wrapped" lines differently than line-endings for lines which the user explicitly terminated, for ease in editing.
2. The mail composer does not do auto-wrap; the author must explicitly supply line terminators (typically by pressing RETURN or ENTER)
In this case, the mail composer should label this text as format=fixed and take care to not display long lines of text being composed, as if they were auto-wrapped. Some displays automatically wrap long lines to the beginning of the following line (though probably not at a word boundary) even though the text isn't really going to be wrapped. This can be confusing.
NOTE: The line length limit in quoted-printable encoding is for the purpose of allowing transport through old mail systems that stored mail in fixed-length 80 character records. It applies only to the encoded form of the text (after quoted-printable encoding has been applied). This does not affect what the recipient sees (unless the recipient lacks MIME capability).
To someone without a MIME mail reader,
|----------------| (pretend this is 78 columns wide)
This is an =
extremely long =
line that must =
be wrapped.
looks much better than
|----------------| (pretend this is 78 columns wide)
This is an extrem= ely long line tha= t must be wrapped.
(This mostly applies to attachments, rather than typed-in text, since for typed-in text you should normally wrap long lines before encoding. )
Some mail composers send HTML body parts even when the author has typed in only plain text with no markup - no bold or italic, no changes in character sizes, no sub or superscripts, and no ``links''. This is silly, especially if the HTML is difficult to read as plain text.
This also implies that HTML should not be the default for composed messages.
If the author has explicitly requested some amount of markup (boldface/italic, typeface, etc.), HTML might be a reasonable choice to represent such markup. But the HTML should be generated such that it can easily be read by a reader that doesn't support HTML, or for that matter, MIME. (Note that the MIME specification says that, in the absence of specific support for text/html, a mail reader should display text/html as if it were text/plain.)
Some guidelines:
...especially if the message being replied to already contains "> " or some other quote character from previous replies.
Okay, so the "> " convention is a pretty crude mechanism to indicate that you're quoting something from a previous message, and it gets pretty ugly after several layers of replies. It's understandable that people would like to replace this with some nestable HTML construct. But if there's anything worse than lots of "> "s at the beginning of each line, it's a mixture of "> "s (or ">"s) and HTML constructs.
Multipart/alternative is useful when you want to send a message in multiple formats and have the recipient's mail reader use the ``best'' format that it understands. But it takes up a lot of extra space. It's especially annoying to mailing lists, and many usenet sites will actually refuse to accept such messages.
A single recipient-friendly HTML body part might be better than a multipart/alternative body part consisting of text/plain and text/html components.
Some mail senders use proprietary character sets even when all of the characters needed to display the mesasge are in US-ASCII or ISO-8859-1. For example, it's silly to use a proprietary character set to transmit characters such as apostrophe and double quotes. ASCII ' and `` '' work well enough.
The content-disposition header field should be used to indicate that a body part is an attachment, and to suggest a filename by which it might be stored. e.g.
Content-disposition: attachment; filename="foo.txt"
Note that the "name=" parameter is only defined for the application/octet-stream content-type. Its use with other content-types is mostly harmless, but neither is it universally recognized.
This applies to things like "MS-TNEF" attachments as well as to things like vCard objects. Since most recipients have no use for such things, they waste disk space and bandwidth. They are also a nuisance to recipients who lack MIME-capable mail readers.
Worse, many mail readers handle unrecognized attachments by asking if the recipient wants to save the attachment to a file. So having this cruft sent out in every message isn't merely wasteful, it's downright offensive to the recipient, and it doesn't reflect well on the sender.
If you want to make your vCard accessible to recipients without annoying them, put it on a web server and include its URL in your signature.
RFC 822 requires quotes around names when they contain special characters. Since ``.'' is a special character, the name "Bryan K. Moore" needs to be in double quotes. Unless the name contains a special character, the double quotes should not be included.
Even worse is including single quotes (usually within double quotes). These serve no purpose at all.
| From: Keith Moore <moore@cs.utk.edu> | works just fine. |
| From: "Keith Moore" <moore@cs.utk.edu> | is unnecessary. |
| From: "'Keith Moore'" <moore@cs.utk.edu> | is annoying. |
If an address isn't accompanied by a name, don't generate one. In particular, don't copy the address to the phrase that precedes the address.
| To: joebob@random.host | is sufficient. |
| To: joebob <joebob@random.host> | is silly and redundant. |
| To: "'joebob@random.host'" <joebob@random.host> | is obscene. |
"Q" or "B" encoded-words (RFC 2047 and its predecessors) should only be used to encode non-ASCII characters that appear in text portions of message or body part headers - say in the phrase that precedes an address, in a Subject field, or in a comment.
Encoded-words should never be used for machine-readable portions of the headers (e.g. within an address). Addresses consist of ASCII characters only, both for compatibility with RFC 822, and so that anybody in the world can type them in.
Message and body part headers consist entirely of ASCII characters. Unencoded non-ASCII characters in headers are nonstandard, undefined, and will cause some mail parsers to barf.
In particular, if a body part consists only of ASCII characters, label it as "US-ASCII" rather than "iso-8859-1" or something else.
Thanks to Steinar Bang, Nathaniel Borenstein, Earl Hood, and Dan Wing for their comments.