Whether RFCs in HTML?
Every few months someone insists that RFCs should not be required to
be in ASCII. Frequently someone suggests that HTML should be
an acceptable alternative.
In a nutshell, the problem is that HTML isn't portable. This may
seem like an outrageous statement, but consider:
- Early versions of HTML were severely limited in their ability
to format text. Various hacks were often needed even to represent
common text constructs, like flush-right text or a hanging indent.
- Support for presentation of newer HTML features like style sheets,
and support for internationalization varies considerably, particularly
if you consider the browsers in actual use on platforms other than Windows.
(saying "you have to use MSIE to view this RFC" won't fly.)
- Lots of tools can generate HTML, but fewer tools can generate
HTML that conforms to a particular style (either in appearance or
in use of particular HTML markup for certain constructs). For
instance, HTML tables are commonly used to format things that aren't
semantically tables.
- Many HTML viewers have security holes, especially in their support
for scripting languages or objects. Expecting IETF participants
to view HTML RFCs and Internet-Drafts may therefore expose them to
security risks unless some screening is done.
- Currently all RFC formats allow the entire RFC to be stored in
a single file, and this is extremely advantageous. HTML doesn't
lend itself to representing complex documents (particularly those
containing images) as flat files.
On every previous occasion that people have examined this issue,
the rough consensus seemed to be that HTML was not worth the trouble.
However this has never been formally taken up by IETF. It may be
that, with the right set of technical constraints, HTML could be
made acceptable. If that were the case then IETF would benefit from
being able to exchange documents in a format that was both richer
than ASCII and also revisable.
Someone who wants to convince IETF that it should accept HTML RFCs
and Internet-Drafts should therefore write an Internet-Draft that
specifies:
- What version of HTML should be permissible, and which features
of that version (if any) should be disallowed,
- What kind of verification should be done at submission time
and prior to publication (these could be different), which existing
tools could be used for this, and the acceptance criteria that
should be applied,
- What kind of transformation (if any) should be done at submission
time in order to allow variations from the permissible HTML version
or subset to be accepted, and what tools could be used to do this,
- How the files should be packaged, e.g. in MHTML (if more than one
file is needed, as with documents that include images),
- A style guide, both to ensure uniformity of appearance, and
uniformity of semantic markup, and
- Security considerations.
The proposal should also describe how to generate acceptable HTML
(version, subset, and style) using common document editing/production
tools that are available on a wide range of platforms. A survey of
commonly-used HTML viewers on various platforms, describing which
of them can present and/or print all features of the recommended
version/subset of HTML (including the means in which multiple-file
HTML documents are packaged) with adequate fidelity, would go a long
way toward convincing the skeptics.
This is entirely my opinion. None of the above has the blessing
of the RFC Editor, the IAB, the IESG, or the IETF secretariat.
A mailing list to discuss RFCs in HTML has existed for many
years. It can be found at
http://klutz.cs.utk.edu/listinfo/rfcs-in-html.
Another draft document that summarizes the issues about choosing
a new RFC format is
http://www.cs.columbia.edu/~hgs/tmp/rfc.txt.
I don't entirely agree with all of its assessments of current
formats, but the discussion of the issues is excellent.
Last modified: 13 Jul 2003