You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
184 lines
7.7 KiB
184 lines
7.7 KiB
Newsgroups: alt.gopher,comp.infosystems.wais,comp.text.sgml,comp.mail.multi-media,comp.sys.next.programmer
|
|
From: connolly@convex.com (Dan Connolly)
|
|
Subject: MIME for global hypertext
|
|
Message-ID: <1992Jun7.042358.29367@news.eng.convex.com>
|
|
Sender: usenet@news.eng.convex.com (news access account)
|
|
Organization: Engineering, CONVEX Computer Corp., Richardson, Tx., USA
|
|
Date: 07 Jun 1992 04:23:58 (19920607042358)
|
|
Corp. The opinions expressed are those of the user and
|
|
not necessarily those of CONVEX.
|
|
Lines: 170
|
|
X-Mozilla-Status: 0000
|
|
Content-Length: 6997
|
|
|
|
|
|
The WAIS, gopher, and world-wide-web projects are all client/server
|
|
information retrieval systems. All three deliver plain text information
|
|
quite well, and they each have evolving mechanisms for delivering
|
|
other forms of information.
|
|
|
|
The MIME RFC defines a system for processing multi-part, multimedia
|
|
messages on the internet. I would like to see these systems, along
|
|
with USENET news and internet mail, interoperate with MIME as the substrate.
|
|
|
|
The clients for these systems go something like this:
|
|
0 user invokes client (and chooses a starting point)
|
|
1 client displays user's request
|
|
2 user reads page, chooses a reference to more info
|
|
3 user informs client of choice
|
|
(e.g. "show me item #1," or "search for googoo")
|
|
4 go to step 1
|
|
|
|
These systems often consist of a hierarchy of menus with text files at
|
|
the leaf nodes. The system allows the user to interactively navigate
|
|
the menus and browse leaf nodes. But 1) the format of the menus is
|
|
particular to the system (USENET newsgroups/articles, unix
|
|
directories/files, WAIS source/database/document). And 2) once a user
|
|
is at a leaf node, the system can no longer interactively follow
|
|
references.
|
|
|
|
The novel aspect of hypertext is that the distinction between the
|
|
menu pages and the text pages disappears. In the world-wide-web,
|
|
text documents have machine-readable links inside them, and all
|
|
menus are represented as hypertext documents.
|
|
|
|
The WWW format works well, but it would benefit from use of MIME's
|
|
features.
|
|
|
|
For a common hypertext document format, I propose we define a
|
|
subtype of the MIME multipart message: X-HYPERTEXT. The first
|
|
part of a multipart/X-HYPERTEXT message is the content of
|
|
the document, and the remaining parts are multimedia attachments
|
|
and links to other documents.
|
|
|
|
The content part contains references (by Content-ID) to the
|
|
attachments and links. The client software allows the user
|
|
to interactively choose references to display/follow.
|
|
|
|
The remaining parts may be attached image/audio/video using
|
|
MIME's various types and transfer encodings (text attachments
|
|
would work too) or they may be references to information
|
|
accessible elsewhere using MIME's message/external-body type.
|
|
The parameters to the external-body content-type provide the
|
|
same information as WWW's Universal Document Indentifier.
|
|
(MIME only defines ANON-FTP, FTP, TFTP, LOCAL-FILE and AFS.
|
|
The remaining access-types (WAIS, gopher, etc) would be
|
|
experimental (X-WAIS, X-GOPHER) until standardized.)
|
|
|
|
The emerging standard for structured, platform-independent text
|
|
is SGML. The WWW project defines an SGML document type with
|
|
traditional elements (title, heading, paragraph, list) and
|
|
new hypertext elements (anchor). Soon it will have multimedia
|
|
elements (image, audio).
|
|
|
|
The current design places external document references (to files,
|
|
WWW servers, WAIS documents, gophers, etc.) inside the SGML as
|
|
attributes. There are lexical incompatibilities, and the design
|
|
is under strain. I suggest that we implement references as
|
|
as SGML entities that identify message/external-body parts
|
|
by content-id.
|
|
|
|
Representing document content in SGML allows the same information
|
|
to be accessed using different user interface paradigms (e.g. dumb
|
|
terminals vs. curses style vs. x windows point-and-click).
|
|
|
|
Short of full SGML parsing, we could adopt the MIME text/richtext
|
|
format, with the addition of a <REF ID="xxx">...</REF> tag.
|
|
In fact, any representation that allows the user to interactively indicate
|
|
one of the attached body parts by content-id will do. For example,
|
|
plain text with one-line descriptions would do. The Andrew ez
|
|
data stream would also work, but only Andrew sites could parse it.
|
|
|
|
This brings up the issue of format negociation. No one format is
|
|
optimal for all information. Clients are likely to be able to process
|
|
information in several formats, and servers are likely to be able
|
|
to provide different representations.
|
|
|
|
The various formats can be enclosed in a MIME multipart/alternative
|
|
message. And rather than including the data for all formats in
|
|
the message, the data could be in message/external-body parts. The
|
|
client chooses the type of data it likes and retrieves the corresponding
|
|
external-body. This (modified) example from the MIME rfc may help explain:
|
|
|
|
MIME-Version: 1.0
|
|
Content-Type: multipart/alternative; boundary=42
|
|
|
|
--42
|
|
Content-Type: message/external-body;
|
|
name="BodyFormats.ps";
|
|
site="thumper.bellcore.com";
|
|
access-type=ANON-FTP;
|
|
directory="pub";
|
|
mode="image";
|
|
|
|
Content-type: application/postscript
|
|
|
|
--42
|
|
Content-Type: message/external-body;
|
|
name="/u/nsb/writing/rfcs/RFC-XXXX.ez";
|
|
site="thumper.bellcore.com";
|
|
access-type=AFS;
|
|
|
|
Content-type: application/x-ez
|
|
|
|
--42
|
|
Content-Type: message/external-body;
|
|
name="BodyFormats.txt";
|
|
site="thumper.bellcore.com";
|
|
access-type=ANON-FTP;
|
|
directory="pub";
|
|
|
|
Content-type: text/plain
|
|
|
|
--42--
|
|
|
|
The client can choose between postscript, ez, and plain text, and
|
|
retrieve the corresponding message body.
|
|
|
|
|
|
The question then becomes: how do these systems interoperate?
|
|
By making information available as multipart/X-HYPERTEXT MIME
|
|
messages.
|
|
|
|
The WWW client interfaced to the other systems by defining
|
|
"addressing schemes" and implementing the various protocols
|
|
and translating the data into HTML. Gopher has a similar
|
|
typing scheme -- one character is reserved to indicate
|
|
the access type and the data type. WAIS clients have yet
|
|
another method of resolving types, though they only support
|
|
one protocol. The NewsGrazer application has its own
|
|
encapsulation mechanism. This is becoming a mess.
|
|
|
|
In the short term, global hypertext viewers will have to support
|
|
the access-type and content-type of each system with which it
|
|
interoperates (so we have X-WAIS, X-HTTP, X-GOPHER, X-NNTP, as well as
|
|
|
|
Some of the access types will become standard, and some will die out.
|
|
But all the data types should be encapsulated in MIME messages. Any
|
|
data that has machine-readable pointers to other data should be made
|
|
into a multipart/X-HYPERTEXT message. For example, a WAIS question
|
|
should have attachments for each of the result documents (the content
|
|
part can stay application/x-wais-question, or it could be converted to
|
|
a text type, or both), at least in the case where those documents are
|
|
available by some standard access method. [I wrote a perl script that
|
|
will change an HTML document into a MIME message with attachments.]
|
|
|
|
Leaf documents, i.e. documents with no external links, can stay in
|
|
single part types. e.g. Plain text files become MIME messages by simply
|
|
adding a blank line at the beginning (to separate the headers (none)
|
|
from the body).
|
|
|
|
Under this model, a mail message can point to a news article
|
|
which references a WAIS document which contains several drawings
|
|
and pointers to several more available by FTP, and a user could
|
|
just point-and-click between them. The only need for
|
|
protocols like gopher and HTTP is to encapsulate data that's not
|
|
already MIME compliant.
|
|
|
|
This is clearly a pipe dream, but it's the kind of thing we can work
|
|
towards today.
|
|
|
|
Dan
|
|
|
|
|
|
|