You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
462 lines
15 KiB
462 lines
15 KiB
<HTML>
|
|
<HEAD>
|
|
<TITLE>Document Management for Web Specs</TITLE>
|
|
</HEAD>
|
|
<BODY>
|
|
<P>
|
|
<!-- context info -->
|
|
<A HREF="../../"><IMG alt="WWW" src="http://www.w3.org/hypertext/WWW/Icons/WWW/WWWlogo48.gif"></A>
|
|
<A href="../"> <IMG src="../../Icons/WWW/html_48x48.gif" ALT="MarkUp"></A>
|
|
| <A HREF="./">SGML</A>
|
|
<H1>
|
|
Document Management for Web Specs
|
|
</H1>
|
|
<P>
|
|
Aaargh! Maintaining specs is a Royal Pain! We need to automate this!
|
|
<P>
|
|
See also:
|
|
<UL>
|
|
<LI>
|
|
<A HREF="http://lists.w3.org/Archives/Public/spec-prod/">spec-prod@w3.org
|
|
Mail Archives</A>
|
|
<LI>
|
|
<A NAME="xmlspec" HREF="../../XML/#xml-spec">W3C XML Specification DTD
|
|
(“XMLspec”)</A>
|
|
<LI>
|
|
<A HREF="http://dri.cornell.edu/pub/davis/html-parser.html">Jim Davis's HTML
|
|
parser</A>, with RFC generator
|
|
<LI>
|
|
<A HREF="../../People/Connolly/drafts/html-design">HTML design notebook</A>,
|
|
with list of implementations
|
|
<LI>
|
|
<A HREF="../../implementations">HTML parser implementations </A>(needs beefing
|
|
up)
|
|
</UL>
|
|
<H2>
|
|
Requirements
|
|
</H2>
|
|
<UL>
|
|
<LI>
|
|
Single Source Format (for each spec, if not a common format for all specs)
|
|
<LI>
|
|
PDF Output
|
|
<LI>
|
|
EPSF Figures (reduced to SVG/PNG for online)
|
|
<LI>
|
|
Tables
|
|
<LI>
|
|
HTML output
|
|
<LI>
|
|
Plain Text output according to IETF formatting guidelines
|
|
<LI>
|
|
Automatic TOC generation, section numbering
|
|
</UL>
|
|
<H2>
|
|
Goals
|
|
</H2>
|
|
<UL>
|
|
<LI>
|
|
Open standard source format
|
|
<LI>
|
|
HTTP-based document management (PUT, version control, ...)
|
|
<LI>
|
|
Direct Manipulation Rich-Text Editing, ala FrameMaker, MS Word
|
|
<LI>
|
|
Direct Manipulation HyperLink Editing, ala Nexus (navipress is close)
|
|
<LI>
|
|
Other Automated navigation structures: index, cross references, glossary,
|
|
references
|
|
<LI>
|
|
FrameMaker interoperability
|
|
<LI>
|
|
MS Word Interoperability
|
|
<LI>
|
|
Emacs/vi interoperability (for low-bandwidth situations)
|
|
<LI>
|
|
LaTeX interoperability (nice typesetting)
|
|
<LI>
|
|
Version control with change logs
|
|
<LI>
|
|
meaningfull diffs
|
|
</UL>
|
|
<H2>
|
|
Wishes
|
|
</H2>
|
|
<UL>
|
|
<LI>
|
|
Annotation support (for writer's comments, group comments, public comments)
|
|
<LI>
|
|
"structured sed" -- API for document manipulation (e.g. for TOC generation,
|
|
glossary, etc.)
|
|
<LI>
|
|
<A NAME=automated-bibliography>BibTeX/refer-like database</A>. Here's how
|
|
it works:
|
|
<UL>
|
|
<LI>
|
|
The schema for the database is modeled after bibtex/refer: class (thesis,
|
|
techreport, etc.), title, author, date, abstract ...
|
|
<LI>
|
|
Each record also can be marked "surrogate," meaning the authoratative source
|
|
is somewhere else: the IETF abstracts file, a W3C tech report, etc.
|
|
<LI>
|
|
To refer an entry in a document, we use some stylized markup. For example,
|
|
in the head: <link rel=bibliography href="/Bibliography?">. Then, at
|
|
the point of the reference, some sort of transclusion link markup: <a
|
|
rel="embed"
|
|
href="/Bibliography?id=draft-ietf-http-v10-4;fields=title,author,date,status,abstract">
|
|
... </a> (we might need to use RANGE or AS/AE to avoid nested A elements)
|
|
<LI>
|
|
The server checks documents at PUT time. When it sees such a reference, it
|
|
consults the bibliography database (which might involve updating the bib
|
|
DB from the ietf drafts index) and fills in the appropriate fields.
|
|
<LI>
|
|
Viola! We never need to manually cite documents again. Not only are the citations
|
|
reusable (in specs, overviews, etc.) but the database itself can be a valuable
|
|
browsing/searching resource in and of itself.
|
|
</UL>
|
|
<LI>
|
|
Sharable elements/entities (boilerplate, cross-references)
|
|
<LI>
|
|
Author-chunks distinct from reader-chunks
|
|
<LI>
|
|
Equasion support
|
|
<LI>
|
|
PowerPoint interoperability (goes beyond the scope of specs into presentations)
|
|
</UL>
|
|
<H2>
|
|
Possible Solutions
|
|
</H2>
|
|
<DL>
|
|
<DT>
|
|
<A href="../../Tools/Multiformat">Multiformat tools</A>
|
|
<DD>
|
|
This was used for the PNG spec
|
|
<DT>
|
|
FrameMaker, WebMaker, ??? print-to-text tool
|
|
<DD>
|
|
This is what Roy Fielding (and a lot of other folks) use.
|
|
<UL>
|
|
<LI>
|
|
+ Direct-Manipulation editing
|
|
<LI>
|
|
+ WYSYWIG postscript output
|
|
<LI>
|
|
+ Automatic TOC, cross references, index, section numbering
|
|
<LI>
|
|
+ Automatic HTML output with TOC, chunking, navigation (prev/next/up)
|
|
<LI>
|
|
- generating plain-text is a bear
|
|
<LI>
|
|
- generating HTML requires a baroque toolset that we don't have a license
|
|
to.
|
|
<LI>
|
|
- need a Frame license
|
|
<LI>
|
|
- no way to edit over a telnet connection
|
|
<LI>
|
|
- no way to edit over an HTTP connection (must have local access to files)
|
|
</UL>
|
|
<DT>
|
|
HTML+, dsr's tools
|
|
<DD>
|
|
Dave Raggett edits the HTML with a text editor (mostly BBEdit on a Mac).
|
|
He's got some little tools written in C to produce plain text.
|
|
<UL>
|
|
<LI>
|
|
+ Automatic TOC, headers/footers in text
|
|
<LI>
|
|
+ renders HTML math in plain text output
|
|
<LI>
|
|
+ HTML is easy to edit over a telnet connection
|
|
<LI>
|
|
+ can work with Navipress to edit HTML via HTTP
|
|
<LI>
|
|
- no postscript output tools
|
|
<LI>
|
|
- text generation tool often requires manual post-processing
|
|
<LI>
|
|
- author chunks must be the same as reader chunks
|
|
<LI>
|
|
- manual maintenance of prev/next links in HTML
|
|
<LI>
|
|
- manual maintenance of cross-references, index, etc.
|
|
<LI>
|
|
- little structure in the source format (e.g. no explicit "Abstract" structure)
|
|
</UL>
|
|
<DT>
|
|
Snafu DTD, gf tools, Texi2HTML, COST, Joe English
|
|
<DD>
|
|
This is what I ended up using for HTML 2.0
|
|
<UL>
|
|
<LI>
|
|
+ structured source format
|
|
<LI>
|
|
+ open standard source format
|
|
<LI>
|
|
+ Automated postscript output in IETF format with TOC, headers, footers,
|
|
cross-references, References, section numbers, glossary
|
|
<LI>
|
|
+ Automated plaintext output with same features
|
|
<LI>
|
|
+ Automated HTML output with glossary, TOC, navigation links
|
|
<LI>
|
|
+ LaTeX output
|
|
<LI>
|
|
+ TeXinfo output
|
|
<LI>
|
|
+ RTF output
|
|
<LI>
|
|
+ meaningfull diffs (cuz I edited the source with a text editor)
|
|
<LI>
|
|
+ low-bandwidth access (emacs over telnet works fine)
|
|
<LI>
|
|
- Tools require outside support (Joe English)
|
|
<LI>
|
|
- Tools are baroque
|
|
<LI>
|
|
- no WYSYWIG tools (perhaps SoftQuad Author/Editor?)
|
|
</UL>
|
|
<DT>
|
|
LinuxDoc
|
|
<DD>
|
|
<UL>
|
|
<LI>
|
|
- print-to-text tool isn't IETF happy
|
|
<LI>
|
|
- no direct-manipulation editing tools
|
|
</UL>
|
|
<DT>
|
|
LaTeX, latex2html, IETF print-to-text tools
|
|
<DD>
|
|
<DT>
|
|
MS Word, rtf2html, ??? print-to-text tool
|
|
<DD>
|
|
</DL>
|
|
<H2>
|
|
Ideal Solution
|
|
</H2>
|
|
<DL>
|
|
<DT>
|
|
Source format: HTML dialect
|
|
<DD>
|
|
use a strict HTML dialect with: tables, class=abstract, possibly math.
|
|
<DT>
|
|
Document Manipulation API: java interface
|
|
<DD>
|
|
formerly:
|
|
<BLOCKQUOTE>
|
|
There are lots of web libraries for python. We could eventually specify the
|
|
interfaces in ILU and use them from lots of languages (C, C++, java, scheme,
|
|
CommonLisp, Modula-3), but we'd prototype and develop using python.
|
|
<P>
|
|
I've already written little tools to do things like relativize links and
|
|
such. Rather than doing TOC generation, section nubmbering, etc. during
|
|
translation, we'd do it in-place in the source, but automatically
|
|
</BLOCKQUOTE>
|
|
<P>
|
|
changed my mind, since java has at least the potential to address the
|
|
installation bugs. Plus, it looks like we can write for the java VM in scheme
|
|
(see kawa)
|
|
<DT>
|
|
Chunking support: python scripts
|
|
<DD>
|
|
This would handle chunking many HTML documents into one for printing, and
|
|
many-to-many chunking for author/reader convenience.
|
|
<DT>
|
|
PostScript Output: python implementation of Mosaic print tool
|
|
<DD>
|
|
This code is already written. Guido translated the postscript printing code
|
|
from Mosaic into python. We could adapt things like headers/footers for our
|
|
needs. This eliminates the need for a TeX installation.
|
|
<DT>
|
|
Postscript Output: libwww TeX module?
|
|
<DD>
|
|
use HTTeXGen module in libwww to generate TeX. It doesn't currently support
|
|
all the features we need, but it could work. It would rely on a many-to-one
|
|
html-to-html filter
|
|
<DT>
|
|
Postscript Output: html2lout?
|
|
<DD>
|
|
lout is kinda like TeX, but it was written since the dawn of postscript,
|
|
so there's less redundancy between lout and PS than between TeX and PS. The
|
|
syntax of lout is also cleaner. Lout has table, equasion, etc. packages.
|
|
A clean html2lout filter should be much more reliable and hands-free than
|
|
anything based on TeX.
|
|
<DT>
|
|
Plain-Text output: custom python app?
|
|
<DD>
|
|
there is already python code to do simple html to text formatting, but handling
|
|
multiple documents, tables etc. needs to be added, as well as IETF style
|
|
<DT>
|
|
Plain-Text output: libwww module?
|
|
<DD>
|
|
same feature enhancements would be needed.
|
|
</DL>
|
|
<H2>
|
|
Wish list
|
|
</H2>
|
|
<DL>
|
|
<DT>
|
|
Direct manipulation grammar editor
|
|
<DD>
|
|
for SGML DTDs, RFC822 grammars in HTTP specs, etc.
|
|
</DL>
|
|
<H2>
|
|
References
|
|
</H2>
|
|
<DL>
|
|
<DT>
|
|
<A href="http://www.inf.tu-dresden.de/~jw6/doc/sdc/index.html">SDC</A>
|
|
<DD>
|
|
structured document conversion.
|
|
<A href="http://www.comp.vuw.ac.nz/Technical/SGML/">in use at vuw.ac.nz</A>.
|
|
Gotta check it out...
|
|
<DT>
|
|
<A href="http://www.chiark.greenend.org.uk/~ijackson/debiandoc-sgml-markup/">Debiandoc-SGML
|
|
markup manual</A>
|
|
<DD>
|
|
4 February 1997 Ian Jackson ijackson@gnu.ai.mit.edu.
|
|
<P>
|
|
source in <A href="ftp://ftp.debian.org/debian/unstable/source/text">debian
|
|
archive under text</A>
|
|
<DT>
|
|
<A href="http://nathan.gmd.de/persons/thomas.gordon.html">Dr. Thomas F.
|
|
Gordon</A>
|
|
<DD>
|
|
GMD FIT - German National Research Center for Information Technology<BR>
|
|
Research Division Artificial Intelligence<BR>
|
|
53754 Sankt Augustin, Germany<BR>
|
|
email: thomas.gordon@gmd.de; phone: (+49 2241) 14-2665
|
|
<DT>
|
|
<A href="http://liinwww.ira.uka.de/bibliography/index.html">The Collection
|
|
of Computer Science Bibliographies</A>
|
|
<DD>
|
|
Copyright © 1995-1996 Alf-Christian Achilles
|
|
<P>
|
|
Great for the reference section!
|
|
<DT>
|
|
<A href="http://www.jclark.com/sp/spam.htm">spam</A>
|
|
<DD>
|
|
<A href="http://www.sil.org/sgml/archEngine.html">example of using spam to
|
|
munge HTML</A>
|
|
<DT>
|
|
<A href="http://fatman.mathematik.tu-muenchen.de/~schwarz/sgml-tools/">SGML
|
|
tools</A>
|
|
<DD>
|
|
as used in <A href="http://sunsite.unc.edu/LDP/">The Linux Documentation
|
|
Project</A>
|
|
<DT>
|
|
<A HREF="http://search.yahoo.com/bin/search?p=Postscript">Postscript</A>
|
|
<DD>
|
|
<DT>
|
|
<A HREF="http://search.yahoo.com/bin/search?p=python">Python</A>
|
|
<DD>
|
|
<DT>
|
|
<A HREF="http://search.yahoo.com/bin/search?p=python">SGML</A>
|
|
<DD>
|
|
<DT>
|
|
<A HREF="http://search.yahoo.com/bin/search?p=LinuxDocSGML">LinuxDocSGML</A>
|
|
<DD>
|
|
<DT>
|
|
<A href="../Relavent#lout">lout</A>
|
|
<DD>
|
|
<A href="http://www.ptc.spbu.ru/mail-archives/lout/0095.html">Re: Lout to
|
|
HTML Jin S. Choi (jsc@atype.com) Wed, 13 Nov 1996 19:34:47 -0500 </A>. Nifty
|
|
thread about LOUT, SGML, DSSSL, HTML, etc. I agree!
|
|
<DT>
|
|
Joe English
|
|
<DD>
|
|
<DT>
|
|
<A HREF="http://www.ccil.org/~esr/home.html">Eric Raymond</A>
|
|
<DD>
|
|
Linux, computational linguistics, www-html
|
|
<199512221800.NAA09004@locke.ccil.org>
|
|
<DT>
|
|
<A href="ftp://ftp.ietf.org/ietf/1id-guidelines.txt">IETF draft guidelines</A>
|
|
</DL>
|
|
<P>
|
|
<P>
|
|
@@ I know from first-hand experience that producing multi-purpose technical
|
|
specifications (e.g. IETF plain text, online hypertext, and postscript) is
|
|
tricky and tedious. I try to keep track of tools that might provide solutions
|
|
to this problem.
|
|
<DL>
|
|
<DT>
|
|
<A href="http://www.jclark.com/sp.html">SP</A>
|
|
<DD>
|
|
a new C++ based SGML parser by James Clark, the author of SGMLS
|
|
<DT>
|
|
<A href="ftp://ftp.ifi.uio.no/pub/SGML/Demo/dtd-fragments-0.2.tar.gz">DTD
|
|
Fragments</A>
|
|
<DD>
|
|
<BLOCKQUOTE>
|
|
Another SGMLS/Perl formatter, DTD Fragments. It's not DTD specific and does
|
|
output to HTML, ASCII and TROFF, it does require a DTD to generic element
|
|
mapping in Perl for any specific DTD and comes with DocBook and Linuxdoc
|
|
mappings. The next version will have RTF output, Snafu DTD mapping and better
|
|
support for applying different styles to the output.
|
|
<ADDRESS>
|
|
<A href="mailto:ken@bitsko.slc.ut.us">Ken MacLeod</A>
|
|
</ADDRESS>
|
|
</BLOCKQUOTE>
|
|
<DT>
|
|
<A HREF="http://www.uottawa.ca/~dmeggins/">SGMLSpm</A>
|
|
<DD>
|
|
Another perl5/ngmls toolet. Includes some support for DocBook->LaTeX,
|
|
HTML conversion, though that part of the code looks like a one-time shot,
|
|
not a complete implementation.
|
|
<DD>
|
|
<DT>
|
|
<A HREF="http://www.oac.uci.edu/indiv/ehood/dtd2html.doc.html" >DTD2HTML</A>
|
|
<DD>
|
|
An SGML DTD documentation/navigation tool by
|
|
<A href="http://www.oac.uci.edu/indiv/ehood/">Earl Hood</A><BR>
|
|
This tool translates an SGML DTD into HTML, providing hypertext navigation
|
|
of the document structure. Handy for learning SGML.
|
|
<DT>
|
|
<A name="psgml" HREF="http://www.lysator.liu.se/projects/about_psgml">PSGML</A>
|
|
<DD>
|
|
A GNU Emacs mode for SGML files
|
|
<DT>
|
|
<A HREF="http://www.sq.com/hm-ftp.html" >Getting HotMetaL by FTP</A>
|
|
<DT>
|
|
<A HREF="http://www.sq.com/panor-pr.html" >SoftQuad Inc. Panorama Press
|
|
Release</A>
|
|
<DT>
|
|
<A HREF="http://www.informatik.tu-muenchen.de/~schwarz/linuxdoc-sgml/">Linux
|
|
Doc/SGML</A>
|
|
<DD>
|
|
These guys have taken a very practical approach to SGML for technical
|
|
documentation. They started with SGMLs from James Clark and the QWERTZ DTD,
|
|
which mirrors LaTeX structure. Then they added down-translators for groff,
|
|
HTML, and others. Looks promising.
|
|
<P>
|
|
Hmmm... on closer examination, this is something of a hack. They hacked the
|
|
DTD, hacked the down-translators, etc. I like the idea of using a LaTeX-like
|
|
DTD, but I think I'll wait till this matures a little more. also:
|
|
<A HREF="ftp://sunsite.unc.edu/pub/Linux/utils/text/">distribution archive</A>.
|
|
<DT>
|
|
<A HREF="ftp://ftp.th-darmstadt.de/pub/text/sgml/misc/" >GF: General SGML
|
|
Formatter</A>
|
|
<DD>
|
|
another SGMLs based SGML to HTML converter supporting a few sophisticated
|
|
DTDs
|
|
<DT>
|
|
<A HREF="http://web.nexor.co.uk/mak/doc/html/sgml-lib/html-sgml.html" >Setting
|
|
up PSGML and sgmls for HTML</A>
|
|
<DT>
|
|
<A HREF="ftp://ftp.jclark.com/pub/sp" >Remote file ftp.jclark.com/pub/sp</A>
|
|
<DT>
|
|
<A HREF="http://www.art.com/cost/" >CoST</A>
|
|
<DD>
|
|
Copenhagen SGMLs Tool -- SGMLs meets Tcl<BR>
|
|
maintained by Joe English
|
|
</DL>
|
|
<P>
|
|
<HR>
|
|
<ADDRESS>
|
|
<A HREF="../People/Connolly/">Dan Connolly</A><BR>
|
|
created 1995/12/05<BR>
|
|
last update by $Author: connolly $ on $Date: 1999/11/23 20:35:13 $
|
|
</ADDRESS>
|
|
</BODY></HTML>
|