Configurable User Documentation; or, How I Came to Write a Language with a Future Conditional

Mark A. Verber
The Ohio State University

Elizabeth D. Zwicky
SRI International

Introduction

In 1984 we decided that the Ohio State University Computer and Information Science department (OSU-CIS) needed an introduction to the department computing facilities. We wanted a document that pulled together all the information that a new graduate student or faculty member might need: site policies, electronic mail addresses, a list of facilities provided, and basic introductions to the various operating systems the department supported. Creating such a document was particularly important since there were no good introductions to our two primary platforms: a DECsystem-20 running TOPS-20, and a VAX running BSD UNIX.

Much of this important information was getting passed student to student in the form of oral traditions. This is fine, except that information was not spread around evenly (you needed to know one of the system gurus or you were missing important information), and once oral traditions were started it was almost impossible to kill them, even when the information became incorrect or irrelevant.

Being firm believers in not working any harder than we needed to, we looked around for a suitable existing document that we could take and adapt to our local requirements. We found a document which had been written for the Computer Science department at Carnegie-Mellon. Years went by, and various people at OSU-CIS edited the document, and edited the document, and edited the document. It grew to be three times its original size, changed text formatter, developed its own layout, grew an annotated bibliography, added its own font, and otherwise consumed time and resources.

And then the two of us who had maintained it at OSU-CIS left, to go to other jobs. Lo and behold, the new sites we were at had no equivalent user documentation. But the old OSU-CIS Facilities Guide didn't quite work for them. On the other hand, there was a not a chance that either of us was going to throw it out and start over, having invested several years into it already. Therefore, in a spirit of enlightened self-interest, we set out to build it into a user-configurable book -- something that we could use at both our sites, take to new sites, and let other people use as well.

The Problem

We found the available documentation inadequate for our users. There are three main types of documentation that you can get without writing it yourself; manual pages, other manufacturer's documentation, and books.

Manual pages are not useful for truly naïve users. They don't have the right sort of information (you can't find out what your e-mail address is, or what editor you should use); the information they do have is often in words that new users don't understand; and they do have all sorts of irrelevant information that acts as noise.

Next is the manufacturers documentation. How good the manufacturer's documentation is varies; unfortunately, the range is from ``almost good enough'' to ``you're not certain whether to laugh or to cry''. [The manufacturer that shipped an installation manual and a third-generation photocopy of the Berkeley 4.2 docs -- complete with font tables for Berkeley's Versatec printer -- falls into the latter category]. Even the best quality documentation from manufacturers tends to fall short of what the average user needs.

Most often the manufacturer provides a very simplistic ``getting started'' guide, and then a stack of documentation which includes all the man pages in printed form and stacks of detailed manuals describing major software systems such as troff, program development tools, etc. The result is that to accumulate enough manufacturer-supplied documentation to introduce a new user to a system generally results in a foot-high stack - the ``getting started'' guide is never enough. The new user's response to the foot-high stack is to put it in a corner and whimper, never looking at the documentation because it is too overwhelming. Furthermore, the site that uses only and exactly the software that their hardware manufacturer supplies has yet to be found.

Because of these well-understood problems with vendor documentation, you can go to your neighborhood bookstore and buy an introduction to UNIX. These days, you can even buy an introduction to Berkeley and Berkeley-derived versions of UNIX. Unfortunately, the quality of these introductions vary as much as the manufacturer-supplied documentation. Even the best of the books will tell you how to use only and exactly what most hardware manufacturers supply. For instance, it will almost certainly tell you how to use vi and troff, which is all very well if that's what your users are supposed to prefer. If your standard is emacs and LaTeX, or FrameMaker, these books will merely confuse your users. By the time you have put together the documentation that lists all the ways your site is different from what the book says, not only have you done as much work as it would have taken to write the documentation from scratch, but you are also back to the foot-high stack of documentation.

Goals

We wanted a document that would meet the following goals:

Our Solution

Many of these goals were met by the OSU-CIS Facilities Guide; it had most of the information in it, the writing style was accessible, there was a good index, it was fanatically structured, and it had a good bibliography. In most cases, it already had macros in place to avoid requiring multiple changes of the same information, and it even had some macros set to allow chapters or sections to be used as stand-alone documents. Unfortunately, it failed pretty badly on portability. What portability it had was the result of adapting to changes in the department's configuration over the years; there had never been any hesitation about coding assumptions about being a university in the middle of Ohio, for instance.

The changes that had already been made were in the form of LaTeX macros. After the year in which the staff offices moved 4 times (and some staff members moved 12), all the names and addresses had been split out into a configuration file, for instance. We kept those, and added to them where appropriate

Simple string replacement was not enough to fix all the changeable parts of the document, however, and we had to add two further methods of customizing it. The first was taking text that was clearly unsalvageable -- so changeable that it was going to have to be rewritten for every site -- and isolating it in individual files. Examples of what we put into separate files include site policies for accounts and user behavior; descriptions of printing devices; dialup information; and tables showing supported programming languages. All of these files go in a ``Local'' subdirectory, and examples are provided with the document. Pulling the files into the main document is done with normal LaTeX input primitives.

For other parts of the document, we needed a solution intermediate between string replacement and file insertion; in fact, we needed something like conditional compilation, where you could have the appearance of text depend on the value of variables. This could be very simple, like omitting the VMS chapter for sites that don't have VMS, or considerably more complex. Unfortunately, tools for text production don't provide features like conditional compilation, except to a limited extent. TeX conditionals really are not well suited for trying to comment out entire blocks of code; it's possible, but it isn't pretty.

Tools that are intended for programs do conditionals well, but break other things. For instance, cpp leaves blank lines where all its directives are, which is lovely for preserving line numbering, but really upsets life when blank lines change the semantics of the language -- as they do in TeX and troff. Furthermore, it's highly inconvenient to have #defined variables interpreted wherever they occur in the text. Capitalization may suffice to distinguish between a preprocessor directive and a language element in C, but in English you don't get free choice of how to capitalize things, and you end up making outrageously ugly #defines to avoid having your text diced. (Instead of using ``#ifdef UNIX'', you must use something like ``#ifdef UNIXP''. Or ``ifdef unix'' -- after all, ``unix'' is not legal in text unless it happens to be the name of a UNIX command -- but that really upsets people who are accustomed to C.)

Furthermore, there are some structures in English that are not well handled by ``if'' and ``if not''. Using cpp to try to control a sentence that is supposed to end up saying something like ``There are three operating systems you can make your home on; UNIX, TOPS-20, and VMS'' when you don't know how many operating systems will be involved is very, very, nasty. When I tried it, I ended up with a screen full of intricately nested ifs, and a bad headache.

Our solution to this was to build a text pre-processor, which we call tpp. tpp carefully gets rid of lots of the features of cpp, and introduces a few new ones instead. The future conditionals mentioned in the title are among them; in sentences like the one above, you can create a case statement that fixes the ``are three'' to ``is one'' or ``are many'' depending on how many things are going to be true when you get to the end of the sentence.

TPP

Tpp is currently a very small language, containing a whole 7 directives. It is implemented as a Perl program. The 7 tpp directives are define, undef, if, ifndef, conj, number and bynumber. Tpp currently considers any line beginning ``%#'' to be information for it; it pays no attention to other lines, and either emits them unchanged or doesn't emit them at all. ``%#'' was chosen for LaTeX's benefit, to allow a tpp document to be processed by LaTeX without first having been run through tpp. Future versions of tpp will allow you to choose an attention sequence.

In the current version of tpp, variables are either set or unset; they don't take values. By default, they are unset; it is perfectly legal to unset a variable that is already unset, set one that's already set, or reference one that was never explicitly set or unset. if and ifndef control emission of text, and have matching else and endif. The following code produces ``Perl is fun.'' as its only output:

%#define fun
%#if fun
Perl is fun.
%#else
Perl is not fun.
%#endif

conj is the conjunction statement, used to output lists in English; it takes a conjunction as an argument, and then cases by variable.

%#define mares
%#define does
%#define lambs
It's true;
%#conj and
%#case mares
mares eat oats
%#case does
does eat oats
%#case lambs
little lambs eat ivy
%#endconj
.

produces ``It's true; mares eat oats, does eat oats, and little lambs eat ivy.'' If ``mares'' and ``does'' are unset, the sentence reads ``It's true; little lambs eat ivy.'' This may seem unimpressive, until you try to produce this effect with only cpp directives.

number takes ``last'' or ``next'' as an argument, and returns the number of cases that were true in the most recent conj, or are going to be true in the next one. This allows you to say

There are
%#number next
main weapons of the Spanish Inquisition:
%#conj and
%#case fear
fear
%#case surprise
surprise
%#case devotion
a fanatical devotion to the pope
%#endconj
.

and not have the usual problem with getting the number right.

bynumber also takes ``last'' or ``next'' as an argument, but it then cases on the result. It allows ranges, and also the use of the keyword ``many'' to mean ``a bigger number than I've got a case for'', allowing

There
%#bynumber next
%#case 0
are no weapons
%#case 1-3
are a few main weapons
%#case 4-10
are several main weapons
%#case many
are lots and lots of main weapons
%#endnumber
of the Spanish Inquisition.

Guide Contents

Our first goal was for the Facilities Guide to be non-threatening. One part of being non-threatening was to keeping the Facilities Guide down to a reasonable size without a novice user needing other reference material. Unfortunately, that's only so acheivable; the Facilities Guide configured for the Ohio State Physics department produces 180 pages of text. This is enough to be scary to a new user. We try to ease these concerns with our introduction which tells a new users that they don't have to learn everything that is contained in the Facilities Guide. The first chapter is a roadmap, which explains what each of the chapters is about and instructs a new (or experienced) user what chapters should be helpful.

We wrote the second chapter of the Facilities Guide for the computer neophyte: someone who had no idea what a text editor was, much less why using electronic mail is useful. Having this introduction helps the rest of the Guide be more useful since we could assume a basic level of understanding. The first chapter advises experienced users to skip or skim this chapter, since they presumably do not want to be told what a text formatter does and why you would want to do it.

The middle chapters of the guide address specific operating systems (currently these are UNIX, VMS, Tops-20, and the Macintosh OS, although an MS-DOS chapter is in progress). Each of the operating system chapters follows the same basic outline (as does the introductory chapter), although the details change from operating system to operating system. The hope is that the information comes in the order that people need it. Roughly, the structure is:

  1. How to get in.
  2. How to change your password.
  3. How to get out.
  4. How to give commands.
  5. How to use electronic mail (including the information about what your mail address is.)
  6. The file system; how it is arranged, and how you deal with files.
  7. Text editing.
  8. Formatting systems.
  9. Printers.
  10. Programming languages and tools.
  11. Other topics.
  12. Games.

In some cases, not all the useful information will fit into this relatively rigid structure. The UNIX chapter, for instance, has long since grown into multiple chapters, with a separate chapter to deal with window systems. Where possible, these additional chapters echo the same structure (for instance, the description of each window system starts by telling you how to start it up, how to shut it down, and how to get help).

In order to keep the operating system chapters relatively short and non-repetitive, information about significant tools that occur on multiple operating systems is pulled out into separate chapters. For instance, a separate text formatting chapter is provided with the information about LaTeX. The operating system chapters provide the operating system dependent information (how to start the executable, how to print a dvi file, where to find sample documents) and then refer to the specific chapter.

We have created an extensive index and a detailed table of contents, and a large annotated bibliography. The layout of the Facilities Guide is designed to make specific sections easy to find by using a lot of white space and horizontal rules.

The use of tpp and various TeX primitives permits us to intersperse site-specific information within the body of a very generalized section. Examples given in the text use host names and the command line prompts that are set locally. Rather than talking in general about printer support, we can say what printer is preferred for high quality output, and which printer is the fastest. In the section discussing remote access we can tell users what phone numbers to use and what they need to type to gain access to our machine via a dialup.

New Problems We Have Created

The version of the facility guide I am looking at now takes up nearly three megabytes of disk; that's built for my configuration as far as LaTeX source. That does not include either LaTeX or Perl, which work together to create the manual. Along with the actual text, the manual distribution contains not only tpp, but also an indexing program and, believe it or not, a PostScript font - the font is only actually required for the Macintosh chapter.

All in all, it's a lot of trouble to go to for a manual, especially when you consider that you may still end up creating sections for programs or operating systems that we don't run. On the other hand, it's a lot less trouble than writing your own manual from scratch, and we will happily merge other people's sections in, with credit. (The acknowledgments section is now well into its second page.)

New Solutions We Have Created

As it turns out, there are programs besides LaTeX and troff that care deeply about empty lines and may have almost anything in them -- sendmail, for instance. While the more esoteric text processing features of tpp are out of place in a sendmail.cf, the ability to do cpp-style preprocessing to create multiple sendmail.cfs out of a single master can be extremely handy at a large site, and in fact OSU-CIS, which is not presently using the facilities guide, is using tpp for that purpose.

Availability

The Facilities Guide and tpp are available as http://www.verber.com/mark/fac-guide.tar.Z and http://www.verber.com/mark/tpp.shar respectively.

About the Authors

Mark Verber was a system programmer for the Physics Department at The Ohio State University. He discovered UNIX in 1978 as a high school student and has been working for OSU since 1980. Reach him via U.S. Mail at The Ohio State University; Physics Department; 174 W. 18th Avenue; Columbus, Ohio 43210. Reach him electronically at .  Mark now works for WebTV.

Elizabeth Zwicky was a system administrator for the Information, Telecommunication and Automation Division at SRI International, where she is known for writing peculiar programs in languages beginning with the letter ``P''. Reach her via U.S. Mail at SRI International; 333 Ravenswood Avenue; Menlo Park, CA 94025. Reach her electronically at zwicky@erg.sri.com. Elizabeth now works for SGI.