Choosing a documentation solution

Jul 29, 2012

For a while I have been using Microsoft Word to author user guides and to be fair this has been a good solution up until now. Since beginning work on Rotorz Tile System there have been a number of product updates which have required multiple updates of the user documentation. Despite using style presets the process of maintaining consistent formatting and updating illustrations is extremely time consuming. I also feel that there is a need for a web based version of the documentation to make it easier for people to find their way around.

So… ultimately I required a solution that takes a number of document fragments that can be composed in a variety of ways to provide at least two deliverables (PDF plus web based documentation). After some careful thought I also decided that I would like to avoid directly formatting the documentation that I write to save valuable time; this should be handled automatically by the documentation solution.

I considered a number of input formats for which to compose my documentation:

HTML5 (http://www.html5rocks.com) – Utilize the semantic elements and attributes that are offered by the HTML5 specification to author documentation. Each section of the documentation would be written into separate HTML5 files which can then be composed into a PDF file and structured web pages using a document map (perhaps written in XML).

Whilst there are a number of WYSIWYG (What-You-See-Is-What-You-Get) editors, from my experience such editors tend to add a lot of junk into the markup. I would be inclined to input documentation by entering markup directly and then create a custom WYSIWYG editor using something like http://aloha-editor.org. The batch processor would automatically add syntax highlighting to code listings plus expand short tags. The output HTML files (plus the document map) would be glued together using some simple PHP scripts.
Markdown (http://daringfireball.net/projects/markdown) – Avoid using XML-like elements and attributes and focus entirely on authoring content. Markdown processors generate HTML and PDF output from files written plain text conventions that became popularized in the early days of plain text e-mail.

One of the beauties of this approach is that no specialized editors are required to author content (though NotePad++ and BBEdit offer syntax highlighting for markdown). However, with simplicity comes the cost of losing the ability to express certain semantics. A batch processor would be required to automate the entire conversion process, plus to perform syntax highlighting on code listings. A document map would also be required to assist with PHP scripts.
LaTeX (http://www.latex-project.org) – A very popular format for authoring documentation (with fantastic support for mathematical equations). Authored content can be converted into both PDF and HTML where content is automatically outlined.

There are a number of specialized WYSIWYM (What-You-See-Is-What-You-Mean) editors which make the process of authoring content easy. Whilst there are ways to use LaTeX to write modular documentation, it is a lot easier to work within a single document. Whilst content is structured there are again less semantic offerings out-of-the-box; however there are extensions like SALT (http://salt.semanticauthoring.org).
DocBook (http://www.docbook.org) – A popular and extensible XML-based format for authoring documentation with extensive semantic offerings. Documentation can be authored modularly into reusable parts using XInclude (http://www.sagehill.net/docbookxsl/ModularDoc.html).

There are a number of WYSIWYM editors for DocBook content, however from my findings these tend to be quite complex due to the rich nature of the DocBook schema. I personally found it easier to author documentation using a good quality XML editor (like the one found in Visual Studio and Oxygen).

The provided DocBook stylesheets can be used to generate a number of deliverables including HTML and PDF. Plus as an added bonus syntax highlighting in program listings is performed automatically for all of the popular programming languages.
DITA (http://dita.xml.org) – Another popular and extensible XML-based format for authoring technical documentation (user guides, tutorials, etc) using a highly modular topic based approach. There are several types of topic including concept, task, reference and the generic catch all topic. Custom specializations can also be defined when more granularity is required.DITA topics can be composed in a variety of ways (as needed) by creating one or more DITAMAP files. A map identifies which content to compose along with the way in which the content is to be structured. There are two types of map, one that is better for article-like content or simple user guides, and another that is designed for books.

The great thing about DITA is that its schema is a lot simpler than the likes of DocBook (meaning less of a learning curve) whilst offering a fairly decent level of semantics.

The official way to compile documentation is using the DITA-OT (http://dita-ot.sourceforge.net) toolkit which uses the Saxon XSLT processor. The toolkit is primarily written using XSLT1 stylesheets though custom stylesheets can be created using XSLT2 stylesheets! The generated deliverables can be highly customized, though the process of stylizing PDF output using XSL:FO is more involved.

XML Mind (http://www.xmlmind.com) also provide an open source solution to convert DITA into delivery formats called Ditac (http://www.xmlmind.com/ditac). Their solution seems far easier to customize; plus generates better quality XHTML output (using proper namespace). Generated documentation is also more attractive out-of-the-box. XML Mind also develop a fantastic WYSIWYM editor that is ideal for DITA editing. Having evaluated a number of DITA editors, theirs certainly seems to be the easiest to use (to me at least).

As a bonus, Ditac also generates an XML outline for the composited documentation!

As you have probably guessed I went with DITA, but this was far from a simple choice. This decision was made having spent several days researching, experimenting with different options and pondering. One of the things that I love about DITA is the ability to write using modular topics because this does lead to a better quality of documentation. When writing a topic it is important to ensure that it can be fully understood when read independently of any other topics. This makes it very easy to reuse the same topics in multiple documents.

For example, the same documentation can be reused for HTML output and PDF output. But more importantly the same documentation can be reused to generate specialized outputs. For example, you might want to create a “Quick Start” guide and a full-blown “User Guide” plus a “Technical Guide” that are each suited towards different needs. Content can also be filtered by audience (or any other attribute for that matter) so the exposure of certain paragraphs/sections/etc can be fully controlled.

IBM utilize DITA heavily for their documentation and have made considerable contributions towards the DITA project. For anybody interested in using DITA I would strongly recommend reading through some of the IBM articles: http://www.ibm.com/developerworks/xml/library/x-dita1. I was pleasantly surprised over the number of companies using DITA for their production user guides! The user guide for my camera was authored using DITA!

I have decided to use the XML Mind DITA Converter for my user documentation because I prefer the quality of its implementation. It also seems to be a lot easier to customize, although PDF customization is still quite involved. Until I have the time to customize the PDF output properly I have decided to take a different approach. Instead I will output a single XHTML file using Ditac and then convert that output into a PDF file.

wkhtmltopdf (http://www.daimi.au.dk/~jakobt/#wkhtmltopdf) is a truly fantastic project which can be useful in so many ways. This is an open source command line tool that converts HTML files into PDF files. It is possible to specify custom cover pages, toc, header and footers and supports some of the paged media features of CSS. The output of this tool is superior to that of any modern web browser and is based on the WebKit layout engine.
Prince (http://www.princexml.com) is a commercial option and is perhaps the best choice when funding is not a problem. Prince supports full CSS3 paged media capabilities.

There are several quirks to wkhtmltopdf, in that none of the latest distributions work perfectly.

The latest Windows release outputs a blank page for the TOC, however there is a workaround to this issue. The tool has an option to output the XSL stylesheet that is used to render the TOC. If this XSL file is specified explicitly when invoking the command line the TOC is rendered properly!
The latest release for OS X intermittently outputs all content as an image (instead of selectable text). This is not an issue with the Windows release.

I hope that my research will be of use to others. Please feel free to comment and ask questions about my experiences and I will do my best to answer!