Converting LaTeX to XHTML, etc.

One of the most useful aspects of LaTeX as an authoring tool is that it (mostly) separates content and structure from visual appearance.  The author is free to focus on the former and allow a particular LaTeX style package (or several) take care of the layout, font choices, etc.   For example, while in MS Word one might indicate a section heading simply by typing a line by itself and choosing a bold font, LaTeX expects you to use the \section{} and \subsection{} commands to divide up the content and signal that special formatting is required.

This is a sensible separation of duties, as most technical authors have no training in typography, and there’s no reason for them to have to grapple with styling decisions if the goal is to produce a self-consistent and professional-looking document.

A side-benefit of the way LaTeX explicitly imposes structure is that it also lends LaTeX documents to automated conversion to other structured document formats, such as XML and XHMTL (which is actually a flavor of XML designed for web display).   XHTML documents in particular are the basis for one popular e-book format known as EPUB.   So, in principle at least, authoring a document in LaTeX should facilitate the creation not only of print versions but also electronic versions suitable for use on, say, an iPad.

Such conversion is now fairly routine for general text documents authored in LaTeX.   The process becomes much messier when mathematical equations and symbols are involved, largely because most e-readers and browsers have not yet embraced the standards for displaying high-quality math equations.  Most e-books using math currently use rasterized images of equations as a poor substitute for specialized mathematical markup like MathML (again, a flavor of XML).

The eventual purpose of this page will be to look at various workflows starting with LaTeX (or perhaps XML), including math, and, hopefully, ending up with something that can be packaged as a credible e-book, usually with web-browsable XHTML as at least a byproduct.  The software packages that I’m currently aware of (emphasizing open source projects) include

More to come — stay tuned.