Formatting Resume with XML

I wanted to polish my resume for my renewed job search. The problem I faced before was that I had multiple versions of my resume. HTML for posting on the web, text for sending in email, Word for sending to recruiters, and PDF for printing. When I wanted to make a change to my resume, I had to change all the versions. Cut and paste helped but it was still lots of work and easy to overlook a typo in one version. I wanted a way to automate the process of generating multiple versions from a single master.

The solution was XML. I wrote an XML document that contained a structural description of my resume. If you have been sleeping, XML is a generic markup language similar to HTML and is a simplification of SGML. It is particularly well suited to representing the structure of a document. For a resume, the structure defined contact info, skills, education, and jobs.

Then XSL is used transform the XML into HTML, text, RTF, and PDF output. XSL is the style language used with XML. There are two parts to XSL. XSL Transformations, XSLT, is a language for describing transformations between XML documents. XSL:FO describes formatting objects that a formatter can produce typeset output, converting to PDF, TeX, or RTF.

Since HTML is similar to XML, XSLT is commonly used to produce HTML from XML documents. The processor actually produces XHTML, the XML version of HTML, and then trivially converts the format into standard HTML that browsers can handle. The result is clean, structured HTML. XSLT can also produce text files directly but that requires some care to make sure all the whitespace is in the right place.

Originally, I created my own structure without a formal document type definition (DTD). Then, I found the XML Resume Library project. They had a DTD that was pretty similar to what I already was using and it was easy to write my resume to use their DTD. A DTD rigorously specifies the structure of a document and can be used to validate a document. The advantage of validation and using a standard DTD is that it makes it possible for other systems to understand the structure of the document. For example, if recruiters ever start using XML-formatted documents, they will require a standard format so that their software can pull out the important information.

They also had more advanced XSL stylesheets. I modifed their stylesheets to reflect the format I wanted. Currently, my stylesheets completely replace the standard ones but eventually I will customize parts of the standard ones and gain the advantages of code sharing. I used XSLT files to produce the HTML and text formats. It is possible to use XSL:FO to produce PDF and RTF output but I wanted the printed output to look like HTML. The easiest way to do this was import the HTML into Microsoft Word, save the Word file and print to a PDF file. This was automated using Perl and is described in another article.

Generating the HTML and text was pretty easy. I used the libxslt2 and libxml2 libraries running under Cygwin on Windows 2000. Those are native C libraries for processing XML and performing XSLT transformations. I also used the Xalan XSLT processor and Xerces XML parser from the Apache Project. The Apache project also has a free XSL:FO formatter, FOP.

Related

Resources