Blosxom: Less is More

Blosxom, a lightweight weblog program written in Perl, is a minimalist’s paradise. It embodies elegant simplicity both in terms of design and implementation. The version 2 core weighs in under 500 lines of Perl but despite it’s size it is highly extendable.

In this document I describe how I use Blosxom on this site and how I have extended it, through the use of several plugins, to serve MathML.

Background

When I recently restructured and redesigned this site I had several goals in mind. I wanted to write posts as Markdown-formatted plain text files with inline LaTeX representations of mathematical expressions. I wanted to allow comments and trackbacks and provide full-content Atom feeds for entries and comments. I also wanted the structure of the site to be somewhere between that of a traditional static site and a weblog. Some pages should appear on the main page in chronological order and others should only be accessible through direct links from other posts or from the archives.

Now, if I want to write posts containing LaTeX expressions, then I need to render them somehow. Rather than call LaTeX and generate images for those expressions, I wanted to convert them to MathML. Presently MathML must be embedded in XHTML and all such pages must validate. If I am generating valid XHTML+MathML+SVG pages, then I get inline SVG for free.

I actually tried out many different mainstream programs and a few more esoteric ones and found that they could easily satisfy most of the requirements. However, I also found that getting any existing package to convert LaTeX to MathML and serve it properly was going to require some work.1

Ultimately, from the universe of weblog programs and content management systems, I ended up choosing Blosxom2 and I have since managed to implement all of the above requirements. Naturally, now I have to start writing about it. In some cases, this was as simple as installing an existing plugin. Other times I had to roll my own.

By now I have modified almost every plugin I use in some way,3 but that’s actually one of the best parts. You can take a vanilla Blosxom setup and do pretty much anything you want with it. Everything about Blosxom is hackable—usually without needing to modify Blosxom itself.

In the following I list some of the notable plugins that I use. Where possible, I have linked to the plugin’s homepage. In some cases, there are many versions of a plugin floating around but there is no clear home. In these cases the plugin can usually be obtained at the Unofficial Blosxom User Group’s plugin repository or the Blosxom CVS repository.

Blosxom Basics

Perhaps my favorite thing about Blosxom is that all posts are stored as simple flat files in a directory hierarchy. Each directory represents a category and categories can be nested simply by creating subdirectories.

Each post can be presented in various flavours with each flavour being denoted by its corresponding file extension. HTML, XHTML, plain text, and RSS are all examples of flavours one could define. Flavours need not correspond directly to a particular format though. When a particular post is requested, template files corresponding to the given flavour are loaded and the post is formatted accordingly.

First, there are a couple of “essential” plugins:

  • meta - For storing metadata in entry files. I use this to record things such as the creation and modification dates, short descriptions, keywords, the type of markup used, and whether or not I want comments enabled.

  • interpolate_fancy - Allows for conditional inclusion of various pieces of template files. For example, I only show comments on individual story pages, not on the main page. My template files rely heavily on this functionality and the pictureindex plugin also requires it to execute actions.

I store a good bit of metadata with each post and several other plugins I use need to access that metadata as well as the page title. These plugins help process that metadata:

  • entries_cache_meta - This plugin caches the metadata for each post to avoid opening and reading each file again, saving lots of overhead. I have used it in writing my own plugins and modified some existing plugins to work with it in order to make them more efficient. It also allows one to store the creation date of each post in a meta tag instead of using the file’s mtime.

  • meta_head - A plugin I wrote which uses the hash generated by entries_meta_cache to provide metadata in head templates. It is available to anyone upon request and I may eventually clean it up and release it.

Organization

I tend to write longer, more permanent posts that easily lend themselves to a hierarchical structure (e.g., short HOWTOs and homepages for my open source projects). From the outside, I’d like these pages to be organized like most static websites: with a categorical directory hierarchy. I also want my weblog entries to be organized in this way. However, I also wanted to provide lists of recent entries, feeds, and so on like most weblogs. Here are a few plugins I use to support this site structure:

  • index_override - This plugin allows one to override category pages or even the main page with a single story page. I use this for my Markdown mode so that I can file other entries in the category /projects/markdown-mode/ but display the project homepage when someone visits that category.

  • hide - The hide plugin allows me to hide certain pages from the front page and feeds. For example, my accessibility statement is not really a “post” so I put /meta/accessibility.text in my hide file to hide it from the front page

  • extensionless - Allows me to use simple extensionless URIs for my pages (see Cool URIs don’t change for background information).

  • canonicaluri - Another plugin which helps me to enforce URI rules regarding things such as trailing slashes.

  • find - A full-featured search plugin with an optional advanced search form.

Processing Content

If I write and store content in Markdown-formatted plain-text files, I need plugins to process that content and convert it to XHTML before it’s sent to a browser. Here are a few plugins that help with this:

  • MultiMarkdown - This plugin parses my Markdown-formatted plain text entries and converts them to valid XHTML. MultiMarkdown supports a superset of the original Markdown syntax, allowing for footnotes, tables, etc. You can see the plain text source of any page on this site by adding a .text extension.

  • acronyms - Scans posts for abbreviations and acronyms and marks them accordingly. I have slightly modified this plugin so that it tags each instance of a given acronym but it only uses the title text for the first instance. In addition, I use CSS to only underline the first such instance:

    abbr, acronym {
        border: none;
    }
    
    abbr[title], acronym[title] {
        cursor: help;
        border-bottom: thin dotted black;
    }
    
  • pictureindex - A very flexible, templated image gallery plugin. I’ve modified this plugin slightly to provide links to the image file itself and to allow for a custom preview image.

These plugins assist in converting inline LaTeX to MathML and handle named entities:

  • itex - This plugin converts inline LaTeX expressions to MathML using itex2MML.

  • numeric_entities - Serving content containing named XHTML or MathML entities as application/xhtml+xml content can be troublesome. This plugin converts all such entities to their numeric equivalents.

As I mentioned, since I am serving valid XHTML+MathML+SVG documents, I can include inline SVG graphics:

  • story_icon - This is a plugin I wrote which looks for an icon meta tag and includes the corresponding SVG icon inline. It is based on the include plugin and is currently unreleased. Just ask if you’d like to see it.

Serving Content

In order to ensure that supporting browsers are able to render pages containing MathML and SVG, I need to make sure it is served with the correct mime type to each browser:

  • xhtmlmime - This plugin sets the content type to application/xhtml+xml. I need to do this for most browsers in order to support inline MathML and SVG. Following Jacques Distler’s Movable Type modifications, I’ve modified the plugin to send text/html to Internet Explorer, Chimera, Camino, and KHTML-based browsers and (except for MSIE with the Math Player plugin and MathML-enabled Camino builds).

Finally, there are a couple plugins I use for efficiency and speed:

  • gzip - HTML gzip compression for faster transmission and reduced bandwidth.

  • lastmodified2 - Generates ETag and Last-Modified headers and handles If-none-match and If-modified-since headers in requests to prevent sending duplicate copies of unchanged pages.

  • page_caching - A plugin I wrote to strike a middle ground between a fully static, faster site and a fully dynamic, slower site. It caches pages when they are first requested and returns the cached version on subsequent requests if it is still fresh. It is fairly stable—I’m presently using it cache every page on this site—but it isn’t quite ready for public consumption. It is available upon request and I eventually hope to be able to release it.

Feedback and Syndication

  • feedback - handles comments and trackbacks. This plugin supports Markdown formatted comments, feedback notification and moderation, and Akismet spam protection. I’ve also slightly modified this plugin so that I can generate Atom feeds of comments.

  • autotrack - Scans new posts for links and automatically sends trackback pings to sites which support them.

  • atomfeed - Simply put, Atom rocks. I’ve entrusted MultiMarkdown and itex2MML to generate valid XHTML for my posts so I’ve disabled the built-in XML::Parser validation.

  • xml_ping_generic - Each time a new entry is posted, this plugin pings a list of URLs using the weblogs.com’s XML RPC ping API. I’ve modified it to send extended pings and to ping Google in addition to weblogs.com and Technorati.

  • gsitemap - Generates an XML sitemap flavour which can be read by search engines such as Google.


  1. That is, any existing weblog software. Instiki does this beautifully, but it’s a Wiki.

  2. I must admit a slight bias toward Blosxom. I started using it just over two years ago when I started the previous version of this weblog. However, when I first began planning to restructure the site I had fully convinced myself to switch to something else.

  3. I have actually hacked on things so much that I am now am terrified that one day someone might actually release a new version of Blosxom. Luckily everything is in Bazaar so I can back out patches if necessary.