Syntax highlighting for code blocks in Emacs Markdown Mode

July 13, 2015

Charl Botha on syntax highlighting in Markdown Mode:

When I’m editing my Markdown, I’d obviously like to see this language-specific highlighting interspersed with my normal Markdown highlighting. Sublime Text’s MarkdownEditing package does a superb job of this, but of course we’re currently rediscovering the universe that is Emacs.

DuckDuckGoing around, we run into at least two Emacs packages that do this: mmm-mode and polymode. We decided to try out both of them…

The simple and direct approach is to install mmm-mode and add something like this to your .emacs file, or equivalent:

(require 'mmm-mode)
(setq mmm-global-mode 'maybe)

    :submode lisp-mode
    :front "^```lisp[\n\r]+"
    :back "^```$")))

(mmm-add-mode-ext-class 'markdown-mode nil 'markdown-lisp)

This asks mmm-mode, when in markdown-mode, to highlight all GFM-style fenced ```lisp blocks using lisp-mode. However, I have Markdown documents with code blocks in many different languages and I didn’t want to clutter my .emacs file with similar blocks for each language. Instead, I automated the creation of these “classes” as follows:

(defun my-mmm-markdown-auto-class (lang &optional submode)
  "Define a mmm-mode class for LANG in `markdown-mode' using SUBMODE.
If SUBMODE is not provided, use `LANG-mode' by default."
  (let ((class (intern (concat "markdown-" lang)))
        (submode (or submode (intern (concat lang "-mode"))))
        (front (concat "^```" lang "[\n\r]+"))
        (back "^```"))
    (mmm-add-classes (list (list class :submode submode :front front :back back)))
    (mmm-add-mode-ext-class 'markdown-mode nil class)))

;; Mode names that derive directly from the language name
(mapc 'my-mmm-markdown-auto-class
      '("awk" "bibtex" "c" "cpp" "css" "html" "latex" "lisp" "makefile"
        "markdown" "python" "r" "ruby" "sql" "stata" "xml"))
Markdown Mode and MMM Mode in Action
Markdown Mode and MMM Mode in Action

The function defined above works for languages where the language name and Emacs mode name are directly related (e.g., bibtex and bibtex-mode). The call to mapc applies the function to each of the languages in the list that follows. Since there is an optional mode argument, other cases can be handled easily as needed:

;; Mode names that differ from the language name
(my-mmm-markdown-auto-class "fortran" 'f90-mode)
(my-mmm-markdown-auto-class "perl" 'cperl-mode)
(my-mmm-markdown-auto-class "shell" 'shell-script-mode)

By default mmm-mode doesn’t automatically re-parse the buffer when new code blocks are added. However, you can ask it to automatically re-parse the buffer when Emacs is idle. This works well in my experience:

(setq mmm-parse-when-idle 't)

If you prefer not to burn your idle cycles checking for buffers that need re-fontifying then you can issue M-x mmm-parse-buffer as needed or via a keybinding:

(global-set-key (kbd "C-c m") 'mmm-parse-buffer)

And yes, the code above does indeed define a markdown-markdown subclass for Markdown code blocks within Markdown documents. Look out!1

  1. Of course, I couldn’t resist trying to nest a GFM code block inside a Markdown code block inside a Markdown document and of course, things got weird.