Efficient BibTeX
August 16, 2008
The setup I describe here consists of a single monolithic BibTeX database which will be used to hold all references. GNU Emacs can be used in conjunction with various web-based utilities to efficiently capture, tag, and format BibTeX entries. Finally, BibTool can be used when authoring papers to automatically extract the required entries from the primary database.
Capturing References
Emacs has a mode for editing BibTeX files that can make creating new
entries very easy. Each entry type has a corresponding keybinding for
inserting a “skeleton” entry of that type, complete with all required
and optional fields. These commands all begin with C-c C-e
(a prefix
in Emacs parlance). The most common is probably C-c C-e C-a
where the
final C-a
is for article. Other common entry insertion commands are
C-c C-e C-t
for technical reports (I use this for working papers) and
C-c C-e b
for books.
When the point is on an entry, pressing C-j
moves to the next field.
When you are finished editing the fields, pressing C-c C-c
checks the
entry, cleans up the unused fields, and automatically generates the
reference key if it doesn’t already exist. Finally, C-c C-q
formats
the entry nicely.
It is possible to customize both the algorithm used to generate keys as
well as how C-c C-q
formats the entry. I use an algorithm that in
most cases generates a unique key that is still readable. It generates
keys of the form authorYYtitle
where author
is the last name of the
first author, YY
is the year of publication (omitting the century),
and title
is the first word of the title (omitting words like the, an,
and, etc.). As for formatting, I choose to have the fields aligned at
the equals sign. I use the following in my ~/.emacs
file to
accomplish this:
(setq bibtex-align-at-equal-sign t
bibtex-autokey-name-year-separator ""
bibtex-autokey-year-title-separator ""
bibtex-autokey-titleword-first-ignore '("the" "a" "if" "and" "an")
bibtex-autokey-titleword-length 30
bibtex-autokey-titlewords 1)
It can also be useful to create a bookmark to your primary BibTeX
database. To do so, open the file and press C-x r m
and type the name
of a tag such as bib
. In the future you can open the file quickly by
pressing C-x r b
and typing bib
(or using tab-completion, just
b TAB
).
Several web-based tools such as Google Scholar and JSTOR, can configured to export BibTeX entries for papers. In Google Scholar’s preferences one can choose to display a BibTeX export link. JSTOR now provides this option as well without any configuration. However, once you become accustomed to your workflow, it is very fast to open your BibTeX database and either fill in the fields manually or copy and paste from a website.
Automated Entry Extraction
BibTeX files for specific LaTeX documents can be created using BibTool
which can automatically extract the required entries from a master
BibTeX database. After running LaTeX an .aux
file is created which
contains the keys of BibTeX entries cited in the paper. If paper.tex
is the name of the LaTeX document and /path/to/research.bib
is the
path of the master BibTeX database, then a paper.bib
file can be
created as follows:
% latex paper.tex
% bibtool -i /path/to/research.bib -x paper.aux > paper.bib
The above command can be placed in a Makefile to automate the process. This setup removes the need to copy and paste BibTeX entries and ensures that only the necessary references are included.
BibTool is also useful for “normalizing” databases, rewriting the keys
according to certain rules such as the ones described above. My
~/.bibtoolrsc
file looks like this:
ignored.word = "on"
ignored.word = "the"
ignored.word = "a"
ignored.word = "an"
ignored.word = "if"
key.generation = on
key.number.separator = {}
key.base = {digit}
key.format = {{%s(key) # %-1p(author) # %-1p(editor) # %-5.1W(institution) # %-5.1W(organization) } %2d(year)%-T(title)}