Unicode characters are spoiling my LaTeX bibliography and I cannot find them

I was being driven a little up the wall by biblatex rendering errors which referred to Unicode characters within my .bib database.

First I learned that the degree-like symbol you get from typing option + 0 in Mac OS is actually the “Masculine Ordinal Indicator” and you should use Option + Shift + 8 for a degree symbol.

That’s not much help though, since degree signs in your .bib file will still cause problems for creating a bibliography. Instead you need to put in titles like “Global warming of 1.5 \textdegree\ C” which almost renders properly, with the only problem the inclusion of a space between ° and C.

Much more annoying was one ‘ZERO WIDTH NON-JOINER’ (U+200C) which snuck into my .bib file. The error logs don’t say what line it is on, and the character is invisible in TextMate. After trying a bunch of ineffective suggestions on various web forums, I found one that referenced this Unicode converter. Throw in your bibliography and it will tell you the contents in Unicode terms character by character and let you find anything which is yielding errors.

4 thoughts on “Unicode characters are spoiling my LaTeX bibliography and I cannot find them”

Also useful:

How to find entries with commas at end of name(s)?

Code like this can also help with tracking down hard-to-find errors:

\DeclareUnicodeCharacter{0301}{*************************************}

View non-printable unicode characters

Online tool to display non-printable characters that may be hidden in copy&pasted strings.

I have been trying to track down the origin of a bewildering error for quite some time:

WARN – BibTeX subsystem: /var/folders/ph/lhhj3yyn07zgj7f61637c8dc0000gn/T/pYjUebZJjE/thesis.bib_16043.utf8, line 39733, warning: 1 characters of junk seen at toplevel

I just sorted it out! There was a comma after the closing } in the previous entry.

4 thoughts on “Unicode characters are spoiling my LaTeX bibliography and I cannot find them”

Leave a Reply