## Sunday, July 08, 2007

### [Linux] Text encoding

I recently had a problem with special Portuguese characters and latex.
The problem is that the input must be in latin1 encoding, and I was
using utf-8 to edit an input file. It took me quite a while to figure
out why latex would not compile my recently edited file, but would
nicely compile my old files written in Portuguese. The error I was
getting was:

"Command \textcent unavailable in encoding T1"

whenever a special symbol was found (for instance, the letter 'a'
circumflex). After studying encodings, unicode, ascii, latin1, and
also searching for similar errors, I found that Latex does not fully
support UTF-8 (as of this date), and that the input should be encoded
in latin1. The way to check the encoding of a file in VIM is to open
the file and type:

:set fileencoding

if it is utf-8 or anything other than 'latin1', you will get error
messages. So it really turned out that some recently edited files of
mine were in utf-8. My solution was then to type

:set fileencoding=latin1

inside GVIM, and save file, re-run latex and .... done.

PS: my latex preamble has the following:

\usepackage[T1]{fontenc}
\usepackage[latin1]{inputenc}
\usepackage[portuges]{babel}

I am using tetex-3.0_p1-r3

Ricardo Fabbri said...

Another cool thing I discovered:

In very recent VIM distributions (the most recent one in Gentoo has this), you can set the accents keymap, which is just like us-acentos keymap, but restricted to VIM input. I now use this keymap in VIM, while for navigating mode (esc in VIM) the mappings will be out of the way, enabling fast navigation.

Ricardo Fabbri said...

To set the keymap:
:set keymap=accents

you can put this in .vimrc

Matt! said...

How incredibly handy. Never knew how to do that before - my problem was totally unrelated to LaTeX but I still needed to know the encoding. Nice post!

Ricardo Fabbri said...

I'm glad, Matt. I would never imagine this post would be useful to so many people - it seems to have a high pagerank from Google! The internet rocks.

rasha said...

Sempre tive problemas com isso quando tentava correger meus ensaios de português: escrevo pelo Vim e LaTeX e corrijo pelo aspell. Agora tudo funciona!

Anonymous said...

¡Muchas gracias!

Valerio Schiavoni said...

saved my eyes, thanks for this !

Eduardo said...

Thanks for your help, I was getting crazy with this.

George said...

Thanks for your help! I actually used gedit to change my encoding, simply by opening the file and saving it again with a different encoding...

All it takes is choosing Western (ISO-8859-1), which is the same as latin1.