The €uropean Connection

Assuming that the world just speaks English is really convenient. In particular when “English” is understood as the ex-colony’s variant, rather than the language of England…

I’m quite entitled to say: when I was living in Michigan, I made it a point to acclimate myself to the American way of life. So I became used to Fahrenheit, Imperial (erm, “US”) measurements, the US keyboard layout, driving on the right, etc. Surprisingly, it was easy enough to switch to and from the US system when visiting Malta (even driving on the left, though I sometimes forgot to hit the clutch with the stick-shift, and more often my right hand would fly out and hit the door in an attempt to reach the gear lever).

Now we’re living in good old England, and I finally realize just how different the American assumptions are. For one thing, there’s much greater diversity here. Because of Europe that is. My boss is Norsk, more collaborators are non-English-speaking, etc. For the first time in my life I had to drop the convenient assumption that the 7-bit ASCII system is enough.

You see, with Maltese I could pretend. Windoze took so long to become Maltese-friendly that most people just got used to writing Maltese without any accents. And there’s so much redundancy in the language that it really doesn’t matter. Everything is still clearly understood. Nowadays, even though Windows caught up, and UTF-8 makes things really easy, many still can’t be bothered.

I could pretend only because most of my work was in English anyway. My only use of Maltese was social, so that was not critical. Nowadays, that assumption had to fall. I’m regularly using UTF-8 for text files (mostly in order to type in foreign names correctly). I’ve also recently updated my MP3 filenames to use UTF-8. In most cases it doesn’t matter too much (French, Italian and German are all within the 8-bit Latin-1 character set). But Greek isn’t, and I do have some Greek tracks and albums.

It’s nice that NTFS supports UTF-8. Clearly, Linux also has no issues. But that doesn’t mean everything is plain-sailing. Some tools (e.g. FileSync on Windows) just get confused. Others that should know better (e.g. Cygwin) also have problems, while some (such as PuTTY and even regular CIFS-mount on Linux) don’t default properly. In short, I’ve set my PuTTY shells to default to UTF-8 encoding. I also installed a Cygwin patch to allow UTF-8. And when I get to sync files between my main repository (ext3 on a Linux box) and an NTFS copy, I just do it on the Linux box itself:

  • mount -t cifs -o iocharset=utf8,rw,username=xxx //imladris/xxx /mnt/
  • rsync -av –delete /opt/users/ /mnt/users/

Life just couldn’t be simpler.


