Unicode, Drupal, PHP, MySQL: A love story (not)

I have been trying to do something fairly simple: I wanted to write a web page that could render well on a small screen (like a mobile phone) and which would display content from the IUF website.  How hard could this be?

Well, it shouldn’t have been hard.

The content on the IUF site, which is handled by a content management system called Drupal, is stored in a MySQL database.  The characters are encoded as Unicode (UTF-8) — because that’s what you need to do when you have multilingual content on your site, and the IUF site works in Arabic, Chinese, Russian and many other non-Latin languages.

So it should have been fairly simple to write a few lines of code in the widely-used PHP programming language to read news headlines from the IUF’s database and show them on screen.

Except that it was showing gibberish every time there was an accented character.

I posted a message calling for help on the Drupal forums.  (Not a big response there.)  I wrote to three very smart friends who understand these things and they all had good ideas, and pointed me in different directions, but to no avail.  Nothing was working.  One suggested that I give up.  I nearly did give up.

And then I decided to search again, and found this page where a programmer named “HoboTraveler” (I’m guessing that’s not his real name) encountered a similar problem on January 21st, 2008, more than five years ago.  He writes asking for help, gets loads of tips, tries a million things, nearly gives up, and then, dozens of comments later, he solves the problem.

He writes:

I think I got it to work with multiple databases!

I’ve inserted the line:

mysql_query(‘SET NAMES utf8’);

just after the mysql_connect() and it seems to work.

So I thought, what the hell, I’ll try that.  I tried everything else.  And “HoboTraveler”, whoever you are, you made my day.

Lesson of the story: sometimes life is hard.  And thank God for Google.