By Andrew Smith
Since the 19th of this month (that’s 6 days ago, I don’t know where all that time has gone,oh yeah, tests) I’ve been importing the translated strings from Debian.
Right now I’ve done over 6 million (6036472) and I’ve only got to the end of the projects beginning with the letter “g”. Using some simple (i.e. inaccurate) math – it will take me another 7 days to finish importing everything I could guess the language code and parse.
The process is driven by a dedicated php script I wrote. PHP because the rest of my code is php, and I wasn’t going to rewrite things in bash :) Turns out that it works pretty well. At first I thought I was going to run out of memory, the script quickly ate up 5% of RAM, but over the next few days it went back down and now sits at a comfortable 1.8%.
I ran it manually (not through apache, actually apache isn’t allowed to read that script at all) in a screen session, which is one of the reasons I had to stop the first import attempt (that was in a plain terminal).
The other reason I had to restart the import was my MySQL configuration. Given that I’m not a database guy my MySQL was always using the minimum amount of resources, the defaults from my-small.cnf in Slackware. I’ve replaced that with my-huge.cnf and that had a very nice effect: no more swapping!
In the first attempt after about a day MySQL was using 120% of my CPU (dual-core). Now even after 6 days and 6 million strings inserted it’s using on average 15% of CPU and 25% of RAM. Everything else on the server (Apache, Sendmail, Imapd, etc) seem to be completely unaffected by the very heavy process.
One sucky thing about migrating from my-small.cnf to my-huge.cnf was that the Innodb backends are incompatible. So I had to:
- figure this out,
- reconfigure the server using the old settings,
- dump the OSTD database into plain text SQL,
- delete the backend,
- reconfigure MySQL with the new settings, and
- import the old SQL from the plain text
Luckily OSTD was the only MySQL user that was using the Innodb backend. So none of my blogs were affected. Though it all worked out fine in the end – I’m quite surprised that there is no automagic way to “upgrade” the Innodb backend. It’s bizzare to me that in this day and age of the cloud and enterprise scalability my storage backend woult be tied to MySQL memory settings.
I’ve started to clean up the site in preparation for the completion of the import, when I’ll be announcing its release. Still not sure if I’m going to register a domain for it or not, but probably not at first.