I’ve started to run into serious performance issues with my SQL queries. I mentioned my concerns earlier, but now (still long before production time) I’m already experiencing clearly unacceptable performance.

I’ve added a couple of thousand translated strings to the database, and uploaded another PO file for the record. That would run the following query for each line in the PO file:

"SELECT Translation.TranslatedString FROM Translation,Language WHERE " .
"Translation.LanguageID = Language.LanguageID AND Language.LanguageCode = '%s' " .
"AND Translation.EnglishString = '%s'"

This I think is called a join, and my boring data structures and algorithms experience says it’s an n^2 algorithm. But I figured a couple of things:

  1. This has got to be almost the most basic kind of SQL query you can write. Select from table1 where the foreign key in table1 is the same as the primary key in table2. Database 101? Why is MySQL not smart enough to cache the result of the subquery, and only repeat it if table2 is changed in the meantime? Was that really such a hard optimisation to make, or did they leave it slow on purpose to encourage better query design?
  2. My training as a real software developer tells me to avoid race conditions, and despite the extremely low likelihood that the LanguageID for french is going to change during the runtime of this query, I should never assume, and rely on the DBMS for making that association.

I figured wrong. The query above, after I inserted enough records into Translation would take ages, my page would take 5 minutes to load during which time MySQL would use 97% of the CPU. Unacceptable.

I had to step over my reluctance of caching a key outside the DBMS and run a separate query to obtain Translation.LanguageID, and ended up with this query in my loop:

"SELECT Translation.TranslatedString FROM Translation,Language WHERE " .
"Translation.LanguageID = '%s' AND Translation.EnglishString = '%s'"

Works much faster. So much faster that I’ve almost stopped thinking about extra hardware for MySQL. Shame that I had to do this.