There are plenty of people in our industry (software development) who advise to fail early and fail often. This is a story of such a project. The OSTD is a really cool idea: get free translations, or at least get started on them. I never pushed it too hard but I developed it to the …
Continue reading OSTD: an experiment that didn’t workOSTD
Posts related to the Open Source Translation Database.
Disgusting: for L in `cat lang.txt | cut -f 2,3,4,5 -d’ ‘ | sed ‘s/^.//’ | sed ‘s/.$//’ | sort`; do echo -n “$L “; done More disgusting: cat lang.txt | sort | awk ‘{ a=substr($2$3, 2); sub(“)$”, “”, a); print ” \””$1″\”, \””a”\”, \”The <a href=\x27http://littlesvr.ca/ostd/\x27>OSTD</a>\”,” ; }’ It reminds me of when I …
Continue reading I’m ashamed I wrote thisTranslating software is hard, I know from my experience of starting two new open source projects (ISO Master and Asunder) about the challenges of learning how to use Gettext, finding volunteers to do the translations, encouraging and enabling them to translate my software. The work was worth it for me, I now have almost 70 …
Continue reading Announcing the Open Source Translation DatabaseI was going to show the OSTO to Chris Tyler and earlier that day, because demos never work, I tried it, from the Seneca network. Turns out already the OSTD is a victim of its own success. When translating the ISO Master POT file I get almost 6000 translated strings in 153 languages. I you …
Continue reading Size mattersFinally a couple of days ago the import of all the translated strings from most of the software in Debian into OSTD has been completed. Now there is a grand total of 11236263 translated strings! It took 1059647 seconds, which is just over 12 days. That’s 0.094 seconds per translation. I’m sure it could be …
Continue reading Debian import completeSince the 19th of this month (that’s 6 days ago, I don’t know where all that time has gone,oh yeah, tests) I’ve been importing the translated strings from Debian. Right now I’ve done over 6 million (6036472) and I’ve only got to the end of the projects beginning with the letter “g”. Using some simple …
Continue reading 6 million translated strings and countingMost of the po files in the Debian tarball follow the naming convention packagename_version_languagecode.po So for all of those I could figure out the language code using a regular expression (or three) on the filename. Armed with that and the exceptions I mentioned in the last post on this topic I was able to get …
Continue reading Language codes, part 2While analyzing the files I got from Debian I ran into a lot of language codes that weren’t in my database already. It was an interesting exercise, involving me learning about the existence of languages such as Javanese and countries that I already forgot about. The problem is that some of the language codes are …
Continue reading Language codes, part 1Christian Perrier from the debian-i18n list has done me a huge favour. He created a tarball with every translation in every language for every piece of software in Debian! You may imagine it’s huge as did I, but I was shocked at just how big it is. Almost 2 GB of gzip-compressed PO files from …
Continue reading Lots of translationsHere’s an OSTD feature I was really excited about and looking forward to implement: allow the user to put in some version control info for a project so that I can run a nightly cron job and pull any new translations that have been added to that project. Would have been a great feature, and …
Continue reading Oh, you Git!