One of the things I will need in my database is a table with all the language codes used in Linux locales. Things like en, fr, es, etc. There are lots, but where do I get a reliable list?
I’ve done some searching and found the IANA language subtag repository. It’s a 45000 line text file with contents in this format:
%% Type: language Subtag: ab Description: Abkhazian Added: 2005-10-16 Suppress-Script: Cyrl
Of all those records only 1155 lines are 2-letter codes, which is what I was interested in. How do I get the language code and english name from there into a database? Piece of cake if you know some basic shell scripting:
#!/bin/bash cat languagelist.txt | while read LINE; do if echo $LINE | grep Subtag > /dev/null; then echo -n "`echo $LINE | cut -f 2 -d' '` "; HAVECODE=1 elif echo $LINE | grep Description > /dev/null; then if [ $HAVECODE -eq 1 ] then echo `echo $LINE | cut -f 2 -d' '`; fi HAVECODE=0 fi; done
And insert it all into the database:
#!/bin/bash ./parselanguagelist.sh | while read LINE; do CODE=`echo $LINE | cut -f 1 -d ' '` NAME=`echo $LINE | cut -f 2 -d ' '` mysql -u user -ppassword -e "INSERT INTO Language (LanguageCode,LanguageEnglishName) VALUES('$CODE','$NAME');" ostd if [ $? -eq 0 ] then echo "Inserted $CODE ($NAME)" else echo "Failed to insert $CODE ($NAME)" fi done
Done, 190 records. And next time I want to update the list (who knows, it might happen) I’ll just need to get a new list and use the MySql feature that will let me either create or update a row depending on whether it already exists.
I think it would have taken me quite a while to generate this list of sql commands by hand :)