One of the things I will need in my database is a table with all the language codes used in Linux locales. Things like en, fr, es, etc. There are lots, but where do I get a reliable list?
I’ve done some searching and found the IANA language subtag repository. It’s a 45000 line text file with contents in this format:
%% Type: language Subtag: ab Description: Abkhazian Added: 2005-10-16 Suppress-Script: Cyrl
Of all those records only 1155 lines are 2-letter codes, which is what I was interested in. How do I get the language code and english name from there into a database? Piece of cake if you know some basic shell scripting:
#!/bin/bash
cat languagelist.txt | while read LINE;
do
if echo $LINE | grep Subtag > /dev/null;
then
echo -n "`echo $LINE | cut -f 2 -d' '` ";
HAVECODE=1
elif echo $LINE | grep Description > /dev/null;
then
if [ $HAVECODE -eq 1 ]
then
echo `echo $LINE | cut -f 2 -d' '`;
fi
HAVECODE=0
fi;
done
And insert it all into the database:
#!/bin/bash
./parselanguagelist.sh | while read LINE;
do
CODE=`echo $LINE | cut -f 1 -d ' '`
NAME=`echo $LINE | cut -f 2 -d ' '`
mysql -u user -ppassword -e "INSERT INTO Language (LanguageCode,LanguageEnglishName) VALUES('$CODE','$NAME');" ostd
if [ $? -eq 0 ]
then
echo "Inserted $CODE ($NAME)"
else
echo "Failed to insert $CODE ($NAME)"
fi
done
Done, 190 records. And next time I want to update the list (who knows, it might happen) I’ll just need to get a new list and use the MySql feature that will let me either create or update a row depending on whether it already exists.
I think it would have taken me quite a while to generate this list of sql commands by hand :)