Archive for the 'Safe For Seneca' Category

I knew there was a good reason I liked C

Saturday, December 24th, 2011

By Andrew Smith

I think I mentioned earlier how I almost started writing a website in C and quickly realised that wasn’t the right tool for the job, and switched entirely to PHP.

For the work I did today I needed a good set of data structures:

  • A set of files
    • Each with a set of english strings
      • Each with a set of 0 or more translations

Pice of cake, right? Right, it would have taken me a half an hour in C, took me over six hours in PHP.

One of the problems was a surprise in scoping. Turns out in PHP there is no such thing as block scoping, and they forgot to mention that in the manual (isn’t it obvious there’s only per function and per file scope?). This created some very weird bugs that took a lot of printing to figoure out.

Then there’s the arrays. How can you have a programming language without 0-indexed arrays? PHP forces you to manage the indicees yourself, since their ‘array’ is actually a hash table. No vector either, not list, basically only a hash table. Can get used to it I guess but was it really so hard to have something more structured?

Then there are the classes. I needed to use them because there is no concept of a struct and the arrays are so retarded. I won’t go too much into it, let me just say that I suspect classes in PHP resemble classes in Perl. Sure you can have them, but don’t expect them to be easy to use.

All this complaining, you say, but didn’t it take you years to learn how things work in C, what’s the problem with taking the time to learn how things work in PHP? Well – I would accept that if it weren’t for the prevailing opinion that PHP is easy to learn. It isn’t: you can get started using it very quickly but its similarities to C in syntax make it harder (not easier) to learn it well.

And yeah, the same applies to JavaScript, but whatever. That’s for when I’m ready to do something with this same data I mentioned above in the browser. Gonna be fun :)

Some more evidence why unit tests are useless

Saturday, December 24th, 2011

By Andrew Smith

In the last year I’ve had to use CPAN several times. Perl people are all fancy and insist on all the perl modules to have unit tests. Commendable, except when it doesn’t work.

At least half of the times I used CPAN the unit tests failed. CPAN smarty pants decides not to install modules that have unit tests that failed. That’s a good thing, right? Wrong.

The problem is that unit tests usually fail not because the software is broken but because the unit tests are broken. Writing unit tests is somewhat similar to writing design documents – generally speaking they are obsolete as soon as you’ve written them, they are never complete, and they don’t account for all the possible environments.

So when CPAN fails, what am I as a user supposed to do, go file a bug report and wait a few months for someone to possibly fix it? Be real. I just run cpan -f to force an install despite failed unit tests. And given that I’ve had to do that 50% of the time I wonder if the typical cpan user simply adds that flag in by default.

Linux Stuff notifications on this wordpress blog

Friday, December 23rd, 2011

By Andrew Smith

I have this silly little website called Linux Stuff, where I post things I learn about Linux that I think might be useful for other people as well.

When a new article is posted I get an email about it, piece of cake using the PHP mail() function.

For some reason (I already forgot why) I now want a new post on this blog whenever a new article on Linux Stuff was added. It turned out to be not so hard.

First I found a file on the internet called remotepost.class.php, and edited that to look like this:

<?php
class remotePost
{
  private $client;
  private $wpURL = 'http://localhost/grumble/xmlrpc.php ';
  private $ixrPath = '../grumble/wp-includes/class-IXR.php';
  private $uname = 'newuser';
  private $pass = 'newpass';
  public $postID;
  function __construct($content)
  {
    if(!is_array($content)) throw new Exception('Invalid Argument');
    include $this->ixrPath;
    $this->client = new IXR_Client($this->wpURL);

    $this->postID = $this->postContent($content);
  }
  private function postContent($content)
  {
    $content['description'] =  $content['description'];
    if(!$this->client->query('metaWeblog.newPost','',$this->uname,$this->pass,$content,true)) throw new Exception($this->client->getErrorMessage());
    return $this->client->getResponse();
  }
}
?>

I made that file only readable by Apache, just in case, even though this is not a multi-user server. The newuser I created in the wordpress Users panel.

Then in my Linux Stuff PHP on submit I added these few lines of code:

    /* Send it to Grumble Grumble too */
    $content['title'] = $got_articlename;
    $content['categories'] = array('Open Source');
    $content['description'] = "New post added to Linux Stuff: $got_articlename.";
    try
    {
      $posted = new remotePost($content);
      $pid = $posted->postID;
    }
    catch(Exception $e)
    {
      echo $e->getMessage();
    }

Piece of cake! Now I can post interesting things either here in wordpress or on Linux Stuff, and LuckyYou can learn about them :)

Looking at .inc files on an Apache server

Saturday, December 10th, 2011

By Andrew Smith

Ever since I learned how CGI works (a couple of lifetimes ago) I was bothered by the fact that the source code is accessible by the web server, and by extension – by anyone on the internet.

If your CGI module is properly loaded and configured then Apache will execute the files rather than display them, but if there is a problem loading your module then Apache will stupidly display the contents of your source code in the web browser.

This is a real problem because the source code usually includes credentials for your database and who knows what else.

With PHP it got a little better when it became almost completely integrated into Apache and it was very challenging to break PHP without breaking Apache too.

Today I thought – wait a minute. My php will interpret .php files, but what about all those .inc files? They are PHP yeah but with a different extension. Sure enough I looked at ostd/parsePoFile.inc in Firefox and the whole source is dumped right out.

It was an easy enough fix, adding a Files section to my httpd.conf, but come on guys! How hard would it have been to add .inc files to the default config? .ht* is there. Lame.

Number of SQL queries per page

Friday, December 9th, 2011

By Andrew Smith

If someone updates a po file with 100 translations – I need to figure out whether each translation is already in the database, and if not – insert it. The result looks like this (more about “looks” later):

This is just a snippet.

I am concerned that running an SQL query like this:

"SELECT Translation.TranslatedString FROM Translation,Language WHERE " .
"Translation.LanguageID = Language.LanguageID AND Language.LanguageCode = '%s' " .
"AND Translation.EnglishString = '%s'"

for every line uploaded might be a bit too much. Is it?

The prevailing wisdom on the internet seems to be that more than a few queries per page is too much. But I wonder if that’s for viewing content rather than uploading. Perhaps for uploading content this is not too bad. I mean I hope a lot of uploading is going to happen but that’s not the most common use case.

Regardless, I can’t think of an easy way to optimise this. I would have to spend a lot more time on the database design. Maybe when (if?) the site gets popular I will have the motivation to learn more fancy DB optimisations.

Smaler PHP-generated HTML

Friday, December 9th, 2011

By Andrew Smith

I will have something like this on one of the pages in the website:

Each of those dropdown lists has the 190 language codes I mentioned in the previous post. It may not sound like a lot but 1900 <option> values really is a lot of HTML. It’s very easy to generate it all in PHP (one loop basically) but the result still has to be pulled over the internet, and suck up my bandwidth.

I had to think about this one for a bit, but I came up with a decent solution.

In my PHP loop I create two javascript arrays: one for visible name and one for value. I also have ten sets of dropdowns printed as HTML, but with the dropdown lists empty.

Then I write some javascript that will onload populate all the dropdown lists with all the names and values from the arrays.

This made the HTML sent to the browser probably 90% smaller, which made me very happy.

Just because it’s funny, here’s what some the php code looks like, I’m glad noone else will be working with me on this so I won’t have to explain it :)

  $languages = array();
  for ( $rowNum = 0; ($row = mysql_fetch_row($result)) != FALSE; $rowNum++)
  {
    $languages[$row[0]] = $row[1];
  }
?>
    <script type="text/javascript">
    function contentOnLoad()
    {
      var languageCodes = new Array(<?php
  # make the PHP array a Javascript array
  $firstElement = TRUE;
  foreach ($languages as $code => $name)
  {
    if ($firstElement)
      $firstElement = FALSE;
    else
      echo ",";

    echo "'" . $code . "'";
  }

That’s PHP, HTML, and JS all in one, still cracks me up :)

Scraping data from a reliable source

Friday, December 9th, 2011

By Andrew Smith

One of the things I will need in my database is a table with all the language codes used in Linux locales. Things like en, fr, es, etc. There are lots, but where do I get a reliable list?

I’ve done some searching and found the IANA language subtag repository. It’s a 45000 line text file with contents in this format:

%%
Type: language
Subtag: ab
Description: Abkhazian
Added: 2005-10-16
Suppress-Script: Cyrl

Of all those records only 1155 lines are 2-letter codes, which is what I was interested in. How do I get the language code and english name from there into a database? Piece of cake if you know some basic shell scripting:

#!/bin/bash

cat languagelist.txt | while read LINE;
do
  if echo $LINE | grep Subtag > /dev/null;
  then
    echo -n "`echo $LINE | cut -f 2 -d' '` ";
    HAVECODE=1
  elif echo $LINE | grep Description > /dev/null;
  then
    if [ $HAVECODE -eq 1 ]
    then
      echo `echo $LINE | cut -f 2 -d' '`;
    fi
    HAVECODE=0
  fi;
done

And insert it all into the database:

#!/bin/bash

./parselanguagelist.sh | while read LINE;
do
  CODE=`echo $LINE | cut -f 1 -d ' '`
  NAME=`echo $LINE | cut -f 2 -d ' '`
  mysql -u user -ppassword -e "INSERT INTO Language (LanguageCode,LanguageEnglishName) VALUES('$CODE','$NAME');" ostd
  if [ $? -eq 0 ]
  then
    echo "Inserted $CODE ($NAME)"
  else
    echo "Failed to insert $CODE ($NAME)"
  fi
done

Done, 190 records. And next time I want to update the list (who knows, it might happen) I’ll just need to get a new list and use the MySql feature that will let me either create or update a row depending on whether it already exists.

I think it would have taken me quite a while to generate this list of sql commands by hand :)

GUI for editing PHP live on the server

Thursday, December 8th, 2011

By Andrew Smith

I have an aversion for vi. It’s present on every linux box so I had to learn how to use it but I very much dislike it, it’s a pain the the ass to use. (Emacs is no better by the way)

I found a brilliant solution for that problem though. In Linux there is a thing called FUSE and another thing called sshfs.

That lets you mount a directory from the server onto your workstation filesystem, without any changes to the server (assuming you have sshd running, which of course you are). Then you can use your favourite GUI editor (mine is Scite) to edit your php and css. Brilliant stuff.

It also helps that my server is on the same lan as my workstation, so I get LAN transfer speeds, so the mounted network filesystem acts as fast as a harddrive.

Now I can use a nice editor to make my changes, a nice file manager to work with files and folders, and after making a change I just go to the browser and hit reload. This is so much more efficient than the command line!

Potentially I can even run svn commands from the workstations but I don’t think I will try that because the svn on my workstation is much newer than that on my server, so I expect that would cause problems.

I do keep a terminal to the server running in case there’s a critical error in my php. Running php from the command line is sometimes an easier way to find the bug.

Version control for websites

Thursday, December 8th, 2011

By Andrew Smith

I was always a fan of version control, even for binaries like word file for homework. This helps keep a history of stuff that may get deleted or changed, and helps a lot for working using multiple computers.

But I’ve never version controlled a website. Mostly because I never had one I was doing serious development on. In fact several of my websites were made using the Seamonkey editor once, uploaded to the server, and never touched again.

This project is different though, there will be a lot of serious code in it. Why not version control it? I decided to do it.

I had a Subversion repository for the project already, and in there I created a subdirectory for the stuff that will go on the server. On the server I checked out that subdirectory into the root of the webserver, as ‘ostd’. Then I added and committed all the php and css and images. Done!

Now I just have to remember to do commits when I’m done working on a feature or will be away from the project for a few days.

Making a good looking website

Thursday, December 8th, 2011

By Andrew Smith

I am the first to admit that I’m almost completely incapable of making something look good. I’m reasonably good at making things functional and easy to use, but when it comes to looks.. not me.

At first I thought I would start with just plain black and white PHP output, but I felt that was going to be a significant problem when trying to attract interest, even early in the project. Also I wouldn’t want to find out two months in that I should have written a Wordpess skin rather than my own PHP.

So then what to do?

I quickly discounted WordPress. I have several Wodpress installations and the maintenance is a nightmare. New versions come out all the time, most of the time there have been security bugs fixed so an upgrade is required, but constantly upgrading a WordPress installation and making sure that the new version still works was too much.

Then I looked around for website templates. There are actually quite a few websites with free templates on. I chose one from http://www.webtemplateocean.com/ – it’s licenced Creative Commons Attribution, that worked for me.

Then the template (one html file, one CSS, and a bunch of images) needed to PHPified. Basically that means to split the body into three parts – top, body, and bottom. The top and bottom are the same for every page, so they can be printed from PHP functions. The body can be different for every page.

I was quite happy with the result. My index.php has basically nothing in it other than the content of the main page, and the same is true for all the other pages.

I will add a link to the website once it’s a little more cleaned up, in a later post.