Computers, PHP, Programming, Technology, Web Programming

Moved to Amazon EC2

I’ve just moved the blog over to Amazon EC2 and so far everything seems to be going well.

I’d been considering the move for a while and a new feature (well I’m not sure how new it is but I only just noticed it) is a new smaller instance type. The virtual servers Amazon offer used to come in three sizes, small medium and large starting at $0.10*. Pretty quickly they added some bigger sizes (going all the way up to $2.00 per hour for quadruple extra large) as well as some more specialized types like GPU clusters. But it still meant the minimum price per month for a server always on was about $74/month which is expensive for simple web hosting.

Now however, their new micro instances are available at a pretty cool $0.02/hour (about $15 a month). For the performance you’re likely to get it’s still probably not the most cost effective solution for plain web hosting, but for having complete access to a server with high availability (and the extra features hosting on Amazon’s infrastructure provides like being able to clone a whole server with one click) it’s pretty good.

One final note is to remember that these numbers are not the final costs you’ll have to pay. You still pay for storage and data transfer which in my case look like they’ll be about an extra 10% extra.

* Since then the price of the small instance has come down to $0.085/hour or about $63/month.

Computers, PHP, Programming, Technology, Web Programming

FreeNAS

In an effort to get more storage to share between the three computers at home (two Windows and one MythTV) I setup yet another machine running FreeNAS.

FreeNAS is a small (about 30MB) operating system based on FreeBSD designed just to be a NAS (Network Attached Storage). You add hard drives to it and it makes them (optionally) available in several different ways, including:

  • CIFS/Samba
  • NFS
  • rsync
  • HTTP
  • FTP

After a few minor problems setting it up (like a power cable breaking and installing from an old CD-ROM drive that didn’t work) it works great. Copying a large (~40GB) chunk of files to it at once took a while but writing to and reading from it at more sensible levels isn’t noticeably slower than using local files (on a gigabit network).

Computers, Entertainment, PHP, Programming, Technology

Transcoding DVDs

Following my post about ripping DVDs, here is a method for transcoding the DVDs into something more manageable. I should point out that is probably for the more technical amongst you – there are certainly easier ways to do it but this has the advantage of being very automatable.

Since MythTV (and Linux in general) seems to like ffmpeg for video encoding/decoding, I figured I’d use that. You can get a binary version for Windows and read the documentation.

The actual command line I use to transcode is:
ffmpeg -i $in_file -vcodec xvid -qscale 5 -acodec copy $out_file

That means to use $in_file as input (a VOB file in my case), use the Xvid codec for the video, set the “quality” to 5, copy the audio straight from the original and save as $out_file. The quality in this case is just simplification of lots of other settings that are available. 1 is perfect and 31 is the worst. 5 results in files that are about 500MB per hour with MPEG artifacts that are visible when I’m sat at y desk but not when I sit on my bed six feet away which is where I normally watch video from. It may be worth transcoding a short clip with a few different settings to see which your happy with.

I made the whole process semi-automatic by writing a CLI PHP script that checks for VOB files in a specifc folder and transcodes the ones it finds. That way I can have the transcoding going on in the background while I rip the DVDs (and then leave it running it overnight to finish). I could make it available to anyone who wants it, but a batch files doing the same thing would probably be more useful for people…

There is one last caveat. I originally encoded the movies with MP3 audio and then half way though decided I want to keep the 5.1 audio (which the above method does). However the version of ffmpeg I used at first had a problem such that AVIs with AC3 audio played back with no sound. If you have a similar problem make sure you have the latest version of ffmpeg you can get.

Computers, Javascript, PHP, Programming, Technology, Web Programming

QED Wiki and the Zend Framework

IBM are working on an impressive looking product called QED Wiki, developed with the Zend Framework.

Fundamentally it’s a wiki like any other. But there is a cool layer on top of it that could be revolutionary (although like many Web 2.0 concepts will probably fall short and just be “cool” – we can hope). The interface allows you to create “situational applications” that can link different components together with the ease of a wiki.

It doesn’t really make much sense just reading about it so go watch the video about it.

On a related note, you can now get snapshots of PHP 6.

Computers, Languages, PHP, Programming, Technology, Web Programming

How much fluff is needed?

I’ve been sorting out exactly what needs recording for the language app (which I finally have an idea for a name for) and I was trying to decide how much extra instructor speech is needed. Situations aren’t described for instance (no “Image an English man sitting next to a French woman”) and you aren’t asked to say things explicitly (“How do you ask someone if they speak English?”). Will this harm the process at all?

The best thing to do perhaps would be to avoid trying to be Pimsleur quite so exactly.

Computers, Languages, PHP, Programming, Technology, Web Programming, XML

Almost ready for a public viewing

The still unnamed language learning app is almost ready for a first public viewing. I’m just trying to get some audio of some other than myself. Firstly because I don’t like really hearing my own voice (and for this purpose my less than perfect pronunciation is too obvious) and secondly I need at least two people just for it not to be confusing.

In the meantime I thought I’d share an example of the script file I’m using:

EntschuldigenSie.xml

It primarily contains English translations although one phrase is done in a few more languages.

It does highlight one possible issue. I had to change the German ß to ss. Although Windows seems perfectly fine with Unicode file names (internally it uses Unicode for storage (either UCS2 or UTF-16 – not sure which)) PHP refuses to open them (fopen, file and file_exists for instance just don’t work) and Apache 2 seems to have issues as well. For German there are workarounds but for other languages it will get fiddly. This might not even be a problem on Linux where it will ultimately reside and it only affects file names which only have to give you a rough idea of what’s inside. But still, it’s annoying…

Pimsleur, German, Windows, Apache, Unicode, UTF-16

Computers, Languages, PHP, Programming, Technology, Web Programming

Best bits of the language app are done

The most important bits of my cool language learning web app are done. Here’s quick overview of how it works.

Everything is split into modules which are XML script files and accompanying audio files.

Currently one type of script is supported, a “conversation”. This contains a short (less than 10 sentences) conversation with sub elements all marked up in XML. Sub elements are phrases, terms and notes. At the moment phrases and terms are handled almost identically. Notes are little explanations or possible stumbling points (for example the test script I have alerts the listener to the difference in the ending between “Ich verstehe” and “Sie verstehen” in German).

Any element of a conversation that is to be repeated is named (literally – the XML tag is given a name attribute). The system keeps track of the number of times a name phrase/term is played to the user and when it was last played so the automatic repetition system can work.

A lesson is currently very simple. A module is loaded and the conversation is played straight through. Then the named phrases/terms are played* with translations. Then any phrases/terms scheduled for repetition are played*. The repetitions are actually determined before the conversation is played however so that if too many are required then no new conversation is played.

* Played in this case means a specific format. First the native version is played, then a pause, then the translation is played twice.

Pimsleur, German