Any voice talent out there?

Oliver Brown
— This upcoming video may not be available to view yet.

Foreign language voice talent needed for the still-unnamed language learning application.

I have a series of phrases I need recording with a total audio time of about five minutes. I need them in as many languages as possible (although if it isn’t English, German or Finnish then I’ll also need them translating - they are really simple by the way). My main requirements are that the recording is good quality and that you are a native or fluent speaker of the language.

After looking around the Internet for a bit I discovered I could technically afford the hourly rate of most voice actors marketing themselves on the Internet - except they all had rather high minimums which made my five minutes very expensive (although I realise five minutes of audio takes more than five minutes of work). As well as money I can also offer a link and a review which has to be worth something (after all people are paying me for links that have nothing to do with the content of the site and presumably think it’s worth it).

If you’re interested, email me with the language(s) you could do and a quote (and preferably a sample but I acknowledge that this approach is hardly targeting professionals).

TalkTalk service could change

Oliver Brown
— This upcoming video may not be available to view yet.

To all those people with TalkTalk problems, the quality of your service may change.

I found this page on Sam Knows that is apparently a list of telephone exchanges with scheduling dates for TalkTalk unbundling. According to the information there, no exchanges have actually been switched completely to TalkTalk yet. When they do switch there should be a difference in quality. For most people it apparently couldn’t get worse so this should be seen as a good thing.

The date for the changeover is 31/08/06. I’m a little suspicious about the authenticity of those dates since a reasonable amount of physical activity is required to switch over and doing them all at once seems silly.

Multilingual pretty URLs

Oliver Brown
— This upcoming video may not be available to view yet.

There is more and more emphasis on pretty URLs these days. With things like Ruby on Rails around to easily support it and better knowledge and use of things like mod_rewrite the days of horrible query strings is going away (excluding of course the most used websites - search engines). But how do you make your multilingual website have pretty URLs?

My language learning app uses the Zend Framework and so uses pretty URLs by default. I need the interface available in many languages, but then the URLs should be pretty in a localized way.

For example, starting a new Finnish lesson uses the following:

/lesson/new/fi

That would be the new action of the lesson controller with an extra language code parameter of fi.

In German this should be something like:

/lektion/neu/fi

By default this would access the neu action of the lektion controller.

The “simple” solution would be to write lots of controllers that just delegate to the real one. Which is silly. Instead an extra layer has to be added to the routing process some sort of look-up table mapping localized URL fragments with “real” canonical ones. This should be fairly simple with Zend Framework (although I haven’t actually tried yet).

Just an important issue no-one seems to have brought up yet…

Charles Dunstone’s email address

Oliver Brown
— This upcoming video may not be available to view yet.

Some of the people trying to cancel their TalkTalk subscriptions have apparently had success email Charles Dunsonte, Carphone Warehouse’s CEO, directly. Two email addressed have been suggested. I haven’t personally tried either so I can’t guarantee any sort of result.

Unlimited is getting sillier

Oliver Brown
— This upcoming video may not be available to view yet.

The Metro, (the only newspaper I read since it’s free) had an advert from ntl:Telewest for their broadband service that said:

Cable broadband has unlimited downloads…unlike BT, Sky, Orange or TalkTalk

BT and Sky at least (and possibly Orange, not sure) definitely offer “unlimited” packages that are restricted by a fair use policy. The fine print at the bottom of the ad said that the ntl:Telewest also had a fair use policy. So how exactly are they “unlike” each other?

On a related note, the number of replies with TalkTalk complaints is getting really high. The irony is none of the broadband companies have tried competing on quality of service yet.

Galaxia ♠ Renaissance

Oliver Brown
— This upcoming video may not be available to view yet.

Galaxia as a game may be dead, but it’s definitely not forgotten. Head on over to Galaxia ♠ Renaissance to read a lengthy (and getting lengthier) piece of fiction by former Galaxia player, Ashley Rayburn. It contains most of the well known players, the Consortium, the UGC as well of course yours truly, Q. If you don’t know what Galaxia is, it won’t mean much to you, but go read it anyway. It’s funny.

Amazon Mechanical Turk

Oliver Brown
— This upcoming video may not be available to view yet.

It’s so crazy it just might work.

I heard about AMT a while ago and thought it looked cool. But not much was happening with it.

Well now it’s beginning to take off more and it might be usable in my language app.

It’s essentially a work marketplace wrapped in a web service API. Your application creates a job request (called a Human Intelligence Task) which someone then completes with the result being sent back to your application. So far it’s commonly used for processing lots of small tasks (for example there’s one about verifying info about some restaurants that only pays $0.03 but there are over two thousand individual tasks available), but it can be used for anything.

The relevance is that it might be possible to get people to record audio for the language app through it. Amazon Mechanical Turk.

Using XHTML, XSLT and XForms for Xemplorary performance

Oliver Brown
— This upcoming video may not be available to view yet.

Alliteration and bad pun. Good start :)

One of the features the language app will need is some sort of module editor. Although the XML format of the scripts is straightforward to anyone used to hand editing HTML, a lot of other people will not have a clue. Therefore a WYSIWIG would be a cool addition. And lots of X’s may be the way to go.

Although XForm support in browsers isn’t exactly stellar, the fact that only script editors will require means that needing a plug-in or extension isn’t such a big thing. And I get brownie points for being Web 2.0 as well.

I’m going to assume you know what XForms and XSLT are. If you don’t, then go find out. I’ll probably explain in a future post, but for now just accept them as “cool” :P

Basically a module is included directly into the XHTML source of the page. The only change is the addition of a namespace declaration (which are normally absent from the modules). XSLT is then used to add some nice formatting to the conversation along with XForm stuff for editing (including adding/removing elements). This makes the server side code really easy since the whole XML of the module gets posted back to the server.

In theory the XSLT shouldn’t be needed since XForms can do repeating and stuff. The only problem is I don’t think it can handle recursion which is a bit of a limitation.

There is one bit of the XSL that I’m stuck on there. I have the XML fragment in the head of the XHTML document. I need to be able to transform a copy of it and place it in the document body, but keep the original intact in the head. Does anyone have an XSL snippet to do that?

Almost ready for a public viewing

Oliver Brown
— This upcoming video may not be available to view yet.

The still unnamed language learning app is almost ready for a first public viewing. I’m just trying to get some audio of some other than myself. Firstly because I don’t like really hearing my own voice (and for this purpose my less than perfect pronunciation is too obvious) and secondly I need at least two people just for it not to be confusing.

In the meantime I thought I’d share an example of the script file I’m using: EntschuldigenSie.xml. It primarily contains English translations although one phrase is done in a few more languages. It does highlight one possible issue. I had to change the German ß to ss. Although Windows seems perfectly fine with Unicode file names (internally it uses Unicode for storage (either UCS2 or UTF-16 - not sure which)) PHP refuses to open them (fopen, file and file_exists for instance just don’t work) and Apache 2 seems to have issues as well. For German there are workarounds but for other languages it will get fiddly. This might not even be a problem on Linux where it will ultimately reside and it only affects file names which only have to give you a rough idea of what’s inside. But still, it’s annoying…

Best bits of the language app are done

Oliver Brown
— This upcoming video may not be available to view yet.

The most important bits of my cool language learning web app are done. Here’s quick overview of how it works.

Everything is split into modules which are XML script files and accompanying audio files. Currently one type of script is supported, a “conversation”. This contains a short (less than 10 sentences) conversation with sub elements all marked up in XML. Sub elements are phrases, terms and notes. At the moment phrases and terms are handled almost identically. Notes are little explanations or possible stumbling points (for example the test script I have alerts the listener to the difference in the ending between “Ich verstehe” and “Sie verstehe_n_” in German). Any element of a conversation that is to be repeated is named (literally - the XML tag is given a name attribute). The system keeps track of the number of times a name phrase/term is played to the user and when it was last played so the automatic repetition system can work.

A lesson is currently very simple. A module is loaded and the conversation is played straight through. Then the named phrases/terms are played* with translations. Then any phrases/terms scheduled for repetition are played*. The repetitions are actually determined before the conversation is played however so that if too many are required then no new conversation is played.

* Played in this case means a specific format. First the native version is played, then a pause, then the translation is played twice.