Tag Archives: programming

SoundEx

A couple of days ago, it was the 70th birthday of Donald Knuth (or rather, his webpage), who as you found out, is hero to many programmers around. I think he’s still working on completing his masterwork, ‘The Art Of Programming’. Anyway, his birthday has not gone unnoticed in the blogosphere. It’s almost like everybody knows each other. No, really. Really.

That brings me up to something (slightly) related: I was using DailyMotion to find older clips of the 80’s band Blondie and the search result of that site always seem to include references to movies with blondes; some of them quite, lets say, exquisite. I won’t link to a URL of such, but I encourage you try it out. Obviously, DailyMotion is using a ‘soundex’ routine: this is an algorithm that indexes keywords by sound based on (language-specific) rules. This works brilliantly for looking up people’s last names (or even first names) but not for searching specific terms like the example I mentioned earlier.

Seriously:If I’m searching for ‘Blondie’, I’m expecting results for Blondie and not ‘Blond’, ‘Blonde’, ‘Blondes’, ‘Bland’, ‘Britney Spears’. And definitely not ‘2 h0t bl0nd3s k1ss1ng 3ach 0ther’1.

1 Obviously, I used ‘some encryption’ there to ensure that your kids don’t end up on this kids-safe website when they google for ‘Blondie’. You’re welcome.

Round

Last night, I decided to follow the Iowa Caucus thing: Apparently Obama and Huckabee are now considered (repectively) Democratic and Republican front runners. I also tuned into to hear their speeches and thought they were dull and boring. This brings me to the actual topic of this posting: over at the BBC, readers discuss the surprising win of the two White House hopefuls and the following comment just stuck out (and trust me, I didn’t take it out of context). I took the freedom to highlight the ‘offending’ portion:

As a Ron Paul supporter, I feel neutral about the Iowa results. We weren’t expecting to do well there, we beat Guiliani, and were only 3% away from 3rd place. On the other hand, 5th place people don’t get talked about. I am optomistic[sic] about Wyoming and to a lesser extend New Hampshire.

This is like saying that I was born 10% earlier than my twin brother. And I’m 45 grams sure he would protest to that fact, since he has always claimed that it was him who kicked me out of my mother’s womb because he thought ‘I needed to grow up and get a life’ (which I eventually did for 73% of my life, which is 3 grams more than him).

3 Percent of what is that again? Maybe some people should be barred from using statistics and percentages.

Way earlier, I read this brilliant UnCov commentary about Pownce, which is a Web 2.0 site that allows you to send ‘stuff’ to your loved ones. Which is sort of less similar like sending mail with attachments, nonetheless. Too bad the Pownce’s lead programmer pulled the post about ‘how to do rounding of floating point numbers’: it be interesting to know why a lead programmer would round floating point numbers using strings (the pulled article had interesting comments how to do this nicely using simple math, but alas).

Any.links

An assortment of links:

I was reading this article at More Intelligent Life written by Enid Stubin (who appears to be an assistant-professor of English) and the first comment in the comment section literally says:

Brilliant. This chick can actually write.

Behold: the future of the Internet!

3 Quarks Daily links to a 2 hour discussion between Dennet, Dawkins, Harris and Hitchens (in two parts). If you have some time and have read some of the material of any of the authors, you may find their commentary on religion and current events interesting.

So, Movable Type has gone open-source: the announcement was made this Summer but effectively a couple of weeks ago the official sources were (finally) published. You may remember that earlier versions of xsamplex ran on Movable Type 3. Heck, you can even steal my original MT template!

A couple of months ago, I happened to run into the sources of MyJabber IM: Much to my surprise, the sources were open-sourced (GPL). It appears that the MyJabber site doesn’t exist anymore, so, I assume that the (original) programmer’s goal was to ensure the program continues to live on. Good choice: the code is a bit ‘old’ fashioned and relies heavily on the agsXMPP library (which is dual-licensed). [Note: this is all C# stuff]

And finally (as in: the last paragraph), Toshiba has developed a Micro Nuclear Reactor, which measures only 20 by 6 feet and can deliver up to 200 KW. I read that the reactor is self-sustaining and should last more than 40 years. If you want one of these things, you can apply for one at your local security agency. People with last names that rhyme with ‘laden’ do not need to apply. If you have certificate in theoretical quantum physics, that is a plus. Oh: and you may need to allow UN inspectors on your property.

Sources

I was reading this posting All these sourcesat Slashdot (“OSS Music composer gaining attention”), which is about a developer who has started a Buzz-like music ‘tracker’ in C#. The part that caught my eye at the linked article was the following paragraph (Italics mine):

The day the source code to Buzz got lost was a very sad day and there was absolutely nothing anyone could do. We’d just had an updated version of Buzz released and suddenly everyone realised there would *never* be another one.

Then I went back to the development log of Rosegarden (that outstanding MIDI composer for KDE/Linux, website), which reminds us that there was indeed a Windows branch:

1995-1996: Andy makes a sibling version of Rosegarden for Microsoft Windows, adding a significant amount of extra sequencer functionality. Then he loses the only copy of the source code in a hard disc crash. You can still have the 32-bit binaries if you like, but they might not work. Don’t come crying to us if you blow up your computer.

I think I have exactly one backup of my oldest sources, (covering 1995 to 2002) which have been put on (exactly) one 700 MB CD ROM. Compressed. Then, when working on my first Toshiba (2004?), all of my sources from then on where stored in a personal folder called ‘Sources’ (how original) with many (many) subfolders, all of them containing some sort of project, library or explanation. This folder has moved with me since then and currently covers 3.5 gig of space. Naturally, I always include the executables too (that is in case I ever lose my sources1). And what not.

If you just started programming and you think you’re a hotshot: Think about the Megabytes of code you can write in the next 10 years. Oh: and don’t forget to make backups too.

1 I did lose code over the years: Missing in action are the original WordPlay/Scrabble server (PHP, this is the one I once demoed to explain separation of UI, Code and data), a NNTP statistics collector (Python, this one actually worked too and I have no idea why I wrote it) and a directory synchronizer (Python too). I recently recovered that last one though (sheer luck) when cleaning up a directory on a ‘free hosting’ server (that was the same day I wrote code to extract passwords from a popular FTP program): The code actually still works, but I have no idea how or why I actually wrote it. I don’t understand the code either, anymore, which is worst than actually losing code).

I can do this too…

Yes, so I needed to get into an older site and I forgot the password, because, well, back in 2000, I used an FTP program (FTP Explorer) that (at that time) was pretty popular. So, late last year, I was transferring data from the oldest laptop (Proteus) which had a copy of that FTP program. Before wiping the drive (refurbish, refurbish!), I was smart enough to preserve the FTP program’s registry entries and save them for future reference, hoping I might be able to break the algorithm one day.

Not shortly after, I ran into some code that purportedly showed the algorithm to decrypt the FTP program’s entries and passwords, however, completely written in Perl (link to Google cache). Luckily, I have no problems understanding Perl and it only took a couple of minutes to rewrite it in C#.

So, there you have it: the C# sources you can find right here. Coding was done in SharpDevelop. Note that you need to compile and build the executable yourself and if you want, you can probably rewrite it in your other favourite programming language. Whatever.

That said, it was fun while it lasted. Ta-daa old website.