Sunday, 7 September 2008

The Results!

My apologies for taking this long to finish up... but at long last, the results of the Self-Documenting Code Contest are now in. You can see them below.

First things first - although a programming contest judged by a non-programmer sounds good on paper, it turns out to be a nightmare in practise. I think it's hard for us programmers to remember just how far removed programming is from ordinary language.

Initially, Sylvia had trouble even understanding the clearest of the entries. To get meaningful results I had to sit with her, giving hints such as "think of it as a recipe, or a series of instructions", that "print" means producing output, that "string" means a piece of text, and that (in most languages) unlike the familiar mathematical concept, an equals sign does not mean true, timeless equality.

She also noted that the consistency of her scores was pretty low, since as she went through the entries she was learning more and recognizing constructs from previous ones. And finally she also complained that the whole process was really exhausting, like to trying to follow a conversation in a foreign language you barely know.

So given all these problems, we hit on a compromise: Sylvia would only score the best entries, and her results would just be used as a tiebreaker - to separate entries that would otherwise be equal. (There are a lot of these. See below.) So rather than a number out of 10, her scores are in stars, out of 5.

A blog post isn't really the ideal place for giving details of every single entry. So I've put up the detailed results... in the form of a subreddit!

Check out sdcc1.reddit.com. This is a normal subreddit, so I hope to see everyone giving lots of comments and votes here. It'll be particularly interesting to see whether the final reddit-score matches the judges' opinions. (If I'd thought of this idea originally, I would have had the entire contest hosted here, with entrants simply posting their entries themselves. Maybe next time...)

Anyway, each of the 3 main judges gave each entry a score out of 10, for a total score out of 30. Sylvia then gave a score in stars, for separating entries that got the same score. Here are the top-rated entries:

18th: Leonardo, entry 1 [22**]
Equal 16th: Diego Essaya [22***]
Equal 16th: Tom Newton [22***]
15th: Kari Hoijarvi [22****]
14th: Mathias Hallman [23**]
13th: Tim [23****]
12th: John Evans [23*****]
11th: Chris Rebert [24**]
Equal 7th: Kristian Stangeland [24***]
Equal 7th: Fabien Le Lez [24***]
Equal 7th: Apfelmus [24***]
Equal 7th: Thomas Annadale [24***]
6th. Michael Sloan [24****]
5th. Reinout Heeck [25*]
4th. Kieran Elby [25****]

Congratulations to all - the field was pretty tough.
Now, let's take a closer look at the top 3. All of these scored a 9-8-9 from the judges:

3rd. Martin Ankerl, entry 2 [26***]
#!/usr/bin/ruby

class String
def letters
split("").sort
end
end

query_letters = "documenting".letters
words = File.read('wordlist.txt').downcase.split

candidates = words.select do |word|
(word.letters - query_letters).empty?
end

candidates.each do |first|
candidates.each do |second|
combined = first + second
puts "#{first} #{second}" if combined.letters == query_letters
end
end
An excellent entry in Ruby. Some very clear string interpolation, and the perfect place for a deferred if. However, the line
(word.letters - query_letters).empty?
was too programmery for Sylvia to follow, and she also had trouble with some of the less-clear english - such as downcase.split, and the class String block at the beginning.


2nd. Arnar Birgisson [26****]
from urllib import urlopen

source_word = 'documenting'
wordlist_url = ''

wordlist = urlopen(wordlist_url).read().split('\n')
source_sorted = sorted(source_word)

for word1 in wordlist:
for word2 in wordlist:

if word2 < word1:
continue

combination_sorted = sorted(word1 + word2)

if combination_sorted == source_sorted:
print word1, word2
print word2, word1
Another excellent entry, this time in Python. The logic really is self-explanatory here.

And the winner of the Self-Documenting Code Contest is...

1st. Ian Davis [26*****]
word_to_anagram = "documenting"

def remove_trailing_whitespace(text):
return text.rstrip()

word_list = [ remove_trailing_whitespace(line) for line in file("wordlist.txt") ]

for first_word in word_list:
for second_word in word_list:
if sorted(word_to_anagram) == sorted(first_word + second_word):
print word_to_anagram, "=", first_word, "+", second_word

A very short, and extremely clear entry. No unnecessary work. I was particularly impressed with the line that generated word_list; it's crystal clear what's happening there. The output format is a little more complex than it needs to be, but it's still perfectly obvious what's happening there.

Congratulations, Ian! I think you just won yourself an Internet or two.

5 comments:

Unknown said...

It seems you didn't get my mail... :(




> > Hi Laurie,
> >
> > here's my entry for the SDCC.
>
>
> Hi, could you resend in a different format? Plain text is fine.
> Hotmail is a bit rubbish and won't let me download files that it can't
> virus scan...

Sorry, I saw your reply just now - I switched to Google Mail, and web.de didn't forward this mail to it.
New address is [...]
I've uploaded the entry here: tinyurl{dot}com/632rm5

Laurie Cheers said...

Ah... I'm very sorry about that. For some reason I had the idea you had already re-sent it.

Of course it's too late to add it now... but feel free to add it to sdcc1.reddit.com yourself.

Unknown said...

Unfortunately, I did not come across this post until tonight. Nevertheless, here is an example which I think unquestionably would have won, using Revolution (http://www.runrev.com). It runs in about 1 second.

http://revuser.com/readable.htm

Tal-N said...

Hey Laurie, long time no see. Been trying to find a way to contact you to catch up. If you want to get hold on me to chat you can reach me at chrisblane@hotmail.com

- Tal-N

cracker said...

I am really dissatisfied by the submissions of this contest. I did not find the code be self documenting at all.

In very few of the submissions did I even see the use of the word anagram. How can you not use the domain language in you code?

Everyone has written code like "combination_sorted == source_sorted" when making a function like "isAnagram(combination_sorted, source_sorted)" would make the code more readable.

Sorting is done but just looking at it gives no understanding of why it is done, its only after you have read much of the code do you understand its purpose.

You have to read the entire code before it becomes obvious to you what it is doing. There are no functions that wrap the code into logical, understandable sub modules.

In self documenting code you almost never have to see the implementation. The interfaces and abstractions make it obvious what is going on without actually reading in the how. But in none of the submissions do i see any level of indirection used to model the problem.