Monday 4 August 2008

Announcing the Self-Documenting Code Contest

Programs must be written for people to read, and only incidentally for machines to execute.

-- Abelson & Sussman, Structure and Interpretation of Computer Programs

The Obfuscated C contest has always annoyed me.

If programs are for communicating with other programmers, why do we have a contest that encourages such complete perversions of best practises? (Ok, that's really a rhetorical question - I know perfectly well how much fun it can be to put hours of work into one devious, unreadable block of line-noise.)

But then I thought... someone should make another competition. One that encourages the opposite - learning to make
better code. A competition in which points are awarded for code that's simple, easy to read, more straightforward.

And here we are. As of today, Monday the 4th of August 2008, The Self-Documenting Code Contest (SDCC) is now officially open. I will be accepting submissions all month, until Monday the 1st of September.
EDIT: Correction - I will be accepting submissions until Monday the 18th of August.

The rules:

1) NO COMMENTS.
This contest is about self-documenting
code. In other words, your goal is to write a program whose logic is so obvious, it is its own documentation.
You may not use comments in your program. Nor may you emulate comments by using a purely decorative string value, or a paragraph-long identifier, etc.

2) Any language.
You may use any programming language - but you might want to bear in mind that if a judge is familiar with your chosen language, s/he will probably find it easier to read.

Currently the panel consists of 5 judges, coming from a wide range of programming backgrounds - academic, game programming, web programming, etc... and just to mix it up, my wife, who has no programming experience.

The judges have one thing in common: their knowledge of English. So ideally, that's your target: a program that can be understood by anyone who speaks English.

Unrealistic? Perhaps. But those are the rules everyone is working under.


Your objective:
Write a program that generates all two-word anagrams of the string "documenting". Here's a word list you might want to use: http://pragdave.pragprog.com/data/wordlist.txt.

When you're done, send the results to selfdocumenting@hotmail.com.

Good luck!
--
Laurie Cheers

33 comments:

Stefan Ciobaca said...

Are you sure this isn't someones homework? Can you send more than one solution?

Anonymous said...

Might I recommend posting a modified wordlist that doesn't include words with punctuation (e.g. hyphens, apostrophes)? The inclusion of symbols has bitten me twice so far.

Laurie Cheers said...

No, it's nobody's homework. :)

And sure, send as many solutions as you want.

PK said...

But does the letter case matter?
And if two words, say word1 and word2, are anagrams, word2 and word1 should be anagrams as well?

Unknown said...

I know what I'm doing tonight.

schipps
http://GirlDeveloper.com

Martin Ankerl said...

Hi, what about runtime constraints? I have a nice ruby solution that takes probably about two hours :-)

tst__ said...

I've made a version in Python. Runtime: 160 seconds on a Macbook Pro (2,2 GHz; 2GB Ram) only used one core.

Martin Ankerl said...

Hi again, I have just submitted a simple Ruby solution that takes about 8 seconds with Ruby, 4 seconds with JRuby. I have tried to make it as simple as possible but still fast enough.

tst__ said...

Wow, that's fast for ruby *scnr* ;)
New solution in Python: 1.08 seconds :p

Martin Ankerl said...

Its a readability contest, so who cares about speed ;-)

tst__ said...

True, but that's fun :)

H. Wilson said...

What, no prize?

EvilTeach said...

This is a fun little problem. Two clarifications though please.

Assume we are solving the pattern
"codcod".

Assume "doc" is in the wordlist.

Would "doc doc" be an expected solution? In fact, should it be expected to occur twice?

This effects the way the loops should be coded.

GrayShade said...

Must we use all the letters in the word?

I. e., is "int em" a solution?

tst__ said...
This comment has been removed by the author.
tst__ said...

evilteach:
You can't split "documenting" into two equal words, so don't worry ;)

grayshade:
wikipedia: "An anagram is a type of word play, the result of rearranging the letters of a word or phrase to produce a new word or phrase, *using all the original letters exactly once*"

EvilTeach said...

Thank you tst. I am aware of that.

How about it Laurie?

Is the program required to return both TUNED COMING and COMING TUNED,

or will one or the other suffice?

Laurie Cheers said...

Either is fine. Some people's entries are printing both pairs (coming tuned, and tuned coming), and others just one.

And I'm sorry people are feeling the need to trade-off readability for speed: I didn't intend that to be a factor. I guess I should have picked a problem that was trivial to compute.

The competition is intended to be solely about readability. Feel free to ignore efficiency, within reason.
(I expect you to at least run the program. So one that takes 100000 years is out).

David said...

for wordOne in wordList:
  for wordTwo in wordList:
    print wordOne, wordTwo

Prints all two-word anagrams of "dictionary." You may want to require that people print all and only two-word anagrams of "dictionary."

EvilTeach said...

Humm.... what happened to the rest of the posts?

I asked laurie another question the other day, and it is gone now.

Eric I. said...

I have a Ruby solution that comes in 1.78 seconds on a MacBook Pro (2.33 GHz).

I use a data structure encapsulated in a class to help make it run efficiently. But that class seems to move me away from realizing the stated goal of this contest. It's one more thing to explain/self-document without the use of comments.

Hmmmmm.......

mgsloan said...

I've got two implementations in haskell.

One takes about two seconds, and displays results as they are found. I think it might actually start processing while the wordlist is loading, too..

The other is so fast it's not really measurable. Kinda cheating though. Just prints out the results discovered via the previous method.

Martin Ankerl said...

With a bit of tweaking I have now a Ruby version that takes 0.532 seconds (Ruby 1.8.6, Linux, 2.2GHz). But it is very ugly code :)

For the results of this competition it would be very interesting to have a graph where one axis is the readability, the other axis is performance.

Martin Dobmeier said...

That's a really interesting AND difficult contest! I've been trying to code up something in Java and it's amazing how verbose Java can be sometimes ;). Especially in light of a challenge like this. Futhermore, I've found it quite difficult to strike the balance between readability and clutter, i.e. you could create a method with a meaningful name for every step of you algorithm, but that makes things even worse in my opinion.

Well, I'm still trying to improve my program. Btw, it completes in about 0.3 seconds...

Martin Ankerl said...

Why do my posts disappear? I have now a Ruby version that takes 0.165 seconds, without cheating in any way. Who said Ruby is slow? :-)
Unfortunately it is quite ugly and difficult to understand without comments.

Try it yourself, the code is here: http://gist.github.com/4667

Unknown said...

Why reinvent the wheel? Programming is inherently designed for machines. Natural or technical languages, in all cases it will get interpreted to machine code.

The complexity of code/documentation is dictated by the complexity of logic. By properly formulating your code logic and keepings things simple, a piece of code can be very easily read with minimal additional commenting required.

If you want to use a coding structure that more resemble human literature, I recommend rather simply creating a DSL (Domain Specific Language). I highly recommend ANTLR.

jkff said...

Hi, the wordlist.txt link is not working.

Sammy Larbi said...

Define "paragraph."

I like long variable names. =)

Sammy Larbi said...

To those looking for PragDave's missing word list: you can find it through the Internet Archive Wayback Machine at:

http://web.archive.org/web/20080120085902/http://pragdave.pragprog.com/data/wordlist.txt

Sammy Larbi said...

Sorry, that seems to have been cut off.

Here's PragDave's word list

Unknown said...

Unfortunately, I did not come across this post until tonight. Nevertheless, here is an example which I think unquestionably would have won, using Revolution (http://www.runrev.com). It runs in about 1 second.

http://revuser.com/readable.htm

Unknown said...

Does anyone have the word list floating around still please?

unnome said...

for the wordlist -->

https://gist.githubusercontent.com/calvinmetcalf/084ab003b295ee70c8fc/raw/314abfdc74b50f45f3dbbfa169892eff08f940f2/wordlist.txt