This year’s challenge: ridiculous performance degredation

The object of the 2006 challenge: write a program that performs a rudimentary text processing task, which inexplicably takes a very long time to run on one specific operating system.

Specifically, you must write a program that scans a text file, separates and sorts the words, and provides a frequency count of each word. Essentially you must mimic the output of the command pipe:

tr "[:space:]" "[n*]" | sort | awk 'length($0)>0' | uniq -c

For no observable reason, your program must make one particular operating system look bad. You get to choose the operating system.

The winners

Many entries exploited endian differences between common platforms, causing a loop to run for a long time on one processor versus another.

Winner: Bobby Brogan

This submission exploited a simple and very common coding mistake:

long unsigned int maxwordsize(char *inputFromStdIn)
	long unsigned int tmpwordsize=0,maxword=1,i;

	for (i=0; i <= strlen(inputFromStdIn); ++i,++tmpwordsize)
		[etc etc]

The text file is loaded into a string, and this routine determines the maximum length needed per word. It is a single loop, but with a strlen() in the loop condition it has quadratic complexity. The kicker: gcc on GNU/Linux optimizes this out. Thus you have a program that runs in linear time on GNU/Linux, and in quadratic time on Windows. This is just enough to create excruciating delays with reasonable input sizes.