CCC Blog

Catalyzing the computing research community and enabling the pursuit of innovative, high-impact research.

Computing Community Consortium Blog

The goal of the Computing Community Consortium (CCC) is to catalyze the computing research community to debate longer range, more audacious research challenges; to build consensus around research visions; to evolve the most promising visions toward clearly defined initiatives; and to work with the funding organizations to move challenges and visions toward funding initiatives. The purpose of this blog is to provide a more immediate, online mechanism for dissemination of visioning concepts and community discussion/debate about them.

“March Madness Algorithm Overlords”

March 26th, 2011 / in research horizons, Research News / by Erwin Gianchandani

Many of us have completed our share of March Madness brackets, competing in leagues or pools to see who can be best at predicting the outcome of the annual NCAA Tournament. For the second year in a row, University of Toronto machine learning Ph.D. student Danny Tarlow has organized and run one such pool. But what makes Tarlow’s pool unique — and noteworthy here — is that all entries are computer-generated, i.e., the entries are brackets completed by computer algorithms working off of historial data and without the use of any human judgment. Tarlow calls it the March Madness Predictive Analytics Challenge, and the rules he’s defined tell the story:

Your bracket must be chosen completely by a computer algorithm.

The computer algorithm must base the decision upon historical data.

You may not hard code selections into your algorithm (e.g., “Always pick Stanford over Cal”)

Your algorithm may only use the data set published for the tournament. The data will be released on Sunday, March 13.

The above rule is fairly restricting, but I believe this provides a more even playing field. The contest should be about your algorithm’s predictive capabilities and not a data advantage one person has over another.

You must be able to provide code that shows how your entry picks the winners. In other words, your bracket and the selection of winning teams in your bracket must be reproducible by me on a machine.

Now that the Final Four is set — with no number 1 seed, for only the second time since 1980 — it’s a good time to assess how well the entries are faring. As Tarlow wrote in an entry on his This Number Crunching Life blog late yesterday:

…The question of which algorithm will win the contest is still not settled: a UConn victory on April 2, and The Pain Machine walks home with the prize; a UConn loss, and Team Delete Kernel is our winner.

What is settled at this point is that a machine will claim victory over the human-aided competition. The human baselines include our commissioner Lee’s bracket; the Higher Seed bracket (where the human intervention came via the committee that chose seeds); and the Nate Silver [of Five Thirty Eight fame] baseline, which was a part-human, part-computer effort.

So it’s premature to congratulate a winner yet, but let me tritely say that I, for one, welcome our new March Madness algorithm overlords.

The “brains” behind the two current leading entries have described their approaches here and here, respectively. And be sure to check out the full results to date here.

(Contributed by Erwin Gianchandani, CCC Director)

Comments are closed.

Back to Top