Monday, November 19, 2007

Interpreting the 4th fastest supercomputer in the world

In response to my blog entry (and comments about it posted on this blog) about `Eka' making rank 4 in the Top500 list of the world's supercomputers, I got this email from my friend Viral Shah who knows a bit about the field:

Ajay, it seems that the ranking of Eka at number 4 on the Top500 list has resulted in quite a lot of excitement. Hats off to the folks at CRD Labs for achieving the feat of assembling such a large computer in a short amount of time. As some of your readers noted, Eka is a cluster. It is roughly 2,000 nodes, consisting of roughly 15,000 processors and connected by Infiniband. Some readers noted that the benchmark is not representative of real scientific applications.
Firstly, making a small cluster is quite easy. However, constructing such a large cluster, and operating it is no easy task. It requires some serious skills to administer it, tune the hardware and software for performance, and run scientific applications on it. Second, the Top500 is an interesting benchmark. Sure, it is not representative of a realistic workload, but over the years, the bar has been set quite high. If a general purpose computer does not achieve a good LINPACK score (the top500 benchmark), it is safe to conclude that something is terribly wrong. I am of course excluding special purpose computers that are built to solve specific problems, rather than get a high LINPACK score.
That said, one needs to think this through clearly. Why was Eka built? To simply show that we can do it, and place a computer in the Top 10 supercomputers? To run specific scientific applications? I am guessing that the answer is "a bit of both". Almost always, it is safe to conclude that the full supercomputer is never used to solve one problem. What are the largest problems that will be run on Eka? What percentage of peak will they achieve? Would it have been a better idea to buy an "off the shelf system" such as the Cray XT4 or the SGI Altix and focus on programmer productivity, instead of getting a high LINPACK score?
Computers such as Eka achieve extremely high and unrealistic flop rates on the LINPACK benchmark.Typically, they can achieve over 70% of the peak flop rate (Number of floating point operations per second). However, real applications often run at below 5% of the peak flop rate. Let's examine some other possibilities.
Note that the software industry has been one of India's strong points. It is becoming increasingly clear, that, software is the key. For example, Apple's success with the iphone and ipod have as much to do with well designed software, as with the hardware. If you ask me, the big event at Supercomputing'07 was not that Eka placed at No. 4 on the Top500 list. For me, the most exciting event was one that you will not hear about in media - it has to do with the other part of the HPC Challenge, often called the beauty contest. Instead of asking "which computer can run LINPACK the fastest", it asks, "which programming language implements the benchmarks elegantly".
The winners of the class II challenge this year were IBM's X10, and Interactive Supercomputing's Python Star-P. For me, the most surprising, and the coolest event was the revelation that some of the compiler work for X10 was done at IBM's research labs in India. This is cutting edge compiler technology, and the fact that part of the team was based in India is a strong statement about HPC innovation in India.