CCC supported three scientific sessions at this year’s AAAS Annual Conference, and in case you weren’t able to attend in person, we are recapping each session. This week, we are summarizing the highlights of the session, “How Big Trends in Computing are Shaping Science.” In Part 3, we hear from Dr. Jayson Lynch, a Research Scientist in the FutureTech lab at Massachusetts Institute of Technology, who explains the speed at which algorithms are improving.
Dr. Jayson Lynch began their presentation by addressing the cliffhanger that Manso ended on: how quickly are algorithms improving, and will these algorithms serve as partial solution to the growing need for compute? “The main role of algorithms”, said Dr. Lynch, “is to solve a problem or do a task. We can break up how long this process takes into tasks per fundamental computation and how long it takes to do each of those computations.” The figure below shows how tasks over time can be represented by the speed of algorithms times hardware speed.
Dr. Lynch referenced the 2010 Report to the President and Congress: Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology, which motivated the research they shared during their presentation. “This report suggests that these algorithmic improvements contribute as much, if not more, to the progress in our ability to compute than advances in hardware have. This has been shown in research in calculating electrostatic potentials and calculating simulations related to magnetic fields in fusion plants. In both these areas, speedups resulting from algorithmic improvements are comparable or larger than the effective speedup from hardware.”
To evaluate algorithmic speedups across the board, Dr. Lynch and their team created a survey of 113 problems and more than 1000 algorithms solving those problems to answer this question. To evaluate the efficiency of these algorithms, Lynch looked at how the number of computations scales with the input size. The history of Matrix chain multiplication optimization is shown in the graph below. Originally, no algorithm existed to calculate this kind of problem, until 1953 when an algorithm was discovered to solve these problems in cubic time. 30 years later, this algorithm was optimized to achieve near-linear time, and it was later proven that this is the best any algorithm could do in terms of asymptotic complexity.
“When we plug numbers into these kinds of algorithms,” explained Dr. Lynch, “we can begin to measure the degree of performance gain for specific problems.” The figure below displays the algorithmic progress of three algorithms over time. “The maximum subarray problem, for example, went from n^3 time down to linear time, which is an n^2 improvement. If n equals one million, then we see a trillion-fold improvement, which is enormous. In comparison, the gray line on this graph shows hardware improvement during that same time period. Across the span of 30 years, we see a 30,000th-fold performance gain. This degree of improvement is also incredible, but we can see that some of these algorithm improvements are substantially larger.”
The figure below demonstrates the average improvement of algorithm families per year for very large datasets, where n equals one billion. Over half of algorithms (shown on the far left) have not seen substantial improvement since the first algorithm was published, however, many of those on the right have seen enormous improvements over time. When we take the average of improvement across all of these algorithms, we see that 43% of algorithms are increasing their speeds faster than Moore’s Law. Using smaller datasets, however, where n only equals one thousand, we see that algorithms are improving more slowly than hardware.
“So can we expect these huge increases to continue?”, Dr. Lynch asked the audience. “We can prove lower bounds on these algorithms, and we can see, in the figure below, that 65% of these algorithms have achieved optimization, meaning there are no more asymptotic gains that can be achieved. So this is a huge triumph for computer science, but it may mean that progress is slowing down. However, this only refers to exact serial problem algorithms. If we allow approximation or change the types of computers running these algorithms, such as using quantum computers in the future or GPUs or parallel processors, we can get around these lower bounds.”
Dr. Lynch wrapped up their presentation, saying, “As we have seen, algorithmic efficiency is incredibly important, and advances are being made alongside hardware improvements. Because of ties to problem size and models’ computation, these increases are intimately intertwined with hardware gains.”
Thank you so much for reading, and stay tuned tomorrow for our summary of Mehmet Belviranli’s presentation on the future of heterogeneous computing.