This two-day course introduces attendees to a thorough approach to optimization techniques for contemporary computing architectures. It is based on Andrei's career-long experience with tuning performance of various software systems, from Machine Learning research to high-performance libraries to Facebook-scale computing backends.
A large category of applications have no boundaries on desired speed, meaning there's no point of diminishing returns in making code faster. Better speed means less power consumed for the same work, more workload for the same data center expense, better features for the end user, better machine learning, better analytics, and more. Yet information on writing fast code is scant and difficult to find. Software engineering folklore is rife with tales of optimizations. Programmers commonly discuss and argue whether a piece of code is supposed to be faster than another, or what to do to improve the performance of a system small or large. The course teaches systematic, scientific approaches to measuring and improving code performance.
Optimization has received increased attention during the past decade, a trend that is likely to intensify. Serial execution speed has stalled and, after parallelizing what's possible, in any system we have single-thread speed of sheer algorithmic execution as the lasting bottleneck. Systematic algorithmic improvements that work in conjunction with hardware tradeoffs are key to improving performance. Optimization has always been an art, and in particular optimizing code on contemporary hardware has become a task of formidable complexity. This is because modern hardware has a few peculiarities about it that are not sufficiently understood and explored. This class offers a thorough dive in this fascinating world.
Please note: This course is being actively developed. The actual course might contain more topics and slight variations on the topics outlined below.
This is aimed at C++ programmers who have efficiency of generated code as a primary concern.
The format is a highly interactive lecture. Questions during the lecture are encouraged. Use of laptops for trying out examples is allowed.
"The explanations of Andrei led to many "Aha!" moments and made it definitely worth my time."
S. Montellese, Supercomputing Systems AG
"Thanks to Andrei for sharing his knowledge!"
P. Kaczmarczyk, ESG Elektroniksystem - und Logistik GmbH
"Bringing my application to speed of light seems possible after this course."
C. Lang, Supercomputing Systems AG