//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>
One of the most pressing challenges in advancing AI technologies is power efficiency. As the world grapples with the urgent need for sustainable solutions, the computing powerhouses behind large AI models have come under scrutiny for their colossal energy requirements. Today’s large language models (LLMs), for instance, not only consume a staggering amount of power during training but may require enough energy for ongoing operations to rival the power consumption of a small city. Today, the machines underpinning generative AI models are essentially high-performance computers (HPC) or “supercomputers” architected for massively parallel processing.
According to Satoshi Matsuoka, Director of the RIKEN Center for Computational Science, the symbiotic relationship between AI and HPC will only deepen in the coming years. AI technologies will facilitate more rapid advancements in HPC, while supercomputers, in turn, will enable increasingly advanced AI algorithms. Dr. Matsuoka’s cutting-edge work includes development of the Fugaku supercomputer, currently ranked No. 2 on the Top500 List and is among the world’s most efficient machines of its caliber. Both Fugaku and the preceding TSUBAME series of supercomputers developed by Matsuoka serve as a testament to the importance of HPC advancements for more energy-efficient AI.
With the 2023 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) fast approaching, we sat down with Dr. Matsuoka to discuss the relationships between supercomputing power efficiency and AI sustainability. Dr. Matsuoka shared insights on everything from how his team and the HPC industry have advanced efficiency in large-scale machines to where he thinks the most significant gains in power efficiency will come from to why the SC conference is a must-attend event for anyone working at the cutting edge of AI sustainability.
A Powerful Start
Dr. Matsuoka’s journey into the world of computing began in junior high school, during the nascent days of the microprocessor revolution. He recalls his fascination with computing sprouting after encountering so-called 8-bit microprocessors and early home computers such as Commodore PET and Apple II. “I started programming games on borrowed machines and even made some money from them. It wasn’t long before I used my earnings to buy my own machine,” Matsuoka shares with a hint of nostalgia.
Fast-forward to his college years, where Matsuoka sharpened his game programming skills while also diving into studying computer science seriously and applying his new learned skills into gaming such as compilers and real-time programming for interactive games. He earned his doctorate from the University of Tokyo in 1993, marking a significant turning point in his career. “During my doctorate, we were experimenting with connecting personal computers over networks and developing software for parallel programming. That became my first research lab,” he explains.
Dr. Matsuoka’s pioneering work in power efficiency wasn’t born out of an initial academic focus, but rather out of real-world problems. During his tenure at Tokyo Tech, the facility staff took notice of his lab’s power consumption, which nearly matched that of a supercomputer they had at their supercomputing facility. “They were baffled and said, ‘It’s not like you have another supercomputer in here,’ but, in reality, we were pretty close for that time,” Matsuoka says. Instead of ending his projects, the university promoted him to full professor and entrusted him with overseeing the supercomputing facility at Tokyo Tech, setting the stage for his groundbreaking contributions to HPC.
Making Important Strides Toward Sustainability
When it comes to power efficiency in supercomputers, Dr. Matsuoka identified a significant challenge: power consumption increases linearly with each added node. As a simple illustration, if one node uses 300 watts, then 1,000 nodes would consume 30 kilowatts. Performance gains in modern supercomputers primarily come from adding nodes and increasing parallelism, supplemented by faster processors and other technological advancements. Considering this reality, Matsuoka and his teams have had to get creative over the years.
In 2010, Matsuoka’s team at Tokyo Tech achieved their first major power efficiency breakthrough with the TSUBAME2.0 supercomputer. This followed their 2006 success in creating TSUBAME1.0, which was the fastest supercomputer in Japan at the time. Given the university’s focus on sustainability, the team aimed for a significant speedup in TSUBAME2.0 while maintaining the same power profile as its predecessor.
“With TSUBAME2.0, we achieved a 20x speedup at the same power consumption as TSUBAME1.0 by embracing new technologies like GPUs,” notes Matsuoka. “We’ve always aimed to identify and implement the best technologies available for our applications and overall architecture. Back then, GPUs offered excellent throughput and parallelism, but were far from being a proven technology for high-performance computing.”
Matsuoka notes that the use of GPUs in large-scale computing is now the standard but was groundbreaking at the time. Despite lacking a blueprint for integrating thousands of GPUs, his team successfully met the challenge through programming innovations and new technologies. “We pioneered methods to program GPUs and reduce power requirements, which have since influenced GPU-based machines in high-performance cloud environments,” he says.
Throughout the development of successive TSUBAME versions at Tokyo Tech. and the Fugaku supercomputer at RIKEN, Dr. Matsuoka has maintained a focus on performance-per-watt principles. “My overarching career goal has been to maximize performance within given energy and financial constraints while being environmentally responsible,” states Matsuoka.
Data Center Cooling: A Hidden Energy Hog
Matsuoka points out that cooling has historically been a major power drain in data centers. “In some cases, more than half of a data center’s 10-megawatt consumption went towards cooling,” he says. This led Matsuoka and his teams to explore more efficient cooling systems from the days of TSUBAME1.0 onwards. “Cooling efficiency drove our supercomputer development over generations. We introduced rack-level liquid cooling with TSUBAME2.0 then full liquid cooling for TSUBAME3.0, overcoming the challenges posed by its scale, which has since become a common practice,” he adds.
Putting the Progress into Perspective
Today’s leading HPC systems are more energy efficient than smartphones on a performance-per-watt basis. “The Fugaku supercomputer, with nearly 160,000 nodes and over 7.5 million cores, is more powerful yet far more efficient than the combined compute power of all the smartphones sold in Japan last year, or about 20 million units” Matsuoka asserts.
Gains aside, even a 20-MW supercomputer will consume the equivalent power of almost 10,000 U.S. households. Moreover, a hyperscaler cloud that contains supercomputers may consume the household power equivalency of a small city, so there is still significant work to be done to improve power efficiency—especially when you consider the significant and growing demands of AI workloads.
Different Disciplines, Same Underlying Game
Although modern AI methodologies have existed for years, they only recently have matured into practical tools with the potential to transform society. Matsuoka attributes this advancement to the capabilities of modern supercomputers.
“Concepts like deep learning were developed decades ago, but the computers of that era lacked the horsepower to support the training and inference of these complex models,” he explains. “While data availability was also a factor, the recent intersection of big data with HPC systems has catalyzed the AI revolution. Today, tech giants like Google, Amazon, OpenAI, and Microsoft operate machines on par with the most powerful national supercomputers dedicated to scientific research.”
Matsuoka highlights that the computational requirements for language and generative models have skyrocketed in just the last few years. This has led traditional HPC-focused vendors like NVIDIA to pivot toward AI. NVIDIA’s Tesla and subsequent A100 Tensor Core GPUs, for example, have transitioned from being supercomputer accelerators to the go-to hardware for training large-scale neural networks, a task demanding immense parallelism.
According to Matsuoka, the boundaries between various computing disciplines are increasingly blurring, thanks in part to advancements in AI.
“There used to be clear demarcations between cloud computing and both classical and supercomputing. Those lines are now dissipating,” he says. “In today’s landscape, companies vie for the fastest and most powerful supercomputers, leading to a more unified field. Even HPC experts from national labs are now being recruited by AI-focused vendors.”
AI Meets HPC: A Symbiotic Relationship
Amid the growing focus on AI, Matsuoka predicts that societal demand for more energy-efficient AI technologies will intensify quickly. He warns that if unchecked, the energy consumption of AI could negate progress made in reducing carbon emissions in other sectors. While he acknowledges that HPC and supercomputers are essential for driving AI advancements, Matsuoka also sees potential for AI to reciprocate by fostering innovations that could propel HPC forward. As such, he’s eager to see what synergies unfold.
On the topic of power efficiency, Matsuoka’s attention is increasingly shifting toward innovations in memory technology: “Most methods to accelerate GPUs are already well-understood, so future improvements are likely to focus less on CPUs and GPUs and more on enhancing the performance of the memory and the interconnect fabric. The most promising avenues for reducing energy consumption per data movement mainly lie in memory technologies and packaging of chips, as well as advanced photonics used in interconnects, so I’m keeping a close eye on that space.”
Get Ahead of the Sustainability Curve with Insights from SC23
For those interested in learning about power efficiency in large-scale systems and the latest technologies propelling AI, Matsuoka highly recommends the SC conference series.
“If your organization aims to be a frontrunner in the AI field and is keen to stay updated on cutting-edge research and technology related to high-performance computing, the SC conference is an invaluable resource,” he offers. “Leading cloud providers often send their teams to the SC conference and even have exhibit booths there, for insights and connections into innovations relevant to parallel supercomputing—topics that are often overlooked at mainstream AI events. The wealth of exhibits and papers available at SC can provide a critical advantage for staying ahead in the rapidly evolving AI landscape.”
Dive deep into performance-intensive computing at SC23, happening from November 12-17, 2023, in Denver, Colorado. Join tech visionaries, collaborate with industry leaders, and explore the future of technology.