Accelerate AI applications! IBM announces new Telum processors
2024-09-03 14:41:20
Today, IBM announced the architecture details of its upcoming IBM Telum® II processor and IBMSpyre™ accelerator at Hot Chips2024. These new technologies are designed to dramatically expand the processing power of next-generation IBMZ mainframe systems and accelerate the collaborative use of traditional AI models and large language AI models through new AI integration approaches.
The major innovations announced by IBM include the IBM Telum II processor, the IO acceleration unit, and the IBM Spyre accelerator. The Telum II processor and IBM Spyre accelerator will be manufactured by Samsung Foundry using its high-performance, energy-efficient 5-nanometer process nodes.
Specifically, the Telum II processor is equipped with eight high-performance cores running at 5.5GHz, each equipped with a 36MB secondary cache, which increases the on-chip cache capacity by 40% (for a total of 360MB). The virtual L4 cache per processor drawer is 2.88GB, an increase of 40% over the previous generation. The integrated AI accelerator enables low-latency, high-throughput in-transaction AI reasoning, such as enhanced fraud detection during financial transactions, and four times more computing power per chip than the previous generation. The TelumII chip integrates the latest I/O acceleration unit DPU. By design, a 50% increase in I/O density can significantly increase data processing power, further improving the overall efficiency and scalability of IBMZ.
Spyre Accelerator is an enterprise-class accelerator designed to provide scalable capabilities for complex AI models and generative AI use cases. It has up to 1TB of memory and works in series on eight cards in a regular IO drawer to support a mainframe's overall AI workload while consuming no more than 75W per card. Each chip consists of 32 computing cores that support int4, int8, fp8, and fp16 data types and are suitable for low latency and high throughput AI applications.
Tina Tarquinio, vice president of mainframe and Linux ONE product management at IBM, said the TelumII processor and Spyre accelerator are designed to deliver secure, energy efficient, high-performance enterprise computing solutions. These innovations from years of research and development will be incorporated into the next generation IBMZ platform to help customers leverage large language models and generative AI technologies at scale.
As the central processing unit for IBM's next-generation IBM Z and IBM Linux ONE platforms, the Telum II processor is expected to be available to IBM Z and Linux ONE customers in 2025, according to IBM. The IBM Spyre accelerator is still in the technology preview stage and is also expected to launch in 2025.
According to a recent Morgan Stanley research report, electricity demand for generative AI will surge at an annual rate of 75% in the next few years, and its energy consumption in 2026 May be comparable to Spain's annual energy consumption in 2022. People in the industry believe that it is increasingly important to support the right scale base model and hybrid architecture for AI workloads.
(Note: The article comes from the network, the information is for reference only, does not represent the views of this website, if there is infringement, please contact to delete!)






