LineShine: China Returns to the Top of Supercomputers with 2.2 Exaflops Using Only CPUs, but Two Italians in the Top 10
The 67th edition of the TOP500 ranking registered a direct debut in first place: LineShine, a supercomputer installed at the National Supercomputing Centre in Shenzhen, achieved 2.198 Exaflop/s on the HPL benchmark, overtaking the American El Capitan (1.809 Exaflop/s). This marks the first time China has led the ranking since 2017, when the top spot was held by Sunway TaihuLight from Wuxi with 93 Petaflops, and importantly, it is the first time a system has surpassed 2 exaflops of sustained double-precision performance using only CPUs, without any graphical accelerators.
The significance of this achievement lies in how it was accomplished: LineShine is built entirely with domestic Chinese components, without GPUs or chips from Intel, AMD, or NVIDIA. It reaches the top despite US sanctions that since 2019 had prompted China to stop submitting Linpack results for its leading systems, limiting their access to advanced chips. The machine, built by the Shenzhen Cloud Computing Center, was able to undergo TOP500 testing precisely because it was developed without public funding. Jack Dongarra, one of the organizers of the ranking, acknowledged the capability to surpass US excellence by developing a machine that does not rely on GPUs.
The ARMv9 LingKun Architecture
LineShine is based on the proprietary LingKun platform: LX2 processors with 304 cores on ARMv9 architecture, totaling 13.79 million cores at 1.55 GHz, LingQi interconnection, and Kylin operating system. The computing power is derived from 20,480 computing nodes in asymmetric NUMA configuration: each node hosts two LX2 processors, each containing two dies, and each die organizes four NUMA domains of 38 cores, pairing 4 GB of HBM memory with 128 GB of off-package DDR. The LingQi interconnect adopts a dual-plane multi-rail fat-tree topology with 1.6 Tb/s per node. The declared theoretical peak is 2.736 Exaflop/s, placing the HPL efficiency around 80%. In terms of power consumption, the system draws about 42.2 megawatts with an efficiency of 52.07 Gigaflops/Watt. The machine was first anticipated in April during a presentation by chief designer Lu Yutong from NSCC Shenzhen, but at that time, it was unclear that it was already built and operational, leading to the surprise effect when the results were submitted.
What Does This Record Really Measure?
The record is also clear on the HPCG benchmark, which is more representative of real workloads, where LineShine reaches 22.00 HPCG-Petaflop/s, ahead of El Capitan (17.41) and Fugaku (16.00). However, the picture changes on the HPL-MxP, the mixed precision test that approximates inference workloads: here, LineShine is only fourth, with 7.92 Exaflop/s and a mere 3.6x speedup compared to HPL, while El Capitan remains first at 16.7 Exaflop/s with a 9.2x speedup. The gap indicates a CPU-only design lacking dedicated accelerators for lower precision: thus, domination in full precision computing does not automatically transfer to typical artificial intelligence workloads.
With LineShine, there are now five systems surpassing the exascale threshold on HPL, for the first time distributed across three continents: behind the Chinese system and El Capitan are Frontier (1.353 Exaflop/s, Oak Ridge), Aurora (1.012, Argonne), and JUPITER Booster (1.000, Jülich), the only European in the group.
Lower down, the top 10 welcomes two new Italian industrial entries: Eni HPC7 in sixth place (571.5 Petaflops, same HPE Cray EX255a architecture with AMD Instinct MI300A as El Capitan) and Eni HPC6 in eighth, with Microsoft Eagle, Fugaku, and Alps completing the ranking. In the Green500, nothing changes: KAIROS, ROMEO-2025, and Levante remain at the top, all on BullSequana XH3000 with Grace Hopper.
The official classification records an unusual architectural variety in the top 10, with Chinese ARMv9 CPUs, AMD APUs, Intel designs, NVIDIA's Grace Hopper, and Fujitsu's A64FX. The TOP500's commentary concludes better than any verdict: "There is no single dominant technological path to leadership-class computing; manufacturers are instead pursuing a variety of CPU, GPU, APU, and custom accelerator approaches."