news

Tsinghua University breaks through AI optical training chip! The results are published in Nature

2024-08-10

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Xindongxi reported on August 10 that the research group of Professor Fang Lu from the Department of Electronic Engineering of Tsinghua University and the research group of Academician Dai Qionghai from the Department of AutomationFirst of its kindFully forward intelligent optical computing training architecture, developed"Tai Chi-II" optical training chip, getting rid of the reliance on offline training and realizing large-scale optical computing systemsNeural NetworksThe relevant research results were published this week in Nature, a top international academic journal.

The Department of Electronic Engineering at Tsinghua University published an article stating thatThe launch of Taiji-II fills the gap in the core puzzle of intelligent optical computing in large-scale training.

Optical computing has the characteristics of high computing power and low power consumption, and is a major frontier direction for accelerating intelligent computing. The reviewer of Nature mentioned in his review that "the ideas proposed in this paper are very novel.The training process of this type of optical neural network (ONN) is unprecedentedThe proposed method is not only effective but also easy to implement.Potentially a widely adopted tool for training optical neural networks and other optical computing systems”。

The Department of Electronics of Tsinghua University is the first unit of the paper, Professor Fang Lu and Professor Dai Qionghai are the corresponding authors of the paper, Xue Zhiwei, a doctoral student of the Department of Electronics of Tsinghua University, and Zhou Tianquan, a postdoctoral fellow, are co-first authors, Xu Zhihao, a doctoral student of the Department of Electronics, and Dr. Yu Shaoliang of Zhijiang Laboratory participated in this work. This project is supported by the Ministry of Science and Technology of China, the National Natural Science Foundation of China, the Beijing National Research Center for Information Science and Technology, and the Tsinghua University-Zhijiang Laboratory Joint Research Center.

1. Clever use of symmetry to help optical computing get rid of GPU dependence

Optical computing promises to improve the speed and energy efficiency of machine learning applications. However, current methods for effectively training these models are limited by computer simulations.

The universal intelligent optical computing chip "Tai Chi" was published in the international academic journal ScienceFor the first time, optical computing has been pushed from principle verification to large-scale experimental application.160TOPS/WThe system-level energy efficiency of optical computing has brought hope for reasoning about complex intelligent tasks, but it has failed to unleash the "training power" of optical computing.

Compared with inference, model training requires more computing power. The electrical training architecture requires a high degree of matching between the forward and back propagation models, which places stringent requirements on the precise alignment of the optical computing physical system, making gradient calculation difficult, offline modeling slow, and mapping errors large, which restricts the scale and efficiency of optical training.

Fang Lu and Dai Qionghai's research team foundPhoton propagation symmetry"This key, withFull forward light trainingIt breaks through the constraints of electrical training architecture on physical optical computing.

According to Xue Zhiwei, the first author of the paper, under the Tai Chi-II architecture,Gradient DescentThe back propagation in the optical system is transformed into the forward propagation of the optical system. The training of the optical neural network can be achieved by using two forward propagations of data and error. The two forward propagations have a natural alignment feature, which ensures the accurate calculation of the physical gradient. The high training accuracy achieved in this way can support large-scale network training.

The modulation-propagation of the physical optical system and the activation-connection of the neural network are mapped to each other, that is, the training of the modulation module can drive the weight optimization of any network, thereby ensuring the speed and energy efficiency of training.

Since backpropagation is not required, the Taichi-II architecture no longer relies on electrical computing for offline modeling and training, making it possible to achieve accurate and efficient optical training of large-scale neural networks.

2. Training optical networks with millions of parameters is an order of magnitude faster

Using light as the computing medium and building computing models with the controllable propagation of light, optical computing naturally has the characteristics of high speed and low power consumption. Using the full forward propagation of light to implement training can greatly improve the speed and energy efficiency of optical network training.

The system's actual test results show that Taiji-II is capable of training a variety of different optical systems and has demonstrated excellent performance in a variety of tasks.

1. Large-scale learning areas:It breaks the contradiction between calculation accuracy and efficiency.Millions of parametersOptical network training speed increased1 order of magnitude, the accuracy of representative intelligent classification tasks has been improved40%

2. Intelligent imaging of complex scenes:In low-light environments (light intensity per pixel is only sub-photon), energy efficiency is achieved.5.40×10^6 TOPS/WFull optical processing, system-level energy efficiency improvement6 orders of magnitudeIn complex scene imaging applications such as non-viewing areas, intelligent imaging with a kilohertz frame rate is achieved, improving efficiency2 orders of magnitude

3. Topological photonics:Non-Hermitian singular points can be automatically searched without relying on any model priors, providing a new idea for efficient and accurate analysis of complex topological systems.

3. Promote application and theoretical progress and provide new computing power for large AI models

Taiji-II also shows its application potential in the field of topological photonics. It can automatically search for non-Hermitian singularities without relying on any model priors, providing new ideas for the efficient and accurate analysis of complex topological systems.

Like the two yin and yang separated, Tai Chi I and II respectively achieve energy-efficient AI reasoning and training; like the two yin and yang harmonized, Tai Chi I and II together constitute the complete life cycle of large-scale intelligent computing.

"'Determine the way of Tai Chi, and combine the positive and negative of heaven and earth'. This is how we describe the Tai Chi series, a set of dialectical collaborative architectures. We believe that they will work together to inject new impetus into the development of computing power for future AI big models and build a new foundation for optical computing power." Fang Lu said.

Based on the principle samples, the research team is actively moving towards the industrialization of intelligent optical chips and has deployed applications on a variety of end-side intelligent systems.

Two generations of Taichi chips have revealed the huge potential of intelligent optical computing. Through unremitting efforts in the field of optical computing, including the Taichi series, the intelligent optical computing platform is expected to open up new paths for high-speed and energy-efficient computing of large AI models, general artificial intelligence, and complex intelligent systems with lower resource consumption and smaller marginal costs.

source:Tsinghua University, Nature