news

Aiming at 32Tbps! Intel reveals silicon photonics integration roadmap, OCI chiplets lay the foundation for future AI infrastructure

2024-08-01

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

On July 31, Xindongxi reported that the Optical Fiber Communication Conference (OFC) is recognized as the highest-level and largest international event in the field of optical communications in the world, and is a weather vane for the development of cutting-edge optical communication technologies. At this year's Optical Fiber Communication Conference, the Intel Silicon Photonics Integrated Solutions (IPS) team shared its breakthrough progress in promoting innovation in high-bandwidth interconnect technology.Industry-leading, fully integrated OCI (Optical Compute Interconnect) chiplets, co-packaged with Intel CPUs, to run real data

For data center and high-performance computing (HPC) applications, Intel's OCI chip enables optical I/O co-packaging, which can support 64 32Gbps channels in one direction on optical fiber up to 100 meters long, and is expected to meet the growing demand for higher bandwidth, lower power consumption and longer transmission distances in AI infrastructure.

Intel has not disclosed the exact dimensions of the OCI chip, but recently released photos show how the OCI chip compares to the eraser on the end of a standard No. 2 pencil.

Recently, Song Jiqiang, vice president of Intel Research and director of Intel China Research Institute, had an in-depth exchange with media such as Xindongxi for more technical details of OCI chips. Song Jiqiang shared Intel's future innovation roadmap for silicon photonics integration. By increasing the line rate, the number of wavelengths per fiber, the number of fibers, and the polarization mode, it is expected to expand the performance of future generations of OCI chips and create bandwidths of up to 1000 Mbps.32TbpsDevice.

Intel is delivering OCI chiplets to various internal and external customers. Specific customer applications and product requirements will determine the sequence and timing of these expansion plans.

1. Electric to silicon photonics ≈ bicycle to motorcycle

As the development of generative AI accelerates, large models require high computing density, large memory capacity and bandwidth, and are difficult to deploy in a single server node, so cross-rack connections are required. Large computing clusters mean longer transmission distances and higher I/O bandwidth requirements.

Song Jiqiang said that AI applications have reached a new level of requirements for storage-to-computing ratios, and memory access is often required, so memory channels and latency will affect how to provide large-scale application services in the future. This requires exploring some new methods.While increasing computing power and storage density, it reduces power consumption and size, thereby putting more computing and storage (chips) in a limited space.

In the past, electrical I/O used copper wires to interconnect chips. Copper wires are fast and consume low power, but their effective transmission distance is very limited.About 1 meter

If you build a cluster in the entire data center, you will also face the problem of large cluster footprint, long cables, and high power consumption for long-distance transmission, making it difficult to achieve a balance between high computing power and energy saving. There are many server nodes in a data center, and there is an upper limit on the power supply. In addition to chips in the rack, there are other places such as I/O that consume power, so the power actually allocated to each chip is very limited.

Song Jiqiang shared that in the past two or three decades, the power required for I/O in the entire computing process has increased. If it continues to grow at the current scale using current technology, it will consume all the power supplied to the rack, resulting in insufficient power for computing and reading and writing operations in the storage chip. Therefore,New technical solutions must be used to reduce the power used for I/O.

Intel compares traditional electrical I/O to a horse-drawn carriage, which has limited transmission speed and distance;Within 100 metersTo achieve higher density and more flexible data transmission, silicon photonics integration is like a lightweight motorcycle, which is fast, flexible, efficient and energy-saving.More than 100 metersFor long-distance transmission, using pluggable optical transceivers is like changing cars, with larger capacity and faster speed.

Optical I/O and pluggable optical transceivers areSilicon Photonics InterconnectThe solution has the advantage of low power consumption and is suitable for longer distance transmission.

Pluggable Optical TransceiverThe solution is relatively mature and can be directly connected to the electronic integrated circuit (EIC) interface to increase the transmission distance, but it is large in size and usually requires high-speed serializer and deserializer (SerDes) or digital signal processing (DSP) technology, so it has high power consumption, low bandwidth density and long latency.

By usingSilicon Photonics IntegrationTechnology, optical I/O can achieve multi-Tbps bandwidth with low power consumption, high bandwidth density, low latency and longer transmission distance to meet the needs of AI expansion.

OCI CoreCo-packaging (or any optical I/O solution) with the CPU, GPU or SoC can optimize and improve I/O bandwidth density, overall energy efficiency, latency and cost, and can also achieve more efficient resource utilization through new architectures that support resource disaggregation (such as HBM or CXL memory pooling).

In the future, Intel will provide different solutions for different transmission distances, including OCI optoelectronic co-packaging and pluggable solutions.

two,Combined with CPU package,How does Intel's OCI chip drive energy efficiency?

The Intel OCI chiplet is a complete physical layer optical I/O device that includes a silicon photonic integrated circuit (PIC) with on-chip dense wavelength division multiplexing lasers and semiconductor optical amplifiers, and an EIC for controlling the PIC and connecting to the host.

The function of EIC is closer to how specific signals are used and which parts they connect to, and it will become a conversion adaptation layer in a protocol. PIC is more about solving the problem of stable transmission of light, tuning and sending signals, and sustainable evolution, such as how to complete a good conversion between dielectric and optical media.

EIC uses standard CMOS process nodes, while PIC uses Intel's silicon photonics manufacturing process based on 300mm silicon wafers. Usually, EIC uses a relatively advanced process to be close to or aligned with the main chip to be supported, while PIC uses a more mature process.

Since such computing components are not pluggable, they have lower power consumption and can effectively improve the integration of silicon photonic interconnects while increasing bandwidth and extending transmission distances, thereby achieving performance and energy consumption improvements and helping to increase cluster density.

Intel's fully integrated OCI chip, bidirectional data transmission speed reaches4Tbps, and compatiblePCIe Gen5, one-way support64 32Gbps channels(Song Jiqiang said this is sufficient in current data centers), the transmission distance is100 metres(Due to transmission delays, the distance in actual applications may be limited to only a few dozen meters).

It uses 8 pairs of optical fibers, each with 8 wavelengths of dense wavelength division multiplexing (DWDM), and the power consumption is only5pJ per bit(picojoules), only the pluggable optical transceiver module consumes1/3

According to Song Jiqiang, Intel is confident that it can reduce energy efficiency to 50% in subsequent generations of products through various improvements in device and package design, manufacturing process and bandwidth expansion.3.5PJ per bitthe following.

At the 2024 Optical Fiber Conference, Intel conducted a live optical link demonstration, showing the transmitter and receiver interconnection between two data center CPU platforms via single-mode fiber (SMF) patch cords.

The CPU generates and measures the bit error rate. The two data center CPUs send and receive data to each other. An OCI chip is co-packaged with a CPU. The OCI chip converts all electrical I/O signals from the CPU into light and transmits them back and forth in the nodes or systems of the two data centers through optical fibers.

As shown in the figure, the electrical signals in the host computers on both sides are converted into light through the photoelectric conversion chip. The transmitter has a total spectrum of 1.6THz, including 8 wavelengths at 200GHz intervals on a single optical fiber, and a 32Gbps transmitter eye diagram, indicating that the signal quality is very strong.

The colored part is light, and different colors represent different wavelengths of light, which are spaced far enough apart in frequency so that they do not interfere with each other during modulation and demodulation. These lights can be combined and transmitted on one optical fiber, that is, multiple bands can "multiplex" one optical fiber, which is the same as frequency division multiplexing in the field of wireless communications.

Because the bandwidth of light is very large, we can select a relatively stable bandwidth and cut it into many different bands, which appear to be different colors of light to the human eye. In fact, they are bands of different frequencies, and the signal to be transmitted can be stably modulated on each band. After photoelectric modulation, the signal is transmitted through optical fiber.

Song Jiqiang shared the performance evolution roadmap of Intel OCI chips. The technology iteration has three main directions:The wavelength of light waves, the transmission rate of optical fibers, and the number of optical fibers.

An optical fiber can be divided into different bands for transmission. Currently, 8 bands can guarantee stable transmission. The data transmission rate adjusted in each band is 32Gbps. The number of optical fiber pairs that can be put together without affecting each other is 8. Multiplying the three, the one-way data transmission speed is 2Tbps, and the two-way data transmission speed is 4Tbps.

In the future, if the 8-band technology remains unchanged and the fiber transmission rate is increased to 64Gbps, the one-way data transmission speed will double to 4Tbps. If the technology is further developed into 16-band technology, the transmission speed will be increased to 8Tbps. We can continue to evolve in the future and gradually increase the bandwidth.

3. In the future, it can be integrated with GPU, multipleDifferentiation Advantages

Compared with the separate and plug-in solutions,Sealing the OCI chip together with the CPU requires overall consideration of heat management and ensuring signal transmission density and transmission frequency at the packaging level.Intel's current technology is already able to meet these needs.

In the future, OCI chiplets can be used to achieve communication and can also be integrated with computing chips such as CPU, GPU, and IPU.Through silicon photonics integration and advanced packaging technology, Intel can achieve higher-density I/O chiplets, and then combine them with other xPUs. In the future, many different types of computing and interconnection chips will be formed based on the chiplets, and the application prospects are promising.

Song Jiqiang further explained that the subsequent challenge of integrating with other types of chips is not at the technical level, but at the implementation level. What needs to be paid attention to isBandwidth DensityFor example, when the distance between the optical and electrical interfaces is limited, how can we put in these optical and electrical conversion interfaces? Is the bandwidth density that can be achieved sufficient within a certain size range?

He shared that in order to make OCI chips more flexible and reduce the workload during integration,It is common to consider using electrical interfaces between the host xPU and I/O that have been standardized through a robust IP ecosystem, such as UCIe, PCIe, Ethernet, etc.

He also talked about the differentiated advantages of Intel's solution.

First, Intel can mass-produce highly integrated lasers at the wafer level, with higher yield and reliability and lower total cost. Only after the theoretical transformation to high-yield production can the industrialization capability be formed.

Existing external laser solutions require specialized optical fibers, which are costly and have no examples of large-scale deployment.The advantage of on-chip lasers is that they can be transmitted using ordinary optical fibers. Since no external light source is required, no polarization-maintaining fiber is required.(PMF, a special optical fiber required to connect an external light source to a passive silicon photonic integrated circuit).

When making a laser transmitter, it is relatively simple to make separate devices. Making a laser on a wafer has technical barriers. It is necessary to be able to bond different types of semiconductors well at the wafer level, and then use semiconductor manufacturing processes to form a control circuit. Optical devices including light sources, modulators, amplifiers, optical waveguides, detectors, etc. must be realized at the wafer level.

Secondly, Intel has a high-volume, field-proven platform with devices that have industry-leading reliability.

Intel OCI chips are built on an internal, production-proven silicon photonics integration platform that has delivered more than 100 million connections for hyperscale data center connectivity applications since 2015.8 millionOptical transceiver modules (including over8 millionSilicon photonic integrated circuits and more32 millionIntegrated lasers) for applications requiring transmission rates of 100Gbps, 200Gbps, and 400Gbps.

Its reliability has been verified on millions of devices, and data shows that the laser's time-based failure rate (FIT) is less than0.1, which means that a failure may occur only once in 10 billion hours.

Furthermore, building the photonic and CMOS circuits on two separate chips (silicon photonic integrated circuit and electronic integrated circuit) ensures scalability and performance optimization, without the compromises and trade-offs required to combine two very different technologies on a single chip.

Intel's accumulation in advanced packaging, systems, and platforms also enables it to optimize optical I/O solutions.Intel is investing in new silicon photonics manufacturing process nodes to achieve leading device performance improvements, higher density, better coupling and higher economic benefits, and will continue to improve the performance, cost and reliability of on-chip lasers and optical transceivers.

Conclusion: From technical prototype to commercial solution

Intel Research has been deeply involved in the field of silicon photonics for more than 25 years and is a pioneer and leader in silicon photonic integration. Intel is the first in the industry to develop and deliver silicon photonic connection devices in batches to large cloud service providers, and is working with customers to transform OCI chip technology prototypes into scalable, commercial solutions.

In terms of cost, Intel believes that over time and with higher volumes, the total interconnect cost per bit for optical I/O will be comparable to electrical I/O at the system level. The higher performance of optical I/O will also help improve performance at the system level.

To achieve this goal, Intel is currently developing the second-generation silicon photonics manufacturing process node, which is expected to reduce chip area by more than 40% and power consumption by more than 15%, thereby improving economic benefits and making progress in optical coupling efficiency, laser power, etc.