news

Intel executives: AI models will gradually shift from the cloud to the edge

2024-07-26

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

·Cloud processing time is extended, data transmission costs are high, and there are concerns about data security. Intel Senior Vice President Sachin Katti said that AI is penetrating to the edge, and large models may gradually move from the cloud to the edge.


Sachin Katti, senior vice president and general manager of Intel's Networking and Edge Group.

"We expect AI to be deployed and applied more at the edge to process local data. Over time, AI models may gradually shift from the cloud to the edge." On July 24, at the 2024 Intel Network and Edge Computing Industry Conference, Intel Senior Vice President and General Manager of the Network and Edge Business Unit Sachin Katti said that current AI mainly runs in the cloud. As edge devices generate large amounts of data locally, the cost of transmitting all data to the cloud is quite high. The evolution to edge computing is the general trend.

Data security and real-time performance drive AI from the cloud to the edge

Sachin Katti said that humans are in the AI-assisted era, where AI helps humans work more efficiently. After the AI-assisted era, humans will enter the AI ​​assistant era. "When you drive past a fast food restaurant, AI agents can provide ordering services, and corporate workflows can also be completed with AI. In the distant future, we may find that agents can interact with each other, just like humans work together to provide department-level solutions."

Sachin Katti said that today's AI growth is mainly concentrated in the cloud, but the evolution to edge computing is an inevitable trend. "In the past, the AI ​​we talked about was basically about machine vision or time series-based automation technology. But now edge AI has gradually progressed from edge machine vision to edge applications such as large language models and generative AI. Intel will continue to provide relevant capabilities to accelerate the deployment of generative AI and large language models at the edge."

In addition to the transmission requirements of edge data, data security and real-time performance are important considerations in promoting AI from the cloud to the edge. Guo Wei, vice president of Intel Marketing Group and general manager of Intel China Network, Edge and Channel Data Center Division, said that on the one hand, enterprises have concerns about putting data in the cloud, and on the other hand, edge computing can help solve real-time requirements.

"This year, more than half of our customers are exploring solutions based on large edge models," said Chen Wei, vice president of Intel and general manager of Intel's Network and Edge Business Unit in China. From the perspective of edge computing, the model size is not the bigger the better, but should be suitable for the actual needs of market application scenarios. "The deployment of edge computing needs to consider many factors, such as latency, practicability, adjustable optimization of micro data, and information security."

Edge tuning is limited by data volume

"The characteristic of edge is fragmentation." Zhang Yu, CTO of Intel China Network and Edge Business Unit and Intel Senior Chief AI Engineer, said that different users have different requirements for computing power and performance. A common challenge of edge tuning is the limitation of data volume. The amount of data that a school or a factory can actually use for training is very small. The data of different companies and schools are also different. Auto parts production factories and machining factories encounter different problems. A unified model cannot be used to detect product defects. The model must be trained with company-specific data.

At the same time, Zhang Yu said, "Training requires annotation so that the machine knows what you are paying attention to. The workers on the production line are often the ones who actually operate the AI-enabled equipment in a factory. How can they have the energy to do annotation during the production process?" Therefore, when tuning the edge, it is necessary to use automated annotation methods to complete the annotation when the amount of data is small. "At the edge, end users pursue business deployment rather than technical solutions. Users' requirements for business are convenience, easy deployment, and easy management after deployment, which is often a pain point for customers."

Guo Wei said that it is still not enough to rely solely on model training to solve actual industry problems, and the demand for improved reasoning capabilities is particularly evident this year. The implementation of large models will inevitably involve the balanced distribution of computing power from the end to the edge and then to the cloud. "If it is just a standard application of vertical large models, large models are mainly deployed in the cloud. However, due to the needs of industry implementation, AI computing power will inevitably be distributed to the edge and end side."

Sachin Katti said that the main workloads at the edge are reasoning and continuous learning. Sometimes after deployment at the edge, it is found that the effect is not as expected, or after running for a period of time, the original model needs to be fine-tuned.

How much computing power is needed at the edge? Sachin Katti said that there is a positive correlation between computing power and energy consumption. Deploying equipment at the edge consumes about 200 watts of energy, and cloud deployment consumes 1-2 kilowatts of energy, while a single-layer rack in a data center consumes up to 100 kilowatts of energy. If the energy consumption of the entire data center is added up, it may reach 50-100 gigawatts. In cases of high computing power or energy consumption, cooling efficiency and cooling capacity are key variables that must be considered. Since large-scale data and computing power generate a lot of heat, "we currently use liquid cooling technology to effectively cool the cluster. Existing liquid cooling technology has been able to successfully cool a 100-kilowatt cluster, and is expected to expand to 300 kilowatts in the future. Therefore, there is another very important factor that limits computing power, which is whether you have enough capacity to effectively cool the overall environment."