news

Amazon acquires a chip company

2024-08-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Amazon has reached an agreement to acquire chipmaker and AI model compression company Perceive, a subsidiary of publicly traded Xperi in San Jose, California, for $80 million in cash. Perceive is a company that develops breakthrough neural network inference solutions, focusing on technology that delivers large AI models on edge devices.

Amazon did not reveal its specific thoughts on the technology. An Amazon spokesperson said: "We are excited to have signed an agreement to acquire Perceive and bring its talented team to join our efforts to bring large language models and multimodal experiences to devices that can run at the edge."

Xperi has been seeking a buyer for Perceive since the beginning of this year. Most of Perceive's 44 employees are expected to join Amazon after the deal is completed. Amazon said it does not expect the transaction to require regulatory approval and called it a conventional acquisition.

Perceive Chip Introduction

According to information, Perceive is led by co-CEO Murali Dharan and Steve Teig. The company has employees all over the world. The company will operate a laboratory in Idaho. Teig promoted the creation of Perceive during his tenure as CTO of Xperi, where he oversaw technology development, including core audio and imaging innovations, while also leading the company's machine learning team. Dharan was previously responsible for the strategic direction, management and growth of Xperi's licensing business, and is now responsible for leading Perceive's business operations, including sales, marketing, customer success and operations.

Perceive offers products for serving large AI models on edge devices, with its flagship product being the Ergo AI processor, which can run datacenter-grade neural networks in a variety of environments, even under power constraints.

It is reported that this is an AI processor that brings breakthrough performance and energy efficiency to edge devices. It can run large neural networks at full frame rate and supports various network architectures and types, including standard CNN, RNN, LSTM, etc. Ergo is flexible and powerful enough to handle a large number of machine learning tasks, from object classification and detection, to image segmentation and posture, to audio signal processing and language. You can even ask it to perform multi-tasking, because Ergo can run multiple networks at a time.

Despite its processing power, Ergo requires no external DRAM, and its small 7mm x 7mm package makes it ideal for use in compact devices such as cameras, laptops, or AR/VR glasses.

Perceive says Ergo is 20 to 100 times more energy efficient than other products, requiring only 9 mW of computing power to perform inference on 30 fps video. This means your device can deliver unparalleled battery life and generate less heat, allowing for a smaller, more functional package.

In early 2023, Perceive launched its new Ergo 2 AI processor. According to reports, the chip can provide the performance required for more complex use cases, including use cases that require transformer models, larger neural networks, multiple networks running simultaneously, and multimodal inputs, while maintaining industry-leading power efficiency.

“With the new Ergo 2 processor, we’ve expanded our capabilities to provide device makers with a path to build their most ambitious products,” said Steve Teig, Perceive founder and CEO, outlining the market opportunity for the latest Ergo chips. “These include transformation models for language or vision processing, higher frame rate video processing, and even combining multiple large neural networks in a single application.”

Ergo 2 runs four times faster than Perceive’s first-generation Ergo chip and has significantly more processing power than typical chips designed for micro ML. Product developers can now leverage advanced neural networks such as YOLOv5, RoBERTa, GANs, and U-Nets to deliver fast, accurate results. All Ergo 2 processing is done on-chip, eliminating the need for external memory for improved energy efficiency, privacy, and security. The Ergo 2 chip enables:

Running MobileNet V2 at 1,106 inferences per second

979 inferences per second running ResNet-50

Running YoloV5-S with 115 inferences per second

To provide the performance enhancements required to run these large networks, the Ergo 2 chip uses a pipelined architecture and unified memory design, which increases its flexibility and overall operational efficiency. As a result, Ergo 2 can support higher-resolution sensors and a wider range of applications, including:

Language processing applications such as speech-to-text and sentence completion

Audio applications such as acoustic echo cancellation and richer audio event detection

Demanding video processing tasks such as video super-resolution and pose detection.

The Ergo 2 processor measures 7 mm x 7 mm and is manufactured by GlobalFoundries using the 22FDX platform and can operate without external DRAM. Its low power consumption also means it does not require cooling. The chip can run multiple heterogeneous networks simultaneously, providing intelligent video and audio capabilities for devices such as enterprise-class cameras for security, access control, thermal imaging or retail video analytics; for industrial use cases including visual inspection; or integrated into consumer products such as laptops, tablets and advanced wearables.

AWS's self-developed chip history

AWS has been building its own in-house custom silicon for AI workloads and cloud optimization for years, thanks largely to the company’s acquisition of Annapurna Labs more than a decade ago. This has allowed AWS to build its own Graviton processors, Inferentia chips, and Trainium machine learning processors over the years for training AI models in the cloud.

This year, Amazon took a major step forward in advancing artificial intelligence (AI) technology.

At its annual AWS re:Invent conference, AWS unveiled two new custom chips: AWS Trainium 2 and Graviton 4. The two chips represent a bold effort by Amazon Web Services (AWS) to meet the growing demand for AI capabilities, especially as the market faces a severe shortage of high-performance graphics processing units (GPUs) produced primarily by Nvidia.

The need for increased computing power stems from the growing popularity of generative AI, which requires powerful infrastructure to train and deploy models. Nvidia’s GPUs are reportedly sold out until 2024, and industry sources, including TSMC’s CEO, predict that this supply crisis could last until 2025. With this in mind, Amazon’s new chip aims to reduce its reliance on Nvidia by providing an alternative tailored specifically for AI workloads.

The Trainium2 chip is designed for training large-scale AI models, with four times the performance and two times the energy efficiency of the previous generation of chips. According to Amazon, this chip can handle 65 exaflops when used in a cloud cluster of up to 100,000 units. This capability can reduce the time to train complex models (such as models with hundreds of billions of parameters) from months to weeks. These advances make Trainium2 a leader in AI training infrastructure.

Recognized for its potential, the Trainium2 chip has already attracted the interest of several industry players, including Anthropic, a company focused on building user-friendly AI models.Co-founder Tom Brown stressed that Trainium2 will enable them to scale quickly, with processing speeds four times faster than previous models. The collaboration between AWS and companies like Anthropic shows the growing trend of using proprietary cloud technologies to simplify AI operations.

On the other hand, the Graviton4 chip is Amazon's most powerful and efficient processor to date, tailored for a variety of cloud workloads. Compared with the previous generation Graviton3, this fourth-generation chip is expected to have a 30% performance improvement, a 50% increase in the number of cores, and a 75% increase in memory bandwidth. These improvements enable users to reduce operating costs and increase data processing speed, making it an ideal choice for enterprises running high-performance databases and intensive analytical applications.

Early adopters of Graviton4 include companies such as Datadog, Epic Games, and SAP. For example, Roman Visintine, chief cloud engineer at Epic, noted that the chip excels in latency-sensitive workloads, especially for online gaming experiences. This trend in cloud service optimization is critical in a competitive environment where fast data access and processing are critical to success.

Amazon's announcement also highlights a larger trend in the tech industry, where companies are increasingly investing in custom chip solutions to meet specific computing needs, especially artificial intelligence and machine learning tasks. By developing proprietary hardware, Amazon hopes to stand out and reduce its reliance on established chipmakers such as Nvidia and AMD.

As artificial intelligence technology continues to advance and become more prevalent in a variety of fields, from healthcare to entertainment, the need for efficient, high-performance chips will only grow. Technology analysts expect that the launch of these new Amazon chips will not only meet current needs immediately, but will also lay the foundation for future AI developments.

Surprisingly, the launch of these chips comes at a strategic time, as Microsoft has also announced its own chip development for AI and cloud services. This has sparked fierce competition in the AI ​​hardware space, prompting companies to innovate and push boundaries quickly.

AWS Trainium2 and Graviton4 chips are expected to be available to customers in the near future, with Trainium2 expected to be available sometime next year and Graviton4 already in preview. As the tech industry continues to shift toward cloud computing and AI-driven solutions, Amazon is expected to play a major role in this digital transformation.

AI chips have great potential

AWS has been frequently focusing on chips, which not only meets the company's business needs, but also proves once again that AI chips have great potential, which is reflected not only in the cloud but also on the terminal side.

According to Futurum Intelligence, Nvidia will have 92% of the AI ​​GPU market and 75% of the total data center AI semiconductor market by 2023. This dominance will continue in this already massive market, which is expected to grow by nearly half by 2024.

The analyst firm estimates that the total market value for processors and accelerators for AI applications in data centers will reach $56.3 billion, up 49.3% from an annual market value of $37.7 billion in 2023. Market analysts predict that the market will grow at a compound annual growth rate of 29.7% over the next five years, bringing the market size to $98.4 billion in 2026 and $138.3 billion in 2028.

Futurum divides the AI ​​data center processor market into four categories: CPUs, GPUs, specialized accelerators (called XPUs), and proprietary cloud accelerators produced by companies such as Google, AWS, and Microsoft.

In 2023, CPUs will account for 20.5% of the market, GPUs will account for 73.5% of the market, and XPUs and cloud-specific products will each account for 3%.

1. CPUs account for 20% of data center AI processing in 2023 and will continue to play a significant role. Futurum estimates that they will grow at a five-year CAGR of 28%, from $7.7 billion in 2023 to $26 billion in 2028. Futurum estimates that in 2023, Nvidia will have a market share of 37%, followed by Intel at 23%.

2. By 2023, GPUs will account for 74% of chipsets used in data center AI applications and will experience a five-year compound annual growth rate of 30%, growing from $28 billion in 2023 to $102 billion in 2028. Futurum estimates that Nvidia has a 92% share of the AI ​​GPU market.

3. XPU will experience a five-year compound annual growth rate of 31%, growing from $1 billion in 2023 to $3.7 billion in 2028.

4. Public cloud AI accelerators will experience a five-year compound annual growth rate of 35%, growing from $1.3 billion in 2023 to $6 billion in 2028.

Futurum excluded AI processors and accelerators from this study if they were not available to the public in data centers, so AI chipsets designed and used by Meta, Tesla, and Apple were not included.

Geographically, North America dominates the market and will account for 55% of the market share by 2023. Europe, the Middle East, and Africa (EMEA) and Asia Pacific (APAC) follow as important markets, while Latin America (LATAM) represents a developing region with great growth potential.

Visual and audio analysis are the largest use cases in 2023. Futurum predicts that the top three use cases in 2028 will be visual and audio analysis, simulation and modeling, and text generation, analysis, and summarization.

Specifically in terms of edge AI, new research from Omdia predicts that by 2028, the edge AI processor market will generate $60.2 billion in revenue, with a compound annual growth rate of 11%.

Omdia's latest edge processor forecast states that as industries and devices adopt artificial intelligence, the increase in demand for hardware is driving revenue growth. One of the areas driving market growth is the personal computer sector, with increasing product offerings from major suppliers such as Intel, AMD and Apple. PC vendors are reportedly trying to market the inclusion of AI processors in their devices as a "unique selling point."

In addition to the PC sector, the report also highlights the rapid adoption of AI processors in areas such as automobiles, drones, security cameras and robots.

From this we can see the intention behind the AWS acquisition.