2024-08-14
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
Machine Heart Report
Editor: Zenan, Chen Chen
Gemini Live competes with GPT-4o, and Google’s entire AI mobile phone family is launched.
Before GPT-4o entered the iPhone, Google Gemini took the lead in completing the mobile version.
Early Wednesday morning, when people were looking forward to OpenAI's "Strawberry Big Model", Google officially released Gemini Live and a series of Pixel hardware products at the Made by Google event.
At today's event, Google confidently conducted a 100% live demonstration, although there were some minor issues.
I asked my phone to recognize the image twice (but I used a Samsung phone), but it failed both times.
But as Google says, we have entered the "Age of Gemini".
After the Pixel 9 series, the series of Gemini AI features released today will also appear on various Android phones with Android 15.
Gemini Live: Benchmarking GPT-4o, now live
Gemini Live is Google's response to OpenAI's advanced speech model. The feature is almost identical to ChatGPT and has been in alpha testing.
Gemini Live provides a mobile conversation experience that allows users to have a free-flowing conversation with Gemini, and even interrupt or change the topic like on a regular phone call, without typing.
Google describes it in a blog post: You can talk to Gemini Live (through the Gemini app) and choose from 10 new natural voices to respond (OpenAI only provides 3 voices). You can even speak at your own pace or interrupt it in the middle of an answer to ask additional questions, just like in a normal conversation.
视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650930230&idx=2&sn=822b96951da8ef70408c0c546c6c5ae5&chksm=84e43848b393b15e320f663d6c311ccab54157b0885da6dee24ce8e5260beed4153dfb2a432a&token=2010422951&lang=zh_CN#rd
Gemini Live can be directly awakened. You can continue to talk with Gemini when the application is running in the background or the phone is locked, and the conversation can be paused and resumed at any time.
Gemini Live will also integrate with the functionality of various Android apps, such as Keep, to increase the usability of Gemini.
Starting today, Google is starting to roll out the feature to Gemini Advanced users on Android phones in English only, and will expand to iOS and more languages in the coming weeks.
However, during the live demonstration, Gemini Live failed twice when asked for information about the concert poster, requiring the presenter to change phones before it could work properly. Although there were some problems during the demonstration, it was successful in the end. Gemini Live finally extracted relevant information from the picture and connected the calendar to provide users with accurate results.
视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650930230&idx=2&sn=822b96951da8ef70408c0c546c6c5ae5&chksm=84e43848b393b15e320f663d6c311ccab54157b0885da6dee24ce8e5260beed4153dfb2a432a&token=2010422951&lang=zh_CN#rd
It is worth noting that according to product manager Leland Rechis, Google does not allow Gemini Live to imitate any voice other than these 10 voices. Google may do this to avoid conflicts with copyright law. Previously, OpenAI was sued by Black Widow for using Black Widow's voice.
Overall, the feature seems like a great way to delve deeper into a topic in a more natural way than using a simple Google search. Google notes that Gemini Live is a step forward for Project Astra, the multimodal AI model that the company debuted during Google I/O. Currently, Gemini Live only supports voice conversations, and Google hopes to add real-time video understanding capabilities in the future.
With the support of chips, Google's hardware family is here
At the same time as the launch of Gemini Live, Google also launched a new generation of smart hardware devices, earlier than Apple and Huawei.
The newly released hardware this morning includes Pixel 9, Pixel 9 Pro and Pixel 9 Pro XL, as well as a foldable phone Pixel 9 Pro Fold. They are all powered by the new Google Tensor G4 chip, which can bring various generative AI capabilities.
The Pixel 9 phones feature a new look that puts the camera front and center, revamping the iconic camera module for a better feel in the hand. Google claims the phones are twice as durable as the Pixel 8.
This time, for the first time, the Pixel Pro models are available in two different sizes: Pixel 9 Pro (6.3 inches) and Pixel 9 Pro XL (6.8 inches), both equipped with Super Actua displays and 42 MP front cameras. In addition to the display size, charging speed and battery, the Pixel 9 Pro and Pixel 9 Pro XL have the same specifications and features.
It is worth noting that the Pixel 9 phone uses Google's new custom chip Tensor G4. This is a new generation of high-performance mobile phone chips designed to improve daily use cases, such as opening applications faster, browsing the web, and so on.
Tensor G4 is designed by Google DeepMind and manufactured by Samsung, using the Arm architecture. After optimization, G4 can run the most advanced artificial intelligence models. It will be the first processor to run the multimodal Gemini Nano model - on the mobile phone side alone, large models can understand tasks such as text, images and audio.
From what we know so far, the Tensor G4 is the same current-generation core as its predecessor - which means it will soon become a legacy chipset in September, and the static GPU core called Mali also means there is no support for ray tracing (the supported version is called Immortalis). Still, compared to itself, the performance improvement over the previous generation is still considerable.
Of course, as a chip developed by DeepMind, Tensor G4 has good AI computing power. Google revealed that it has an "industry-leading" output speed of 45 tokens per second.
To ensure that the AI experience on the device runs smoothly, Google has also upgraded the memory of the Pixel 9 series, which come with 12GB RAM, while the Pixel 9 Pro and Pixel 9 Pro XL come with 16GB RAM.
The Pixel series has always been a benchmark for Google's technology applications. The new phone is connected to Gemini Live and will go on sale in August. Google said that Pixel 9 Pro, Pixel 9 Pro XL and Pixel 9 Pro Fold users can enjoy a one-year Gemini Advanced subscription after purchasing the phone. It seems that compared to the iPhone 16 series equipped with the OpenAI large model, Google has done it faster this time.
Google introduced a range of generative AI capabilities for the Pixel.
Among them, Pixel Studio can help you turn ideas into images on your phone. It is a combination of the device-side diffusion model running on Tensor G4 and the Imagen 3 text-to-image model in the cloud.
Google's new image processing model Imagen 3 was first released at the I/O conference in May. The model has been optimized and upgraded in terms of generating details, lighting, interference, etc., and its ability to understand prompts has been significantly enhanced. With today's press conference, Google DeepMind submitted a paper on Imagen 3 on arXiv:
Pixel Screenshots helps you save, organize, and recall important information you want to remember for later.
Let's say you have a friend who loves squirrels and her birthday is coming up. You can look for gifts on Google Chrome and take screenshots of squirrel shirts, squirrel coasters, and all things squirrel-related. Pixel Screenshots will analyze the content of all these images and help you search for this information in the app. Then, you just open the app and search for "squirrel" and these results will pop up. It will also include links to all the content you found, as well as summaries and related information about the content you are viewing.
One of the most common things people do on their phones is check the weather. Pixel Weather can provide more accurate weather information, and Gemini Nano will also generate a custom AI weather report to let people know the weather conditions of the day.
In terms of photography, which is something all mobile phones are good at, Pixel 9 has added AI shooting function to improve the efficiency of film production.
Often, a designated photographer is left out of group photos. With Add Me, you can take photos with everyone present without having to carry a tripod or ask strangers for help.
With the redesigned Panorama, you can now take detail-rich photos even in low light. It’s the highest quality low-light panorama on any smartphone.
In addition, the Magic Editor in Google Photos has new editing features. You can take the photos you want. The automatic framing function can recompose the photos. You only need to enter what you want to see (for example: adding wildflowers to an open field) to recompose the photos and turn your ideas into reality.
Smart call recording for the big models is now integrated into Android. Clear Calling further improves audio quality, and the new Call Notes feature sends a private summary and full call log as soon as the user hangs up. So when you get a call back, there's no need to scramble to find a pen and paper to take notes. To protect privacy, call recording runs entirely on the device.
The latest Pixel 9 devices are the first Android phones to come with a new Satellite SOS feature, so users can contact emergency responders and share their location via satellite, even without a mobile network. Satellite SOS will be available first on Pixel 9 devices in the United States, regardless of your carrier plan. This feature will be available for free for the first two years on a Pixel.
Finally, the pricing, Pixel 9, Pixel 9 Pro and Pixel 9 Pro XL are all available for pre-order, starting at $799, $999 and $1099 respectively. Pixel 9 and Pixel 9 Pro XL will be available on August 22 from the Google Store and Google retail partners. Pixel 9 Pro will be available in the United States on September 4, while Pixel 9 Pro Fold will also be available in other markets in the coming weeks.
References:
https://blog.google/products/pixel/google-pixel-9-pro-xl/
https://www.androidauthority.com/google-tensor-g4-explained-3466184/