Gemini is on the phone! Google's Pixel 9 cashes in on the futures issued by Apple and OpenAI

2024-08-14

Author: Jessica
Email: [email protected]

Early this morning, Google held a "Made by Google 2024" press conference at its headquarters in Mountain View, Silicon Valley. This annual event, which was supposed to take place in October as usual, was said to have been moved to the summer to meet the public in advance in order to avoid Apple's launch of the new iPhone in September.

As rumored, at the press conference, Google's new generation of Android flagship phones Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL and the foldable screen version Pixel 9 Pro Fold, along with Pixel Buds Pro 2 wireless headphones and Pixel Watch 3 watches were all unveiled.

But in addition to new hardware products, Google's main focus and the biggest protagonist of the show is still the ubiquitous AI.

In the official promotional video, Gemini is asked to write a "breakup letter" to his old phone, and a striking "Oh Hi, AI." is displayed.

Two months ago, Apple officially announced Apple Intelligence. The iPhone 16 series is coming as a key device that will be fully adapted and run Apple AI. OpenAI has been releasing smoke bombs every day, from "Q Project" to "Strawberry" emojis, which have made people panic. Google knows that it can't wait any longer: it has launched a new upgraded Gemini assistant and more than a dozen new AI features on Android, and they are available immediately, not futures.

Rick Osterloh, who leads Google's platforms and devices team, also seemed to backstab a rival at the start of his speech:

“There’s been so much promise about AI and so much ‘coming soon’ talk about it. Today we’re showing real progress, and you’ll see live demonstrations of new Pixel products, Android features, and AI experiences, with Gemini at the heart of it all — we’re fully entering the Gemini era.”

Gemini is now available for download. Gemini Live is now available for download.

Google's device-side AI is driven by its lightweight multimodal model Gemini Nano, and has introduced the more flexible Gemini 1.5 Flash. It currently supports 45 languages, covers more than 200 countries and regions, and can run on hundreds of mobile phone models. Users can trigger it to perform tasks through pictures, videos or voice commands.

Now, the smart assistant Gemini can be linked with more apps such as Calendar, Tasks, Google Keep, YouTube Music, etc.

For example, if you happen to see a Sabrina Carpenter concert poster, you can open Gemini, take a photo, and ask "Am I free when she comes to San Francisco this year?" Gemini will extract relevant information from the picture, connect to the calendar, and give an answer.

There is a funny episode here. When the staff showed this example, they failed twice. Gemini responded successfully on the third question. But it also proved that everything was a real test on the spot and not cheating. After Gemini finally answered with bated breath, the audience applauded. The guy also breathed a sigh of relief and said, "Thank you, demo god."

Gemini can also understand what is displayed on the screen. When you are drooling over a food exploration vlog, just instruct Gemini to "create a list of the food eaten by the blogger in the video", and it will connect to the YouTube video and grab the required information from the subtitles, so that you can check in as it is next time.

There are many examples like this, such as setting timed reminders to sync to "tasks", creating music playlists, drafting personalized emails and sending them with Gmail, etc. As Gemini coordinates work between more daily applications, users' production and life efficiency is further improved.

What makes these experiences smoother and smarter is the new voice feature Gemini Live officially launched by Google.

As a voice competitor to GPT-4o, Gemini Live allows users to interact with Gemini in the most natural real-time manner. From accompanying mock interviews, practicing speaking, brainstorming to any communication needs, Gemini Live can provide a real-life chat experience. You can pause, interrupt or change the topic at any time during the conversation, and you can choose from 10 voices of different genders and personalities according to your preferences.

What’s even more exciting is that Google is much more straightforward than OpenAI this time.

While GPT-4o was still hidden and only available to a small number of users for trial beta, Google generously announced that it would open access to the English Android version of Gemini Live to all paying users starting today, and would expand to iOS and more languages in the coming weeks.

Don’t say it, don’t say it!

More than 10 AI updates: automatic call summary, screenshot search, image editing, real-time translation....

With the Gemini model as the core, Google has also updated a large number of unique practical AI features for Pixel devices.

1. New Weather App: Uses artificial intelligence technology to improve the accuracy of weather forecasts. It can accurately predict the start and end time of rainfall and generate personalized weather reports, eliminating the trouble of checking each data one by one.

2. Call Notes: This new "Call Notes" feature will automatically generate a private summary and detailed record of the conversation after the call ends. When you need to record important information such as time and address but don't have a pen and paper, just turn on Call Notes and all text records will be saved in the call log. (To protect privacy, this feature runs completely on the local device, and both parties on the call will receive a notification when it is turned on.)

3. Pixel Screenshots: We are all used to taking screenshots on our phones to save information, but it is also a problem to go back and search through hundreds of screenshots when needed. This new app can help you easily save, organize and find screenshot information. Suppose you have a screenshot of the access code for the homestay you are about to stay in, but you can't remember it when you arrive. Open Pixel Screenshots and simply ask, it can quickly help you find the corresponding screenshot and extract the text information in the picture.

4. Pixel Studio: A new AI-powered image-making app powered by a local diffusion model running on the Tensor G4 chip and a cloud-based Imagen 3 text-to-image model. It can generate ideas, adjust styles, and create personal stickers through natural language prompts.

Pixel phones also have two new AI-driven features for photography and video:

1. Add Me: This "Add Me" feature is very interesting! As the name suggests, it can include you in the photo. Two staff members at the demonstration invited NBA star Jimmy Butler to open the camera and slide to Add Me mode. First, staff member A took a photo with Jimmy, and then under the guidance of augmented reality overlay, it was staff member B's turn to enter the picture. The result was a clear full-body photo of three people without the need for an additional photographer.

2. Magic Editor: Using generative AI technology, users can reimagine and edit photos in the Magic Editor, such as enlarging the frame, moving objects, changing the background, or even selecting a small area and asking to "add a hot air balloon."

Plus a cute Made You Look feature: Parents know how difficult it is to pose for little babies, so the Pixel 9 foldable phone has added a "Make You Laugh" option that allows them to play funny animation clips on the external screen to attract children's attention.

Continuing with the camera lens, Google has made four AI improvements for people and scenarios with special needs:

1. Guided Frame: Designed for people with visual impairments or low vision, it helps users take great photos and selfies through voice guidance. The latest update improves object recognition, smart face filtering in group photos, and focus in complex scenes, and can be enabled directly from the camera settings.

2. Magnifier: This is a unique application for Pixel phones that uses AI to help low-vision users magnify the world around them. New features include searching for specific words in the environment, using picture-in-picture mode to view scene details, selecting the best lens to magnify, and enabling the selfie light feature to use as a mirror.

Using Magnifier to identify menus and airport information signs

3. Dual-screen mode for real-time transcription: The dual-screen mode specifically designed for foldable phones allows users to place their phones on a tabletop so that multiple people can view the real-time transcription of the conversation at the same time. This is very helpful for conversations during meetings or group dinners.

4. Real-time subtitle translation: Multi-language support, with seven new language translations, including Korean and Chinese, expanding the availability of real-time subtitles and real-time transcription, which can be used even without an Internet connection.

In addition, the Pixel smartwatch also adds new features that automatically detect sleep and enable sleep mode, help users plan runs, reflect running progress, and provide daily running suggestions. It also pioneered AI pulse detection, combining Pixel Watch 3 sensors, AI and signal processing algorithms to detect pulse loss events caused by cardiac arrest, respiratory failure, overdose, etc.

Finally, Google also revealed several projects that are currently underway and will be released soon:

One is Project Astra, which was introduced at the I/O conference. It uses the camera to display the surrounding environment and interact with Gemini. Its actual application will first be implemented in Gemini Live, becoming a more agent-like all-round AI assistant.

The other is Gemini Research. This feature is designed to help people perform more advanced reasoning, planning, and memory by creating multi-step research plans, integrating complex information from the Internet, and automatically generating well-structured research reports. It is expected to be launched to advanced users in a few months, which will greatly improve research efficiency and save time and effort.

Google launches four Pixel 9 phones, Buds Pro 2 headphones, and a smartwatch to round out its AI hardware lineup

All of the above AI functions, including the upgraded Gemini assistant and various freshly launched AI applications, will be integrated into the new AI hardware products launched by Google and meet consumers.

The full range of features and prices are summarized below:

Pixel 9 series phones

Google released four Pixel 9 series phones: Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL and Pixel 9 Pro Fold. All are equipped with the latest Google Tensor G4 chip, supporting various AI performance enhancements.

Image credit: Sam Rutherford / Engadget

• Pixel 9: 6.3-inch Actua display, 12GB memory, 50MP main camera and 48MP ultra-wide-angle camera on the rear, 10.5MP front camera. Starting at $799, available in four colors: Obsidian Black, Porcelain White, Holly Green and Peony Pink.

• Pixel 9 Pro: 6.3-inch Super Actua display, 16GB memory, 42MP front camera, triple rear camera (50MP main camera, 48MP ultra-wide angle and 48MP telephoto). Starting at $999, available in Obsidian Black, Porcelain White, Hazelnut and Rose.

• Pixel 9 Pro XL: Equipped with a 6.8-inch Super Actua display, starting at $1,099, the memory, design configuration and color options are the same as the Pro.

• Pixel 9 Pro Fold: A foldable phone with the largest display ever made by Google and the thinnest foldable phone. 16GB of memory and a camera configuration similar to last year's Pixel Fold. Starting at $1,799.

All four phones offer up to seven years of OS and security updates, enhancing durability and user experience.

Pixel Buds Pro 2 wireless earphones

Google has launched a new generation of wireless headphones with improved sound quality and connectivity. Users can talk to Gemini without a mobile phone by wearing Pixel Buds Pro 2.

Pixel Watch 3

Available in two sizes (41mm and 45mm), the screen is larger and more Google ecosystem features are integrated, such as Nest camera and doorbell video streaming, Google TV remote control function, offline Google Maps, etc. The watch also provides AI-driven exercise suggestions and a battery life of up to 24 hours, which can be extended to 36 hours with power saving mode turned on.

Overall, Google's release this time is quite sincere.

As early as a few days ago, the official Twitter account responded to netizens' expectations by saying: "We just don't want to hide anymore!"

Today, Google not only brought the new Pixel 9 series hardware products, but more importantly, demonstrated the actual application of its own generative AI technology. From the smarter Gemini assistant to various AI functions that improve daily convenience, Google obviously wants to slap some "futures players" in the face with actions: AI should not be just a slogan, but should be deeply integrated into all aspects of life, so that users can truly enjoy a more efficient and smarter experience in daily use.

Unveiling the app before Apple's fall conference not only helps Gemini win more news cycles, but also provides more time for Gemini Assistant to improve. The actual performance will depend on user feedback after it is launched on the market.

With OpenAI's recent talent loss and declining reputation, Google may really be ready to strengthen its Android market layout across the board and fight a head-on turnaround battle with Apple.

news

Gemini is on the phone! Google's Pixel 9 cashes in on the futures issued by Apple and OpenAI

Introduction

My contact information