news

Much more conscientious than OpenAI, this article summarizes the 11 highlights of Google's press conference

2024-08-14

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Just on X, a group of people were randomly sent by Strawberry BrotherOpenAIPreview: It’s a time when your mind goes crazy.

Google broughtMadeByGoogle24The press conference is coming.

In order to snipe at Google, OpenAI even posted a blog a few minutes after the press conference started. It was such a crappy blog, but it also let their AI Strawberry Brother act as the Riddler for two days.

OpenAI is now like the boy who cried wolf. I have no expectations or trust in it anymore. Every time I curse, I just need to utter those two curse words, XXX, XX!

And Google still has some extra surprises without any expectations.

I have summarized 11 highlights. After reading this article, you have also watched the press conference.

1. Google is going to rebuild Android based on Gemini.

They defined a term called AI OS, and Google wants to bring AI OS to everyone.

They currently support 45 languages ​​in more than 200 countries and regions, it can be used on hundreds of mobile phone models from dozens of device manufacturers and is supported on billions of devices around the world.

2. Gemini’s image recognition on mobile phones failed.

The first thing they demonstrated was Gemini's image recognition capabilities.

As a multimodal AI, image capture recognition seems to have become a must-have function. The most challenging part of photo recognition is information-intensive content such as shooting instructions and schedules - it not only needs to recognize the image, but also understand the text content and give the answer.

The guy who demonstrated it took a photoPaper Concert Poster, the tour schedule above. The presenter asked Gemini to look at her schedule and choose a time when she could go to see Sabrina Carpenter's show.

However, the inevitable missteps of a live demonstration came.

Gemini failed in the first two photo sessions. I felt awkward even though the Pacific Ocean was across from me. It was twice.

This demonstration also specifically mentioned that the Samsung Galaxy S24 Ultra phone was used. Could it be that Samsung has made an effort?

We urgently replaced the device on site and tried again. Fortunately, we successfully identified the image content on the third try.

Gemini gave a very specific date: Sabrina will beNovember 9, 2024Arrived in San Francisco, and the presenter had no other plans for the day, so he could go to the show.

The applause finally broke out, and I could see with my own eyes how relieved I was.

3. Cross-software interaction is very convenient.

Gemini is now able to understand and analyze video content directly on your phone.

While watching a video, you can call out Gemini to summarize the key points for you or answer your questions about the video content.

For example, after watching a food video on YouTube at night, you don’t have to identify the pictures one by one. It will automatically generate a list of foods that appear in the video and add it to the user’s personal "to be tried" list.

Good news for foodies.

And you can create a list of attractions or itinerary suggestions for some travel videos on YouTube.

As someone who needs BGM even when walking,

You can also ask Gemini to make a "K-pop playlist perfect for walking in Seoul," and it will recommend appropriate music based on the scene, mood, or type of activity described by the user.

Makes finding music more intuitive and personal.

4. The writing speed is very fast and the effect is good.

Gemini can also help you write emails on your mobile phone in just a few seconds.

The guy demonstrated two scenarios: the first was to write a polite reminder letter to the landlord, informing her to come and repair the power supply module at home.

The second one was to write an apology letter to the professor for being absent due to illness (it seemed like this guy had done this many times before).

In addition, Gemini also has an interactive design that makes it easy for users to polish text and send emails.

Seeing Gemini finish the apology letter in just a few seconds, the guy could hardly hold back his laughter.

5. Gemini Live’s real-time conversation effect is pretty good, but it is only low-latency TTS.

Google has launched a real-time conversation feature similar to GPT4o that can be interrupted at any time, which they call Gemini Live.

There are 10 tones to choose from.

The girl who gave the demonstration chatted with Gemini Live for a long time. The sound quality was good and the latency was low enough, but in fact it looked like a low-latency TTS, not a native multimodal large model like GPT4o.

Because there was no demonstration of any emotional understanding and expression, according to Google's nature, if there was any, they would definitely show it off like crazy. Also, in some of the longer answers, you can still clearly feel the delay.

So it is actually a low-latency TTS conversation.

Currently, it is only available to Gemini Advanced subscribers, costing US$20 per month, and available immediately.

6.Pixel 9 is the first phone to be equipped with multi-modal Gemini Nano.

This is the most powerful on-device AI model ever released on a phone, and is three times more powerful than the previous AI used on the Pixel 8 Pro.

The Pixel 9’s processors (TPU and Tensor G4) can generate up to 45 words in one second, which is twice as fast as before.

The regular version of Pixel 9 has 12GB of memory, while the Pro version has a larger memory of 16GB. And the most exciting thing is that they finally have the function of satellite calling. . .

All I can say here is that we are far ahead!

The products released this time include three candy-bar phones and a folding phone. The regular series includes a basic version of the Pixel 9 with a 6.3-inch display, a Pixel 9 Pro XL with a 6.8-inch screen, and a new smaller 6.3-inch Pixel 9 Pro.

To be honest, I think it's a bit ugly...

There is also a new folding screenPixel 9 Pro Fold。

Even uglier...

7.Call Notes can help you record key information during a call.

Now, Pixel's "Call Assistant" has become more powerful with the addition of the "Call Notes" feature.

After you've made a call, it gives you a completely private summary of the call, so you can easily access the phone number, time, details and other information you don't want to forget, even if you don't have a pen and paper with you during the call.

Moreover, the entire process is run locally, so there is basically no privacy issue.

The guy gave an example. He recently considered changing his hairstyle, but his barber couldn't do the hairstyle he wanted, so he recommended me to go to another barber shop.

The problem is, he forgot to write down the store's phone number. With Call Notes, he can easily trace it back.

7. The screenshot feature similar to Recall is pretty cool.

There’s a familiar scenario: You see something on your phone that you want to remember, and maybe you write it down in your mind or take a screenshot to save it.

But often, you either forget what you were trying to remember or can’t find it when you need it.

Then they made a new product.

You can use AI to quickly search all the saved pictures. For example, if you have dozens of pictures of bicycles on your phone, you can search for bicycles and all of them will appear.

You can also ask something more complex, like the price of a t-shirt, and you can see that Pixel Screenshot not only found the original image, but also provided me with an answer in natural language based on the information in the image.

8. An ordinary local AI painting Pixel Studio.

Every Pixel 9 phone comes with the new Pixel Studio, their first-ever image generator on a phone.

I think the effect is just average and usable.

For example, the beach bonfire pit at sunset feels very ordinary in effect.

9. The AI ​​camera is great for taking group photos.

Pixel camera is said to be the first AI camera.

I don’t understand most of the parameters, but this photo scene is very interesting.

Many times we can’t take a group photo, and there is always a friend who has to act as the photographer.

It uses a simple on-screen interface to guide you through taking photos, such as asking you to hand the camera to someone else so that you can swap positions. Then, you can align the people in the new photo based on their outlines in the first photo and take another photo. The resulting image will combine the two photos together, making it look like everyone is in the same photo at the same time.

It’s really awesome and solves a major pain point in taking group photos.

10. New watches and headphones.

Sent a Pixel Watch 3.

A pair of Pixelbuds Pro 2 earphones.

The headphones can wake up Gemini and talk to him at any time.

11. Project Aster, which is comparable to GPT4o.

Project Aster was unveiled at the Google conference a few months ago, directly targeting GPT4o's native multimodal large model.

Now, in Gemin Live, you can also use it in the futureAster.

For example, you can share your camera while talking to Gemini, so you can directly show a problem you're having on your calculus homework or ask for help with the next step in furniture assembly.

And, you can integrate your most frequently used apps into Gemini Live, so it can help you take actions in conversations and messages, and pull information from apps like Google Calendar.

So you can text a neighbor right from Gemini Live, share details about a business, and check your calendar all at the same time, all without having to open another app.

It’s pretty cool, a combination of GPT4o and Apple.

It’s a pity that it’s still a pie.

In terms of AI, the combination with hardware is quite interesting. Gemini Live is at least not a futures product and can be used today.

It is still much better than OpenAI, which can only make empty promises.

I hope Google gets better and better and beats OpenAI to death.

The above is all. Now that you have read this far, if you think it is good, please click like, reading, and forward. If you want to receive push notifications as soon as possible, you can also give me a star ⭐~ Thank you for reading my article, see you next time.

>/ Author: Kha'Zix, Wenwen, Xiaorui, Dawn_E