news

holographic ar glasses are here! zuckerberg unboxed them, huang renxun was among the first to try them out, and the llama 3.2 large model was released

2024-09-26

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

author | cheng qianzer0

editor | heart

zhidongxi reported on september 26 that at 1:15 am beijing time today, the annual "mr circle spring festival gala" meta connect 2024 conference officially kicked off.

meta ceo mark zuckerberg took the stage in a black t-shirt and released theQuest 3Sheadset,Llama 3.2large model,ray-ban smart glassesholographic ar glasseswaiting for new products.

the first new hardware product that zuckerberg unveiled wasQuest 3S, shocking price$299.99(equivalent to approximately rmb 2,110).

although its performance is slightly inferior to meta's first consumer-grade mr all-in-one machine, quest 3, its starting price is$200 off, which is approximately equal to that of apple vision pro1/11, a perfect price/performance ratio machine!!!

a new big model! meta multimodal modelLlama 3.2released, including 90b and 11b parametersvisual large language model, and 1b and 3b parameterslightweight plain text model

with llama 3.2,Meta AIa new multimodal feature was introduced, which supports voice interaction with multiple voice options (including some celebrity voices). zuckerberg demonstrated live voice chat with meta ai, which was very smooth.

there are also what meta calls the most advanced ar glasses ever -holographic ar glasses“Orion”

nvidia founder and ceo jen-hsun huang has already tried it.

01.

quest 3s: affordable and affordable version, priced at $300.

performance is almost the same as quest 3

first of all, the affordable version of the quest device is here!

meta directlyQuest 3Sthe price has been cut by $200 (equivalent to about rmb 1,406), and the performance is almost the same as quest 3.

meta quest 3s 128gb version is priced at$299.99(equivalent to about rmb 2110), the 256gb version is priced at$399.99(equivalent to about rmb 2,813). the 512gb quest 3 isusd 499.99(equivalent to approximately rmb 3,516).

the two headsets use the same processor,qualcomm snapdragon xr2 gen 2 chipthe key to the significant price drop is that quest 3s replaces the pancake lens withinfinite lens

judging from the on-site demonstration, the quest 3s with a 4k display has a very clear display and also supports dolby atmos surround sound.

Metarebuilt horizon osto achieve spatial computing, it can better support users' use of basic 2d applications such as youtube, facebook and instagram.

meta addedspatial audio, and improvedpassthroughthe contrast and color make the picture presentation more realistic and immersive.

zuckerberg announced that meta is working withmicrosoftcooperation, useWindows 11 PCfor a seamless virtual desktop experience.

meta already offers multi-screen support and the ability to interact directly with what’s happening on the display, so for example, you can drag an interface from your laptop directly to the quest device.

in order to create a more realistic metaverse environment, meta launchedHyperscape, users only need to scan the room they are in with their mobile phone, and then they can "recreate" the room at any time by putting on the headset.

this headset allows you to watch concerts from a front row seat, watch high-definition blockbusters in a home theater, exercise, and more.

additionally, quest 3s is compatible with meta’s library of thousands of apps and games, as well as upcoming quest 3 and 3s exclusives like batman: arkham shadow.

for those who are new to xr or have been waiting for the quest and quest 2 devices to drop in price, the quest 3s may be a better choice.

02.

llama 3.2: the visual model surpasses gpt-4o mini,1b end-side model comparable to gemma

in terms of open source ai, meta released a new multimodal large modelLlama 3.2

llama 3.2 has two parameter specifications of 90b and 11b visual language models, as well as 1b and 3b lightweight plain text models that can run locally on the device, including pre-trained and instruction-adjusted versions.

download address:https://www.llama.com/

1b and 3b model support128K tokenscontext, adapted for qualcomm and mediatek hardware, and targetedarm processorsoptimization has been made.

3b modelit outperforms gemma 2 2.6b and phi 3.5-mini models on tasks such as following instructions, summarizing, fast rewriting, and tool usage.1b modelthe performance is comparable to gemma.

90b and 11b visual modelsit is a direct replacement for its corresponding text model, and outperforms closed models such as claude 3 haiku and gpt-4o mini on image understanding tasks.

for example, if you ask a company which month had the highest sales last year, llama 3.2 can reason based on available charts and provide an answer quickly.

it can also use maps to reason and help answer questions, such as the distance of a specific path marked on the map.

vision models can also help tell the story by extracting details from images, understanding the scene, and then crafting a sentence or two as an image caption.

unlike other open multimodal models, both pre-trained and aligned models can be fine-tuned for custom applications using torchtune and deployed locally using torchchat.

multimodal models with 11b and 90b parameters requirenew model architecture supporting image inference

meta’s training pipeline consists of multiple stages, starting with a pre-trained llama 3.1 text model, first adding an image adapter and encoder, then pre-training on large-scale noisy data, and then training on medium-scale high-quality in-domain and knowledge-augmented data.

in the later stages of training, meta uses a similar approach to the text model, with multiple rounds of alignment in supervised fine-tuning, rejection sampling, and direct preference optimization. the result is a set of models that can simultaneously receive image and text cues and deeply understand and reason about the combination of the two.

for the 1b and 3b parameter lightweight models, meta uses an approach that leverages powerful teacher models to create smaller models with better performance, making them the first high-performance lightweight llama models that can be efficiently adapted to devices.

meta works by reducing the size of llama’s existing model while recovering as much knowledge and performance as possible, using a one-time approach using structured pruning from llama 3.1 8b.

in the post-training, the researchers used a similar approach as llama 3.1 to generate the final chat model by performing several rounds of alignment based on the pre-trained model.

meta is sharing the first official llama stack release, which will greatly simplify how developers use llama models in different environments such as single node, local, cloud, and on-device, enabling turnkey deployment of tool-supported applications with retrieval enhanced generation (rag) and integrated security.

03.

meta ai: choose from a variety of celebrity voices.

p-picture and real-time translation are more convenient

with llama 3.2,meta ai has a voice

now, you can use voice to talk to meta ai, and let it answer your questions or tell you jokes to make you happy. meta has also added many familiar ai voices to the voice, such as the voice of british actress judi dench.