2024-09-30
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
recommended by the great god kapasi wall crack!
even predicted thisai application, it is possible to open up "an opportunity as big as chatgpt."
it is an experimental ai product from google,Notebook LM, powered by gemini 1.5 pro, google’s most powerful model right now.
recently, this app has become as popular as it gets, all because of the launch of a new feature——
upload a file (text, audio, video), and ai can not only help extract key points with text, but also convert the file into an ai-generated conversation podcast through the audio overview function, and conduct discussions based on the document content.
two ais, using real-person voices and tones, passionately discussed the content of the document and made final concluding remarks.
△
kapasi entered the c code for training gpt-2 and produced a conversation podcast
this is really cool!
and kapasi isn’t the only one boasting. after browsing several major internet platforms, netizens generally agree with notebook lm.
kol @elvis from the ai industry also left a message in the kapasi comment area:
card god said that this is "reminiscent of a chatgpt moment", which is definitely not an exaggeration!
truly allowing multiple models to work together will unlock unique content formats and user experiences like notebook lm.
how to play notebook lm?
the gameplay is very simple, just open the trial page and drag and drop the files that need to be processed.
it could be a google doc, a link to a website or a video, or even just pasting a large block of text.
each notebook supports uploading 50 files, and the content of each file is limited to 500,000 words.
here we have uploaded the system card document of openai o1, and then we can choose the content that needs to be created.
built-in support for text version functions such as q&a, quizzes, table of contents, timeline, summary, etc., as well as in-depth conversation audio content between two hosts.
if you have more personalized needs, you can also type the prompt yourself.
we tried asking questions in chinese, and the result was that the ai can understand it.
it's a pity that notebook lmanswering in chinese is not supported, even if you ask for it deliberately.
if you choose to create audio, you will need to wait a few minutes to ten minutes depending on the length of the document.
take this time to learn about the gemini model behind it~
notebooklm is supported by gemini 1.5 pro, which is google’s current flagship large model.
gemini 1.5 pro supports ultra-long 128k context, which is the basis for interpreting long documents.
in a recent upgrade, gemini 1.5 pro’s mathematics and reasoning capabilities surpassed the openai o1 preview version.
okay, the audio generated just now has been processed. friends who are good at english can come and listen.
friends who are not that good in english can also take a look at the text version of the ai podcast transcribed and translated by matryoshka ai to get a feel for it.
simply uploading documents and generating content is only one of the practical ways of using notebook.
someone also introduced a method for students to record classes and use ai to sort out the key points at home, which was also widely praised.
(i don’t mean that i won’t listen to the class)
specifically, you can follow the following steps:,
use your mobile phone to record during class;
there is no need to use a computer during class, just jot down brief key points (paper and pencil);
(after class) scan the recording and notes and upload them to notebooklm, and let it expand the notes based on the recording details.
in addition, you can also create a weekly audio review of the key points of what you have learned.
an interactive paradigm different from simple chat
in fact, notebooklm did not become a hit immediately after its debut.
it had already appeared at the google i/o conference in may last year, but at that time, as an ai notebook project, it was also calledProject Tailwind。
it was not until july last year that notebooklm was changed to its current name.
at first, it was only supported for users in some areas of the united states; the functions still revolved around the basic chat mode.
△
notebooklm automatically generates documentation guide (from google official website)
on the 11th of this month, notebooklm suddenly announced that it would be open to players around the world and added major new features.audio overview。
the official introduction from google is as follows:
"the new audio overview feature turns documents, slides, charts, and more into engaging discussions with one click."
because the interaction form is very new, the ai voice is lifelike, and the discussion is really like a live podcast, everyone is having fun immediately.
as of these two days, notebook lm is not only able to use youtube videos as input, but also supports more than 100 languages.
now, kapasi’s “show of love” has added to the popularity of notebook lm.
as kapasi said, the main reason why notebook lm became so popular is that it provides an interaction paradigm that is different from simple chat.
kapasi said,notebook lm removes two major barriers to enjoyment of large models:
first, chatting is actually quite difficult.
some people struggle to communicate with others in their daily lives, let alone chatting with a chatbot, where they have to keep asking questions.
the good thing about notebooklm is that one of the two generated ai podcasts will be in the role of asking questions and guiding.
let's put the documents, audio and video in, wait for them to be generated, and meimei ting ai will chat based on the files.
secondly, reading is not easy.
in the fragmented era of information explosion, it is much easier to find a comfortable position or listen to others discussing what i need while driving than to put it on my own.
——even if we are looking at the condensed version that ai has summarized for us (hey, yes, we are just so lazy! doge).
in the spirit of striving for excellence, some netizens also expressed their expectations for notebook lm to take it to the next level.
after trying it out, yuchen jin, co-founder and cto of hyperbolic labs, summarized two limitations:
one isit "can't see", that is, the image information in the document cannot be processed.
however, the gemini behind it is multi-modal. compared with notebook lm, it will not be too late.
the other isusers cannot direct the content of ai podcasts。
yuchen jin fed it two tweets, and it generated nearly 13 minutes of audio content, but it defaulted to the general audience, so it talked about many very basic concepts.
if you can specify the target group to generate the podcast, or talk about the topic, direction, and angle, that is really an added bonus.
One More Thing
it's too late, but it's too late, the developers came up withopen source version of notebooklmgot it!
but for the time being, i can only feed it pdf.
let’s just say, humans are really interesting!
in the past, i struggled with converting audio to text, pursuing the conversion of broadcasts, conference recordings, etc. into text.
now i am starting to use large models to convert text into podcasts again...
interesting wow interesting wow (dog head).
reference links:
[1]https://notebooklm.google/
[2]https://x.com/karpathy/status/1840112692910272898
[3]https://x.com/omarsar0/status/1840145774874898506
[4]https://x.com/Yuchenj_UW/status/1840203324571943403
[5]https://github.com/gabrielchua/open-notebooklm
— over —