I suspect that changing the underlying model to Gemini 2.5 Pro would produce better transcripts, but right now there's no way of determining what model is being used.
The TTS is amazing, but the audio overviews are frankly useless for me.
That said Google is screwing the pooch as usual by trying to make it another walled garden. Slap an API on NoteboolLM already! The market research has already been done - there’s even an unofficial API https://www.reddit.com/r/notebooklm/comments/1eti9iz/api_for...
The LLM built into YouTube is one of the few LLM chatbots bolted onto existing apps that I actually find useful. Not just for summaries but questions like "what is the timestamp in this 2 hour video where they talk about _____".
wow I gave up searching specific timestampos of long videos before. Never again.
Thank you!
If you're not already a premium subscriber you may want to stick with other tools. I didn't mean to unintentionally advertise YouTube Premium:)
https://open.substack.com/pub/lawsen/p/notebooklm-podcasts-b...
Generate a deep technical briefing, not a light podcast overview. Focus on technical accuracy, comprehensive analysis, and extended duration, tailored for an expert listener. The listener has a technical background comparable to a research scientist on an AGI safety team at a leading AI lab. Use precise terminology found in the source materials. Aim for significant length and depth. Aspire to the comprehensiveness and duration of podcasts like 80,000 Hours, running for 2 hours or more.
- List the characters in chapter [x] and add a small description about each one. - What's [x] device used for? - What happened in chapter [x]?
It works very well without hallucinations and referencing all the answers.
If I encounter a paper that is too difficult for me to digest just by reading, then I take a step back, feed it into NotebookLM, and listen to that summary. I've only done this a few times, but so far it hasn't failed to give me the overview and momentum that I need to take another stab and successfully dig into the paper and digest it on my own.
As others have noted, it can gloss over certain details and miss important points from time to time, but overall it does a fantastic job of giving me an introduction to a complex topic and making it far less indimidating / overwhelming.
I can say, "Hey, NotebookLM, explain the difference between feature X and feature Y to me," or, "How do I configure Z to work the way we want?" And while the answers still kinda suck because the documentation is pretty shitty, it's way faster than digging through the PDFs. And it cites the PDFs so I can (with some trouble) find the actual documentation in the PDF if I need it.
The worst part of it is that it only accepts 50 PDFs at once.
Honestly, though, the best use for it I've seen was when my GM added the PDF rulebooks to our TTRPG to NotebookLM. We were then able to ask NotebookLM rules questions, and it would answer us pretty well. That's what it's really great for.
I don't care about the audio features at all. The first thing I do is close the audio pane.
The podcast thing is more a novelty to me.
Is there an easy way to simply have text read to me unaltered?
https://www.youtube.com/watch?v=K3pYZwol6Dc&t=73s
Transcript of the fridge scene:
Fridge (after a bar code was scanned): "Ah, there we go."
Gilfoyle: "It's bad enough that it has to talk. Does it need fake vocal ticks like 'uh'."
Dinesh: "Well it just makes it sound more human."
Gilfoyle: "Humans are shit. This thing is addressing problems that don't exist. It's solutionism at its worst. We are dumbing down machines that are inherently superior."
I would like to have a Gilfyole mode for NotebookLM where the machine answers only with cold precision instead of endless "Mmmhmm", "Yeah!", "Amazing!", "That's so cool!".To generate them, we’ve scanned the physical book pages, and then with a simple Python script fed the images into GCP’s Document AI to extract the text en-masse, and concatenated the results together into a text-only version of the chapter. Give that text to NotebookLM and run with it.
One thing I'll note is they only cover the "high level" aspects. No depth. I'd recommend them for someone who is either already very knowledgeable or for someone not at all knowledgeable who is looking for an overview before they plan to do deeper learning/studying through reading.
Yep. This is what I have used them (sparingly) for — a scaffold to build the deeper learning onto. My brain struggles to retain information when it doesn’t have a high-level understanding of how/why a system works and how individual parts connect and interact, even if it is all eventually revealed later.
It's good to get the big picture about the discussion with 300+ comments.
The last one I listened to one host would repeat a keyword or phrase the other host had just said for emphasis — except they did incessantly — with multiple words in every sentence for many sentences in a row.
So true.
Of course one can invest more in better authenticity but for what it is, I believe it is a good bang for effort..
Also, if you listen to it for a while, and get over the initial cringe, it becomes enjoyable, at least for me. Some visitors even asked if it was Ai generated. lol
Excited and frightened about the future where its more a real. This was a cool comparison I came across recently [2]
Interestingly I saw today the Descripts Avatars are made to sound and look non-realistic on purpose to avoid I guess all kind of issues, but they claim they want to leave something authentic on the table for real content. Which I think is a good move..
[1] - https://resonancy.io/case-studies/flava-process-digitization [2] - https://yummy-fir-7a4.notion.site/dia
HN is the worst place to get product feedback (and I'm sure the NotebookLM team has internal metrics that validates their approach)
That brief TTS-like moment was the only time I was reminded that the voices were not human.
https://notebooklm.google.com/notebook/c36ea335-6686-474d-bf...
The fumu fumu is at 01:50.
The podcast is about the impact of AI on higher education in Japan. I prompted NotebookLM briefly in Japanese about the topic, and it collected ten sources in Japanese and English that it used as the basis for the audio overview.
I don't use it a lot, but it's useful when you want to have an engaging audio interface to long (50p+) reports, which you wouldn't normally read because it's not your area of expertise or you don't have time, but you can listen while doing some cardio or chores.
You can use Hacker Podcadt to compare
anyfactor•11h ago
riffic•11h ago
jszymborski•10h ago
crazygringo•8h ago
You have to scroll down a couple pages' worth before you even realize this might be SO long you need to collapse it. So then you've got to scroll back UP a couple pages, find the teensy [-] link...
It's enough to just post the link to the list of languages. The list itself doesn't belong in a comment here, when it's that long.
riffic•7h ago
behnamoh•8h ago
I'm glad the name of my native language is written correctly. In many cases, people say "Farsi", which is offensive to many Iranians because it's the Arabic version of the word "Parsi" (unlike Persian, Arabic doesn't have "p", "g", "ch", "zh").
It's like someone calling English "Anglaise" because that's how the French say it.
PS: Contrary to common belief, Persian and Arabic are totally different languages, though they have borrowed words from one another (think English and French). Persian is an Indo-European language whereas Arabic is Aramaic (same roots as Hebrew).
myth_drannon•8h ago
And TIL I learned that Aramaic replaced Hebrew in Judea because the Persian Empire maintained Aramaic as the official administrative language, and Jews brought it back, coming back from the Babylonian captivity.
FlyingSnake•8h ago
Wikipedia says Farsi should be avoided in Western languages, but what about others? Persian is called Farsi in Indian subcontinent due to the deep historical connections we share. We have proverbs saying Farsi is the sign of a learned person etc.
crazygringo•8h ago
That is the case for some other languages, though. We call the language German rather than Deutsch because Germani was the Latin name for tribes in the area, for example.
Or native names get modified too -- in English we don't call it Espanish, just Spanish, even though it comes from español.
The names of languages in other languages tend to get modified in tons of different and random ways for lots of reasons. Is there really a reason to take offense at it?
It doesn't bother me that Italians call me an americano instead of an American. It's just a letter change. So why is it so bothersome that it's called Farsi rather than Parsi? Can't the change from "p" to "f" be seen as an interesting historical quirk, due to the fascinating effect of Arabic on European languages in the Middle Ages? At the same time that we got Arabic words like "algebra" and "alcohol"?
omneity•5h ago
I’m also quite curious about the sounds of “ch” and “zh” which exist in Arabic as ش and ج, or did you mean something else?
behnamoh•5h ago
"zh" is written as "ژ" in Persian (sounds like bourgeoisie in French).
omneity•5h ago
In Arabic[1], there are two close phonemes: `/dʒ/` for `ج` and `/ʃ/` for `ش`
The difference in both phonemes is minimal and are practically affricates[2] of each other (where `d` or `t` can precede a `ʒ` or a `ʃ`), so it seems these sounds are present in both Arabic and Persian.
These variations are also within the dialectal distribution of either languages. For example `ج` is pronounced `/dʒ/` in Algeria and `/ʒ/` in Morocco.
0: https://en.wikipedia.org/wiki/Help:IPA/Persian
1: https://en.wikipedia.org/wiki/Help:IPA/Arabic
2: https://en.wikipedia.org/wiki/Affricate