Just because the author was unable to wrangle LLM to do novel research doesn't mean that it's impossible. We already have examples of LLMs either doing or aiding significantly with novel research.
I'm also a researcher and agree wholeheartedly with the article. LLMs can maybe help you sift through existing literature or help with creative writing, at most they can be used or background research in hypothesis generation by finding pairs of related terms in the literature which can be put together into a network of relationships. They can help with a few tasks suitable for an undergrad research assistant.
> we obtain the first statistically significant conclusion on current LLM capabilities for research ideation: we find LLM-generated ideas are judged as more novel (p < 0.05) than human expert ideas while being judged slightly weaker on feasibility.
It's a bit better than just finding related pairs. And that's with sonnet 3.5 which is basically ancient at this point.
Pretty much what I would expect. The paper also seems to be doing exactly what I described, I don't understand how the technique is better than that?
The article says:
> Yet, every time I tried to get LLMs to perform novel research, they fail because they don’t have access to existing literature on the topic.
You say:
> LLMs can maybe help you sift through existing literature
> they can be used or background research in hypothesis generation by finding pairs of related terms in the literature
As far as I can see, these two positions are mutually exclusive. Aren’t you disagreeing with the article?
Because AI can inadvertently say nasty things. This could potentially damage the company image.
It may very well be the case that Apple too finds themselves pressured into going all out on LLM
Besides my stance that LLMs can serve specific tasks very well and are likely going to take a place similar to spreadsheets and databases over the coming years, hasn’t Apple already? Rarely has Apple tried to appear so unified on one goal across their product stack as they did with Apple Intelligence, the vast majority of which is heavily LLM focused.
The Author appears to fully skip over their attempt and subsequent failure, which made the entire point the piece is trying to further rather unsubstantiated and made me check whether this wasn’t posted in 2022, even more for someone like myself who also is very confident that there is a large chasm between LLMs and whatever AGI may end up being.
But its not copying it. That is the entire point. Its using the training data to adjust floating point numbers. If you train on a single data piece over and over again, then yes it can replicate it, just like you can memorize lines of a school play, but its still not copied/compressed in the traditional, deterministic sense.
You can't argue "we don't know how they work, or our own brains work with any certainty" and then over-trivialize what they do on the next argument.
People suffer brain damage and come out the other side with radically different personalities. What happened to "qualia", or "sense of self", where is their "soul". Its just a mechanistic emergent property of their biological neural network.
Who is to say our brains aren't just very high parameterized biological floating point machines? That is the true Occam's Razor here, as uncomfortable as that might make people.
I believe it's quite possible that what is happening during training is in certain ways similar to what is happening to a child learning the world, although there are many practical differences (and I don't even mean the difference between human neurons and the ones in a neural network).
Is there anything to feel uncomfortable about? It's been a long time since people started discussing the concept of "a self doesn't exist, we're just X" where X was the newest concept popular during that time. I'm 100% sure LLMs are not the last one.
(BTW as for LLMs themselves, there are still two big engineering problems to solve: quite small context windows and hallucinations. The first requires a lot of money to solve, the second needs special approaches and a lot of trial and error to solve, and even then the last 1% might be almost impossible to get working reliably.)
Humans mis-remember and make up things all the time, completely unintentionally. It could be a fundamental flaw in large neural networks. Impressive data compression and ability to generalize, but impossible to make "perfect".
If AI becomes cheap and fast enough, its likely a simple council of models will be enough to alleviate 99% of the problem here.
AI is one in a long long long line of new technologies. It is generating a lot of investment, new corporate processes and directives, declarations like "new era" and "civilizational milestone," etc.
If someone thinks any of the above are wrong or misguided, it's a mistake to "blame" or look to AI as the primary cause.
The primary cause is our system: humans are actors in the US economic system and when a new technology is rolling out, usually the response is the same and differs only in magnitude.
Don't hate the player, hate the game.
So without further ado:
* If LLMs can indeed produce wholly novel research independently without any external sources, then prove it. Cite sources, unlike the chatbot that told you it can do that thing. Show us actual results from said research or products that were made from it. We keep hearing these things exponentially increase the speed of research and development but nobody seemingly has said proof of this that’s uniquely specific to LLMs and didn’t rely on older, proven ML techniques or concepts.
* If generative AI really can output Disney quality at a fraction of the cost, prove it with clips. Show me AI output that can animate on 2s, 4s, and 1s in a single video and knows when to use any of the above for specific effects. Show me output that’s as immaculate as old Disney animation, or heck, even modern ToonBoom-like animation. Show me the tweens.
* Prove your arguments. Stop regurgitating hypeslop from CEBros, actually cite sources, share examples, demonstrate its value relative to humanity.
All people like us (myself and the author) have been politely asking for since this hype bubble inflated was for boosters to show actual evidence of their claims. Instead, we just get carefully curated sizzle reels and dense research papers making claims instead of actual, tangible evidence that we can then attempt to recreate for ourselves to validate the claims in question.
Stop insulting us and show some f*king proof, or go back to playing with LLMs until you can make them do the things you claim they can do.
echelon•1h ago
I see the claims being levied against LLMs, but in the generative media world these models are nothing short of revolutionary.
In addition to being an engineer, I'm also a filmmaker. This tech has so many orders of magnitude changes to the production cycle:
- Films can be made 5,000x cheaper (a $100M Disney film will be matched by small studios on budgets of $20,000.)
- Films can be made 5x faster (end-to-end, not accounting for human labor hour savings. A 15 month production could feasibly be done in 3 months.)
- Films can be made with 100x fewer people. (Studios of the future will be 1-20 people.)
Disney and Netflix are going to be facing a ton of disruptive pressure. It'll be interesting to see how they navigate.
Advertising and marketing? We've already seen ads on TV that were made over a weekend [1] for a few thousand dollars. I've talked to customers that are bidding $30k for pharmaceutical ad spots they used to bid $300k for. And the cost reductions are just beginning.
[1] https://www.npr.org/2025/06/23/nx-s1-5432712/ai-video-ad-kal...
suddenlybananas•1h ago
I do not believe this is true.
Palomides•1h ago
kaonwarb•1h ago
tovej•1h ago
amelius•1h ago
viraptor•1h ago
How does this work? If the quality ads are easier to produce, wouldn't there be more competition for the same spot with more leftover money for bidding? Why would this situation reduce the cost of a spot?
maxbond•1h ago
JimDabell•1h ago
> Using AI-powered tools, they were able to achieve an amazing result with remarkable speed and, in fact, that VFX sequence was completed 10 times faster than it could have been completed with traditional VFX tools and workflows
> The cost of [the special effects without AI] just wouldn’t have been feasible for a show in that budget
— https://www.theguardian.com/media/2025/jul/18/netflix-uses-g...
Discussed on Hacker News here: https://news.ycombinator.com/item?id=44602779
Cthulhu_•1h ago
In theory (idk it probably exists already) you can generate a script and feed it into an AI that generates a film. Novelty aside, who is going to watch it? And what if you generate a hundred films a day? A thousand?
This probably isn't a hypothetical scenario, as low-effort / generated content is already a thing, both writing, video and music. It's an enormous long tail on e.g. youtube, amazon, etc, relying on people passively consuming content without paying too much attention to it. The background muzak of everything.
As someone smarter than me summarized, AI generated stuff is content, not art. AI generated films will be content, not art. There may be something compelling in there, but ultimately, it'll flood the market, become ubiquitous, and disappear into the background as AI generated background noise that only few people will seek out or watch intentionally.