Just because the author was unable to wrangle LLM to do novel research doesn't mean that it's impossible. We already have examples of LLMs either doing or aiding significantly with novel research.
I'm also a researcher and agree wholeheartedly with the article. LLMs can maybe help you sift through existing literature or help with creative writing, at most they can be used or background research in hypothesis generation by finding pairs of related terms in the literature which can be put together into a network of relationships. They can help with a few tasks suitable for an undergrad research assistant.
> we obtain the first statistically significant conclusion on current LLM capabilities for research ideation: we find LLM-generated ideas are judged as more novel (p < 0.05) than human expert ideas while being judged slightly weaker on feasibility.
It's a bit better than just finding related pairs. And that's with sonnet 3.5 which is basically ancient at this point.
Pretty much what I would expect. The paper also seems to be doing exactly what I described, I don't understand how the technique is better than that?
The article says:
> Yet, every time I tried to get LLMs to perform novel research, they fail because they don’t have access to existing literature on the topic.
You say:
> LLMs can maybe help you sift through existing literature
> they can be used or background research in hypothesis generation by finding pairs of related terms in the literature
As far as I can see, these two positions are mutually exclusive. Aren’t you disagreeing with the article?
Researchers using GPT to summarize papers may be helping humans create novel research, but it certainly isn't GPT doing any such thing itself.
Because AI can inadvertently say nasty things. This could potentially damage the company image.
It may very well be the case that Apple too finds themselves pressured into going all out on LLM
Besides my stance that LLMs can serve specific tasks very well and are likely going to take a place similar to spreadsheets and databases over the coming years, hasn’t Apple already? Rarely has Apple tried to appear so unified on one goal across their product stack as they did with Apple Intelligence, the vast majority of which is heavily LLM focused.
The Author appears to fully skip over their attempt and subsequent failure, which made the entire point the piece is trying to further rather unsubstantiated and made me check whether this wasn’t posted in 2022, even more for someone like myself who also is very confident that there is a large chasm between LLMs and whatever AGI may end up being.
But its not copying it. That is the entire point. Its using the training data to adjust floating point numbers. If you train on a single data piece over and over again, then yes it can replicate it, just like you can memorize lines of a school play, but its still not copied/compressed in the traditional, deterministic sense.
You can't argue "we don't know how they work, or our own brains work with any certainty" and then over-trivialize what they do on the next argument.
People suffer brain damage and come out the other side with radically different personalities. What happened to "qualia", or "sense of self", where is their "soul". Its just a mechanistic emergent property of their biological neural network.
Who is to say our brains aren't just very high parameterized biological floating point machines? That is the true Occam's Razor here, as uncomfortable as that might make people.
I believe it's quite possible that what is happening during training is in certain ways similar to what is happening to a child learning the world, although there are many practical differences (and I don't even mean the difference between human neurons and the ones in a neural network).
Is there anything to feel uncomfortable about? It's been a long time since people started discussing the concept of "a self doesn't exist, we're just X" where X was the newest concept popular during that time. I'm 100% sure LLMs are not the last one.
(BTW as for LLMs themselves, there are still two big engineering problems to solve: quite small context windows and hallucinations. The first requires a lot of money to solve, the second needs special approaches and a lot of trial and error to solve, and even then the last 1% might be almost impossible to get working reliably.)
Humans mis-remember and make up things all the time, completely unintentionally. It could be a fundamental flaw in large neural networks. Impressive data compression and ability to generalize, but impossible to make "perfect".
If AI becomes cheap and fast enough, its likely a simple council of models will be enough to alleviate 99% of the problem here.
AI is one in a long long long line of new technologies. It is generating a lot of investment, new corporate processes and directives, declarations like "new era" and "civilizational milestone," etc.
If someone thinks any of the above are wrong or misguided, it's a mistake to "blame" or look to AI as the primary cause.
The primary cause is our system: humans are actors in the US economic system and when a new technology is rolling out, usually the response is the same and differs only in magnitude.
Don't hate the player, hate the game.
So without further ado:
* If LLMs can indeed produce wholly novel research independently without any external sources, then prove it. Cite sources, unlike the chatbot that told you it can do that thing. Show us actual results from said research or products that were made from it. We keep hearing these things exponentially increase the speed of research and development but nobody seemingly has said proof of this that’s uniquely specific to LLMs and didn’t rely on older, proven ML techniques or concepts.
* If generative AI really can output Disney quality at a fraction of the cost, prove it with clips. Show me AI output that can animate on 2s, 4s, and 1s in a single video and knows when to use any of the above for specific effects. Show me output that’s as immaculate as old Disney animation, or heck, even modern ToonBoom-like animation. Show me the tweens.
* Prove your arguments. Stop regurgitating hypeslop from CEBros, actually cite sources, share examples, demonstrate its value relative to humanity.
All people like us (myself and the author) have been politely asking for since this hype bubble inflated was for boosters to show actual evidence of their claims. Instead, we just get carefully curated sizzle reels and dense research papers making claims instead of actual, tangible evidence that we can then attempt to recreate for ourselves to validate the claims in question.
Stop insulting us and show some f*king proof, or go back to playing with LLMs until you can make them do the things you claim they can do.
Everything revolving around these LLMs so far as been tech hype culture and similar "think of the future" vibes. IMO we never see proof of this because right now it simply doesn't exist.
compare with a headline from today:
>OpenAI’s ChatGPT to hit 700 million weekly users, up 4x from last year
I don't think there was much that hype when ChatGPT launched. Just an awful lot of people using it because it's kind of cool.
The critics seem to do a certain amount of goal post moving, like saying it's doing well by having unprecedented user growth is met by - f*king prove LLMs can indeed produce wholly novel research. Is anyone actually claiming they produce wholly novel research?
There is a similar effect with the featured blog post where the guy makes some perfectly reasonable arguments why he doesn't really like LLMs and doesn't want to work on them and then instead of titling it "Why I hate LLMs" goes with "Why I hate AI." But there's some quite cool stuff in non LLM AI like AlphaFold and trying to cure diseases. If you are talking about LLMs why not put LLM in the title?
I would argue companies performing layoffs because they think LLMs can do the work of a human is practically the same thing. There was even another article posted here today saying a bunch of companies are hiring humans again to fix the terrible work LLMs are doing, so it's fair to assume C-level jerks do think LLMs can produce something to the level of novel research and are finding out LLMs can't.
I was actually thinking about it, and there could be a simple test - Remove all knowledge of X from knowledge corpus and train LLM on such corpus. Under X one can imagine anything - differential calculus, logarithms, Reimann Hypothesis, Special Theory of relativity, Fermat's theorems, ... And now ask AI questions which actually has lead to discovery of X.
If AI is able to rediscover X while not knowing about X, we can say it is proof of intelligence.
I don't think this makes any sense and suspect it requires a deep misunderstanding of “research” to even consider this an issue.
Research is inherently something done in response to, and grounded in, the external.
It was quite convincing, and I could see lower-budget studios trying to make it work. (There is a truckload of garbage tier animation on all platforms.)
The person who submitted it is an experienced producer who used something like 600 prompts to generate the end result, so it's not exactly few-shot prompting from novices with no film experience. But it happened
Then again, the astroturfing done (presumably) by big LLM is off the charts, so who knows if this was actually what happened.
echelon•6mo ago
I see the claims being levied against LLMs, but in the generative media world these models are nothing short of revolutionary.
In addition to being an engineer, I'm also a filmmaker. This tech has so many orders of magnitude changes to the production cycle:
- Films can be made 5,000x cheaper (a $100M Disney film will be matched by small studios on budgets of $20,000.)
- Films can be made 5x faster (end-to-end, not accounting for human labor hour savings. A 15 month production could feasibly be done in 3 months.)
- Films can be made with 100x fewer people. (Studios of the future will be 1-20 people.)
Disney and Netflix are going to be facing a ton of disruptive pressure. It'll be interesting to see how they navigate.
Advertising and marketing? We've already seen ads on TV that were made over a weekend [1] for a few thousand dollars. I've talked to customers that are bidding $30k for pharmaceutical ad spots they used to bid $300k for. And the cost reductions are just beginning.
[1] https://www.npr.org/2025/06/23/nx-s1-5432712/ai-video-ad-kal...
suddenlybananas•6mo ago
I do not believe this is true.
Palomides•6mo ago
kaonwarb•6mo ago
queenkjuul•6mo ago
tovej•6mo ago
amelius•6mo ago
CyberDildonics•6mo ago
viraptor•6mo ago
How does this work? If the quality ads are easier to produce, wouldn't there be more competition for the same spot with more leftover money for bidding? Why would this situation reduce the cost of a spot?
maxbond•6mo ago
JimDabell•6mo ago
> Using AI-powered tools, they were able to achieve an amazing result with remarkable speed and, in fact, that VFX sequence was completed 10 times faster than it could have been completed with traditional VFX tools and workflows
> The cost of [the special effects without AI] just wouldn’t have been feasible for a show in that budget
— https://www.theguardian.com/media/2025/jul/18/netflix-uses-g...
Discussed on Hacker News here: https://news.ycombinator.com/item?id=44602779
Karawebnetwork•6mo ago
Personally, I'm not particularly impressed. Yes, I'm impressed by the technology and the fact that we've reached a point where something like this is even possible, but in my opinion, it's soulless and suffers from the same problems as other AI videos. More emphasis was placed on length than quality, and I've seen shorter, traditionally produced videos that had more heart. That's probably because these videos were created by amateurs who thought the AI would fill in all the gaps, but that only underscores the need for human artists with a keen eye.
Cthulhu_•6mo ago
In theory (idk it probably exists already) you can generate a script and feed it into an AI that generates a film. Novelty aside, who is going to watch it? And what if you generate a hundred films a day? A thousand?
This probably isn't a hypothetical scenario, as low-effort / generated content is already a thing, both writing, video and music. It's an enormous long tail on e.g. youtube, amazon, etc, relying on people passively consuming content without paying too much attention to it. The background muzak of everything.
As someone smarter than me summarized, AI generated stuff is content, not art. AI generated films will be content, not art. There may be something compelling in there, but ultimately, it'll flood the market, become ubiquitous, and disappear into the background as AI generated background noise that only few people will seek out or watch intentionally.
echelon•6mo ago
That's not fair. Do you know how many dreamers and artists and great ideas wither away on the vine? It's tragic.
Movies are going to be like books today. And that's not a bad thing.
Distribution is always the hard part. Indie games, indie music. You've still got to market yourself and find your audience.
But the difference is that now it's possible. And you don't have to obey some large capital distributor and mind their oversight and meddling.
CyberDildonics•6mo ago
Says who?
Do you know how many dreamers and artists and great ideas wither away on the vine?
How many?
Movies are going to be like books today.
People are already creating movies with $1k cameras that look good enough to distribute. A lot of them are horror movies because of the budget and most of them are terrible with huge glaring mistakes in editing, pacing, framing, etc. but they can still make money.
And you don't have to obey some large capital distributor and mind their oversight and meddling.
How many movies have you worked on?
Most experienced executives can help guide priorities and make sure there aren't any big overlooked problems.
echelon•6mo ago
Over a dozen.
> People are already creating movies with $1k cameras
Nobody wants to make an iPhone movie. They want $200k glass optics and a VFX budget like Nolan's or Villeneuve's.
I'm tired of people from outside the profession telling us we should be happy with quaint little things. That's not how the ideal world works. In the ideal world, everyone can tell the exact story in their minds and not be beset by budget.
My imagination is my ideal world. I won't listen to naysayers, because you're so far behind my dreams.
If this wasn't about to be practical, I'd relent. But the technology works. It's tangible, usable, and practical.
I've been saying that on HN since Deep Dream. My friends in IATSE also called this crazy. It's not. It's not just coming, it's here.
> A lot of them are horror movies because of the budget
Tell me about it. Been there, done that. It sucks to be so creatively stifled when what I wanted to make as a youth were fantasy movies like Lord of the Rings.
I got creative. I did mocap and VFX. It still wasn't what I dreamed of.
> How many?
Film school attendance is over 100,000 students annually. Most of them were never able to land a high autonomy role or be able to follow though on their dreams. I know hundreds of people with this story.
> A lot of them are horror movies because of the budget and most of them are terrible with huge glaring mistakes in editing, pacing, framing, etc
Sound. Sound is the biggest thing people fuck up. But the rest are real failure cases too.
> Most experienced executives can help guide priorities and make sure there aren't any big overlooked problems.
They're not as important as you think. They're just there to mind the investment.
When the cost of production drops to $30k, this entire model falls apart.
Studios were only needed for two things: (1) distribution, (2) capital. The first one is already solved.
queenkjuul•6mo ago
What?
echelon•6mo ago
But if you're using photons, good gear is very costly.
queenkjuul•6mo ago
echelon•6mo ago
CyberDildonics•6mo ago
queenkjuul•6mo ago
UncleMeat•6mo ago
ygritte•6mo ago
echelon•6mo ago
These aren't prompted end-to-end. There's a tremendous amount of work being done.
For end-to-end, go to Show Runner AI. Or look up SpongeBob AI on YouTube.