frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

AI isn't "just predicting the next word" anymore

https://stevenadler.substack.com/p/ai-isnt-just-predicting-the-next
8•gmays•1h ago

Comments

anishgupta•1h ago
it's getting way better and we've to acknowledge how far we've come in last 4 years. Interestingly, one of the key examples of this is within vs code. AI is able to predict the next world not based on the generic trained data, but in the context of the repo (while manually editing)
i7l•1h ago
* LLMs, not AIs. AI has mostly never been about predicting the next words only.
teom•1h ago
I mean the article pretty much confirms that ai is basically just predicting the next word.

It works well and can be used for a lot of things, but still.

boredatoms•52m ago
This article needs to be put through a summarizer
xscott•49m ago
For or against, I don't know why the "just predicting" or "stochastic parrots" criticism was ever insightful. People make one word after another and frequently repeat phrases they heard elsewhere. It's kind of like criticizing a calculator for making one digit after another.
anon373839•32m ago
It isn’t a criticism; it’s a description of what the technology is.

In contrast, human thinking doesn’t involve picking a word at a time based on the words that came before. The mechanics of language can work that way at times - we select common phrasings because we know they work grammatically and are understood by others, and it’s easy. But we do our thinking in a pre-language space and then search for the words that express our thoughts.

I think kids in school ought to be made to use small, primitive LLMs so they can form an accurate mental model of what the tech does. Big frontier models do exactly the same thing, only more convincingly.

Nevermark•23m ago
Levels of algorithm people confuse:

1. Model architecture. Calculation of outputs from inputs.

2. Training algorithm, alters parameters in the architecture based on training data, often input, outputs vs. targets, but can be more complex than that.

3. The class of problem being solved, i.e. approximation, prediction, etc.

4. The actual instance of the problem being solved, i.e. approximation of chemical reaction completion vs. temperature, or prediction of textual responses.

5. The embodiment of the problem, i.e. the actual data. How much, how complex, how general, how noisey, how accurate, how variable, how biased, ...?

6. The algorithm that is actually learned from (5) in the form of (3), in order to perform (4), which has no limit in complexity, or sub-problems, which must be solved for successful results.

Data can be unbounded in complexity. Therefore, actual (successful) solutions are necessarily unbounded in complexity.

The "no limit, unbounded" part of (6) is missed by many people. To perform accurate predictions, of say the whole stock market, would require a model to learn everything from economic theory, geopolitics, human psychology, natural resources and their extraction, crime, electronic information systems and their optimizations, game theory, ...

That isn't a model I would call "just a stock price predictor".

The misconception that training a model to predict, creates something that "just" predicts is prevalent but ... well I struggle for words to describe how deeply ignorant, wrong, category-violating that misconception is.

Human language is an artifact created by complex beings. A high level of understanding of how those complex beings operate in conversation, writing, speeches, legal theory, .... on and on ... their knowledge, their assumptions, their modeling of each other in their interactions, ... on and on ... becomes necessary to mimic general written artifacts between people even a little bit.

LLM's, at the point of being useful, were never "just" prediction machines.

I am astonished there were technical people still saying such a thing.

furyofantares•9m ago
It never was "just predicting the next word", in that that was always a reductive argument about artifacts that are plainly more than what the phrase implies.

And also, they are still "just predicting the next word", literally in terms of how they function and are trained. And there are still cases where it's useful to remember this.

I'm thinking specifically of chat psychosis, where people go down a rabbit hole with these things, thinking they're gaining deep insights because they don't understand the nature of the thing they're interacting with.

They're interacting with something that does really good - but fallible - autocomplete based on 3 major inputs.

1) They are predicting the next word based on the pre-training data, internet data, which makes them fairly useful on general knowledge.

2) They are predicting the next word based on RL training data, which causes them to be able to perform conversational responses rather than autocomplete style responses, because they are autocompleting conversational data. This also causes them to be extremely obsequious and agreeable, to try to go along with what you give them and to try to mimic it.

3) They are autocompleting the conversation based on your own inputs and the entire history of the conversation. This, combined with 2), means you are, to a large extent, talking yourself, or rather something that is very adept at mimicing and going along with your inputs.

Who, or what, are you talking to when you interact with these? Something that predicts the next word, with varying accuracy, based on a corpus of general knowledge plus a corpus of agreeable question/answer format plus yourself. The general knowledge is great as long as it's fairly accurate, the sycophantic mirror of yourself sucks.

We can't have nice things because of AI scrapers

https://blog.metabrainz.org/2025/12/11/we-cant-have-nice-things-because-of-ai-scrapers/
1•LorenDB•17s ago•0 comments

Is it a joke?

https://novalis.org/blog/2025-11-06-is-it-a-joke.html
3•luu•1m ago•0 comments

Roadmap for using transcranial ultrasound to learn more about consciousness

https://news.mit.edu/2026/new-tool-could-tell-us-how-consciousness-works-0112
1•jjoe•1m ago•0 comments

An Archaeology of Tracking on Government Websites

https://www.flux.utah.edu/paper/singh-pets26
3•luu•2m ago•0 comments

Small Kafka: Tansu and SQLite on a Free T3.micro (AWS Free Tier)

https://blog.tansu.io/articles/broker-aws-free-tier
2•enether•3m ago•0 comments

Ford CEO Jim Farley Says Physical Buttons Still Being Figured Out

https://fordauthority.com/2026/01/ford-ceo-jim-farley-says-physical-buttons-still-being-figured-o...
1•bookofjoe•4m ago•0 comments

JVM Rainbow – Mixing Java Kotlin Scala Clojure and Groovy

https://github.com/Hakky54/java-tutorials/tree/main/jvm-rainbow
1•hakky54•4m ago•1 comments

Firefox DevTools Hides Unreferenced CSS Variables – Stefan Judis Web Development

https://www.stefanjudis.com/notes/firefox-devtools-unreferenced-css-variables/
2•janandonly•5m ago•0 comments

Categorical Foundations for Cute Layouts

https://arxiv.org/abs/2601.05972
1•zvr•6m ago•1 comments

Hegseth Wants to Integrate Grok into Pentagon Networks

https://arstechnica.com/ai/2026/01/hegseth-wants-to-integrate-musks-grok-ai-into-military-network...
2•zelon88•6m ago•0 comments

Giving coding agents situational awareness (from shell prompts to agent prompts)

https://dave.engineer/blog/2026/01/agent-situations/
1•dave1010uk•6m ago•1 comments

Running Lean at Scale

https://harmonic.fun/news#blog-post-lean
4•eab-•7m ago•0 comments

Atheist's Wager

https://en.wikipedia.org/wiki/Atheist%27s_wager
1•olalonde•7m ago•0 comments

Could police crackdowns help criminal networks?

https://phys.org/news/2025-12-police-crackdowns-criminal-networks.html
1•PaulHoule•8m ago•0 comments

A brilliant warning about the gamification of everyday life

https://www.theguardian.com/books/2026/jan/06/the-score-by-c-thi-nguyen-review-a-brilliant-warnin...
1•herbertl•9m ago•0 comments

2026 Internet Blackout in Iran

https://en.wikipedia.org/wiki/2026_Internet_blackout_in_Iran
2•pykello•10m ago•0 comments

Epic accuses Health Gorilla of improperly accessing medical records

https://www.channel3000.com/news/epic-accuses-silicon-valley-based-network-of-allowing-improper-a...
1•primitivesuave•10m ago•0 comments

Meta Unveils Nuclear-Power Plan to Fuel Its AI Ambitions

https://www.wsj.com/tech/ai/meta-unveils-sweeping-nuclear-power-plan-to-fuel-its-ai-ambitions-65c...
1•gmays•10m ago•0 comments

We built the "Excel of Finance Apps," but the growth isn't there. Next move?

1•caoxhua•10m ago•3 comments

We Synchronize .NET's Virtual Monorepo

https://devblogs.microsoft.com/dotnet/how-we-synchronize-dotnets-virtual-monorepo/
1•jayd16•11m ago•0 comments

Navigating the volatile silicon market: updates on memory and storage pricing

https://frame.work/de/en/blog/updates-on-memory-pricing-and-navigating-the-volatile-memory-market
1•layer8•12m ago•0 comments

Premeditatio Malorum and Anxiety

https://jondeaton.github.io/post/premeditatio_malorum/
1•semiinfinitely•14m ago•0 comments

How does Golang know time.Now?

https://tpaschalis.me/golang-time-now/
1•fanf2•15m ago•0 comments

Bevy 0.18

https://bevy.org/news/bevy-0-18/
2•_han•15m ago•0 comments

StudentRisk AI – Predicting student dropout and wellbeing using AI and analytics

https://studentrisk.admnwizard.com/dashboard
1•sureshwin•17m ago•1 comments

Show HN: DeepFace now supports DB-backed vector search for face recognition

https://sefiks.com/2026/01/01/introducing-brand-new-face-recognition-in-deepface/
1•serengil•20m ago•0 comments

Give the Internet an Infinite Word Search and the Internet Will Draw Dicks on It

https://gizmodo.com/give-the-internet-an-infinite-word-search-and-the-internet-will-draw-a-dick-o...
1•yathern•21m ago•0 comments

Mailchimp Free Plan Now Supports Only 250 Contacts

https://blog.groupmail.io/mailchimp-free-plan-changes-2026/
2•apparent•26m ago•1 comments

Ask HN: Fastest Security Event Info Channels?

2•timnetworks•26m ago•0 comments

Why doesn't Google Maps show events?

https://tommaso-girotto.co/blog/a-universal-events-app
1•tgirotto•29m ago•2 comments