frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Markov chains are the original language models

https://elijahpotter.dev/articles/markov_chains_are_the_original_language_models
76•chilipepperhott•4d ago

Comments

allthatineed•1h ago
I remember playing with megahal eggdrop bots.

This was one of my first forays into modifying c code, trying to figure out why 350mb seemed to be the biggest brain size (32 bit memory limits and requiring a contiguous block for the entire brain).

I miss the innocence of those days. Just being a teen, tinkering with things i didn't understand.

foobarian•1h ago
I'm old now, but thanks to LLMs I can now again tinker with things I don't understand :-)
codr7•1h ago
Are you though? Or is the LLM the target of your tinkering and lack of understanding? Honest question.
jcynix•17m ago
The nice thing about LLMs is that they can explain stuff so you can learn to understand. And they are very patient.

For example I'm currently relearning various ImageMagick details and thanks to their explanations now understand things that I cut/copy/pasted a long time ago without always understanding why things worked the way they did.

vunderba•59m ago
I remember reading the source of the original MegaHAL program when I was younger - one of the tricks that made it stand out (particularly in the Loebner competitions [1]) was that it used both a backwards and forwards Markov chain to generate responses.

[1] https://en.wikipedia.org/wiki/Loebner_Prize

glouwbug•1h ago
The Practice of Programming by Kernighan and Pike had a really elegant Markov:

https://github.com/Heatwave/the-practice-of-programming/blob...

jcynix•23m ago
And Mark V. Shaney was designed by Rob Pike and posted on Usenet, but that happened a long time ago:

https://en.wikipedia.org/wiki/Mark_V._Shaney

chankstein38•1h ago
I once, probably 4-6 years ago, used exports from Slack conversations to train a Markov Chain to recreate a user that was around a lot and then left for a while. I wrote the whole thing in python and wasn't overly well-versed in the statistics and math side but understood the principle. I made a bot and had it join the Slack instance that I administrate and it would interact if you tagged it or if you said things that person always responded to (hardcoded).

Well, the responses were pretty messed up and not accurate but we all got a good chuckle watching the bot sometimes actually sound like the person amidst a mass of random other things that person always said jumbled together :D

vunderba•1h ago
I had a similar program designed as my "AWAY" bot that was trained on transcripts of my previous conversations and connected to Skype. At the time (2009) I was living in Taiwan so I would activate it when I went to bed to chat with my friends back in the States given the ~12 hour time difference. Reading back some of the transcripts made it sound like I was on the verge of a psychotic break though.
ahmedhawas123•46m ago
Random tidbit - 15+ years ago Markov chains were the go to for auto generating text. Google was not as advanced as it is today at flagging spam, so most highly affiliate-marketing dense topics (e.g., certain medications, products) search engine results pages were swamped with Markov chain-created websites that were injected with certain keywords.
AnotherGoodName•37m ago
The problem is the linear nature of markov chains. Sure they can branch but after an observation you are absolutely at a new state. A goes to B goes to C etc. A classic problem to understand why this is an issue is feeding in a 2D bitmap where the patterns are vertical but you’re passing in data left to right which Markov chains can’t handle since they are navigating exclusively on the current left to right inout. They miss the patterns completely. Similar things happen with language. Language is not linear and context from a few sentences ago should change probabilities in the current sequence of characters. The attention mechanism is the best we have for this and Markov chains struggle beyond stringing together a few syllables.

I have played with Markov chains a lot. I tried having skip states and such but ultimately you’re always pushed towards doing something similar to the attention mechanism to handle context.

cuttothechase•4m ago
Would having a Markov chain of Markov chains help in this situation. One chain does this when 2D bitmap patterns are vertical and another one for left to right?
6r17•3m ago
Would you say it's interesting to explore after spending much time on them ? Do you feel like one could make an use for it pragmatically within certain context or it's way too much of a toy where most of the time getting a service / coherent llm would ease-in the work ?
taolson•36m ago
If you program a Markov chain to generate based upon a fairly short sequence length (4 - 5 characters), it can create some neat portamenteaus. I remember back in the early 90's I trained one on some typical tech literature and it came up with the word "marketecture".
cestith•30m ago
I’ve been telling people for years that a reasonably workable initial, simplified mental model of a large language model is a Markov chain generator with an unlimited, weighted corpus trained in. Very few people who know LLMs have said anything to critique that thought more than that it’s a coarse description and downplays the details. Since being simplified is in the initial statement and it’s not meant to capture detail, I say if it walks like a really big duck and it honks instead of quacking then it’s maybe a goose or swan which are both pretty duck-like birds.
nerdponx•24m ago
It's not a Markov chain because it doesn't obey the Markov property.

What it is, and what I assume you mean, is a next-word prediction model based solely on the previous sequence of words, up to some limit. It literally is that, because it was designed to be that.

jama211•21m ago
Sure, but arguably by that definition so are we ;)
jcynix•26m ago
Ah, yes, Markov chains. A long time agoMark V. Shaney https://en.wikipedia.org/wiki/Mark_V._Shaney was designed by Rob Pike and posted on Usenet.

And Veritasium's video "The Strange Math That Predicts (Almost) Anything" talks in detail about the history of Markov chains: https://youtu.be/KZeIEiBrT_w

benob•19m ago
Like it or not, LLMs are effectively high-order Markov chains
guluarte•16m ago
markov chains with limited self correction
BenoitEssiambre•7m ago
Exactly. I think of them as Markov Chains in grammar space or in Abstract Syntax Tree space instead of n-gram chain-of-words space. The attention mechanism likely plays a role in identifying the parent in the grammar tree or identifying other types of back references like pronouns or if it's for programming languages, variable back references.
fair_enough•2m ago
On a humorous note OP, you seem like exactly the kind of guy who would get a kick out of this postmodern essay generator that a STEM professor wrote using a Recursive Transition Network in 1996:

https://www.elsewhere.org/journal/pomo/

Every now and again, I come back to it for a good chuckle. Here's what I got this time (link to the full essay below the excerpt):

"If one examines subcultural capitalist theory, one is faced with a choice: either reject subdialectic cultural theory or conclude that the significance of the artist is significant form, but only if culture is interchangeable with art. It could be said that many desituationisms concerning the capitalist paradigm of context exist. The subject is contextualised into a presemanticist deappropriation that includes truth as a totality."

https://www.elsewhere.org/journal/pomo/?1298795365

jedberg•2m ago
For funsies one of the early reddit engineers wrote a Markov comment generator trained on all the existing comments.

It worked surprisingly well. :)

Libghostty is coming

https://mitchellh.com/writing/libghostty-is-coming
302•kingori•5h ago•71 comments

Find SF parking cops

https://walzr.com/sf-parking/
192•alazsengul•1h ago•111 comments

Android users can now use conversational editing in Google Photos

https://blog.google/products/photos/android-conversational-editing-google-photos/
79•meetpateltech•2h ago•60 comments

Markov chains are the original language models

https://elijahpotter.dev/articles/markov_chains_are_the_original_language_models
77•chilipepperhott•4d ago•23 comments

How to draw construction equipment for kids

https://alyssarosenberg.substack.com/p/how-to-draw-construction-equipment
16•holotrope•33m ago•2 comments

Launch HN: Strata (YC X25) – One MCP server for AI to handle thousands of tools

90•wirehack•4h ago•48 comments

Go has added Valgrind support

https://go-review.googlesource.com/c/go/+/674077
397•cirelli94•10h ago•100 comments

From MCP to shell: MCP auth flaws enable RCE in Claude Code, Gemini CLI and more

https://verialabs.com/blog/from-mcp-to-shell/
75•stuxf•4h ago•24 comments

Always Invite Anna

https://sharif.io/anna-alexei
321•walterbell•4h ago•23 comments

Mesh: I tried Htmx, then ditched it

https://ajmoon.com/posts/mesh-i-tried-htmx-then-ditched-it
114•alex-moon•7h ago•78 comments

Nine things I learned in ninety years

http://edwardpackard.com/wp-content/uploads/2025/09/Nine-Things-I-Learned-in-Ninety-Years.pdf
826•coderintherye•16h ago•316 comments

x402 — An open protocol for internet-native payments

https://www.x402.org/
167•thm•5h ago•87 comments

Getting AI to work in complex codebases

https://github.com/humanlayer/advanced-context-engineering-for-coding-agents/blob/main/ace-fca.md
103•dhorthy•5h ago•103 comments

Getting More Strategic

https://cate.blog/2025/09/23/getting-more-strategic/
126•gpi•7h ago•18 comments

Restrictions on house sharing by unrelated roommates

https://marginalrevolution.com/marginalrevolution/2025/08/the-war-on-roommates-why-is-sharing-a-h...
247•surprisetalk•5h ago•287 comments

Thundering herd problem: Preventing the stampede

https://distributed-computing-musings.com/2025/08/thundering-herd-problem-preventing-the-stampede/
17•pbardea•19h ago•6 comments

Structured Outputs in LLMs

https://parthsareen.com/blog.html#sampling.md
173•SamLeBarbare•9h ago•80 comments

OpenDataLoader-PDF: An open source tool for structured PDF parsing

https://github.com/opendataloader-project/opendataloader-pdf
64•phobos44•5h ago•17 comments

Agents turn simple keyword search into compelling search experiences

https://softwaredoug.com/blog/2025/09/22/reasoning-agents-need-bad-search
48•softwaredoug•5h ago•19 comments

Zinc (YC W14) Is Hiring a Senior Back End Engineer (NYC)

https://app.dover.com/apply/Zinc/4d32fdb9-c3e6-4f84-a4a2-12c80018fe8f/?rs=76643084
1•FriedPickles•7h ago

Zoxide: A Better CD Command

https://github.com/ajeetdsouza/zoxide
277•gasull•14h ago•174 comments

Denmark wants to push through Chat Control

https://netzpolitik.org/2025/internes-protokoll-daenemark-will-chatkontrolle-durchdruecken/
12•Improvement•33m ago•1 comments

Shopify, pulling strings at Ruby Central, forces Bundler and RubyGems takeover

https://joel.drapper.me/p/rubygems-takeover/
273•bradgessler•4h ago•151 comments

YAML document from hell (2023)

https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell
169•agvxov•10h ago•111 comments

Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput

https://github.com/Mega4alik/ollm
86•anuarsh•4d ago•8 comments

The Great American Travel Book: The book that helped revive a genre

https://theamericanscholar.org/the-great-american-travel-book/
6•Thevet•2d ago•0 comments

Smooth weighted round-robin balancing

https://github.com/nginx/nginx/commit/52327e0627f49dbda1e8db695e63a4b0af4448b1
17•grep_it•4d ago•2 comments

Processing Strings 109x Faster Than Nvidia on H100

https://ashvardanian.com/posts/stringwars-on-gpus/
158•ashvardanian•4d ago•23 comments

Show HN: Kekkai – a simple, fast file integrity monitoring tool in Go

https://github.com/catatsuy/kekkai
40•catatsuy•5h ago•9 comments

Permeable materials in homes act as sponges for harmful chemicals: study

https://news.uci.edu/2025/09/22/indoor-surfaces-act-as-massive-sponges-for-harmful-chemicals-uc-i...
93•XzetaU8•10h ago•82 comments