I built this website which allows you to:
Spectate: Watch different models play against each other.
Play: Create your own table and play hands against the agents directly.
I built this website which allows you to:
Spectate: Watch different models play against each other.
Play: Create your own table and play hands against the agents directly.
It’s similar to how an LLM can sometimes play chess on a reasonably high (but not world-class) level, while Stockfish (the chess solver) can easily crush even the best human player in the world.
To limit the scope of what it has to simulate.
It's unlikely they're perfect, but there's very small differences in EV betting 100% vs 101.6% or whatever.
Stockfish isn't really a solver it's a neural net based engine
There are other poker playing programs [0] - what we called AI before large language models were a thing - which achieve superhuman performance in real time in this format. They would crush the LLMs here. I don't know what's publicly available though.
This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.
As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.
I was interested in this idea too and made a video where some of the previous top LLMs play against each other https://www.youtube.com/watch?v=XsvcoUxGFmQ&t=2s
It's mostly a ChatGPT conversational interface over a classic Solver (Monte-Carlo simulation based), but that ease of use makes it very convenient for quick post-game analysis of hands.
I'm sure if you hook a Solver to a hud, it might be even simpler, but it's quite burdensome for amateurs, and it might be too close to cheating.
These LLMs are playing better than most human players I encounter (low limits).
They're kinda bad, but not as criminally bad as the humans.
How much of a session is based on "reading players" vs "playing the odds"?
What I am getting at, is how different is poker than say roulette or blackjack? My initial thoughts are that poker such as TX hold 'em is not a game offered in a casino, so it must be mostly indeterminate. I imagine that the casino versions of poker are not TXHT.
By contrast, roulette is simply a game where the casino wins eventually with a fixed profit (thanks to 0 and a possible 00). That is all well documented.
I have only ever visited a casino once, 25 years ago, Plymouth, Devon as it turns out and I was advised to only take £50 in readies and bail out when it was gone. I came out £90 up, which was nice and my "advisor" came out £95 up (eventually, after being £200 down at one point). Sadly my "advisor" ended up bankrupt a year later.
So, how do you play a LLM? I would imagine that conversation is not allowed ...
Most common game spread is 9-handed $200 max $1/$2 NLHE. It's exactly like the game on the link, except more players and lower stakes.
In the game, you try to win the money of the other 8 players, not of the casino. The casino takes a rake each hand, and a player with a large enough edge can overcome it. The edge might be you're excellent, or it might be they're terrible (or drunk). But the house gets paid to deal each hand.
In the long term, poker outcomes are determined by skill. In the short term, they're luck. In the medium term, both. Most people never reach the long term, it's a lot of hands.
There's also table games, similar to blackjack, that they call "three card poker" etc. These can't be beat, they favor the house. Standard table game, with a poker flavor. I've never played one of these.
At low levels, playing is ABC simple and mostly about following basic strategy for starting hands and pot adds for chasing. Don’t get fancy and keep your temperament steady and you’ll win.
To a slight degree, you can do better with reading players and identifying them in broad ways (wild, conservative, confused, etc.) but don’t let that allow you to get fancy. Stick to the basic fundamental strategy for hands, position, and pot odds to crush lower level games.
Others may differ and I am biased because 99% of my play has been online, but I'd say it's almost entirely playing the odds. Or at least, the popular romantic conception of looking for tells or whatever, is, I would expect, a really minimal edge compared to simply playing better.
You do learn the other players' tendencies and adapt accordingly, and table selection is very important, so in that sense it is very much about reading players.
A large part of my play was heads up where it's very much about understanding the other player's play as deeply as possible, and so if I wanted to be technically accurate about reading players vs playing the odds, I'd say both are very important. But if I'm answering someone who has the popular conception of what those phrases mean, I think saying "it's about playing the odds" would give them the more accurate picture.
You really want to be good at playing the odds, and you don't want to stray too far from fundamentally good play. If someone is learning how to play and I'm advising them, I'm teaching them all about playing the odds, and trying to get them to read players less. Only once they have a solid fundamental understanding of the odds would I teach them how to adjust.
Why not? Because you think it's a game where the casino can lose?
If so it's not an issue, as casinos that provide poker take "fees" from the stakes. Like how stock exchanges work: there are people making or losing money from stock market, but exchanges are always making profit.
but seriously at lower stakes there is just no respect for the art its just a shock and awe strategy: throw shit up, break the game and use that demoralization to bully others.
Post-flop on the other hand is all over the place...
That is, good enough to compete amongst each other but not good enough to for one to win.
Or do you mean - each agent has a chance to think after every turn?
Your idea of having it being passed in real time and having the LLM create a chain of thoughts even if action is not on them is interesting. I'd be curious to see if it would result in improved play.
sblawrie•10h ago
zahlman•9h ago
sejje•9h ago
What I'm curious about is if their innate training is enough to give them biases. Like maybe they think Grok is full of shit.