I wonder if something similar will happen here.
@eieio please open source the Go code, would be fun to poke at.
It's kind of annoying and expensive to get a bunch of IPv4s to evade limits, but it's really easy to get a TON of IPv6s.
The other Big Trick I know is to persist rate limits after a client disconnects so that they can't disconnect -> reconnect to refresh their limits.
My blog describing it is pretty sparse, sorry about that. Happy to answer any questions that folks have about the architecture.
Not that it was necessary, but I got really into building this out as a single process that could handle many (10k+/sec) moves for thousands of concurrent clients. I learned a whole lot! And I found golang to be a really good fit for this, since you mostly want to give tons and tons of threads concurrent access to a little bit of shared memory.
The dependency graph is between pieces you’re interacting with? Meaning if you move a queen and are trying to capture a pawn, and there’s potentially a rook that can capture your queen, those 3 are involved in that calculation, and if you moved your queen but the rook also captures your queen at the same time one of them wins? How do you determine that?
(edit: I realized I didn't answer your question. If we receive a captured for a piece we're optimistically tracking that always takes precedence, since once a piece is captured it can't move anymore!)
* clients send a token with each move
* they either receive a cancel or accept for each token, depending on if the move is valid. If they receive an accept, it comes with the sequence number of the move (everything has a seqnum) and the ID of the piece they captured, if applicable
* clients receive batches of captures and moves. if a move captured a piece, it's guaranteed that that capture shows up in the same batch as the move
So when you make a move we: * Write down all impacted squares for the move (2 most of the time, 4 if you castle)
* Write down its move token
* If you moved a piece that is already tracked optimistically from a prior not-yet-acked-or-canceled move, we note that dependency as well
We maintain this state separate from our ground truth from the server and overlay it on top.When we receive a new move, we compare it with our optimistic state. If the move occupies the same square as a piece that we've optimistically moved, we ask "is it possible that we inadvertently captured this piece?" That requires that the piece is of the opposite color and that we made a move that could have captured a piece (for example, if you moved a pawn up that is not a valid capturing move).
If there's a conflict - if you moved a piece to a square that is now occupied by a piece of the same color, for example - we back out your optimistically applied move. We then look for any moves that depended on it - moves that touch the same squares or share the same move token (because you optimistically moved a piece twice before receiving a response).
So concretely, imagine you have this state:
_ _ _ _
K B _ R
You move the bishop out of the way, and then you castle _ _ B _
_ R K _
Then a piece of the same color moves to where your bishop was! We notice that, revert the bishop move, notice that it used the same square as your castle, and revert that too.There's some more bookkeeping here. For example, we also have to track the IDs of the pieces that you moved (if you move a bishop and then we receive another move for the same bishop, that move takes precedence).
Returning the captured piece ID from the server ack is essential, because we potentially simulate after-the-fact captures (you move a bishop to a square, a rook of the opposite color moves to that square, we decide you probably captured that rook and don't revert your move). We track that and when we receive our ack, compare that state with the ID of the piece we actually captured.
I think that's most of it? It was a real headache but very satisfying once I got it working.
On this I found Go to be at the right balance of not having to worry about memory management yet having decent concurrency management primitives and decent performance (memory use is especially impressive). Also did a multiplayer single server Go app with pseudo realtime updates (long polling waiting for updates on related objects).
My goal with the board architecture was "just be fast enough that I'm limited by serialization and syscalls for sending back to clients" and go made that really easy to do; I spend a few hundred nanos holding the write lock and ~15k nanos holding the read lock (but obviously I can do that from many readers at once) and that was enough for me.
I definitely have some qualms with it, but after this experience it's hard to imagine using something else for a backend with this shape.
My biggest goal was "make sure that my bottleneck is serialization or syscalls for sending to the client." Those are both things I can parallelize really well, so I could (probably) scale my way out of them vertically in a pinch.
So I tried to pick an architecture that would make that true; I evaluated a ton of different options but eventually did some napkin math and decided that a 64-million uint64 array with a single mutex was probably ok[1].
To validate that I made a script that spins up ~600 bots, has 100 of them slam 1,000,000 moves through the server as fast as possible, and has the other 500 request lots of reads. This is NOT a perfect simulation of load, but it let me take profiles of my server under a reasonable amount of load and gave me a decent sense of my bottlenecks, whether changes were good for speed, etc.
I had a plan to move from a single RWMutex to a row-locking approach with 8,000 of them. I didn't want to do this because it's more complicated and I might mess it up. So instead I just measure the number of nanos that I hold my mutex for and send that to a loki instance. This was helpful during testing (at one point my read lock time went up 10x!) but more importantly gave me a plan for what to do if prod was slow - I can look at that metric and only tweak the mutex if it's actually a problem.
I also took some free wins like using protobufs instead of JSON for websockets. I was worried about connection overhead so I moved to GET polling behind Cloudflare's cache for global resources instead of pushing them over websockets.
And then I got comfortable with the fact that I might miss something! There are plenty more measurements I could have taken (if there was money on the line I would have measured some things like "number of TCP connections sending 0 moves this server can support" but I was lazy) but...some of the joy of projects like this is the firefighting :). So I was just ready for that.
Oh and finally I consulted with some very talented systems/performance engineer friends and ran some numbers by them as a sanity check.
It looks like this was way more work than I needed to do! I think I could comfortable 25x the current load and my server would be ok. But I learned a lot and this should all make the next project faster to make :)
[1] I originally did my math wrong and modeled the 100x100 snapshots I send to clients as 10,000 reads from main memory instead of 100 copies of 100 uint64s, which lead me down a very different path... I'm not used to thinking about this stuff!
I won
building fortresses (since enemy can't cross board border by capture) is also. fun.
1) fill corner board with pieces 2) cover the border (from inside) with pawns 3) cover the promotion square on the border with the king
king can't exit the board, pawns can't walk backward to leave space, filler doesn't allow these 2 to make way
---
the real holy grail that will never be achieved I fear
I enjoyed the sc2 UI when selecting pieces
Evidently move between boards but not capture between boards :-( It's extra weird because it's not that the movement isn't projected (e.g. queen blue lines all point correctly across board boundaries just the lines always stop at every piece on the other board, regardless of color)
So, I guess as an exercise in scale, well done! As one million chess boards, caveat gamator
I'm sorry you don't like that decision! But I think that I stand by it.
I work in chess tech, but in a very different direction (structured games, coaching, serious play). It's inspiring to see chess reimagined like this!
pavel_lishin•7h ago
Neat, though I expected every individual board to have "turns" - I didn't expect that I could just pick a random board, liberate the black queen, and have her clean up every single white piece on the board without my "opponent" getting to do anything in return.
tantalor•5h ago
krupan•5h ago
NooneAtAll3•4h ago
get the most kills, make a cool shape
or take a piece veeeeery far away from home
eieio•5h ago
Anyway, yeah, I guess I could have gone with turns here but I thought that building a more realtime MMO thing where pieces could cross boards would be a little more interesting and novel. I also didn't feel like a version of this that was turn based would ever complete.
certainly a queen can go wipe out a whole board, but the game tries to place you next to other active players when you join, which hopefully promotes some interesting counterplay to that. And I think playing chess in realtime like this against someone is pretty fun. But I understand why it might not be for everyone!
JusticeJuice•4h ago
chunkles•3h ago
eieio•39m ago
[1] https://news.ycombinator.com/item?id=40800869 for example
dang•1h ago
(Oh and I still owe you an email. I haven't forgotten!)
eieio•43m ago
(and thanks, I'm in no rush!!)