There is also the aspect of reinforcing existing memories which is easy to miss with this criteria. In the sense that any single given experience won't significantly alter your life, but if you remove all of them, suddenly a change happens.
I for once had my gears grinding trying to recall my patterns and predict moves of the "AI". Especially past round 20.
Played 145 times
Max credits 49
Max combo 8
Win 56 (39%)
Draw 35 (24%)
Loss 54 (37%)
My "this game" stats: Played 94 times
Current credits 45
Max credits 49
Max combo 8
Win 39 (41%)
Draw 25 (27%)
Loss 30 (32%)
I was playing mostly as with a living oponent.When playing I was not exploiting any assumptions about it being Markov learner (I don't even understand how it learns and how to exploit it)
While it can not learn to beat me most of the time, it is playable.
How do you know it learns with markov chain? How exactly? (What states the chain has? Other details?)
Why does camera shakes at every move? To me that's very annoying.
The game needs stats of the total credits I spent including previous rounds, so that I see the total spent/won balance. I am clearly winning on that measure.
> How do you know it learns with markov chain? How exactly? (What states the chain has? Other details?)
I wrote it! It uses five chains of different lengths, and each of those estimates the next, then a standard way of resolving those to a single signal.
> Why does camera shakes at every move? To me that's very annoying.
Yeah, it is. The first version you actually spun around on the spot too, which was cool but made you sick, even on a phone. Therefore I hacked it to this, but it's not great.
> The game needs stats of the total credits I spent including previous rounds, so that I see the total spent/won balance. I am clearly winning on that measure.
In the menu there is "statistics" which contains more stats than live stats, and that might have more of what you want!
In any case Markov chains are a simple and elegant mechanism for making predictions, easy enough to understand so that even school kids at grade eleven (which is where probability theory got introduced when I underwent it) can follow the popular "weather forecasting" tutorial using MCs.
Onwards, Hidden Markov Models (HMMs) add to MCs a layer of hidden states and associated emission probabilities. To learn these, check out Lawrence Rabiner's beautiful tutorial: https://ieeexplore.ieee.org/document/18626
Apart from their simplicity and mathematical elegance it is remarkable how little data and electricity these models require to do a good job.
For example, it is not necessarily trying to beat you. This was inspired by an experience with https://luduxia.com/whichwayround/ where I found most traffic came from people trying to break it, until one day it hit a mailing list and exploded.
This one I updated the game a bit, submitted a link, and it went nowhere when I posted it some days ago. I woke to a Linode alert of sustained outgoing traffic and it turns out the post had been resurrected.
38 wins (44%), 21 draws (24%), 28 losses (32%)
Max combo 7
I think I could have kept going too.
At that time I was quite tired of polishing and repolishing a paper draft at my university. (I write very poorly). And there it was, an announcement of this fun competition. The deadline was just an hour away.
I had no time to implement any respectable algorithm myself.
So, all my submission did was take the history of plays of my opponent, up to the present point in the ongoing joust, and extend it by three possible completions. I would compress each of the three resulting strings. Whichever completion gave the shortest compressed final string, I assumed that to be my opponent's predicted play. I played whatever beat that predicted play.
This did not win the competition, but survived the knockout for many rounds, beyond my expectation. If I remember correctly, I had used plain old zip. A compressor that converges faster would have been better.
In essence my strategy was similar to a Markov chain predictor. It would have been exact match had I used a Markov model based compressor.
The number of rounds of play against one's opponent was not long. Had it been so, Lempel-Ziv being an universal compressor, would have been hard to beat.
Of course we know from game theoretic analysis what the Nash strategy is for this game, but where is the fun in playing that ?
One might think that a transformer based player would be an excellent strategy, not necessarily. One needs to learn potentially non-stationary strategies of the opponent from very few examples - (near) zero-shot online learning is required.
If I had more time, I would have tried context-weighted trees - universal learners for stochastic strings. The failure mode would be non-ergodic plays by the opponent, assuming that, it too is another stochastic parrot.
Of course these things will always have to assume a semi strategic opponent.
Following your link I found this
https://daniel.lawrence.lu/programming/rps/
Submitted.
https://news.ycombinator.com/item?id=44105859
There goes my afternoon, nicely nerd sniped :) Thank you.
Probably not. Most writing is self-congratulatory garbage. The best writing I've ever read was not meant to be fancy but got right to the point from the heart, by people who couldn't care less what others thought about them. Your writing in this comment was fine and easy to read.
I am still very bad as far as writing research papers go. That's one area my advisor's feedback helped a lot. Anything I wrote would need multiple passes.
My writing's bad, partly because of bad grammar and partly because once the cool results of the research is done, my heart is really not into writing it. It becomes a chore.
EDIT: I see no reason why your comment got down-voted.
As mentioned, I was slightly surprised to see this trending this morning given I thought it had gone nowhere in a prior submission.
Some tech details no one asked about:
* Custom WebGL2 renderer
* Leaderboards are in FoundationDB (talk about overkill)
The markov chains are a fairly standard 5 chain ensemble setup. Each chain is configured to process k-(1..5) of your previous moves (for some value of k), and base the estimate of the next move on that. The ensemble mechanism takes care of working out which to go with. This is obviously harder than if it cheated, but it's more fun. The actual markov chain component took about an hour to write, the stats was worse.
If you're interested in writing RPS players there have been many efforts http://www.rpscontest.com/ including things as linked elsehwere here like https://daniel.lawrence.lu/programming/rps/ .
082349872349872•4d ago