Sutton and Barto book implementation

https://github.com/ivanbelenky/RL

80•ivanbelenky•9mo ago

Comments

sage76•9mo ago

Damn this is a lot of work. Bookmarked.

ivanbelenky•9mo ago

It has not been stress tested, or optimized, tread lightly and thanks a lot for appreciating the work.

mark_l_watson•9mo ago

Very nice, thanks for doing this.

I have experimented a lot with the "official" Common Lisp and Python examples for the Sutton/Barto RL book, and I will enjoy your implementations also!

For reference, original examples in Lisp and Python: http://incompleteideas.net/book/code/code2nd.html

A bunch of implementations with all kinds of use cases (e.g., using OpenAI RL Gym, etc.):

Here are some resources with code examples and implementations related to the Sutton and Barto "Reinforcement Learning: An Introduction" book:

Code for Sutton & Barto Book: Reinforcement Learning: An Introduction: The official website for the book provides links to various software and re-implementations in different languages, including Python, Julia, and Lisp. This is a great starting point to find code directly associated with the book's examples and exercises.

Link: http://incompleteideas.net/book/code/code2nd.html jovsa/rl-examples-sutton-and-barto-book on GitHub: This repository offers Python implementations of examples from the book, organized by chapter. It includes code for figures and examples from various chapters, covering topics like Gridworld, Blackjack, and the Mountain Car task.

Link: https://github.com/jovsa/rl-examples-sutton-and-barto-book kamenbliznashki/sutton_barto on GitHub: This repository provides Python implementations of RL algorithms for the examples and figures in the Sutton and Barto book. It covers a wide range of topics from multi-armed bandits to policy gradient methods.

Link: https://github.com/kamenbliznashki/sutton_barto boldyshev/sutton on GitHub: This repository contains Python implementations of example experiments (figures) and programming exercises from the second edition of the book. Chapters are added as the author studies the book, making it a potentially growing resource.

Link: https://github.com/boldyshev/sutton AntonioSerrano/Implementation-of-RL-algorithms-from-Sutton-and-Barto-2018 on GitHub: This repository offers implementations in Python using OpenAI Gym and Tensorflow, covering exercises and solutions to complement the book and David Silver's RL course. It includes various algorithms like Dynamic Programming, Monte Carlo, Temporal Difference, and Policy Gradient methods.

Link: https://github.com/AntonioSerrano/Implementation-of-RL-algor...

ivanbelenky•9mo ago

my code is not as good as anything above most probably. Ive done this exploring while studying. No linter no typechecker, grug engineer mentality. But thanks nevertheless for the comment :)

mark_l_watson•9mo ago

well, it looks good to me.

mark_l_watson•9mo ago

I want to add a second comment:

Professors White & White (a husband and wife team) have a very good set of courses on RL on Coursera:

https://www.coursera.org/specializations/reinforcement-learn...

ivanbelenky•9mo ago

Lovely!

AndrewKemendo•9mo ago

Let me know if anyone fills out the true online Sarsa section with a working example in a robot

vlad•9mo ago

The authors were professor and grad student at UMass Amherst, and are the current winners of the Turing Award.

https://www.cics.umass.edu/

https://www.nsf.gov/news/ai-pioneers-andrew-barto-richard-su...

ultrasounder•9mo ago

Super helpful while I come upto speed with this field in general. Currently taking the XCS234(RL @ Stanford online) and this book is referenced for everything.

Show HN: I decomposed 87 tasks to find where AI agents structurally collapse

I went back to Linux and it was a mistake

Octrafic – open-source AI-assisted API testing from the CLI

US Accuses China of Secret Nuclear Testing

Peacock. A New Programming Language

A postcard arrived: 'If you're reading this I'm dead, and I really liked you'

What to know about the software selloff

Show HN: Syntux – generative UI for websites, not agents

Microsoft appointed a quality czar. He has no direct reports and no budget

AI overlay that reads anything on your screen (invisible to screen capture)

Show HN: Seafloor, be up and running with OpenClaw in 20 seconds

Tesla turbine-inspired structure generates electricity using compressed air

State Department deleting 17 years of tweets (2009-2025); preservation needed

Learning to code, or building side projects with AI help, this one's for you

Effulgence RPG Engine [video]

Five disciplines discovered the same math independently – none of them knew

We Scanned an AI Assistant for Security Issues: 12,465 Vulnerabilities

Amazon no longer defend cloud customers against video patent infringement claims

Show HN: Medinilla – an OCPP compliant .NET back end (partially done)

How Does AI Distribute the Pie? Large Language Models and the Ultimatum Game

Resistance Infrastructure

Fire-juggling unicyclist caught performing on crossing

Restoring a lost 1981 Unix roguelike (protoHack) and preserving Hack 1.0.3

GPS and Time Dilation – Special and General Relativity

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

Show HN: I built a clawdbot that texts like your crush

Scientists reverse Alzheimer's in mice and restore memory (2025)

Compiling Prolog to Forth [pdf]

Show HN: Cymatica – an experimental, meditative audiovisual app

GitBlack: Tracing America's Foundation

Show HN: I decomposed 87 tasks to find where AI agents structurally collapse

I went back to Linux and it was a mistake

Octrafic – open-source AI-assisted API testing from the CLI

US Accuses China of Secret Nuclear Testing

Peacock. A New Programming Language

A postcard arrived: 'If you're reading this I'm dead, and I really liked you'

What to know about the software selloff

Show HN: Syntux – generative UI for websites, not agents

Microsoft appointed a quality czar. He has no direct reports and no budget

AI overlay that reads anything on your screen (invisible to screen capture)

Show HN: Seafloor, be up and running with OpenClaw in 20 seconds

Tesla turbine-inspired structure generates electricity using compressed air

State Department deleting 17 years of tweets (2009-2025); preservation needed

Learning to code, or building side projects with AI help, this one's for you

Effulgence RPG Engine [video]

Five disciplines discovered the same math independently – none of them knew

We Scanned an AI Assistant for Security Issues: 12,465 Vulnerabilities

Amazon no longer defend cloud customers against video patent infringement claims

Show HN: Medinilla – an OCPP compliant .NET back end (partially done)

How Does AI Distribute the Pie? Large Language Models and the Ultimatum Game

Resistance Infrastructure

Fire-juggling unicyclist caught performing on crossing

Restoring a lost 1981 Unix roguelike (protoHack) and preserving Hack 1.0.3

GPS and Time Dilation – Special and General Relativity

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

Show HN: I built a clawdbot that texts like your crush

Scientists reverse Alzheimer's in mice and restore memory (2025)

Compiling Prolog to Forth [pdf]

Show HN: Cymatica – an experimental, meditative audiovisual app

GitBlack: Tracing America's Foundation

Sutton and Barto book implementation

Comments