frontpage.

With TRL, it's now straightforward to RL-finetune LLMs, but picking good reward functions is still the weakest link.

Zeno is an open-source toolkit for verifiable, deterministic reward functions for RL on LLMs.

While the initial release focuses on Python code generation, the goal is broader: make RL reward design for LLMs transparent, modular, and extendable across domains (math, retrieval, reasoning, tool-use, etc.)

What's in Zeno for now? - Auditable, stateless reward functions for Python code - docstrings, ruff linting, type hints, recursion, and more - Works directly with Huggingface's TRL or any RL loop - plug reward functions in as needed. - MIT licensed and minimal.

Roadmap: Python code is just the starting point. Extensions for math problem solving, planning and agentic behaviors are in todo.

Repo: https://github.com/think-a-tron/zeno

Docs and more details in the README

Comments, critiques, and real-world use cases encouraged, especially if you want to push beyond code.

Ask HN: What are some non-standard ways to reduce the size of executable files?

Show HN: A simple old school news website

Better Software Conference (Casey Muratori on OOP)

Show HN: Free YouTube Tag Generator to Improve Video SEO

Building a Distributed Cache for S3

Bad Actors Are Grooming LLMs to Produce Falsehoods

The Open Source AI Definition 1.0

Leave Russia

Invisible Text

Jonathan Blow – Jai Demo and Design Explanation

Separation of storage and compute without a performance tradeoff

Development in Progress

Damn Small Link Forwarder (DSLF) – rust based bit.ly replacement

Systemd's Nuts and Bolts – A Visual Guide to Systemd

Attended Windsurf's Build Night 18 hours before founders joined Google DeepMind

Malware Found in Official GravityForms Plugin Indicating Supply Chain Breach

Google Glass Wasn't a Failure. It Raised Crucial Concerns

Milgram shock-study imaginal replication: how far do you think you would go?

One California worker dead, hundreds arrested after cannabis farm raid

Science Fiction, the Future, and Now: Some Mid-Life Reflections

Show HN: XUtil – 40+ fast, privacy-friendly developer tools (no ads, no fluff)

Longevity Might Be All in Your Head

White-sounding names get called back for jobs more than Black ones, study finds

AI therapy bots fuel delusions and give dangerous advice, Stanford study finds

The Great Exhibition of 1851

Indeed, Glassdoor to lay off 1,300 staff amid AI push

Depopulation won't stop climate change

curl Cybersecurity Risk Assessment Request

PocketBase to OpenAPI Converter

Ask HN: What do you do with your list of articles links?

Show HN: Zeno – A framework for verifiable RL rewards (code, math, and more)