frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Env-shelf – Open-source desktop app to manage .env files

https://env-shelf.vercel.app/
1•ivanglpz•2m ago•0 comments

Show HN: Almostnode – Run Node.js, Next.js, and Express in the Browser

https://almostnode.dev/
1•PetrBrzyBrzek•3m ago•0 comments

Dell support (and hardware) is so bad, I almost sued them

https://blog.joshattic.us/posts/2026-02-07-dell-support-lawsuit
1•radeeyate•4m ago•0 comments

Project Pterodactyl: Incremental Architecture

https://www.jonmsterling.com/01K7/
1•matt_d•4m ago•0 comments

Styling: Search-Text and Other Highlight-Y Pseudo-Elements

https://css-tricks.com/how-to-style-the-new-search-text-and-other-highlight-pseudo-elements/
1•blenderob•6m ago•0 comments

Crypto firm accidentally sends $40B in Bitcoin to users

https://finance.yahoo.com/news/crypto-firm-accidentally-sends-40-055054321.html
1•CommonGuy•6m ago•0 comments

Magnetic fields can change carbon diffusion in steel

https://www.sciencedaily.com/releases/2026/01/260125083427.htm
1•fanf2•7m ago•0 comments

Fantasy football that celebrates great games

https://www.silvestar.codes/articles/ultigamemate/
1•blenderob•7m ago•0 comments

Show HN: Animalese

https://animalese.barcoloudly.com/
1•noreplica•7m ago•0 comments

StrongDM's AI team build serious software without even looking at the code

https://simonwillison.net/2026/Feb/7/software-factory/
1•simonw•8m ago•0 comments

John Haugeland on the failure of micro-worlds

https://blog.plover.com/tech/gpt/micro-worlds.html
1•blenderob•8m ago•0 comments

Show HN: Velocity - Free/Cheaper Linear Clone but with MCP for agents

https://velocity.quest
2•kevinelliott•9m ago•2 comments

Corning Invented a New Fiber-Optic Cable for AI and Landed a $6B Meta Deal [video]

https://www.youtube.com/watch?v=Y3KLbc5DlRs
1•ksec•10m ago•0 comments

Show HN: XAPIs.dev – Twitter API Alternative at 90% Lower Cost

https://xapis.dev
2•nmfccodes•11m ago•0 comments

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

https://psychotechnology.substack.com/p/near-instantly-aborting-the-worst
2•eatitraw•17m ago•0 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
2•anipaleja•17m ago•0 comments

The Super Sharp Blade

https://netzhansa.com/the-super-sharp-blade/
1•robin_reala•19m ago•0 comments

Smart Homes Are Terrible

https://www.theatlantic.com/ideas/2026/02/smart-homes-technology/685867/
1•tusslewake•20m ago•0 comments

What I haven't figured out

https://macwright.com/2026/01/29/what-i-havent-figured-out
1•stevekrouse•21m ago•0 comments

KPMG pressed its auditor to pass on AI cost savings

https://www.irishtimes.com/business/2026/02/06/kpmg-pressed-its-auditor-to-pass-on-ai-cost-savings/
1•cainxinth•21m ago•0 comments

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

https://twitter.com/b1rdmania/status/2020155122181869666
3•birdmania•21m ago•1 comments

First Proof

https://arxiv.org/abs/2602.05192
7•samasblack•23m ago•2 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•24m ago•0 comments

Kagi Translate

https://translate.kagi.com
2•microflash•25m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•26m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
2•facundo_olano•28m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•28m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•28m ago•1 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•29m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•30m ago•0 comments
Open in hackernews

Show HN: RULER – Easily apply RL to any agent

https://openpipe.ai/blog/ruler
81•kcorbitt•7mo ago
Hey HN, Kyle here, one of the co-founders of OpenPipe.

Reinforcement learning is one of the best techniques for making agents more reliable, and has been widely adopted by frontier labs. However, adoption in the outside community has been slow because it's so hard to implement.

One of the biggest challenges when adapting RL to a new task is the need for a task-specific "reward function" (way of measuring success). This is often difficult to define, and requires either high-quality labeled data and/or significant domain expertise to generate.

RULER is a drop-in reward function that works across different tasks without any of that complexity.

It works by showing N trajectories to an LLM judge and asking it to rank them relative to each other. This sidesteps the calibration issues that plague most LLM-as-judge approaches. Combined with GRPO (which only cares about relative scores within groups), it just works (surprisingly well!).

We have a full writeup on the blog, including results on 4 production tasks. On all 4 tasks, small Qwen 2.5 models trained with RULER+GRPO beat the best prompted frontier model, despite being significantly smaller and cheaper to run. Surprisingly, they even beat models trained with hand-crafted reward functions on 3/4 tasks! https://openpipe.ai/blog/ruler

Repo: https://github.com/OpenPipe/ART

Comments

someoneontenet•7mo ago
Love these write ups!
kcorbitt•7mo ago
Thank! If there are any topics that you'd find particularly interesting, let me know and I can try to find time. :)
sadiq•7mo ago
Excellent, look forward to giving this a go.

I was looking at: https://arxiv.org/abs/2506.18254 but your approach is even more general.

kcorbitt•7mo ago
I really like RLPR for when you have a known-good answer to compare to as well!
spmurrayzzz•7mo ago
Might end up being some confusion with the RULER benchmark from NVIDIA given the (somewhat shared) domain: https://github.com/NVIDIA/RULER

EDIT: by shared I only mean the adjacency to LLMs/AI/ML, RL is a pretty big differentiator though and project looks great

kcorbitt•7mo ago
Dang, hadn't seen that. Namespace collision strikes again.
swyx•7mo ago
yeah unforutnately for you this is one of the well known long context benchmarks. too late tho, soldier on.
ndgold•7mo ago
Dope
maxrmk•7mo ago
Very cool. Do you do anything to mitigate ordering bias in the evaluation function, or do you just expect it to average out over time?
kcorbitt•7mo ago
No, we don't do anything. Theoretically we could judge several times with different ordering.

We could measure order bias really easily though; we just need to look at the average score by rollout position across many runs. I'll add that to my list of experiments!

swyx•7mo ago
how does o3 on the customer support agent task so dreadfully underperform qwen?