frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Mirror Parliament where users vote on top of politicians and draft laws

https://github.com/fokdelafons/lustra
1•fokdelafons•52s ago•0 comments

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

1•Chance-Device•2m ago•0 comments

We Mourn Our Craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
1•ColinWright•5m ago•0 comments

Jim Fan calls pixels the ultimate motor controller

https://robotsandstartups.substack.com/p/humanoids-platform-urdf-kitchen-nvidias
1•robotlaunch•8m ago•0 comments

Exploring a Modern SMTPE 2110 Broadcast Truck with My Dad

https://www.jeffgeerling.com/blog/2026/exploring-a-modern-smpte-2110-broadcast-truck-with-my-dad/
1•HotGarbage•8m ago•0 comments

AI UX Playground: Real-world examples of AI interaction design

https://www.aiuxplayground.com/
1•javiercr•9m ago•0 comments

The Field Guide to Design Futures

https://designfutures.guide/
1•andyjohnson0•10m ago•0 comments

The Other Leverage in Software and AI

https://tomtunguz.com/the-other-leverage-in-software-and-ai/
1•gmays•11m ago•0 comments

AUR malware scanner written in Rust

https://github.com/Sohimaster/traur
3•sohimaster•14m ago•1 comments

Free FFmpeg API [video]

https://www.youtube.com/watch?v=6RAuSVa4MLI
3•harshalone•14m ago•1 comments

Are AI agents ready for the workplace? A new benchmark raises doubts

https://techcrunch.com/2026/01/22/are-ai-agents-ready-for-the-workplace-a-new-benchmark-raises-do...
2•PaulHoule•19m ago•0 comments

Show HN: AI Watermark and Stego Scanner

https://ulrischa.github.io/AIWatermarkDetector/
1•ulrischa•19m ago•0 comments

Clarity vs. complexity: the invisible work of subtraction

https://www.alexscamp.com/p/clarity-vs-complexity-the-invisible
1•dovhyi•20m ago•0 comments

Solid-State Freezer Needs No Refrigerants

https://spectrum.ieee.org/subzero-elastocaloric-cooling
2•Brajeshwar•21m ago•0 comments

Ask HN: Will LLMs/AI Decrease Human Intelligence and Make Expertise a Commodity?

1•mc-0•22m ago•1 comments

From Zero to Hero: A Brief Introduction to Spring Boot

https://jcob-sikorski.github.io/me/writing/from-zero-to-hello-world-spring-boot
1•jcob_sikorski•22m ago•1 comments

NSA detected phone call between foreign intelligence and person close to Trump

https://www.theguardian.com/us-news/2026/feb/07/nsa-foreign-intelligence-trump-whistleblower
8•c420•23m ago•1 comments

How to Fake a Robotics Result

https://itcanthink.substack.com/p/how-to-fake-a-robotics-result
1•ai_critic•23m ago•0 comments

It's time for the world to boycott the US

https://www.aljazeera.com/opinions/2026/2/5/its-time-for-the-world-to-boycott-the-us
3•HotGarbage•23m ago•0 comments

Show HN: Semantic Search for terminal commands in the Browser (No Back end)

https://jslambda.github.io/tldr-vsearch/
1•jslambda•24m ago•1 comments

The AI CEO Experiment

https://yukicapital.com/blog/the-ai-ceo-experiment/
2•romainsimon•25m ago•0 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
4•surprisetalk•29m ago•0 comments

MS-DOS game copy protection and cracks

https://www.dosdays.co.uk/topics/game_cracks.php
4•TheCraiggers•30m ago•0 comments

Updates on GNU/Hurd progress [video]

https://fosdem.org/2026/schedule/event/7FZXHF-updates_on_gnuhurd_progress_rump_drivers_64bit_smp_...
2•birdculture•31m ago•0 comments

Epstein took a photo of his 2015 dinner with Zuckerberg and Musk

https://xcancel.com/search?f=tweets&q=davenewworld_2%2Fstatus%2F2020128223850316274
14•doener•31m ago•2 comments

MyFlames: View MySQL execution plans as interactive FlameGraphs and BarCharts

https://github.com/vgrippa/myflames
1•tanelpoder•32m ago•0 comments

Show HN: LLM of Babel

https://clairefro.github.io/llm-of-babel/
1•marjipan200•32m ago•0 comments

A modern iperf3 alternative with a live TUI, multi-client server, QUIC support

https://github.com/lance0/xfr
3•tanelpoder•33m ago•0 comments

Famfamfam Silk icons – also with CSS spritesheet

https://github.com/legacy-icons/famfamfam-silk
1•thunderbong•34m ago•0 comments

Apple is the only Big Tech company whose capex declined last quarter

https://sherwood.news/tech/apple-is-the-only-big-tech-company-whose-capex-declined-last-quarter/
4•elsewhen•37m ago•0 comments
Open in hackernews

Recursive Language Models: the paradigm of 2026

https://www.primeintellect.ai/blog/rlm
5•skhameneh•1mo ago

Comments

obiefernandez•1mo ago
The RLM framing basically turns long-context into an RL problem over what to remember and where to route it: main model context vs Python vs sub-LLMs. That’s a nice instantiation of The Bitter Lesson, but it also means performance is now tightly coupled to whatever reward signal you happen to define in those environments. Do you have any evidence yet that policies learned on DeepDive / Oolong-style tasks transfer to “messy” real workloads (multi-week code refactors, research over evolving corpora, etc.), or are we still in the “per-benchmark policy” regime?

The split between main model tokens and sub-LLM tokens is clever for cost and context rot, but it also hides the true economic story. For many users the cost that matters is total tokens across all calls, not just the controller’s context. Some of your plots celebrate higher “main model token efficiency” while total tokens rise substantially. Do you have scenarios where RLM is strictly more cost-efficient at equal or better quality, or is the current regime basically “pay more total tokens to get around context limits”?

math-python is the most damning data point: same capabilities, but the RLM harness makes models worse and slower. That feels like a warning that “more flexible scaffold” is not automatically a win; you’re introducing an extra layer of indirection that the model has not been optimized for. The claim that RL training over the RLM will fix this is plausible, but also unfalsifiable until you actually show a model that beats a strong plain-tool baseline on math with less wall-clock and tokens.

Oolong and verbatim-copy are more encouraging: the controller treating large inputs as opaque blobs and then using Python + sub-LLMs to scan/aggregate is exactly the kind of pattern humans write by hand in agents today. One thing I’d love to see is a comparison vs a well-engineered non-RL agent baseline that does essentially the same thing but with hand-written heuristics (chunk + batch + regex/SQL/etc.). Right now the RLM looks like a principled way to let the model learn those heuristics, but the post doesn’t really separate “benefit from architecture” vs “benefit from just having more structure/tools than a vanilla single call.”

On safety / robustness: giving the model a persistent Python REPL and arbitrary pip is powerful, but it also dramatically expands the attack surface if this ever runs on untrusted inputs. Are you treating RLM as strictly a research/eval harness, or do you envision this being exposed in production agent systems? If the latter, sandboxing guarantees and resource controls probably matter as much as reward curves.