frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Clay Christensen's Milkshake Marketing (2011)

https://www.library.hbs.edu/working-knowledge/clay-christensens-milkshake-marketing
2•vismit2000•4m ago•0 comments

Show HN: WeaveMind – AI Workflows with human-in-the-loop

https://weavemind.ai
3•quentin101010•10m ago•1 comments

Show HN: Seedream 5.0: free AI image generator that claims strong text rendering

https://seedream5ai.org
1•dallen97•11m ago•0 comments

A contributor trust management system based on explicit vouches

https://github.com/mitchellh/vouch
2•admp•13m ago•1 comments

Show HN: Analyzing 9 years of HN side projects that reached $500/month

2•haileyzhou•14m ago•0 comments

The Floating Dock for Developers

https://snap-dock.co
2•OsamaJaber•15m ago•0 comments

Arcan Explained – A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
2•walterbell•16m ago•0 comments

We are not scared of AI, we are scared of irrelevance

https://adlrocha.substack.com/p/adlrocha-we-are-not-scared-of-ai
1•adlrocha•17m ago•0 comments

Quartz Crystals

https://www.pa3fwm.nl/technotes/tn13a.html
1•gtsnexp•20m ago•0 comments

Show HN: I built a free dictionary API to avoid API keys

https://github.com/suvankar-mitra/free-dictionary-rest-api
2•suvankar_m•22m ago•0 comments

Show HN: Kybera – Agentic Smart Wallet with AI Osint and Reputation Tracking

https://kybera.xyz
1•xipz•24m ago•0 comments

Show HN: brew changelog – find upstream changelogs for Homebrew packages

https://github.com/pavel-voronin/homebrew-changelog
1•kolpaque•27m ago•0 comments

Any chess position with 8 pieces on board and one pair of pawns has been solved

https://mastodon.online/@lichess/116029914921844500
2•baruchel•29m ago•1 comments

LLMs as Language Compilers: Lessons from Fortran for the Future of Coding

https://cyber-omelette.com/posts/the-abstraction-rises.html
2•birdculture•31m ago•0 comments

Projecting high-dimensional tensor/matrix/vect GPT–>ML

https://github.com/tambetvali/LaegnaAIHDvisualization
1•tvali•32m ago•1 comments

Show HN: Free Bank Statement Analyzer to Find Spending Leaks and Save Money

https://www.whereismymoneygo.com/
2•raleobob•35m ago•1 comments

Our Stolen Light

https://ayushgundawar.me/posts/html/our_stolen_light.html
2•gundawar•36m ago•0 comments

Matchlock: Linux-based sandboxing for AI agents

https://github.com/jingkaihe/matchlock
1•jingkai_he•38m ago•0 comments

Show HN: A2A Protocol – Infrastructure for an Agent-to-Agent Economy

1•swimmingkiim•42m ago•1 comments

Drinking More Water Can Boost Your Energy

https://www.verywellhealth.com/can-drinking-water-boost-energy-11891522
1•wjb3•46m ago•0 comments

Proving Laderman's 3x3 Matrix Multiplication Is Locally Optimal via SMT Solvers

https://zenodo.org/records/18514533
1•DarenWatson•48m ago•0 comments

Fire may have altered human DNA

https://www.popsci.com/science/fire-alter-human-dna/
4•wjb3•48m ago•2 comments

"Compiled" Specs

https://deepclause.substack.com/p/compiled-specs
1•schmuhblaster•53m ago•0 comments

The Next Big Language (2007) by Steve Yegge

https://steve-yegge.blogspot.com/2007/02/next-big-language.html?2026
1•cryptoz•55m ago•0 comments

Open-Weight Models Are Getting Serious: GLM 4.7 vs. MiniMax M2.1

https://blog.kilo.ai/p/open-weight-models-are-getting-serious
4•ms7892•1h ago•0 comments

Using AI for Code Reviews: What Works, What Doesn't, and Why

https://entelligence.ai/blogs/entelligence-ai-in-cli
3•Arindam1729•1h ago•0 comments

Show HN: Solnix – an early-stage experimental programming language

https://www.solnix-lang.org/
3•maheshbhatiya•1h ago•0 comments

DoNotNotify is now Open Source

https://donotnotify.com/opensource.html
5•awaaz•1h ago•3 comments

The British Empire's Brothels

https://www.historytoday.com/archive/feature/british-empires-brothels
2•pepys•1h ago•0 comments

What rare disease AI teaches us about longitudinal health

https://myaether.live/blog/what-rare-disease-ai-teaches-us-about-longitudinal-health
2•takmak007•1h ago•0 comments
Open in hackernews

RL algorithms are less bitter-lesson-pilled than 2015-era deep learning

1•rajap•3mo ago
The real issue isn't reward shaping or curriculum learning - everyone complains about those. The deeper problem is that we're hardcoding the credit assignment timescale into our algorithms.

Discount factors (γ), n-step returns, GAE λ parameters - these are human priors about temporal abstraction baked directly into the learning signal. PPO's GAE(λ) literally tells the algorithm "here's how far into the future you should care about consequences." We're not learning this, we're imposing it. Different domains need different λ values. That's manual feature engineering, RL-style.

Biological learning doesn't have a global discount factor slider. Dopamine and temporal difference learning in the brain operate at multiple timescales simultaneously - the brain learns which timescales matter for which situations. Our algorithms? They get a single γ parameter tuned by grad students.

Even worse: exploration strategies are domain-specific hacks. ε-greedy for Atari, continuous noise processes for robotics, count-based bonuses for sparse rewards. We're essentially doing "exploration engineering" for each domain, like it's 2012 computer vision all over again.

Compare this to supervised learning circa 2015: we stopped engineering features and just scaled transformers. The architecture learned what mattered. RL in 2025? Still tweaking γ, λ, exploration coefficients, entropy bonuses for every new task.

True bitter-lesson compliance would mean learning your own temporal abstractions (dynamic γ), learning how to explore (meta-RL over exploration strategies), and learning credit assignment windows (adaptive eligibility traces). Some promising directions exist - options frameworks, meta-RL, world models with learned abstraction - but they're not mainstream because they're compute-hungry and unstable. We keep returning to human priors because they're cheaper. That's the opposite of the bitter lesson.

The irony is stark: RL researchers talk about "end-to-end learning" while manually tuning the most fundamental learning signal parameters. Imagine if vision researchers were still manually setting feature detector orientations in 2025. That's where RL is.

I predict: The next major RL breakthrough won't come from better policy gradient estimators. It'll come from algorithms that discover their own temporal abstractions and exploration strategies through meta-learning at scale. Only then will RL be bitter-lesson-pilled.