frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Learning to Reason in 13 Parameters

https://arxiv.org/abs/2602.04118
1•nicholascarolan•42s ago•0 comments

Convergent Discovery of Critical Phenomena Mathematics Across Disciplines

https://arxiv.org/abs/2601.22389
1•energyscholar•56s ago•1 comments

Ask HN: Will GPU and RAM prices ever go down?

1•alentred•1m ago•0 comments

From hunger to luxury: The story behind the most expensive rice (2025)

https://www.cnn.com/travel/japan-expensive-rice-kinmemai-premium-intl-hnk-dst
1•mooreds•2m ago•0 comments

Substack makes money from hosting Nazi newsletters

https://www.theguardian.com/media/2026/feb/07/revealed-how-substack-makes-money-from-hosting-nazi...
3•mindracer•3m ago•0 comments

A New Crypto Winter Is Here and Even the Biggest Bulls Aren't Certain Why

https://www.wsj.com/finance/currencies/a-new-crypto-winter-is-here-and-even-the-biggest-bulls-are...
1•thm•3m ago•0 comments

Moltbook was peak AI theater

https://www.technologyreview.com/2026/02/06/1132448/moltbook-was-peak-ai-theater/
1•Brajeshwar•4m ago•0 comments

Why Claude Cowork is a math problem Indian IT can't solve

https://restofworld.org/2026/indian-it-ai-stock-crash-claude-cowork/
1•Brajeshwar•4m ago•0 comments

Show HN: Built an space travel calculator with vanilla JavaScript v2

https://www.cosmicodometer.space/
1•captainnemo729•4m ago•0 comments

Why a 175-Year-Old Glassmaker Is Suddenly an AI Superstar

https://www.wsj.com/tech/corning-fiber-optics-ai-e045ba3b
1•Brajeshwar•4m ago•0 comments

Micro-Front Ends in 2026: Architecture Win or Enterprise Tax?

https://iocombats.com/blogs/micro-frontends-in-2026
1•ghazikhan205•6m ago•0 comments

These White-Collar Workers Actually Made the Switch to a Trade

https://www.wsj.com/lifestyle/careers/white-collar-mid-career-trades-caca4b5f
1•impish9208•7m ago•1 comments

The Wonder Drug That's Plaguing Sports

https://www.nytimes.com/2026/02/02/us/ostarine-olympics-doping.html
1•mooreds•7m ago•0 comments

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

https://new.knife.day/blog/reddit-steel-sentiment-analysis
1•p-s-v•7m ago•0 comments

Federated Credential Management (FedCM)

https://ciamweekly.substack.com/p/federated-credential-management-fedcm
1•mooreds•7m ago•0 comments

Token-to-Credit Conversion: Avoiding Floating-Point Errors in AI Billing Systems

https://app.writtte.com/read/kZ8Kj6R
1•lasgawe•8m ago•1 comments

The Story of Heroku (2022)

https://leerob.com/heroku
1•tosh•8m ago•0 comments

Obey the Testing Goat

https://www.obeythetestinggoat.com/
1•mkl95•9m ago•0 comments

Claude Opus 4.6 extends LLM pareto frontier

https://michaelshi.me/pareto/
1•mikeshi42•9m ago•0 comments

Brute Force Colors (2022)

https://arnaud-carre.github.io/2022-12-30-amiga-ham/
1•erickhill•12m ago•0 comments

Google Translate apparently vulnerable to prompt injection

https://www.lesswrong.com/posts/tAh2keDNEEHMXvLvz/prompt-injection-in-google-translate-reveals-ba...
1•julkali•12m ago•0 comments

(Bsky thread) "This turns the maintainer into an unwitting vibe coder"

https://bsky.app/profile/fullmoon.id/post/3meadfaulhk2s
1•todsacerdoti•13m ago•0 comments

Software development is undergoing a Renaissance in front of our eyes

https://twitter.com/gdb/status/2019566641491963946
1•tosh•14m ago•0 comments

Can you beat ensloppification? I made a quiz for Wikipedia's Signs of AI Writing

https://tryward.app/aiquiz
1•bennydog224•15m ago•1 comments

Spec-Driven Design with Kiro: Lessons from Seddle

https://medium.com/@dustin_44710/spec-driven-design-with-kiro-lessons-from-seddle-9320ef18a61f
1•nslog•15m ago•0 comments

Agents need good developer experience too

https://modal.com/blog/agents-devex
1•birdculture•16m ago•0 comments

The Dark Factory

https://twitter.com/i/status/2020161285376082326
1•Ozzie_osman•16m ago•0 comments

Free data transfer out to internet when moving out of AWS (2024)

https://aws.amazon.com/blogs/aws/free-data-transfer-out-to-internet-when-moving-out-of-aws/
1•tosh•17m ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•alwillis•19m ago•0 comments

Prejudice Against Leprosy

https://text.npr.org/g-s1-108321
1•hi41•19m ago•0 comments
Open in hackernews

Show HN: PILF, The ultimate solution to catastrophic oblivion on AI models

https://github.com/dmf-archive/PILF
31•NetRunnerSu•7mo ago

Comments

Ifkaluva•7mo ago
It’s an interesting idea, I have two questions.

- Surprise is detected by the norm of the gradients. So, doesn’t this suggest that the model already has a way of adjusting to surprise?

- Is there a danger of model instability when the gradients become larger and the learning rate is also increased?

NetRunnerSu•7mo ago
1. an overly strong surprise is like PTSD in humans - it changes the model's previously learned experience forever, this is what we want to avoid

2. it's bound to happen, and our PILR-S is designed to keep the learning rate within the bell curve and decreasing as the surprise decreases (less new information, less learning).

derefr•7mo ago
But doesn’t this lead to the opposite problem: creating a model that can never learn to let go of an early-life mental model picked up from a skewed dataset?

By analogy to humans: if this model were raised in a cult, and then let out into the real world, it would be seemingly incapable of unlearning the cult’s indoctrination, despite the real-world data all contradicting it — as all of this real-world data would be too surprising for the model to accept.

Or, for a maybe-more-likely situation you might encounter in e.g. incremental model re-training of old models for chronologically-newer info: a model trained this way would “stubbornly” refuse to accept any major shift in scientific consensus on a topic.

The human cognitive architecture seems to solve this problem by 1. buffering this rejected-for-being-too-out-there info in a way where it can at least be pattern-recognized; and then 2. noticing when a lot of different, seemingly independent, seemingly trustworthy sources begin matching on the rejected pattern. At that point, the human brain seems to swing the other way — experiencing a “crisis of faith” per se.

NetRunnerSu•7mo ago
That's a brilliant and crucial point. You've pinpointed the central dialectic of this architecture: the trade-off between stability (resisting catastrophic forgetting) and plasticity (updating core beliefs).

You are absolutely right that a poorly configured model could become "dogmatic," incapable of escaping an early "cult" indoctrination. This cognitive rigidity, however, is not a hardcoded flaw but a tunable personality trait .

This is where the remaining hyperparameters come into play. We still define:

1. The initial `learning_rate`, setting its baseline openness.

2. The `sigma_threshold` for the surprise EMA, which defines its "trust window." (This can be adjusted at any time! It does not affect any past training progression. For generative models, such as LLMs, you can even try to let them specify themselves)

A narrow sigma creates a conservative, "skeptical" model, while a wider sigma creates a more "open-minded" one that is more willing to entertain paradigm shifts. So, the paradigm shift is this: we are no longer micromanaging how the model learns moment-to-moment. Instead, we are defining its cognitive temperament or learning style. Your "crisis of faith" mechanism is the logical next step—a meta-learning process we are actively exploring. Thank you for the incredibly sharp insight.

alienbaby•7mo ago
Doesn't this lead you to now trying to dynamically adjust sigma to respond successfully?
NetRunnerSu•7mo ago
You've hit on the core. We don't manually tweak sigma directly in operation. Instead, sigma_threshold is a high-level cognitive trait. The beauty lies in the system's inherent drive for realignment: even from random initializations, PILF converges by minimizing surprise. With G²MoE in future, the model will gains the theoretical capacity to self-adjust its own hyperparameters, akin to a more fundamental Gödel Agent.[^1]

Ultimately, wallet balance is the true ultimate hyperparameter.

[^1] https://arxiv.org/abs/2410.04444

upghost•7mo ago
This looks absolutely fantastic, please accept my meagre professional jealousy. I have long bemoaned manual hyperparam fiddling . I have on occasion dabbled with nonparametric ("genetic") methods of hyperparam tuning inspired by AutoML... but then you still have to manually tune the evolutionary hyperparams.

Finding a way to derive this from the gradients is amazing.

NetRunnerSu•7mo ago
This is definitely not just another machine learning method. It comes from a complete cognitive science theory, rooted in a complete understanding of intelligence and consciousness.

https://github.com/dmf-archive/IPWT

:)

hackingonempty•7mo ago
Parameters I'd Like to Fiddle
vermilingua•7mo ago
Caution: this appears to be part of a very involved sci-fi LARP (as I understand it), so I’d take whatever claims it makes with a grain of salt.
NetRunnerSu•7mo ago
You can Git clone down and run around on your own - science fiction with enough precision is futurology
alienbaby•7mo ago
Ooohhhh.