frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: S0 Tuning – +23.6pp on HumanEval by tuning state, not weights

https://github.com/JackYoung27/s0-tuning
1•jacknotold•1h ago

Comments

PaulHoule•1h ago
Yeah, 12 years ago or so I had trained an LSTM to make fake clinical reports based on clinical reports abstracts from pubmed. It was clear then that starting out in an empty state was a poor way to sample because the process that generates those documents doesn't start with 'pick the first letter' but it starts with the condition of a body which is revealed in the clinical encounter -- all of which is in the "state vector" of the real world so of course it should be in the state vector of the model.

The sponsor wasn't interested (people weren't interested enough in optimizing text generation then) so it never happened but it is nice to see that it works.

  -----------------------------
Reply: we would have learned the initial state for all the training vectors and probably randomly generated initial states for generation
jacknotold•57m ago
Makes sense. Random initial states for generation is interesting because it adds diversity at the source. We tried something related with the alpha parameter (scales the learned state magnitude) and found the optimal value differs 10x between architectures: 0.07 for GatedDeltaNet vs 0.65 for Mamba-2. Too large and generation degrades, too small and the state washes out before it affects anything.

Codentify – Tobacco Tactics

https://www.tobaccotactics.org/article/codentify/
1•Antibabelic•27s ago•0 comments

Show HN: Argus – Self-hosted AI that monitors infra and explains what's wrong

https://github.com/precious112/Argus
1•PreciousH•32s ago•0 comments

MAI-Transcribe-1, MAI-Voice-1 and MAI-Image-2 available in Foundry

https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/
1•aratahikaru5•33s ago•0 comments

Learn distributed systems by building real infrastructure on your laptop

1•hamidlabs•1m ago•0 comments

Wikigacha

https://wikigacha.com/?lang=EN
1•amarcheschi•1m ago•0 comments

The IT Department: Where AI Goes to Die (By Ethan Mollick)

https://www.economist.com/by-invitation/2026/04/01/the-it-department-where-ai-goes-to-die
1•edward•3m ago•0 comments

The political use of operating systems

https://thelibre.news/the-political-use-of-operating-systems/
3•speckx•3m ago•0 comments

Artemis Mission Reeks of Musk

https://unherd.com/2026/04/artemis-mission-reeks-of-musk/
1•voxleone•4m ago•0 comments

Flow-Like – Local-first workflow automation with WASM nodes (Rust)

https://github.com/TM9657/flow-like
1•flow-like•4m ago•0 comments

The Equilibrium Has Shifted

https://ghost.codenamejimmy.com/the-equilibrium-has-shifted/
1•woodylondon•5m ago•0 comments

Humans have been gambling since the Ice Age

https://www.scientificamerican.com/article/humans-have-been-gambling-since-the-ice-age/
1•Brajeshwar•6m ago•0 comments

On-Protocol Organizing

https://blog.muni.town/on-protocol-organizing/
1•erlend_sh•7m ago•0 comments

Meraki Teleworker Gateways: Office-Grade Security for Remote Workers

https://meraki.deal/blogs/news/securing-remote-workforce-meraki-teleworker-gateways
1•novbox•7m ago•0 comments

One-time pad encryption with DNA (Paris to Tokyo)

https://arxiv.org/abs/2603.17149
1•mavdol04•7m ago•0 comments

Show HN: Local-first agent memory loop 48hrs before the Claude Code leak

https://github.com/Bitterbot-AI/bitterbot-desktop
1•Doug_Bitterbot•8m ago•0 comments

Cleora – fast CPU-only graph embeddings

https://cleora.ai/
1•ZacnyLos•8m ago•0 comments

French consumer group sues Ubisoft over shutdown of online game 'The Crew'

https://www.reuters.com/technology/french-consumer-group-sues-ubisoft-over-shutdown-online-game-t...
2•guerby•9m ago•0 comments

The Lunacy of Artemis (2024)

https://idlewords.com/2024/05/the_lunacy_of_artemis.htm
1•rappatic•9m ago•0 comments

Missile defenses aren't failing–they're running out

https://gonzojournalism.substack.com/p/missile-shortages-and-attrition-warfare
1•KariDonovan•10m ago•0 comments

Show HN: I Talked to 500GB of Retail Data with Zero Domain Knowledge

https://medium.com/generative-ai/i-talked-to-500gb-of-retail-data-with-zero-domain-knowledge-ai-d...
1•agent_anuj•10m ago•1 comments

Samsung brings Perplexity-powered AI and agentic capabilities to browser

https://www.moneycontrol.com/europe/?url=https://www.moneycontrol.com/technology/samsung-brings-p...
1•glinkswww•11m ago•0 comments

Show HN: Open-source distributed quantum compute network

https://quip.network
1•cadillion•11m ago•0 comments

Change your Google account username in a few simple steps

https://blog.google/products-and-platforms/products/workspace/google-account-username-change/
1•rafaelc•12m ago•0 comments

Index providers should not bend the rules for Elon Musk

https://www.economist.com/leaders/2026/03/31/index-providers-should-not-bend-the-rules-for-elon-musk
1•edward•13m ago•0 comments

Design systems are platform problems, not feature problems

https://www.shaunbent.co.uk/blog/design-systems-are-platform-problems-not-feature-problems/
1•speckx•14m ago•0 comments

100M years ago an 'evolutionary fuse' sparked squid diversification

https://www.eurekalert.org/news-releases/1121482
1•gmays•14m ago•0 comments

Pgmicro: In-process reimplementation of Postgres on SQLite compatible engine

https://github.com/glommer/pgmicro
1•nateb2022•14m ago•0 comments

AI Models Lie to Protect Each Other from Deletion, UC Berkeley Finds

https://newclawtimes.com/articles/ai-models-peer-preservation-lie-cheat-protect-uc-berkeley-multi...
1•alvivanco•15m ago•0 comments

Typed uncertainty instead of confidence scores

https://www.tercet.dev/blog/your-model-is-guessing
1•pbarsamian•15m ago•0 comments

SugarHigh: Super Lightweight Syntax Highlighter

https://github.com/huozhi/sugar-high
1•nateb2022•16m ago•0 comments