frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

[deleted by user]

https://harness.io/blog/gitops-branching-for-database-devops
1•sonichigo•3m ago•1 comments

Analysis Shows Competitive LCOE Target for Small Modular Reactors

https://www.nucnet.org/news/analysis-shows-competitive-lcoe-target-for-small-modular-reactors-7-3-2025
1•mpweiher•8m ago•0 comments

Show HN: I built a free backlink exchange marketplace

https://launchigniter.com/link-exchange
1•maulikdhameliya•9m ago•0 comments

Bitchat Mesh

https://apps.apple.com/us/app/bitchat-mesh/id6748219622
2•doener•11m ago•0 comments

Apisix Integration with AI/ML API

https://apisix.apache.org/blog/2025/07/29/announcing-integration-of-apisix-and-ai-ml-api/
1•Yilialinn•12m ago•0 comments

Automatic A2A Service Discovery in Kubernetes with Inference Gateway

https://github.com/inference-gateway/inference-gateway/tree/main/examples/kubernetes/a2a
1•edenr•12m ago•1 comments

The Online Safety Act for forum and blog owners

https://successfulsoftware.net/2025/07/29/the-online-safety-act-for-forum-owners/
1•hermitcrab•13m ago•1 comments

Most Watched Software Engineering Talks Of 2025 (so far)

https://www.techtalksweekly.io/p/50-most-watched-software-engineering
3•hal918•13m ago•0 comments

Parity of Zero

https://en.wikipedia.org/wiki/Parity_of_zero
1•derdi•19m ago•2 comments

Hypercube 3d ultimate tic tac toe

https://dhkts1.github.io/ultimate-nd-tictactoe-3d/
1•dhkts1•21m ago•0 comments

Tell HN: NISAR Satellite to Launch Today

1•_448•21m ago•0 comments

New battery manufacturer with European software: GAZ Energy

https://www.ess-news.com/2025/07/28/new-battery-manufacturer-with-european-software-gaz-energy-builds-factory-in-czech-republic/
1•doener•22m ago•0 comments

Nostr Auth Provider · clerk · Discussion #6435

https://github.com/orgs/clerk/discussions/6435
2•kehiy•27m ago•0 comments

Show HN: Deno is amazing. I built a toy TUI text editor to make sure of that

https://github.com/eu-ge-ne/toy
1•eu-ge-ne•28m ago•0 comments

Happy 20th Birthday MDN

https://web.dev/blog/mdn-birthday
2•feross•30m ago•0 comments

Do LLMs Identify Fonts?

https://maxhalford.github.io/blog/llm-font-identification/
4•Lemaxoxo•30m ago•1 comments

The Torch of Terrorism (1994)

https://time.com/archive/6726261/the-torch-of-terrorism/
2•thomassmith65•32m ago•0 comments

Decoding the Chinese Computer

https://www.sixthtone.com/news/1017405
2•sohkamyung•32m ago•0 comments

YouTube to be included in Australia's teen social media ban

https://www.bbc.com/news/articles/cpv0zkxx0njo
2•nojs•32m ago•0 comments

The chaos and confusion of itch.io and Steam's abrupt adult game ban

https://www.theverge.com/games/715299/itchio-games-delisting-payment-processor-paypal
2•isaacfrond•35m ago•0 comments

Intra-procedural lifetime and borrowing analysis in Clang

https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-analysis-in-clang/86291
2•fanf2•35m ago•0 comments

Dead Internet Theory becomes more real – Now anyone can start botting easily

https://twitter.com/ArtusVranken/status/1950476396033175721
2•reeeeee•37m ago•1 comments

Seriously, Why Do Some AI Chatbot Subscriptions Cost More Than $200?

https://www.wired.com/story/seriously-why-do-some-ai-chatbot-subscriptions-cost-more-than-200/
9•isaacfrond•40m ago•1 comments

Show HN: I built a local AI assistant as a browser extension (zero cloud)

https://github.com/NativeMindBrowser/NativeMindExtension
3•kaylakay•41m ago•0 comments

Sleep all comes down to the mitochondria

https://www.science.org/content/blog-post/it-all-comes-down-mitochondria
3•A_D_E_P_T•43m ago•1 comments

Nvidia CEO Jensen Huang Sells $27.6M in Stock over Five Days

https://techgraph.co/stock-market/nvidia-ceo-jensen-huang-sells-27-6-million-in-stock-over-five-days/
3•visitednews•51m ago•1 comments

Show HN: Bear.Share – Turn any webpage into beautiful sharing cards

https://chromewebstore.google.com/detail/bearshare-web-sharing-car/njgbcdlfpmkgbdkiagganmdgmkfegidh
1•BearBest•54m ago•0 comments

AWS Introduces Vector Capabilities on Amazon S3

https://www.infoq.com/news/2025/07/aws-s3-vectors/
2•NomDePlum•1h ago•0 comments

Show HN: RentUp – I built a rent manager for my parents

https://play.google.com/store/apps/details?id=ai.sach.rentup&hl=en_US
1•Sachinrao•1h ago•0 comments

Scientists use quantum machine learning to create semiconductors

https://www.livescience.com/technology/computing/scientists-use-quantum-machine-learning-to-create-semiconductors-for-the-first-time-and-it-could-transform-how-chips-are-made
1•tinchox5•1h ago•0 comments
Open in hackernews

Supervised fine tuning on curated data is reinforcement learning

https://arxiv.org/abs/2507.12856
56•GabrielBianconi•12h ago

Comments

mandevil•12h ago
Interesting to see two independent researchers on this. Makes me curious as to what the back-story is? Side project?
babelfish•12h ago
Especially interesting given they both work for Google DeepMind.
GabrielBianconi•11h ago
Yeah, I hadn't noticed!
jtspringenberg•11h ago
Author here, just to clarify: we are both no longer working for DeepMind. This was purely an independent effort for the sake of research and understanding! Happy to answer any questions.
iandanforth•12h ago
How is this kind of analogy helpful? You can frame any optimization problem as RL if you try hard enough. RL is a method of optimization which calls the optimum "reward maximization". You can craft the reward function any which way you want.

The key point about RL is that it is a sequential decision making process. If you don't have something (an agent) making multiple decisions over time while interacting with an environment, then why bother calling it RL?

anndvision•11h ago
We recently ran similar experiments and saw that fine-tuning small models on automatically curated high-quality outputs from a large model can beat large-model performance while reducing inference costs by up to 30x and inference time by up to 4x.

We benchmarked closed-source (OpenAI, Google) and open-source (Qwen) models on multi-turn maze navigation (BabyAI), agentic RAG (Multi-Hop), and agentic tool use (τ-bench).

We're still running a few experiments and plan to update the post with additional results in a few days.

Looking forward to trying out importance weighting soon!

Curated Behavior Cloning: Small LLMs Can Beat Large Ones at 5-30x Lower Cost: https://www.tensorzero.com/blog/curated-behavior-cloning-sma...

chongliqin•11h ago
Cool! If you are interested, we have open sourced our code: https://github.com/emmyqin/iw_sft
anndvision•10h ago
thanks
TheTaytay•6h ago
Thanks for this - I’ve spent the last hour reading your docs and blog. I like the primitives you’ve exposed in your APO, and particularly like the decision to separate out the structured inputs from the prompt when you record an LLM call, so I can finally perform optimizations and evals on past calls.

Quick question : you mentioned unsloth in the blog post. Which of the fine tuning providers mentioned is using unsloth under the hood?

GabrielBianconi•6h ago
[I'm his coworker.] We ran Unsloth ourselves on a GPU-by-the-hour server. We have a notebook in the repository showing how to query historical data and use it with Unsloth.

It's a WIP PR that we plan to merge soon: https://github.com/tensorzero/tensorzero/pull/2273

henriquegodoy•11h ago
It's cool to see the perspective that many problems (somekinda communication problems, look at lawyers, compliance and etc...) can be solved by treating AI less as agents and more as modular components within a larger system. Once we build a working process—monitored through evals—we can then reduce costs by distilling these modules. That means starting with superintelligent models and later distilling them down to just a few billion parameters, instead of needing hundreds of billions.
stolencode•7h ago
> For example achieving 66.7% on the AIME 2024 dataset.

We worked _really_ hard, burned _tons_ of cash, and we're proud of our D- output. No wonder there are more papers published than actual work being done.

supermdguy•7h ago
That corresponds to a 10/15, which is actually really good (median is around 6)

https://artofproblemsolving.com/wiki/index.php/AMC_historica...

stolencode•3h ago
Isn't the test taken only by students under the age of 12?

Meanwhile the model is trained on these specific types of problems, does not have an apparent time or resource limit, and does not have to take the test in a proctored environment.

It's D- work. Compared to a 12 year old, okay, maybe it's B+. Is this really the point you wanted to make?

jpcompartir•38m ago
This is a nonsense critique.

Modest results are worth publishing, as are bad results.

markisus•6h ago
Something seems off with equation (5).

Just imagining Monte Carlo sampling it, the middle expectation will have a bunch of zeros due to the indicator function and the right expectation won’t.

I can make the middle expectation be as close to zero as I like by making the success threshold sufficiently high.