frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Frontier Model Training Methodologies

https://djdumpling.github.io/2026/01/31/frontier_training.html
1•xdotli•50m ago

Comments

xdotli•50m ago
How do labs train a frontier, multi-billion parameter model? We look towards seven open-weight frontier models: Hugging Face’s SmolLM3, Prime Intellect’s Intellect 3, Nous Research’s Hermes 4, OpenAI’s gpt-oss-120b, Moonshot’s Kimi K2, DeepSeek’s DeepSeek-R1, and Arcee’s Trinity series. This blog is an attempt at distilling the techniques, motivations, and considerations used to train their models with an emphasis on training methodology over infrastructure.

These notes are largely structured based on Hugging Face’s SmolLM3 report due to its extensiveness, and it is currently supplemented with notes from other reports including Intellect-3, gpt-oss-120b, Hermes 4, DeepSeek, and Kimi. While this blog explores some infrastructure-related ideas like in-flight weight updates and multi-client orchestrators, there are many other ideas mentioned throughout those posts/blogs like expert parallelism and quantization. Hugging Face writes more about gpt-oss-120b’s infrastructure here.

Recreating Daejeon, South Korea, in Blender and Unreal Engine [video]

https://www.youtube.com/watch?v=dOsO3tO9PF8
1•nogajun•1m ago•0 comments

Ferrari's New Jony Ive–Designed EV Is Swathed in Glass and Aluminum

https://www.wired.com/story/ferrari-ev-jony-ive-design/
1•divbzero•4m ago•0 comments

Cloud meeting recorders record everyone in the room. Not just you

https://thoth-app.com/blog/2026-05-13-why-your-meeting-recorder-shouldnt-upload-your-audio/
1•MattVePhD•7m ago•0 comments

Show HN: Presentforme.ai – Make slide decks explain themselves

1•cheecheongfan•7m ago•0 comments

What's so special about Emacs? [video]

https://www.youtube.com/watch?v=mJZDmO5yOxE
1•internet_points•9m ago•0 comments

The power struggle in the narrow seas, a visual story

https://ig.ft.com/maritime-chokepoints/
1•helsinkiandrew•11m ago•0 comments

A look inside ITER, the world's largest fusion energy project

https://www.cnet.com/science/climate/inside-the-worlds-biggest-bet-on-fusion-energy/
2•giuliomagnifico•11m ago•0 comments

Google Health Sucks

https://joebaldwin.me.uk/blog/google-ruins-fitbit/
2•edent•13m ago•0 comments

Heimdall: Formally Verified eBPF-to-Rust Migration

https://arxiv.org/abs/2605.25411
1•igortru•19m ago•0 comments

Contrastive Decoding Diffing: Recovering Finetuning Data Without Weight Access

https://arxiv.org/abs/2605.25902
1•Timofeibu•20m ago•0 comments

Cognitive Security as an AI Safety Cause Area

https://www.lesswrong.com/posts/KGcE7eAdfxHchk25X/cognitive-security-as-an-ai-safety-cause-area
1•joozio•20m ago•0 comments

How to make a well-structured business architecture diagram?

https://www.processon.io/blog/business-architecture-diagrams
1•kapababala•21m ago•0 comments

Orchestrating AI code review at scale

https://blog.cloudflare.com/ai-code-review/
1•pramodbiligiri•23m ago•0 comments

Switchberry: Sometimes a good time costs extra [video]

https://www.youtube.com/watch?v=wxFHw57XGjA
1•teleforce•26m ago•0 comments

We need to add 6k seats to Congress

https://www.usatoday.com/story/opinion/2026/05/25/congress-larger-size-house-representatives/9014...
2•Cider9986•29m ago•0 comments

In-Browser Container Builds

https://ochagavia.nl/blog/fully-in-browser-container-builds/
1•gurjeet•30m ago•0 comments

Bird–Meertens Formalism

https://en.wikipedia.org/wiki/Bird%E2%80%93Meertens_formalism
1•tosh•30m ago•0 comments

The first class of AI natives is graduating

https://www.wsj.com/tech/ai/ai-natives-graduates-job-cuts-6bab8ac9
1•FDETalkDotCom•31m ago•1 comments

Show HN: A high-performance audio visualizer using Rust, WASM, and React

https://audiofftimage.netlify.app/
1•dmaynard•34m ago•1 comments

Show HN : Building Production MPC Wallets: Architecture, Solana Implementation

https://nethsara.substack.com/p/byowbuild-your-own-wallet-a-field
1•nethsarask•35m ago•0 comments

Show HN: GPTFortress, a 24/7 live-stream playing Dwarf Fortress with GPT-5

https://www.twitch.tv/gptfortress
1•leostera•38m ago•0 comments

AI guardrails stripped from Meta and Google models in minutes

https://www.ft.com/content/5630ed79-a263-41ed-9a1a-321617ae310e
4•thunderbong•38m ago•1 comments

Ship Early, Learn Fast: What 10 Days of User Feedback Taught Me About My App

https://qebapps.statichost.page/devnotes/ship-early-learn-fast/
1•qeb_newsairy•42m ago•0 comments

The Quiet Death of the Senior Individual Contributor

https://medium.com/@yalovoy/the-quiet-death-of-the-senior-individual-contributor-why-staff-engine...
1•zero-ground-445•42m ago•0 comments

Show HN: Riot, a modern multicore actor-based ecosystem for OCaml

https://riot.ml
1•leostera•42m ago•0 comments

Why can't anyone build a decent deployment platform for plain HTML?

https://foliodrop.app
1•jaxxchen•46m ago•1 comments

Frontier Model Training Methodologies

https://djdumpling.github.io/2026/01/31/frontier_training.html
1•xdotli•50m ago•1 comments

Microsoft to Publishers: Don't Block the AI Bots

https://www.adexchanger.com/publishers/microsoft-to-publishers-dont-block-the-ai-bots/
3•SVI•51m ago•0 comments

Zero-knowledge encryption may not stop password theft if servers are hacked

https://techxplore.com/news/2026-02-knowledge-encryption-password-theft-servers.html
2•Ember_Wipe•52m ago•0 comments

AI Making Work Easy for Data Analysts and Founders

https://anallyst.app/
1•Sechele•55m ago•0 comments