frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Relatively SoTA LLM Agents from Scratch?

4•solsane•1mo ago
As we know, OpenAI is not so open.

In 2023, I was playing with transformers, RNNs and I had an understanding how it worked from top to bottom (e.g. made my own keras, could whiteboard small nets) and I can throw things together in keras or tf pretty quick

I got a job and never touched that again. Data and compute notwithstanding, how hard would it be to make a pet project foundation model using the latest techniques? I’ve heard about MoE, things like that and I figure we’re not just throwing a bunch of layers and dropout in Keras anymore.

Comments

huevosabio•1mo ago
The Olmo team is AFAIK the only SOTA-ish model that has fully open source code and data. Their report is fantastic: https://www.datocms-assets.com/64837/1763662397-1763646865-o...

It should give you an idea of how hard it is to do a SOTA model from scratch!

If you relax the SOTA aspect, Karpathy's nanochat has you covered: https://github.com/karpathy/nanochat

walpurginacht•1mo ago
I'd suggest you take a read on HuggingFace's writeup when they trained smolLM3

https://huggingface.co/spaces/HuggingFaceTB/smol-training-pl...

rare detailed insight on the entire process

bjourne•1mo ago
Read this article: https://dl.acm.org/doi/10.1145/3712285.3759827 Training algorithms are relatively simple (base training, fine-tuning, RL), but the scale is critical. I.e., the engineering infrastructure. The authors recommend a 128 GPU cluster minimum and many petabytes of training data.

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

2•Chance-Device•4h ago•0 comments

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

49•UmYeahNo•1d ago•30 comments

Ask HN: Ideas for small ways to make the world a better place

21•jlmcgraw•1d ago•22 comments

Ask HN: Non AI-obsessed tech forums

34•nanocat•1d ago•28 comments

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

45•Invictus0•1d ago•11 comments

Ask HN: Who wants to be hired? (February 2026)

139•whoishiring•5d ago•524 comments

LLMs are powerful, but enterprises are deterministic by nature

5•prateekdalal•13h ago•7 comments

Ask HN: Who is hiring? (February 2026)

313•whoishiring•5d ago•515 comments

AI Regex Scientist: A self-improving regex solver

7•PranoyP•1d ago•1 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

2•netfortius•21h ago•1 comments

Tell HN: Another round of Zendesk email spam

104•Philpax•3d ago•54 comments

Ask HN: Is Connecting via SSH Risky?

19•atrevbot•2d ago•37 comments

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

18•jchung•2d ago•14 comments

Ask HN: How does ChatGPT decide which websites to recommend?

5•nworley•1d ago•11 comments

Ask HN: Why LLM providers sell access instead of consulting services?

5•pera•1d ago•13 comments

Ask HN: Is there anyone here who still uses slide rules?

123•blenderob•4d ago•122 comments

Ask HN: Mem0 stores memories, but doesn't learn user patterns

9•fliellerjulian•3d ago•6 comments

Kernighan on Programming

170•chrisjj•5d ago•61 comments

Ask HN: Is it just me or are most businesses insane?

8•justenough•2d ago•7 comments

Ask HN: What is the most complicated Algorithm you came up with yourself?

3•meffmadd•1d ago•7 comments

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

2•guhsnamih•1d ago•4 comments

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

5•wewewedxfgdf•1d ago•3 comments

We built a serverless GPU inference platform with predictable latency

5•QubridAI•2d ago•1 comments

Ask HN: Does a good "read it later" app exist?

8•buchanae•3d ago•18 comments

Ask HN: Have you been fired because of AI?

17•s-stude•4d ago•15 comments

Ask HN: Anyone have a "sovereign" solution for phone calls?

12•kldg•4d ago•1 comments

Ask HN: Cheap laptop for Linux without GUI (for writing)

15•locusofself•4d ago•16 comments

Ask HN: Any International Job Boards for International Workers?

2•15charslong•23h ago•2 comments

Ask HN: How Did You Validate?

4•haute_cuisine•2d ago•6 comments

GitHub Actions Have "Major Outage"

53•graton•5d ago•17 comments