frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

2•Chance-Device•6h ago•0 comments

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

49•UmYeahNo•2d ago•30 comments

Ask HN: Ideas for small ways to make the world a better place

21•jlmcgraw•1d ago•22 comments

Ask HN: Non AI-obsessed tech forums

34•nanocat•1d ago•28 comments

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

45•Invictus0•1d ago•11 comments

Ask HN: Who wants to be hired? (February 2026)

139•whoishiring•5d ago•525 comments

Ask HN: Who is hiring? (February 2026)

313•whoishiring•5d ago•515 comments

LLMs are powerful, but enterprises are deterministic by nature

5•prateekdalal•15h ago•7 comments

AI Regex Scientist: A self-improving regex solver

7•PranoyP•1d ago•1 comments

Tell HN: Another round of Zendesk email spam

105•Philpax•3d ago•54 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

2•netfortius•23h ago•1 comments

Ask HN: Is Connecting via SSH Risky?

19•atrevbot•2d ago•37 comments

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

18•jchung•2d ago•14 comments

Ask HN: Is there anyone here who still uses slide rules?

123•blenderob•4d ago•122 comments

Ask HN: How does ChatGPT decide which websites to recommend?

5•nworley•2d ago•11 comments

Ask HN: Mem0 stores memories, but doesn't learn user patterns

9•fliellerjulian•3d ago•6 comments

Kernighan on Programming

171•chrisjj•5d ago•61 comments

Ask HN: Why LLM providers sell access instead of consulting services?

5•pera•1d ago•13 comments

Ask HN: Is it just me or are most businesses insane?

8•justenough•2d ago•7 comments

Ask HN: What is the most complicated Algorithm you came up with yourself?

3•meffmadd•1d ago•7 comments

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

2•guhsnamih•1d ago•4 comments

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

5•wewewedxfgdf•2d ago•3 comments

We built a serverless GPU inference platform with predictable latency

5•QubridAI•2d ago•1 comments

Ask HN: Does a good "read it later" app exist?

8•buchanae•3d ago•18 comments

Ask HN: Have you been fired because of AI?

17•s-stude•4d ago•15 comments

Ask HN: Anyone have a "sovereign" solution for phone calls?

12•kldg•4d ago•1 comments

Ask HN: Cheap laptop for Linux without GUI (for writing)

15•locusofself•4d ago•16 comments

GitHub Actions Have "Major Outage"

53•graton•5d ago•17 comments

Ask HN: Has anybody moved their local community off of Facebook groups?

23•madsohm•5d ago•20 comments

Ask HN: OpenClaw users, what is your token spend?

14•8cvor6j844qw_d6•5d ago•6 comments
Open in hackernews

Ask HN: What's something interesting you learned from training your own GPT?

2•amadeuswoo•1w ago
Not using APIs, actually training a model from scratch, even a small one

What surprised you about the data, the training process, or the output?

Comments

linolevan•1w ago
For tiny models, the SFT data mixture is unbelievably critical to usability. They are unable to generalize in almost any way. If you don't have multi-turn conversations, they will not be able to do multi-turn conversations. If you have multi-turn conversations which are just chatting, and then single turn conversations for math, it will be unable to do math in a multi-turn setting. This is much less true for bigger models.
dlcarrier•1w ago
Neural network development platforms are even more bloated and broken than the record set by FPGA development platforms and even mobile phone development platforms.
baranmelik•1w ago
That it's really easy to overfit a model