frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: What's something interesting you learned from training your own GPT?

2•amadeuswoo•1d ago
Not using APIs, actually training a model from scratch, even a small one

What surprised you about the data, the training process, or the output?

Comments

linolevan•1d ago
For tiny models, the SFT data mixture is unbelievably critical to usability. They are unable to generalize in almost any way. If you don't have multi-turn conversations, they will not be able to do multi-turn conversations. If you have multi-turn conversations which are just chatting, and then single turn conversations for math, it will be unable to do math in a multi-turn setting. This is much less true for bigger models.
dlcarrier•1d ago
Neural network development platforms are even more bloated and broken than the record set by FPGA development platforms and even mobile phone development platforms.
baranmelik•18h ago
That it's really easy to overfit a model

Ask HN: How far has "vibe coding" come?

4•pigon1002•4h ago•8 comments

The Anti-Pomodoro Technique: Focus on Taking Breaks, Not Watching the Timer

4•kentich•1h ago•1 comments

Ask HN: Books to learn 6502 ASM and the Apple II

97•abkt•2d ago•67 comments

Ask HN: Who do you follow via RSS feed?

67•znpy•2d ago•51 comments

Designing programming languages beyond AI comprehension

4•mr_bob_sacamano•15h ago•6 comments

Ask HN: DDD was a great debugger – what would a modern equivalent look like?

56•manux81•3d ago•60 comments

Ask HN: What recent UX changes make no sense to you?

29•superasn•1d ago•34 comments

Ask HN: Has Show HN become LLM-prompt-centric?

8•piratesAndSons•15h ago•3 comments

Ask HN: What's the Point Anymore?

60•fnoef•1d ago•73 comments

Ask HN: Gmail spam filtering suddenly marking everything as spam?

210•goopthink•4d ago•122 comments

Ask HN: Where to find cool companies to work for?

5•truetaurus•16h ago•4 comments

How much recurring income do you generate in 2026 and from what?

9•djshah•1d ago•4 comments

Ask HN: What's the current best local/open speech-to-speech setup?

256•dsrtslnd23•6d ago•61 comments

Ask HN: Vibe Researching" with AI – Anyone Using It for Real?

8•spenceXu•1d ago•5 comments

Ask HN: Notification Overload

7•fractal618•2d ago•8 comments

Ask HN: European alternative to Vercel/Cloudflare for hosting

11•vldszn•1d ago•14 comments

Where can I find startups looking for fractional product leads?

6•stulogy•1d ago•3 comments

How to DeGoogle Myself?

12•neuralkoi•1d ago•1 comments

Ask HN: How to prevent Claude/GPT/Gemini from reinforcing your biases?

29•akshay326•2d ago•22 comments

I built a C++ runtime with immutable objects and no GIL

5•gamarino•1d ago•3 comments

Ask HN: How much emphasis to put on unit testing and when?

9•theturtlemoves•2d ago•18 comments

Tell HN: I cut Claude API costs from $70/month to pennies

40•ok_orco•3d ago•25 comments

Ask HN: If Everyone Can "Build" a SaaS, What Becomes Valuable?

10•spenceXu•1d ago•8 comments

Ask HN: Can a MMO be vibe coded?

3•radicalethics•20h ago•4 comments

Ask HN: What usually happens after a VC asks for a demo?

12•stijo•4d ago•7 comments

Tell HN: JumpCloud 2FA appears to be down

2•sgammon•1d ago•0 comments

Why did the developer go broke?

7•oxqbldpxo•1d ago•6 comments

Generative AI failed to replace SaaS

3•AIFairy•1d ago•2 comments

Ask HN: What's something interesting you learned from training your own GPT?

2•amadeuswoo•1d ago•3 comments

Ask HN: If OpenAI stops its free Web service (ChatGPT)

3•JPLeRouzic•1d ago•2 comments