frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Life at the Edge

https://asadk.com/p/edge
1•tosh•2m ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
2•oxxoxoxooo•6m ago•0 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•6m ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
2•goranmoomin•10m ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

3•throwaw12•11m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•13m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•15m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
2•myk-e•18m ago•3 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•19m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
2•1vuio0pswjnm7•21m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
1•1vuio0pswjnm7•22m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•24m ago•1 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•27m ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•32m ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•34m ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•37m ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•49m ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•51m ago•1 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•52m ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•1h ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•1h ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
2•helloplanets•1h ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•1h ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•1h ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1h ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•1h ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
2•basilikum•1h ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•1h ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•1h ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
4•throwaw12•1h ago•3 comments
Open in hackernews

Writing an LLM from scratch, part 22 – training our LLM

https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm
254•gpjt•3mo ago

Comments

mettamage•3mo ago
Here's part 1 [1]. Since his archive goes by date, it makes it a bit easier to guestimate which part is made in which month.

[1] https://www.gilesthomas.com/2024/12/llm-from-scratch-1

3abiton•3mo ago
It's interesting 22 parts in under a year, seems like a fun up to date project. Karpathy did something very similar with nanochat (following nanogpt).
ziyunli•3mo ago
seems like you can filter by tag https://www.gilesthomas.com/llm-from-scratch
js8•3mo ago
It's based on a book https://www.manning.com/books/build-a-large-language-model-f..., is it a good book?
checker659•3mo ago
I have done a little bit of DL stuff (with keras) before this. I'm currently in the attention chapter. The book gives you the code, but I feel like there is very little in the way of building intuition. Thankfully, there are tons of videos online to help with that.

I think it is a great guide. An extended tutorial if you will (at least until this point in my reading). Also having the code right in front of you helps a lot. For example, I was under the impression that embedding vectors were static like in word2vec. Turns out, they are learnable parameters too. I wouldn't have been able to tell for sure if I didn't have the code right in front of me.

dvt•3mo ago
> The book gives you the code, but I feel like there is very little in the way of building intuition.

There isn't really much intuition to begin with, and I don't really think building intuition will be useful, anyway. Even when looking at something as barebones as perceptrons, it's hard to really see "why" they work. Heck, even implementing a Markov chain from scratch (which can be done in an afternoon with no prior knowledge) can feel magical when it starts outputting semi-legible sentences.

It's like trying to build intuition when it comes to technical results like the Banach-Tarski paradox or Löb's theorem. Imo, understanding the math (which in the case of LLMs is actually quite simple) is orders of magnitude more valuable than "building intuition," whatever that might mean.

checker659•3mo ago
> Even when looking at something as barebones as perceptrons

I was thinking something like "it is trying to approximate a non-linear function" (which is what it is in the case of MLPs).

CamperBob2•3mo ago
Even when looking at something as barebones as perceptrons, it's hard to really see "why" they work.

Check out the Karpathy "Zero to Hero" videos, and try to follow along by building an MLP implementation in your own language of choice. He does a good job of building intuition because he doesn't skip much of anything.

mrasong•3mo ago
The cost comparison between local RTX 3090 and cloud A100 clusters is useful, but I wonder if the author accounted for hidden overhead—like data transfer time for large datasets or the time spent debugging CUDA compatibility issues on local hardware.
pppoe•3mo ago
Love this. This reminds me of LFS (Linux From Scratch) https://www.linuxfromscratch.org

Feeling nostalgic about the days building LFS in college.

Learning by building wouldn't help you remember all the details but many things would make more sense after going through the process step by step. And it's fun.