frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

RustGPT: A pure-Rust transformer LLM built from scratch

https://github.com/tekaratzas/RustGPT
94•amazonhut•1h ago

Comments

techsystems•1h ago
> ndarray = "0.16.1" rand = "0.9.0" rand_distr = "0.5.0"

Looking good!

kachapopopow•1h ago
I was slightly curious: cargo tree llm v0.1.0 (RustGPT) ├── ndarray v0.16.1 │ ├── matrixmultiply v0.3.9 │ │ └── rawpointer v0.2.1 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-complex v0.4.6 │ │ └── num-traits v0.2.19 │ │ └── libm v0.2.15 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-integer v0.1.46 │ │ └── num-traits v0.2.19 () │ ├── num-traits v0.2.19 () │ └── rawpointer v0.2.1 ├── rand v0.9.0 │ ├── rand_chacha v0.9.0 │ │ ├── ppv-lite86 v0.2.20 │ │ │ └── zerocopy v0.7.35 │ │ │ ├── byteorder v1.5.0 │ │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ │ ├── proc-macro2 v1.0.94 │ │ │ │ └── unicode-ident v1.0.18 │ │ │ ├── quote v1.0.39 │ │ │ │ └── proc-macro2 v1.0.94 () │ │ │ └── syn v2.0.99 │ │ │ ├── proc-macro2 v1.0.94 () │ │ │ ├── quote v1.0.39 () │ │ │ └── unicode-ident v1.0.18 │ │ └── rand_core v0.9.3 │ │ └── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.170 │ ├── rand_core v0.9.3 () │ └── zerocopy v0.8.23 └── rand_distr v0.5.1 ├── num-traits v0.2.19 () └── rand v0.9.0 ()

yep, still looks relatively good.

cmrdporcupine•23m ago
linking both rand-core 0.9.0 and rand-core 0.9.3 which the project could maybe avoid by just specifying 0.9 for its own dep on it
tonyhart7•1h ago
is this satire or does I must know context behind this comment???
stevedonovan•1h ago
These are a few well-chosen dependencies for a serious project.

Rust projects can really go bananas on dependencies, partly because it's so easy to include them

obsoleszenz•57m ago
The project only has 3 dependencies which i interpret as a sign of quality
Charon77•1h ago
Absolutely love how readable the entire project is
yieldcrv•1h ago
Never knew Rust could be that readable. Makes me think other Rust engineers are stuck in a masochistic ego driven contest, which would explain everything else I've encountered about the Rust community and recruiting on that side.
jmaker•1h ago
Not sure what you’re alluding to but that’s just ordinary Rust without performance or async IO concerns.
emporas•57m ago
It is very procedural/object oriented. This is not considered good Rust practice. Iterators make it more functional, which is better, more succinct that is, and enums more algebraic. But it's totally fine for a thought experiment.
koakuma-chan•26m ago
It's AI generated
Revisional_Sin•18m ago
How do you know? The over-commenting?
koakuma-chan•9m ago
I know because this is how an AI generated project looks. Clearly AI generated README, "clean" code, the way files are named, etc.
cmrdporcupine•6m ago
To me it looks like LLM generated README, but not necessarily the source (or at least not all of it).

Or there's been a cleaning pass done over it.

ndai•1h ago
I’m curious where you got your training data? I will look myself, but saw this and thought I’d ask. I have a CPU-first, no-backprop architecture that works very well on classification datasets. It can do single‑example incremental updates which might be useful for continuous learning. I made a toy demo to train on tiny.txt and it can predict next characters, but I’ve never tried to make an LLM before. I think my architecture might work well as an on-device assistant or for on-premises needs, but I want to work with it more before I embarrass myself. Any open-source LLM training datasets you would recommend?
kachapopopow•1h ago
huggingface has plenty of openai and antrophic user to assistant chains, beware there are dragons (hallucinations), but good enough for instruction training. I actually recommend distilling kimi k2 instead for instruction following capabilities.
electroglyph•1h ago
https://huggingface.co/datasets/NousResearch/Hermes-3-Datase...
kachapopopow•1h ago
This looks rather similar to when I asked an AI to implement a basic xor problem solver I guess fundementally there's really only a very limited amount of ways to implement this.
Goto80•59m ago
Nice. Mind to put a license on that?
untrimmed•56m ago
As someone who has spent days wrestling with Python dependency hell just to get a model running, a simple cargo run feels like a dream. But I'm wondering, what was the most painful part of NOT having a framework? I'm betting my coffee money it was debugging the backpropagation logic.
taminka•39m ago
lowkey ppl who praise cargo seem to have no idea of the tradeoffs involved in dependency management

the difficulty of including a dependency should be proportional to the risk you're taking on, meaning it shouldn't be as difficult as it in, say, C where every other library is continually reinventing the same 5 utilities, but also not as easy as it is with npm or cargo, because you get insane dependency clutter, and all the related issues like security, build times, etc

how good a build system isn't equivalent of how easy it is include a dependency, while modern languages should have a consistent build system, but having a centralised package repository that anyone freely pull to/from, and having those dependencies freely take on any number of other dependencies is a bad way to handle dependencies

itsibitzi•7m ago
What tool or ecosystem does this well, in your opinion?
quantumspandex•5m ago
Security is another problem, and should be tackled systematically. Artificially making dependency inclusion hard is not it and is detrimental to the more casual use cases.
codetiger•21m ago
I guess, resource utilization like GPU, etc
ricardobeat•3m ago
Have you tried uv [1]? It has removed 90% of the pain of running python projects for me.

[1] https://github.com/astral-sh/uv

enricozb•50m ago
I did this [0] (gpt in rust) with picogpt, following the great blog by jaykmody [1].

[0]: https://github.com/enricozb/picogpt-rust [1]: https://jaykmody.com/blog/gpt-from-scratch/

bigmuzzy•47m ago
nice
abricq•28m ago
This is great ! Congratulations. I really like your project, especially I like how easily it is to peak at.

Do you plan on moving forward with this project ? I seem to understand that all the training is done on the CPU, and that you have next steps regarding optimizing that. Do you consider GPU accelerations ?

Also, do you have any benchmarks on known hardware ? Eg, how long would it take to train on a macbook latest gen or your own computer ?

ramon156•10m ago
Cool stuff! I can see some GPT comments that can be removed

// Increased for better learning

this doesn't tell me anything

// Use the constants from lib.rs

const MAX_SEQ_LEN: usize = 80;

const EMBEDDING_DIM: usize = 128;

const HIDDEN_DIM: usize = 256;

these are already defined in lib.rs, why not use them (as the comment suggests)

jlmcgraw•3m ago
Some commentary from the author here: https://www.reddit.com/r/rust/comments/1nguv1a/i_built_an_ll...

No Pain, No Gain

https://blog.staysaasy.com/p/the-trauma-you-need-to-learn
1•jger15•17s ago•0 comments

The Day the Linter Broke My Code

https://blog.fillmore-labs.com/posts/errors-2/
1•eik•2m ago•1 comments

Neuronpedia - open souce interpretability platform for AI Models

https://www.neuronpedia.org/
1•nalinidash•5m ago•0 comments

Tyler Robinson and America's Lost Boys

https://www.wsj.com/opinion/tyler-robinson-and-americas-lost-boys-16e1a64f
1•Anon84•6m ago•1 comments

Best Free SQLite GUIs

https://turso.tech/blog/5-best-free-sqlite-gui
1•shaswatraj•10m ago•0 comments

Curious Connections: Voyager Probes and Sinclair ZX Spectrum

https://www.theregister.com/2025/09/15/curious_connections_between_the_voyager/
1•mastazi•11m ago•0 comments

Happy music could help you recover from motion sickness

https://www.frontiersin.org/news/2025/09/03/happy-music-could-help-you-recover-from-motion-sickne...
1•mfld•13m ago•0 comments

Leatherman (Vagabond)

https://en.wikipedia.org/wiki/Leatherman_(vagabond)
1•redbell•13m ago•0 comments

Why VCs Are Funding $100M Remote Control Toys

https://www.youtube.com/watch?v=sRYOSnei-2U
1•frag•14m ago•0 comments

Plasma Beam Solution Tackles Kessler Syndrome Threat

https://www.universetoday.com/articles/a-bi-directional-plasma-thruster-could-deorbit-space-junk-...
1•jnord•15m ago•0 comments

Childlike sex dolls being advertised on Facebook

https://www.thebureauinvestigates.com/stories/2025-09-13/childlike-sex-dolls-being-advertised-on-...
2•Improvement•16m ago•0 comments

How to Code Better with AI

https://medium.com/@christopher.graves09/how-to-code-better-with-ai-the-one-thing-99-of-developer...
1•cgvas•18m ago•0 comments

Google Search Tests Dropping 100 Search Results Parameter

https://www.seroundtable.com/google-search-drops-100-results-parameter-40097.html
1•barron35•23m ago•0 comments

Show HN: Kafy – kubectl-style CLI for Kafka management

https://github.com/KLogicHQ/kafy
1•makilan•27m ago•0 comments

Listening to Unreliable Narrators

https://secondvoice.substack.com/p/listening-to-unreliable-narrators
1•jger15•35m ago•0 comments

Tracking Trust with Rust in the Kernel

https://lwn.net/Articles/1034603/
1•pykello•35m ago•0 comments

Overcoming Existential Dread (explained by ducks) [video]

https://www.youtube.com/watch?v=cMXkiDjzZqo
2•keepamovin•38m ago•0 comments

Actually Good Encryption? Confusing Users by Changing Nonces (2022) [pdf]

https://ethz.ch/content/dam/ethz/special-interest/infk/inst-infsec/appliedcrypto/education/theses...
1•zygentoma•42m ago•0 comments

Looking for good books? A dataviz to explore NPR's Best Books 2013-2024

https://andrewmarder.net/books/
2•amarder•44m ago•0 comments

DNA cassette tape can store every song ever recorded

https://www.newscientist.com/article/2495758-dna-cassette-tape-can-store-every-song-ever-recorded/
1•Maf1•46m ago•1 comments

IETF Draft: Authenticated Transfer Repo and Sync Specification

https://www.ietf.org/archive/id/draft-holmgren-at-repository-00.txt
1•diggan•47m ago•1 comments

Is AI video the future of storytelling, or just the end of stories?

1•jamessmithe•50m ago•0 comments

Lesser-Known but C Language Facts That Might Surprise You

1•whyandgrowth•52m ago•0 comments

Cheese cave fungi reveal how genetic mutations drive rapid evolutionary change

https://phys.org/news/2025-09-cheese-cave-fungi-reveal-genetic.html
1•pseudolus•53m ago•0 comments

The Startup Killer Nobody Talks About: Domain Negotiations

https://www.brandhunt.com/
1•brandhunt•57m ago•1 comments

Meta-abstraction in the physical and social sciences (2021)

http://edwardfeser.blogspot.com/2021/03/meta-abstraction-in-physical-and-social.html
1•danielam•1h ago•0 comments

For the climate, little things don't add up

https://andymasley.substack.com/p/for-the-climate-little-things-dont
3•NavinF•1h ago•0 comments

Carbon emissions from oil giants directly linked to deadly heatwaves

https://www.theguardian.com/environment/2025/sep/10/link-oil-giants-heatwaves-research-legal-liab...
4•Anon84•1h ago•0 comments

Evorca: Fast and Minimal PlmDCA in Jax

https://github.com/suzuki-2001/evorca
1•ss-13•1h ago•1 comments

Algebraic Types are not Scary

https://blog.aiono.dev/posts/algebraic-types-are-not-scary,-actually.html
2•Bogdanp•1h ago•0 comments