frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Tiny Diffusion – A character-level text diffusion model from scratch

https://github.com/nathan-barry/tiny-diffusion
33•nathan-barry•4d ago
This is a character-level language diffusion model for text generation.

The model is a modified version of Nanochat's GPT implementation and is trained on Tiny Shakespeare!

It is only 10.7 million parameters, so you can try it out locally.

Comments

yugretcx•1h ago
Why do these text diffusion demos always look like the number of allowed tokens is fixed for a specific unfilled region?

Is this the case?

Ie. if the region only has four tokens(here characters) but calculates the best word is “forget” does it just abandon the best fit or truncate it to fit?

Are there text diffusion models with lax infill directives?

rand0mwalk•20m ago
Tokens start as a special [MASK] token. Then as the diffusion process runs they are "unmasked" i.e. sampled.

So yes, you define a sequence of [MASK] tokens with some length ahead of time.

In practice, if a model wants to write a shorter sequence, it'll just fill the remaining tokens with empty content. If it wants to write a longer sequence, you'll have to identify this and extend the sequence with more [MASK] tokens. This is typically obvious since there's no "end of sequence" token present if the model wants to generate more.

nathan-barry•20m ago
Yes, this is the case. During training, the model will get a sequence of text (ex, 512 tokens long) with a percentage of them masked out (with a special <MASK> token). It learns how to unmask those tokens to construct the original text.

In the case that you mentioned, if we had 4 <MASK> tokens in a row, all we are doing for decoding is predicting what those 4 tokens should be.

Generally, this does not seem to be a significant problem, as there are usually multiple ways to express an idea in varying lengths. Also, with confidence-aware parallel decoding, it can usually avoid the scenario you mentioned, as focusing on decoding the highest confident tokens will generally avoid such scenarios with a well trained model.

simonw•22m ago
This is really neat.

I noticed the diffusion-process.py demo was using matplotlib in a window, but I figured it would be cute if it used a terminal UI instead - so I had Claude Code convert it to use curses. Code and demo GIF here: https://gist.github.com/simonw/9033ebd8dd17b4c0ad101ddda7a54...

AI World Clocks

https://clocks.brianmoore.com/
223•waxpancake•2h ago•124 comments

A race condition in Aurora RDS

https://hightouch.com/blog/uncovering-a-race-condition-in-aurora-rds
113•theanomaly•2h ago•40 comments

Manganese is Lyme disease's double-edge sword

https://news.northwestern.edu/stories/2025/11/manganese-is-lyme-diseases-double-edge-sword
83•gmays•3h ago•24 comments

Structured Outputs on the Claude Developer Platform (API)

https://www.claude.com/blog/structured-outputs-on-the-claude-developer-platform
26•adocomplete•1h ago•13 comments

All Praise to the Lunch Ladies

https://bittersoutherner.com/issue-no-12/all-praise-to-the-lunch-ladies
14•gmays•41m ago•0 comments

The disguised return of EU Chat Control

https://reclaimthenet.org/the-disguised-return-of-the-eus-private-message-scanning-plot
266•egorfine•2h ago•144 comments

Show HN: Tiny Diffusion – A character-level text diffusion model from scratch

https://github.com/nathan-barry/tiny-diffusion
33•nathan-barry•4d ago•4 comments

Minisforum Stuffs Entire Arm Homelab in the MS-R1

https://www.jeffgeerling.com/blog/2025/minisforum-stuffs-entire-arm-homelab-ms-r1
27•kencausey•1h ago•17 comments

Bitchat for Gaza – messaging without internet

https://updates.techforpalestine.org/bitchat-for-gaza-messaging-without-internet/
184•ciconia•2h ago•78 comments

Awk Technical Notes (2023)

https://maximullaris.com/awk_tech_notes.html
39•signa11•1w ago•8 comments

US Tech Market Treemap

https://caplocus.com/
51•gwintrob•3h ago•20 comments

Incus-OS: Immutable Linux OS to run Incus as a hypervisor

https://linuxcontainers.org/incus-os/
115•_kb•1w ago•39 comments

RetailReady (YC W24) Is Hiring

https://www.ycombinator.com/companies/retailready/jobs/kGHAith-support-engineer
1•sarah74•3h ago

AGI fantasy is a blocker to actual engineering

https://www.tomwphillips.co.uk/2025/11/agi-fantasy-is-a-blocker-to-actual-engineering/
465•tomwphillips•7h ago•438 comments

Meeting notes between Forgejo and the Dutch government via Git commits

https://codeberg.org/forgejo/sustainability/pulls/137/files
75•speckx•3h ago•28 comments

Honda: 2 years of ml vs 1 month of prompting - heres what we learned

https://www.levs.fyi/blog/2-years-of-ml-vs-1-month-of-prompting/
247•Ostatnigrosh•4d ago•88 comments

Magit manuals are available online again

https://github.com/magit/magit/issues/5472
95•vetronauta•8h ago•34 comments

GPG and Me (2015)

https://moxie.org/2015/02/24/gpg-and-me.html
13•cl3misch•3d ago•2 comments

Winamp clone in Swift for macOS

https://github.com/mgreenwood1001/winamp
125•hyperbole•7h ago•93 comments

Show HN: Chirp – Local Windows dictation with ParakeetV3 no executable required

https://github.com/Whamp/chirp
9•whamp•1h ago•2 comments

Being poor vs. being broke

https://blog.ctms.me/posts/2025-11-14-being-poor-or-being-broke/
302•speckx•3h ago•332 comments

EDE: Small and Fast Desktop Environment (2014)

https://edeproject.org/
77•bradley_taunt•7h ago•30 comments

Linear Algebra Explains Why Some Words Are Effectively Untranslatable

https://aethermug.com/posts/linear-algebra-explains-why-some-words-are-effectively-untranslatable
77•mrcgnc•5h ago•55 comments

Operating Margins

https://fi-le.net/margin/
233•fi-le•5d ago•90 comments

Germany to ban Huawei from future 6G network

https://www.bloomberg.com/news/articles/2025-11-13/germany-to-ban-huawei-from-future-6g-network-i...
110•teleforce•3h ago•91 comments

I think nobody wants AI in Firefox, Mozilla

https://manualdousuario.net/en/mozilla-firefox-window-ai/
1040•rpgbr•6h ago•635 comments

'No One Lives Forever' turns 25 and you still can't buy it legitimately

https://www.techdirt.com/2025/11/13/no-one-lives-forever-turns-25-you-still-cant-buy-it-legitimat...
106•speckx•4h ago•66 comments

Scientists Produce Powerhouse Pigment Behind Octopus Camouflage

https://today.ucsd.edu/story/scientists-produce-powerhouse-pigment-behind-octopus-camouflage
61•gmays•4d ago•5 comments

Show HN: Dumbass Business Ideas

https://dumbassideas.com
19•elysionmind•2h ago•15 comments

Nvidia is gearing up to sell servers instead of just GPUs and components

https://www.tomshardware.com/tech-industry/artificial-intelligence/jp-morgan-says-nvidia-is-geari...
152•giuliomagnifico•7h ago•68 comments