frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

How's Linear so fast? A technical breakdown

https://performance.dev/how-is-linear-so-fast-a-technical-breakdown
202•howToTestFE•3h ago•106 comments

Building from zero after addiction, prison, and a felony

https://gavinray97.github.io/blog/building-from-zero-after-addiction-prison-felony
225•gavinray•3h ago•114 comments

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

https://arxiv.org/abs/2605.31514
61•ketchup32613•3h ago•39 comments

Show HN: I Derived a Pancake

https://www.absurdlyoptimized.com/recipes/pancakes/
24•bkazez•2d ago•3 comments

Making peace with your unlived dreams (2023)

https://nik.art/making-peace-with-your-unlived-dreams/
90•herbertl•4h ago•38 comments

Silurus/ooxml: Pixel-faithful Office documents, rendered in the browser

https://github.com/yukiyokotani/office-open-xml-viewer
100•maxloh•4h ago•34 comments

Powering up a module from the IBM 604: an electronic calculator from 1948

https://www.righto.com/2026/06/ibm-604-thyraton-tube-module.html
62•elpocko•5h ago•19 comments

What is the purpose of the lost+found folder in Linux and Unix? (2014)

https://unix.stackexchange.com/questions/18154/what-is-the-purpose-of-the-lostfound-folder-in-lin...
97•tosh•2d ago•37 comments

My automated doubt development process

https://www.alexself.dev/blog/automated-doubt
36•aself101•4h ago•14 comments

Do we fear the serializable isolation level more than we fear subtle bugs?

https://blog.ydb.tech/do-we-fear-the-serializable-isolation-level-more-than-we-fear-subtle-bugs-5...
21•b-man•4d ago•5 comments

Cloning a Sennheiser BA2015 battery pack

https://blog.brixit.nl/cloning-a-sennheiser-ba2015-accu-pack/
92•zdw•1d ago•15 comments

LLMs are eroding my software engineering career and I don't know what to do

https://human-in-the-loop.bearblog.dev/llms-are-eroding-my-software-engineering-career-and-i-dont...
733•poisonfountain•9h ago•689 comments

Show HN: Lathe – Use LLMs to learn a new domain, not skip past it

https://github.com/devenjarvis/lathe
207•devenjarvis•11h ago•41 comments

The 29th International Obfuscated C Code Contest (IOCCC) 2025 Winners

https://www.ioccc.org/2025/
352•matt_d•16h ago•85 comments

A Fundamental Principle of Aeronautical Engineering Has Been Overturned

https://www.tohoku.ac.jp/japanese/2026/05/press20260512-02-DMR.html
5•mhb•5d ago•1 comments

The architecture of the internet creates risks for democracy

https://www.science.org/doi/10.1126/science.aei2409
64•Anon84•1h ago•70 comments

Proliferate (YC S25) is hiring to building open source Codex

https://www.ycombinator.com/companies/proliferate/jobs/L3copvK-founding-engineer
1•pablo24602•5h ago

Why isn't the U.S. better at soccer?

https://www.natesilver.net/p/why-isnt-the-us-better-at-soccer
38•7777777phil•2h ago•78 comments

The complete IPv4 address space, mapped

https://worldip.io/
24•theanonymousone•4h ago•9 comments

Backrest – a web UI and orchestrator for restic backup

https://github.com/garethgeorge/backrest
66•flexagoon•5d ago•5 comments

A visual introduction to kernel functions

https://kelvinpaschal.com/blog/kernel-functions/
20•Kelvinidan•2d ago•1 comments

Anthropic, please ship an official Claude Desktop for Linux

https://github.com/anthropics/claude-code/issues/65697
409•predkambrij•9h ago•235 comments

Podman 6: machine usability improvements (2025)

https://blog.podman.io/2025/10/podman-6-machine-usability-improvements/
86•daesorin•8h ago•6 comments

Splash Is a Colour Format

https://www.todepond.com/lab/splash/
41•tobr•4d ago•46 comments

An Ohio Valley 100k-watt FM signal is severed in broad daylight

https://www.radioworld.com/news-and-business/headlines/an-ohio-valley-100000-watt-fm-signal-is-se...
124•pkaeding•20h ago•120 comments

The gamers taking on the industry to stop it switching off games

https://www.bbc.com/news/articles/c8e8e7g0r82o
87•Brajeshwar•6h ago•97 comments

Win16 Memory Management

http://www.os2museum.com/wp/win16-memory-management/
125•supermatou•2d ago•64 comments

I design with Claude more than Figma now

https://blog.janestreet.com/i-design-with-claude-code-more-than-figma-now-index/
226•MrBuddyCasino•17h ago•209 comments

Public Domain Image Archive

https://pdimagearchive.org/
235•davidbarker•21h ago•32 comments

The week I lost the plot at a startup where the tools worked as advertised

https://nlkw.de/en/blog/getting-fried-part-1-cogentiv/
19•nilaloeber•5h ago•2 comments
Open in hackernews

What Are Tokens in LLMs?

https://bearisland.dev/posts/tokens-and-tokenization/
9•s1monb•1h ago

Comments

Tiberium•1h ago
The article comes from the "personal" experience of an LLM so it's a very trusted source!

/s

Tiberium•1h ago
> This isn’t because the model can’t count. It’s because it never sees the letters at all.

> The chunks aren’t characters and they aren’t words. They’re something more specific, and the specificity matters more than most people realize.

> Those numbers are real, but they hide what a token actually is.

> GPT-4’s vocabulary isn’t Claude’s. Claude’s isn’t Llama’s.

> The model never sees text. It sees a sequence of integer indices into its own private alphabet.

> So tokens aren’t “roughly like words” or “kind of like characters”. They’re the atoms of perception for one specific model, and they’re the only language that model speaks.

> The same sentence is nine tokens to GPT-4 and seven tokens to Llama 3. Not because Llama is smarter or the sentence changed, but because the two models have different vocabularies.

> That’s it. No clever scoring, no neural network.

Could people who use LLM to write articles at least prompt them to have a better style? I'm really tired of the default Claude style (a lot of Chinese models also reuse the same style)

s1monb•1h ago
I appreciate the feedback. My main focus was on the visual elements, and not so much "ridding the text of AI-traces".

What did you think about the more visual elements?

Simon

s1monb•58m ago
I will do better and link to the research and related sources in the next iteration.
Tiberium•56m ago
I was just pointing out how the article is clearly LLM written, probably including the interactive widgets. It's especially obvious because someone writing such an article in 2026 would at least find what the newest tokenizers are, instead of mentioning LLaMA 2/3 (!), and GPT's old tokenizer that they dropped since GPT-4o (or something close).

And, more obviously, the fact that GPT-4 is being directly named even though that model is over 3 years old by now: "Ask GPT-4, Claude, or Gemini today and they will usually answer three.".

Sorry, I just think that the article wasn't produced by a human at all.

s1monb•36m ago
> It's especially obvious because someone writing such an article in 2026 would at least find what what the newest tokenizers are

The underlying BPE algorithm, which is the main focus of this article, is the one used modern tokenizers today.

> The fact that GPT-4 is being directly named even though that model is over 3 years old by now

That is fair. Will be updated

> Sorry, I just think that the article wasn't produced by a human at all.

While I have used LLM to help me write and explain my content, my hopes is that most readers does not share this opinion of yours. Everything touched by AI is not slop, and I wanted to share the notes I created for myself.