frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

India told millions to get degrees. Now even peon jobs are out of reach

https://www.indiatoday.in/jobs/story/graduate-unemployment-in-india-what-sc-ruling-reveals-about-...
1•rustoo•16s ago•0 comments

Show HN: VPets.net – a cozy pixel pet world

https://vpets.net/start
1•solidarnosc•9m ago•1 comments

Nemotron 3 Ultra is open weight and open data [video]

https://www.youtube.com/watch?v=D8LIIvQVGS4
1•TheJCDenton•10m ago•0 comments

The First AI QFT Textbook

https://www.math.columbia.edu/~woit/wordpress/?p=15735
1•jjgreen•14m ago•0 comments

Data centers consumed 264B gallons of water as drought hits nearly 63% of US

https://www.barchart.com/story/news/2339834/ai-data-centers-water-consumption-breaks-264-billion-...
7•yogthos•14m ago•2 comments

Compression and Intelligence 3blue1brown [video]

https://www.youtube.com/watch?v=l6DKRf-fAAM
2•2bird3•15m ago•0 comments

Painting that made Turner's name gets second public showing since 1799

https://www.thetimes.com/culture/art/article/painting-turner-abergavenny-bridge-rcvx8hglh
1•bookofjoe•15m ago•1 comments

Investing Is Compression

https://arxiv.org/abs/2604.10758
1•lisper•16m ago•0 comments

The Zebra v4.4.1 Chronicles: Independent Audit

https://github.com/Alex74SjS3/THE-ZCASH-ZEBRA-v4.4.1-CHRONICLES
1•Alex74-SjS3•20m ago•0 comments

Spyro the Dragon returns with a new game after almost two decades

https://www.theguardian.com/games/2026/jun/07/spyro-the-dragon-returns-with-a-new-game-after-almo...
2•TechTechTech•23m ago•0 comments

Thoughts on starting new projects with LLM agents

https://eli.thegreenplace.net/2026/thoughts-on-starting-new-projects-with-llm-agents/
2•zdw•24m ago•0 comments

VibeOS: First ever AI-native operating system

https://vibeos.sh/
1•doener•26m ago•0 comments

Flock Safety Price List [pdf]

https://www.omniapartners.com/suppliers-files/E-J/Flock_Safety/Contract_Documents/R250203/5_29_20...
2•ourmandave•28m ago•0 comments

A Portrait of the Software Engineer, 2031

https://jamesjboyer.substack.com/p/a-portrait-of-the-software-engineer
1•aesthetics1•28m ago•0 comments

Ask HN: Is Facebook registration procedure broken?

2•stefanos82•28m ago•0 comments

I built a sentiment analyzer for Hacker News (as an MCP server)

https://mcpize.com/mcp/sentiment-analyzer
1•Lord_Dontavious•29m ago•0 comments

VibeOS – Hallucinated Operating System [video]

https://www.youtube.com/watch?v=z3pV6FHvcgM
2•doener•31m ago•0 comments

Academics set out vision for planetary survival

https://www.theguardian.com/environment/2026/jun/04/world-inequality-lab-equality-academics-plane...
4•worik•34m ago•0 comments

The future is controlled by companies who control the physical bottlenecks of AI

https://silicon-frontier.com/research/silicon-control
1•momentmaker•34m ago•0 comments

Why are there so many canines in fine art?

https://www.theatlantic.com/magazine/2026/07/the-dogs-gaze-thomas-w-laqueur/687312/
1•prismatic•34m ago•0 comments

Got a job, dropped this for 3 months – MaskOps, Polars PII masking in Rust

https://github.com/fcarvajalbrown/MaskOps
1•fcarvajalbrown•36m ago•0 comments

1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations

https://arxiv.org/abs/2604.24885
1•PaulHoule•38m ago•0 comments

Expert Selections in MoE Transformer Models Reveal Almost as Much as Text

https://arxiv.org/abs/2602.04105
4•busserweiser•39m ago•0 comments

Small modular nuclear reactor reaches criticality in first test

https://arstechnica.com/science/2026/06/first-us-test-of-modular-reactor-reaches-criticality/
1•NedCode•39m ago•0 comments

NEC PC Engine LT Recap and LCD Bias Fix (Necromancy)

https://hitmanmcc.com/entry/pc-engine-lt-necromancy
1•zdw•41m ago•0 comments

The spelling error made 200B times a day (2025)

https://nbailey.ca/post/spelling-error/
2•NaOH•42m ago•0 comments

The US Only Has One Political Party [video]

https://www.youtube.com/watch?v=GUVf6DkDkgA
2•joe_mamba•43m ago•0 comments

Show HN: Claude Code on Slack/Discord/Telegram for flat $20/mo – no API bills

https://lobsteady.com
1•jvalansi•44m ago•0 comments

How much do amd64 microarchitecture levels help in Go?

https://lemire.me/blog/2026/06/06/how-much-do-amd64-microarchitecture-levels-help-in-go/
1•zdw•45m ago•0 comments

Why add an agent skill to a CLI that has a context command?

https://www.andreagrandi.it/posts/why-add-agent-skill-cli-context-command/
2•andreagrandi•50m ago•0 comments
Open in hackernews

What Are Tokens in LLMs?

https://bearisland.dev/posts/tokens-and-tokenization/
9•s1monb•1h ago

Comments

Tiberium•45m ago
The article comes from the "personal" experience of an LLM so it's a very trusted source!

/s

Tiberium•40m ago
> This isn’t because the model can’t count. It’s because it never sees the letters at all.

> The chunks aren’t characters and they aren’t words. They’re something more specific, and the specificity matters more than most people realize.

> Those numbers are real, but they hide what a token actually is.

> GPT-4’s vocabulary isn’t Claude’s. Claude’s isn’t Llama’s.

> The model never sees text. It sees a sequence of integer indices into its own private alphabet.

> So tokens aren’t “roughly like words” or “kind of like characters”. They’re the atoms of perception for one specific model, and they’re the only language that model speaks.

> The same sentence is nine tokens to GPT-4 and seven tokens to Llama 3. Not because Llama is smarter or the sentence changed, but because the two models have different vocabularies.

> That’s it. No clever scoring, no neural network.

Could people who use LLM to write articles at least prompt them to have a better style? I'm really tired of the default Claude style (a lot of Chinese models also reuse the same style)

s1monb•37m ago
I appreciate the feedback. My main focus was on the visual elements, and not so much "ridding the text of AI-traces".

What did you think about the more visual elements?

Simon

s1monb•32m ago
I will do better and link to the research and related sources in the next iteration.
Tiberium•29m ago
I was just pointing out how the article is clearly LLM written, probably including the interactive widgets. It's especially obvious because someone writing such an article in 2026 would at least find what the newest tokenizers are, instead of mentioning LLaMA 2/3 (!), and GPT's old tokenizer that they dropped since GPT-4o (or something close).

And, more obviously, the fact that GPT-4 is being directly named even though that model is over 3 years old by now: "Ask GPT-4, Claude, or Gemini today and they will usually answer three.".

Sorry, I just think that the article wasn't produced by a human at all.

s1monb•9m ago
> It's especially obvious because someone writing such an article in 2026 would at least find what what the newest tokenizers are

The underlying BPE algorithm, which is the main focus of this article, is the one used modern tokenizers today.

> The fact that GPT-4 is being directly named even though that model is over 3 years old by now

That is fair. Will be updated

> Sorry, I just think that the article wasn't produced by a human at all.

While I have used LLM to help me write and explain my content, my hopes is that most readers does not share this opinion of yours. Everything touched by AI is not slop, and I wanted to share the notes I created for myself.