frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
20•gnufx•1h ago•2 comments

I Write Games in C (yes, C)

https://jonathanwhiting.com/writing/blog/games_in_c/
90•valyala•3h ago•62 comments

SectorC: A C Compiler in 512 bytes

https://xorvoid.com/sectorc.html
49•valyala•3h ago•10 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
165•1vuio0pswjnm7•9h ago•211 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
136•AlexeyBrin•8h ago•25 comments

We have broken SHA-1 in practice

https://shattered.io/
6•mooreds•25m ago•2 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
81•vinhnx•6h ago•10 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
843•klaussilveira•23h ago•252 comments

Al Lowe on model trains, funny deaths and working with Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
58•thelok•5h ago•8 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
1075•xnx•1d ago•615 comments

The F Word

http://muratbuffalo.blogspot.com/2026/02/friction.html
10•zdw•3d ago•0 comments

We Mourn Our Craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
286•ColinWright•2h ago•333 comments

Reinforcement Learning from Human Feedback

https://rlhfbook.com/
88•onurkanbkrc•8h ago•5 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
508•theblazehen•3d ago•187 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
30•josephcsible•1h ago•22 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
222•jesperordrup•13h ago•80 comments

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

https://www.forbes.com/sites/mikestunson/2026/02/05/us-jobs-disappear-at-fastest-january-pace-sin...
228•alephnerd•3h ago•177 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
20•momciloo•3h ago•2 comments

Selection Rather Than Prediction

https://voratiq.com/blog/selection-rather-than-prediction/
11•languid-photic•3d ago•3 comments

72M Points of Interest

https://tech.marksblogg.com/overture-places-pois.html
34•marklit•5d ago•5 comments

Coding agents have replaced every framework I used

https://blog.alaindichiappari.dev/p/software-engineering-is-back
242•alainrk•7h ago•385 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
592•nar001•7h ago•263 comments

A Fresh Look at IBM 3270 Information Display System

https://www.rs-online.com/designspark/a-fresh-look-at-ibm-3270-information-display-system
42•rbanffy•4d ago•8 comments

History and Timeline of the Proco Rat Pedal (2021)

https://web.archive.org/web/20211030011207/https://thejhsshow.com/articles/history-and-timeline-o...
20•brudgers•5d ago•4 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
119•videotopia•4d ago•36 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
87•speckx•4d ago•97 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
205•limoce•4d ago•112 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
282•isitcontent•23h ago•38 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
292•dmpetrov•23h ago•156 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
25•sandGorgon•2d ago•13 comments
Open in hackernews

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
33•surprisetalk•2h ago

Comments

thehamkercat•2h ago
Interesting, output price is insane/Mtok
1123581321•1h ago
Could be a use for the $50 extra usage credit. It requires extra usage to be enabled.

> Fast mode usage is billed directly to extra usage, even if you have remaining usage on your plan. This means fast mode tokens do not count against your plan’s included usage and are charged at the fast mode rate from the first token.

minimaxir•1h ago
After exceeding the increasingly shrinking session limit with Opus 4.6, I continued with the extra usage only for a few minutes and it consumed about $10 of the credit.

I can't imagine how quickly this Fast Mode goes through credit.

arcanemachiner•59m ago
It has to be. The timing is just too close.
simonw•1h ago
The one question I have that isn't answered by the page is how much faster?

Obviously they can't make promises but I'd still like a rough indication of how much this might improve the speed of responses.

l1n•1h ago
2.5x faster or so (https://x.com/claudeai/status/2020207322124132504).
zurfer•1h ago
6x more expensive
scosman•1h ago
Yeah is this cerebras/groq speed, or I just skip the queue?
krm01•1h ago
Will this mean that when cost is more important than latency that replies will now take longer?

I’m not in favor of the ad model chatgpt proposes. But business models like these suffer from similar traps.

If it works for them, then the logical next step is to convert more to use fast mode. Which naturally means to slow things down for those that didn’t pick/pay for fast mode.

We’ve seen it with iPhones being slowed down to make the newer model seem faster.

Not saying it’ll happen. I love Claude. But these business models almost always invite dark patterns in order to move the bottom line.

speedping•1h ago
> $30/150 MTok Umm no thank you
pedropaulovc•1h ago
Where is this perf gain coming from? Running on TPUs?
pronik•1h ago
While it's an excellent way to make more money in the moment, I think this might become a standard no-extra-cost feature in several months (see Opus becoming way cheaper and a default model within months). Mental load management while using agents will become even more important it seems.
giancarlostoro•1h ago
Yeah especially once they make an even faster fast mode.
Nition•1h ago
Note that you can't use this mode to get the most out of a subscription - they say it's always charged as extra usage:

> Fast mode usage is billed directly to extra usage, even if you have remaining usage on your plan. This means fast mode tokens do not count against your plan’s included usage and are charged at the fast mode rate from the first token.

Although if you visit the Usage screen right now, there's a deal you can claim for $50 free extra usage this month.

IMTDb•1h ago
I’m curious what’s behind the speed improvements. It seems unlikely it’s just prioritization, so what else is changing? Is it new hardware (à la Groq or Cerebras)? That seems plausible, especially since it isn’t available on some cloud providers.

Also wondering whether we’ll soon see separate “speed” vs “cleverness” pricing on other LLM providers too.

pshirshov•1h ago
> so what else is changing?

Let me guess. Quantization?

sothatsit•1h ago
There are a lot of knobs they could tweak. Newer hardware and traffic prioritisation would both make a lot of sense. But they could also lower batching windows to decrease queueing time at the cost of lower throughput, or keep the KV cache in GPU memory at the expense of reducing the number of users they can serve from each GPU node.
Nition•1h ago
I wonder if they might have mostly implemented this for themselves to use internally, and it is just prioritization but they don't expect too many others to pay the high cost.
sothatsit•1h ago
Roon said as much here [0]:

> codex-5.2 is really amazing but using it from my personal and not work account over the weekend taught me some user empathy lol it’s a bit slow

[0] https://nitter.net/tszzl/status/2016338961040548123

jstummbillig•1h ago
> It seems unlikely it’s just prioritization

Why does this seem unlikely? I have no doubt they are optimizing all the time, including inference speed, but why could this particular lever not entirely be driven by skipping the queue? It's an easy way to generate more money.

singpolyma3•1h ago
Until everyone buys it. Like fast pass at an amusement park where the fast line is still two hours long
servercobra•1h ago
It's a good way to squeeze extra out of a bunch of people without actually raising prices.
sothatsit•1h ago
At 6x the cost, and it requiring you to pay full API pricing, I don’t think this is going to be a concern.
kingstnap•22m ago
It comes from batching and multiple streams on a GPU. More people sharing 1 GPU makes everyone run slower but increases overall token throughput.

Mathematically it comes from the fact that this transformer block is this parallel algorithm. If you batch harder, increase parallelism, you can get higher tokens/s. But you get less throughput. Simultaneously there is also this dial that you can speculatively decode harder with fewer users.

Its true for basically all hardware and most models. You can draw this Pareto curve of how much throughput per GPU vs how many tokens per second per stream. More tokens/s less total throughput.

See this graph for actual numbers:

Token Throughput per GPU vs. Interactivity gpt-oss 120B • FP4 • 1K / 8K • Source: SemiAnalysis InferenceMAX™

https://inferencemax.semianalysis.com/

solidasparagus•1h ago
I pay $200 a month and don't get any included access to this? Ridiculous
pedropaulovc•1h ago
Well, you can burn your $50 bonus on it
bakugo•1h ago
The API price is 6x that of normal Opus, so look forward to a new $1200/mo subscription that gives you the same amount of usage if you need the extra speed.
MuffinFlavored•1h ago
I always wondered this, is this true/does the math come out to be really that bad? 6x?

Is the writing on the wall for $100-$200/mo users that, it's basically known-subsidized for now and $400/mo+ is coming sooner than we think?

Are they getting us all hooked and then going to raise it in the future, or will inference prices go down to offset?

kingforaday•1h ago
..But it says "Available to all Claude Code users on subscription plans (Pro/Max/Team/Enterprise) and Claude Console."

Is this wrong?

behindsight•1h ago
It's explicitly called out as excluded in the blue info bubble they have there.

> Fast mode usage is billed directly to extra usage, even if you have remaining usage on your plan. This means fast mode tokens do not count against your plan’s included usage and are charged at the fast mode rate from the first token.

https://code.claude.com/docs/en/fast-mode#requirements

sothatsit•58m ago
I think this is just worded in a misleading way. It’s available to all users, but it’s not included as part of the plan.
hmokiguess•1h ago
Give me a slow mode that’s cheaper instead lol
jhack•1h ago
The pricing on this is absolutely nuts.
nick49488171•1h ago
For us mere mortals, how fast does a normal developer for through a MTok. How about a good power user?
clbrmbr•1h ago
I’d love to hear from engineers who find that faster speed is a big unlock for them.

The deadline piece is really interesting. I suppose there’s a lot of people now who are basically limited by how fast their agents can run and on very aggressive timelines with funders breathing down their necks?

sothatsit•1h ago
If it could help avoid you needing to context switch between multiple agents, that could be a big mental load win.
maz1b•1h ago
AFAIK, they don't have any deals or partnerships with Groq or Cerebras or any of those kinds of companies.. so how did they do this?
hendersoon•1h ago
Could well be running on Google TPUs.
tcdent•58m ago
Inference is run on shared hardware already, so they're not giving you the full bandwidth of the system by default. This most likely just allocates more resources to your request.
esafak•59m ago
It's a good way to address the price insensitive segment. As long as they don't slow down the rest, good move.
paxys•57m ago
Looking at the "Decide when to use fast mode", it seems the future they want is:

- Long running autonomous agents and background tasks use regular processing.

- "Human in the loop" scenarios use fast mode.

Which makes perfect sense, but the question is - does the billing also make sense?

l5870uoo9y•54m ago
It doesn’t say how much faster it is but from my experience with OpenAI’s “service_tier=priority” option on SQLAI.ai is that it’s twice as fast.
simianwords•38m ago
Whatever optimisation is going on is at the hardware level since the fast option persists in a session.