frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
226•theblazehen•2d ago•65 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
692•klaussilveira•15h ago•206 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
962•xnx•20h ago•553 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
5•AlexeyBrin•56m ago•0 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
129•matheusalmeida•2d ago•35 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
66•videotopia•4d ago•5 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
53•jesperordrup•5h ago•24 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
34•kaonwarb•3d ago•27 comments

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
10•matt_d•3d ago•2 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
236•isitcontent•15h ago•26 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
233•dmpetrov•15h ago•124 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
32•speckx•3d ago•18 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
335•vecti•17h ago•147 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
502•todsacerdoti•23h ago•244 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
385•ostacke•21h ago•97 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
300•eljojo•18h ago•186 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
361•aktau•22h ago•185 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
6•__natty__•3h ago•0 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
422•lstoll•21h ago•283 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
68•kmm•5d ago•10 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
96•quibono•4d ago•22 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
19•1vuio0pswjnm7•1h ago•5 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
21•bikenaga•3d ago•11 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
264•i5heu•18h ago•215 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
33•romes•4d ago•3 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
63•gfortaine•13h ago•28 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1076•cdrnsf•1d ago•460 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
39•gmays•10h ago•13 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
298•surprisetalk•3d ago•47 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
154•vmatsiiako•20h ago•72 comments
Open in hackernews

Will supercapacitors come to AI's rescue?

https://spectrum.ieee.org/supercapacitor-2671883490
51•mfiguiere•9mo ago

Comments

Animats•9mo ago
Is that kind of load variation from large data centers really a problem to the power grid? There are much worse intermittent loads, such as an electric furnace or a rolling mill.
paulkrush•9mo ago
Edit: It's interesting the GPU's are causing issues on the grid before they cause issues with the data center's power.
mystified5016•9mo ago
Read the article.
toast0•9mo ago
I suspect it's more of a problem for the data center's energy bill. My understanding is that large electric customers pay a demand charge in addition to the volumetric charge for the kWh's use at whatever rates given time of use / wholesale rates. The demand charge is based on the maximum kW used (or sometimes just the connection size) and may also have a penalty rate if the power factor is poor. Smoothing over small duration surges probably makes a lot of things nicer for the rate payer, including helping manage fluctuations from the utility.

There's probably something that could be done on the individual systems so that they don't modulate power use quite so fast, too; at some latency cost, of course. If you go all the way to the extremes, you might add a zero crossing detector and use it to time clock speed increases.

timewizard•9mo ago
If you have a working thermometer you can predict when furnaces are going to run.

If you want to smooth out data centers then you need hourly pricing to force them to manage their demand into periods where excess grid capacity is not being used to serve residential loads.

hinkley•9mo ago
Large customers pay not by wattage but by… I’m spacing on the word but essentially how much their power draw fucks up the sine waves for voltage and current in the power grid.

I imagine common power rail systems in hyperscaler equipment helps a bit with this, but for sure switching PSUS chop up the input voltage and smooth it out. And that leads to very strange power draws.

murderfs•9mo ago
You're probably thinking of power factor, which is usually not a big deal for datacenters. All of your power supplies are going to have active PFC, and anything behind a double conversion UPS is going to get PFC from the UPS. The biggest contributors are probably going to be the fans in the air conditioning units.
Animats•9mo ago
This isn't about power factor. That's a current vs. voltage thing within one cycle. It's about demand side ramp rate - how fast load can go up and down.

Ramp rate has been a generation side thing for a century. Every morning, load increases from the pre-dawn low, and which generators can ramp up output at what speed matters. Ramp rate is usually measured in megawatts/minute. Big thermal plants, atomic and coal, have the lowest ramp rates, a few percent per minute.

Ramp rate demand side, though, is a new thing. There are discussions about it [1] but it's not currently something that's a parameter in electric energy bills.

[1] https://www.aceee.org/files/proceedings/2012/data/papers/019...

hinkley•9mo ago
For factories it’s both, but if by “this” you mean the article, I agree. Bursty traffic is the problem they’re most trying to solve.

It’s akin (but other side of the coin) to when NAS hardware learned to stagger drive spool up during power on to avoid brownouts. Startup current on spinning platters is ridiculous. You have to do something to buffer, and you can do it on the supply or demand side. Or for large problems, both.

quickthrowman•9mo ago
You’re thinking of power factor. My utility will force companies to use power factor correction equipment (a capacitor bank) if their power factor is too low.

The data center issue is not related to power factor.

oakwhiz•9mo ago
Power factor, but most power supplies have a good amount of power factor correction on them now, and datacenters can have PFC as an independent system and/or inside some types/modes of UPSes. Oddly enough the capacitive power factor of power supplies cancels out with some of the inductive loads of supporting mechanical equipment.
oakwhiz•9mo ago
There is often a demand flux surcharge as well. Not just demand but delta in demand over some time period.
changoplatanero•9mo ago
Yes its a problem for the grid and the power companies don't allow large clusters to oscillate their power like this. The solution that AI have to do during their training big runs is to fill in the idle time on the GPUs with dummy operations to keep the power load constant. Having capacitors would be able to save on power usage.
nancyminusone•9mo ago
Inb4 a startup is created to sell power load idle cycle compute time in AI training data centers.
mystified5016•9mo ago
Those loads aren't nearly as intermittent. Your furnace likely runs for tens of minutes at a time. These datacenters are looking at second-to-second loads.

Drawing high intermittent loads at high frequency likely makes the utility upset and leads to over-building supply to the customer to cope with peak load. If you can shave down those peaks, you can use a smaller(cheaper) supply connection. A smoother load will also make the utility happy.

Remember that electricity generation cannot ramp up and down quickly. Big transient loads can cause a lot of problems through the whole network.

paulkrush•9mo ago
"Thousands of GPUs all linked together turning on and off at the same time." So supercapacitors allow for simpler software?, reduced latency? at a low cost?
mjevans•9mo ago
They service 'spot demand moderation' as an extension of UPS and power smoothing. In this case it's flattening out spikes to smooth slopes.
sonium•9mo ago
Or you simply use the pytorch.powerplant_no_blow_up operator [1]

[1] https://www.youtube.com/watch?v=vXsT6lBf0X4

janalsncm•9mo ago
Pretty much. From the article:

> Another solution is dummy calculations, which run while there are no spikes, to smooth out demand.

metaphor•9mo ago
Paraphrasing this[1]?

[1] https://github.com/pytorch/pytorch/pull/132936/files#diff-98...

0cf8612b2e1e•9mo ago

  One solution is to rely on backup power supplies and batteries to charge and discharge, providing extra power quickly. However, much like a phone battery degrades after multiple recharge cycles, lithium-ion batteries degrade quickly when charging and discharging at this high rate.
Is this really a problem for an industrial installation? I would imagine that a properly sized facility would have adequate cooling + capacity to only run the batteries within optimal spec. Solar plants are already charging/discharging their batteries daily.
jeffbee•9mo ago
In addition to what you said, nothing is forcing or even encouraging anyone to use lithium-ion batteries in fixed service, such as a rack full of computers.
pixl97•9mo ago
Eh, I think part of the problem here is the speed of load switching. From the article it looks like the loads could generate dozens to hundreds of demand spikes per minute. With most battery operated loads that I've ever messed with we're not switching loads like that. It's typically 'oh a fault, switch to battery' then some time later you check the power circuit to see if it's up and switch back.

This looks a whole lot more like high frequency load smoothing. Really it seems to me like a continuation of a motherboard. Even if you have a battery backup on your PC you still have capacitors on the board for voltage fluctuations.

lstodd•9mo ago
in a properly designed install you can actually use the compressors and fans for smoothing load spikes. won't be much, but why not.

edit: otherwise I'm not getting what the entire article is about. it's as contrary to what I know about datacenter design as it can get.

it's.. just wrong.

touisteur•9mo ago
I'm thinking of sequences of 'put the sharded dataset through the ten thousand 2kW GPUs, then wait on network - all-reduce - then spike again - a mostly-synchronous all-on/all-off loop. Watching how quick they get to boost-frequency I can see where the worries come from.
lstodd•9mo ago
does anyone actually do those kind of loads over entire dcs?

because if so, I have some nice east-european guys to teach them proper load-balancing.

0cf8612b2e1e•9mo ago
Wouldn’t that situation arise when a company is training their top end model? Facebook/Google/DeepSeek probably trained on thousands of collocated GPUs. The bigger the cluster, the bigger the sync delays between batches as the model data gets shunted back and forth.
lstodd•9mo ago
basically it's a load-spreading problem. this can be mitigated entirely by control plane alone. unless you think they really do a cluster-per-datacenter and in that case, I cannot believe the DC in question were not designed for peak and transient loads both in AC and power supply. Besides it would be stupud.
jwatte•9mo ago
Yes, the data center was designed for peak load. Unfortunately, the data center is hooked up to the grid. De-coupling that connection is what this article is literally about.

Then again, some data centers just use re-generation from giant flywheels, where the grid powers the flywheel to build up inertia, and thus load can be smoothed that way. The flywheels need to keep running at 60 Hz for at least 30 seconds to give the Diesel generators time to start, should the grid fail.

Run a flywheel at the main transformer, run batteries at your PDC and add supercaps at each point of load, and you may very well be able to show a much smoother load to the grid.

jwatte•9mo ago
Yes, they do, and no, they don't want to wait longer for the result just because you think they shouldn't.
amelius•9mo ago
Maybe a superconducting superinductor would be a better fit.
lstodd•9mo ago
that would be a blackhole bomb.
janalsncm•9mo ago
I am curious about what the load curves look like in these clusters. If the “networking gap” is long enough you might just be able to have a secondary workload that trains intermittently.

Slightly related, you can actually hear this effect depending on your GPU. It’s called coil whine. When your GPU is doing calculations, it draws more power and whines. Depending on your training setup, you can hear when it’s working. In other words, you want it whining all the time.

touisteur•9mo ago
You might need more memory for this secondary training workload. But yeah, donating/selling the 'network' time for high-intensity, low memory footprint workloads (thinking number crunching, monte-carlo stuff, maybe brute-force through a series of problems...) might end up making sense.
hulitu•9mo ago
> Will Supercapacitors Come to AI's Rescue?

Yes, just like the octopussies. /s

blt•9mo ago
What is causing demand bursts in AI workloads? I would have expected that AI training is almost the exact opposite. Load a minibatch, take a gradient step, repeat forever. But the article claims that "each step of the computation corresponds to a massive energy spike."
wmf•9mo ago
If the cores go idle (or just much less loaded) in between steps because they're waiting for network communication that would cause the problem.
sdenton4•9mo ago
Bad input pipelines are a big cause of spikiness - you might have to wait a non-trivial fraction of a second for the next batch of inputs to arrive. If you can run 20+ training steps per second on. adecent batch size, it can take some real engineering to get enough data lined up and ready to go fast enough. (I work on audio models, where data is apparently quite heavy compared to images or text...)
lstodd•9mo ago
if that is the case, my, I am appaled.

where do you get those ntfractions of seconds? network? storage?

sdenton4•9mo ago
It's not soooo appalling once you dig further into what's happening.

There's been a tug-of-war between compute bottlenecks and memory bottlenecks in accelerators for quite a while. Big increases in FLOPs requires faster memory access, so you tend to get hardware releases alternating between emphasizing compute gains and bus bandwidth gains. Data I/O is just the next layer out... once your base training is running fast enough, it becomes progressively harder to keep the model fed with data.

blt•9mo ago
I can see how such a phenomenon could happen at the level of a single machine, but if we're using a whole data center full of GPU machines it should be possible to spread out those spikes evenly over time. Still weird that the article implies spikiness is a fundamental property of AI workloads rather than a design oversight that can be fixed at the software level.
jfim•9mo ago
When running data parallel training, basically all the nodes taking part in the training run the same training loop in lockstep. So you'd have all nodes running the forward and backward passes on the GPU, then they'd wait for the gradient to be reduced across all nodes and then the weights get updated and another iteration can be run. For the first part the GPU is working, but when waiting on the network it's idle. The spikes are basically synchronous across all nodes doing the training.

The only way to spread the spikes would be to make the training run slower, but that'd be a hard sell considering training can sometimes be measured in days.

blt•9mo ago
I agree with this part of your response: If we were to require that distributed training generates the exact same sequence of weight updates as serial SGD on a single machine, then we would need a barrier like that. However, there is a lot of research on distributed optimization that addresses this issue by relaxing the "exactly equivalent to serial SGD" requirement, including classic papers [1] and more recent ones [2].

Basically, there are two properties of ML optimization that save us: 1) the objective is a huge summation over many small objectives (the losses for each training data point), and 2) the same noise-robustness that makes SGD work in the first place can give us robustness against further noise caused by out-of-order updates.

So I think this issue can be overcome fairly easily. Does anyone know if the big LLM-training companies use asynchronous updates like [1,2]? Or do they still use a big barrier?

[1] https://proceedings.neurips.cc/paper_files/paper/2011/hash/2...

[2] https://arxiv.org/abs/2401.09135

sdenton4•9mo ago
I think there's been some tendency against things like HogWild in favor of reproducibility and (its close cousin) debug-friendliness.
blt•9mo ago
Understandable. However, I went down that rabbit hole once, and learned that even summing a 1D array with a GPU is nondeterministic (because it is broken down like merge sort and subject to the scheduler). I guess I assumed that practitioners had fully embraced randomness due to things like that.
tzs•9mo ago
> Another solution is dummy calculations, which run while there are no spikes, to smooth out demand. This makes the grid see a consistent load, but it also wastes energy doing unnecessary work.

Oh god...I can see it now. Someone will try to capitalize on the hype of LLMs and the hype of cryptocurrency and try to build a combined LLM training and cryptocurrency mining facility that that runs the mining between training spikes.

ludicity•9mo ago
Oh man, I really, really wish that you hadn't said this and also that you were wrong.
permo-w•9mo ago
you really think this isn't literally the first thing that happened the second hosting these models became commercially viable?
jgalt212•9mo ago
There wouldn't be any crypto or LLMs if the Fed hadn't printed $7 trillion.
ijustlovemath•9mo ago
YCW27
candiddevmike•9mo ago
From the same founders who brought you (or didn't, actually) maritime fusion
FridgeSeal•9mo ago
It’s ok though because YC invests in the team, better just give them another chance!!
x-complexity•9mo ago
Dummy calculations aren't even needed, if you allow the LLMs to pre-compute on the given context before inference:

https://arxiv.org/abs/2504.13171

It should be noted that this type of inference is less useful on time-sensitive tasks, but most tasks truthfully don't require such time sensitivity (there exists slack time between when the task is given & when questions are asked).

wongarsu•9mo ago
There are already some providers offering cheap LLM services that will give you a response within 24 hours instead of within seconds. That allows them to schedule tasks during low-request hours when they have spare capacity and use better batching. For some automated tasks this is perfectly acceptable. A bit of effort to accommodate, but easy to justify when it halves your inference costs
sandis•9mo ago
Any examples of such providers?
dghlsakjg•9mo ago
Certain tasks at OpenAI when I checked a few months ago. Embedding for one.
wongarsu•9mo ago
OpenAI [1] as well as Azure OpenAI, Anthropic [2], as well as Parasail [3] for all the "open source" models. There are others that I was thinking of, but those are the first I could find without my notes. Typically the batch API is 50% cheaper than live inference

1: https://platform.openai.com/docs/guides/batch

2: https://docs.anthropic.com/en/docs/build-with-claude/batch-p...

3: https://docs.parasail.io/parasail-docs/batch/batch-quickstar...

cranberryturkey•9mo ago
https://infernetprotocol.com
DaSHacka•9mo ago
>implying its not already happening
Merrill•9mo ago
Wouldn't it be better to arrange the network and software to run the GPUs continuously at optimal usage?

Otherwise a lot of expensive GPU capital is idle between bursts of computation.

Didn't DeepSeek do something like this to get more system level performance out of less capable GPUs?

WatchDog•9mo ago
Sounds like an issue that would be cheaper to address by just adjusting the software.
bcoates•9mo ago
Are these GPU DCs entirely passive cooled?

I'm surprised it's not cheaper to modulate all those compressor motors they presumably already have

gitroom•9mo ago
lmao the amount of weird fixes folks float for this problem is insane - tbh i feel like half of it really comes down to software folks not wanting to tweak their pipelines
rini17•9mo ago
Uhm, I was under impression you are contractually obliged not to do that to the grid? As a wholesale customer, not small kettle picker I mean. Or that's just my european bureaucratically minded approach and in the US everyone just rides the grid raw?
krunck•9mo ago
I wonder what voltage is used on the caps? The higher the voltage the greater the energy density(assuming the dielectric can handle it):

E = (CV^2)/2

where E is the stored energy, C is the capacitance, and V is the applied voltage