Norway's 2 petabytes of Huawei flash storage and LLM training

https://www.blocksandfiles.com/flash/2026/05/22/norways-2-petabytes-of-huawei-flash-storage-and-llm-training/5244910

31•rbanffy•1h ago

Comments

7e•43m ago

2 PB? They will not come close to training in on that amount. Maybe years from now.

Den_VR•32m ago

Could probably LoRA with that

sgt•19m ago

Think they will not train on the dull 2TB but use that as the data lake to start and then apply a more targeted approach.

jauntywundrkind•38m ago

384 core cpu cluster? 2 petabytes?

Dell just launched a 2U that fits almost 10 petabytes in it. It's probably not 384 core capable but that is very doable right now, Epyc chips are 192 cores each! https://www.techradar.com/pro/dell-launches-record-shatterin...

100ms•11m ago

5x 400gbit running to a 2U box whoa, the PCI lanes must have heat shielding.

More seriously there is a sensibility limit on extreme density where it's not needed. The idea that you're just going to magically get 2 TBit/s out of those ports seems unlikely even with tweaked software, and you're stuck with a power and comms hotspot that's liable to dictate the remainder of your network design.

At max utilisation that 2U would take 12 hours to drain, and only 12 hours assuming peak and likely unachievable throughput and the box otherwise being completely out of service. Not a great start

Den_VR•36m ago

> He asserted that any country with its own language that did not have a sovereign LLM trained in that language was at a disadvantage as a globally trained, English-speaking LLM would not know about that country’s history, news and culture that was described in the local language.

I don’t know this is true. But whatever sounds true enough and gets funding seems to be what flies these days.

redanddead•29m ago

They made the cultural case, you have no idea how strong this is in places like quebec, nordics, france, russia etc

sgt•20m ago

Can confirm that. Norway may have a small population, but if you live there you'll think it's truly the center of the world (aside from the US. Norwegians love America)

ipsum2•33m ago

This is how much storage the average r/datahoarder user has in their basement. Fewer than 100 hard drives.

arjie•12m ago

But not in flash. I have an appreciable fraction of that but in spinning rust.

Levitz•30m ago

>As Husnes put it; Norway is a small country solving a problem every non-English-speaking nation will face: how do you build AI that reflects your language, your culture and your history? AI needs custodians, not just builders.

I'm afraid the answer is, mostly you don't.

Such a thing requires strong political will that, at least in my environment, seems basically impossible to align.

The costs are prohibitive, but beyond that, the type of person who cares about local representation like that is either completely fine with letting foreign companies implement it (after all, you can use ChatGPT in Basque if you want to) or is against the idea of AI altogether.

kreyenborgi•29m ago

Ad for Huawei?

solenoid0937•28m ago

> The Olivia system is an HPE Cray Supercomputing EX system, with 448 GPUs and 64,512 CPU cores.

Training a sovereign LLM with this meager hardware as opposed to a LORA on some open source model seems like a huge mistake and a potential red flag.

There is no way these people have the resources to train a fully fledged LLM, so claiming that is their goal makes me think they don't intend for the LLM to be useful.

Which begs the question, whose money are they wasting - and why?

sgt•22m ago

That's what they have access to right now. I am sure that will change in the future as the project progresses.

What do you suggest, that they stop and wait until they have the right HW?

otabdeveloper4•17m ago

> meager hardware

Qwen was made on a cluster about that size.

And this is before anybody ever thought about optimizing the training process. (Currently it's just pytorch analyst-as-coder slop, with extremely overprovisioned quantizations, etc.)

kristjansson•6m ago

DeepSeek claims to have trained on something like 2k H800, this is ~0.5k GH200 … it’s not nothing. Sure they’re not going to _serve_ it at scale, but that’s not the point?

Also the line between “finetuning a base model” and “man this is a real good initialization” gets pretty blurry at scale.

Altogether a pretty presumptuous take.

kvam•27m ago

As a Norwegian this sounds like a mistake. Who will use this LLM? Where? For what? The underlying data could be made more easily searchable and digestible for agents in general if the goal is better knowledge of Norwegian culture.

spwa4•22m ago

Exactly, if there's one thing transformers are good at it's translation. One I've found particularly nice: any question ChatGPT can answer in English it can answer in French. I'm assuming Norwegian too. So there's no point.

sgt•18m ago

There's quite a bit more to culture and language than just being able to have transformers come up with believable language and/or dialect.

otabdeveloper4•15m ago

They're only good at it because they were trained on massive amounts of English and French data.

sisve•11m ago

The point is that norway willl have its own LLM. And will not have dependencies to another state or private company. The goal is not to be the best model. But to have a model that include more Norwegian data then other LLM and that it's not screwed against other sources.

dalemhurley•21m ago

Hard disagree. This is the first step not the last and proves to other countries that this can be done.

TrackerFF•23m ago

I'm a Norwegian, and I use the national library almost every day for searching through texts. They have truly one of the best working user interfaces (and functionality) for searching through the massive amounts of text.

dalemhurley•23m ago

How about that, they actually asked for permission to use data and the companies said yes.

arjie•13m ago

This can’t be right. 2 PB of flash is like $200k. It’s within reach of many individuals. Then again I guess you don’t need that much storage so maybe it is.

devttyeu•8m ago

More like $1M at current prices at this scale / level of performance.

If you go with HDD arrays probably $50k

Exit IP VPN servers mitigation rollout

California moves to exempt Linux from its age-verification law after backlash

Norway's 2 petabytes of Huawei flash storage and LLM training

Magnifica Humanitas

C extensions, portability, and alternative compilers

Japan's New Hypersonic Engine Could Make 2-Hour Flights to the US a Reality

Toshifumi Suzuki, founder of Seven-Eleven Japan, has died

Jensen–Shannon Divergence

The bootstrapper's EU stack for under €10 per month

Everyone Against Us (2023)

Launch HN: Chert (YC P26) – Twilio for iMessage

Weave (YC W25) is hiring ML, AI, product, & design engineers

Netherlands Seizes 800 Servers, Arrests 2 for Aiding Cyberattacks

IBM Spins Off the First Pure-Play Quantum Chip Foundry

CPPL: A Circuit Prompt Programming Language

Didgeridoo playing as alternative treatment for obstructive sleep apnoea (2006)

Gnutella: A Protocol Outliving the World That Created It

Yoti age checks share facial photos and device fingerprints with third parties

Show HN: Audiomass – a free, open-source multitrack audio editor for the web

Microsoft pulls plug on plans for 244-acre data center in Caledonia (2025)

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

He Lost It at the Movies

Migrating from Go to Rust

Alaska's oil revival sparks a new energy rush Into the Arctic

The analog computer museum's online library

Bytecode VMs in surprising places (2024)

The physicists who convinced Fermilab to send Brazil's emails

AI errno(2) values

Show HN: Geomatic – A command-driven geometry studio enabled with autodiff

White Rabbit – sub-nanosecond synchronization for large distributed systems

Norway's 2 petabytes of Huawei flash storage and LLM training

Comments

Exit IP VPN servers mitigation rollout

California moves to exempt Linux from its age-verification law after backlash

Norway's 2 petabytes of Huawei flash storage and LLM training

Magnifica Humanitas

C extensions, portability, and alternative compilers

Japan's New Hypersonic Engine Could Make 2-Hour Flights to the US a Reality

Toshifumi Suzuki, founder of Seven-Eleven Japan, has died

Jensen–Shannon Divergence

The bootstrapper's EU stack for under €10 per month

Everyone Against Us (2023)

Launch HN: Chert (YC P26) – Twilio for iMessage

Weave (YC W25) is hiring ML, AI, product, & design engineers

Netherlands Seizes 800 Servers, Arrests 2 for Aiding Cyberattacks

IBM Spins Off the First Pure-Play Quantum Chip Foundry

CPPL: A Circuit Prompt Programming Language

Didgeridoo playing as alternative treatment for obstructive sleep apnoea (2006)

Gnutella: A Protocol Outliving the World That Created It

Yoti age checks share facial photos and device fingerprints with third parties

Show HN: Audiomass – a free, open-source multitrack audio editor for the web

Microsoft pulls plug on plans for 244-acre data center in Caledonia (2025)

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

He Lost It at the Movies

Migrating from Go to Rust

Alaska's oil revival sparks a new energy rush Into the Arctic

The analog computer museum's online library

Bytecode VMs in surprising places (2024)

The physicists who convinced Fermilab to send Brazil's emails

AI errno(2) values

Show HN: Geomatic – A command-driven geometry studio enabled with autodiff

White Rabbit – sub-nanosecond synchronization for large distributed systems