frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Pangu's Sorrow: The Sorrow and Darkness of Huawei's Noah Pangu LLM R&D Process

https://github.com/moonlightelite/True-Story-of-Pangu/blob/main/README.md
12•guardiangod•8h ago

Comments

yms_hi•7h ago
Calling a paper already determined to be AI-generated as "incident"? This is a major point of suspicion in the entire text.
nirui•4h ago
Is the article a translation from Chinese? You have to have some deep knowledge on Chinese net slang and Huawei slang to correctly understand it.

And all that unnecessary emotional expressions. All of it made the article hard to read.

Here's takeaways I extracted:

1. The author claim to be "an employee of the Pangu Large Model Team and Huawei Noah's Ark Laboratory", a lower ranking "small worker". The first 4 bullet points supposed to prove that they have insider knowledge, which should authenticate the claims that followed. As of why Huawei named their teams in this oddly way is unexplained but do desire some psychiatric analysis.

2. "At the beginning, our (Huawei, editor's note) computing power was very limited..." (detail followed), "...At the same time, other domestic companies such as Alibaba (which published Qwen, editor's note) and Zhipu were training on GPUs and had already figured out the right method. The gap between Pangu and its competitors was getting bigger and bigger"

3. "In this situation, Wang Yunhe ('the current director of Noah', editor's note) and his small model laboratory took action. They claimed that they inherited and transformed from the old 135B parameters, and through training a short few hundred B of data, the average improvement of various indicators was about ten points. In fact, this was their first masterpiece of applying the shell to the large model. Huawei's laymen led the experts, which made the leaders completely unaware of this nonsense. They only thought that there must be some algorithm innovation. After internal analysis, they actually used Qwen (which is published by Alibaba, editor's note) 1.5 110B for continued training.", "By adding layers, expanding the ffn dimension, and adding some mechanisms from the Pangu pi paper, they gathered about 135B parameters. In fact, the old 135B has 107 layers, while this model has only 82 layers, and the various configurations are also different. After training, the distribution of many parameters of the new 135B of unknown origin is almost exactly the same as that of Qwen 110B. Even the class name of the model code was Qwen at the time, and they were too lazy to even change the name. The subsequent model is the so-called 135B V2. This model was also provided to many downstreams at the time, even including external customers."

And that's about it.

Also, yeah, the article was indeed a translation from Chinese. The [original post] was written in Chinese, and then got translated it to English by github.com/moonlightelite. That's why it felt odd to read.

[original post]: https://web.archive.org/web/20250706034203/https://github.co...

After reading the article, I feel this is less of a whistle blowing, more of an attack against Wang Yunhe. That's why there's so much emotional expressions, to (maybe) appeal to Huawei and/or the future employer of this individual. But that's just my personal feelings/hint.

Bitchat – A decentralized messaging app that works over Bluetooth mesh networks

https://github.com/jackjackbits/bitchat
358•ananddtyagi•9h ago•154 comments

I extracted the safety filters from Apple Intelligence models

https://github.com/BlueFalconHD/apple_generative_model_safety_decrypted
390•BlueFalconHD•13h ago•259 comments

Intel's Lion Cove P-Core and Gaming Workloads

https://chipsandcheese.com/p/intels-lion-cove-p-core-and-gaming
179•zdw•10h ago•22 comments

Neanderthals operated prehistoric “fat factory” on German lakeshore

https://archaeologymag.com/2025/07/neanderthals-operated-fat-factory-125000-years-ago/
70•hilux•3d ago•17 comments

Show HN: I wrote a "web OS" based on the Apple Lisa's UI, with 1-bit graphics

https://alpha.lisagui.com/
339•ayaros•14h ago•106 comments

Show HN: Piano Trainer – Learn piano scales, chords and more using MIDI

https://github.com/ZaneH/piano-trainer
55•FinalDestiny•2d ago•14 comments

Building the Rust Compiler with GCC

https://fractalfir.github.io/generated_html/cg_gcc_bootstrap.html
161•todsacerdoti•11h ago•24 comments

A non-anthropomorphized view of LLMs

http://addxorrol.blogspot.com/2025/07/a-non-anthropomorphized-view-of-llms.html
151•zdw•10h ago•148 comments

Why English doesn't use accents

https://www.deadlanguagesociety.com/p/why-english-doesnt-use-accents
128•sandbach•11h ago•157 comments

The first time I was almost fired from Apple

https://www.engineersneedart.com/blog/almostfired/almostfired.html
203•chmaynard•2d ago•73 comments

Portability of Tar Features

https://mgorny.pl/articles/portability-of-tar-features.html
11•Bogdanp•3d ago•1 comments

Crypto 101 – Introductory course on cryptography

https://www.crypto101.io/
163•pona-a•11h ago•12 comments

Fictional K-pop bands zoom to top of US music charts

https://www.bbc.com/news/articles/clyl1zyv1y2o
16•ranit•2d ago•8 comments

High Performance Image Sensor Processing Using FPGAs [pdf]

https://oda.uni-obuda.hu/bitstream/handle/20.500.14044/10350/Gabor_S_Becker_ertekezes.pdf
39•teleforce•6h ago•1 comments

LLMs should not replace therapists

https://arxiv.org/abs/2504.18412
95•layer8•11h ago•110 comments

I Ported SAP to a 1976 CPU. It Wasn't That Slow

https://github.com/oisee/zvdb-z80/blob/master/ZVDB-Z80-ABAP.md
26•weinzierl•3h ago•9 comments

Opencode: AI coding agent, built for the terminal

https://github.com/sst/opencode
207•indigodaddy•15h ago•56 comments

Thesis: Interesting work is less amenable to the use of AI

https://remark.ing/rob/rob/Thesis-interesting-work-ie
59•koch•12h ago•30 comments

Async Queue – One of my favorite programming interview questions

https://davidgomes.com/async-queue-interview-ai/
130•davidgomes•16h ago•118 comments

Nobody has a personality anymore: we are products with labels

https://www.freyaindia.co.uk/p/nobody-has-a-personality-anymore
400•drankl•10h ago•310 comments

Swedish Campground (2004)

https://www.folklore.org/Swedish_Campground.html
84•CharlesW•9h ago•24 comments

Get the location of the ISS using DNS

https://shkspr.mobi/blog/2025/07/get-the-location-of-the-iss-using-dns/
283•8organicbits•20h ago•79 comments

Backlog.md – Markdown‑native Task Manager and Kanban visualizer for any Git repo

https://github.com/MrLesk/Backlog.md
142•mrlesk•13h ago•32 comments

There's a COMPUTER inside my DS flashcart [video]

https://www.youtube.com/watch?v=uq0pJmd7GAA
56•surprisetalk•9h ago•11 comments

Functions Are Vectors (2023)

https://thenumb.at/Functions-are-Vectors/
192•azeemba•17h ago•93 comments

Uncommon Uses of Python in Commonly Used Libraries (2022)

https://eugeneyan.com/writing/uncommon-python/
21•sebg•3d ago•2 comments

Show HN: A Language Server Implementation for SystemD Unit Files

https://github.com/JFryy/systemd-lsp
36•arandomhuman•8h ago•17 comments

The era of full stack chip designers

https://chipinsights.substack.com/p/the-era-of-full-stack-chip-designers
16•bharathw30•5h ago•4 comments

Show HN: Modernized file manager and program manager from Windows 3.x

https://github.com/brianluft/heirloom
34•electroly•9h ago•7 comments

Jane Street barred from Indian markets as regulator freezes $566M

https://www.cnbc.com/2025/07/04/indian-regulator-bars-us-trading-firm-jane-street-from-accessing-securities-market.html
421•bwfan123•19h ago•246 comments