frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Qwen3-Next

https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list
73•tosh•2h ago

Comments

Jgoauh•56m ago
Seems impressive, i believe better architectures are really the path forward, i don't think you need more than 100B params taking this model and what GPT OSS 120B can acchieve
NitpickLawyer•40m ago
New arch seems cool, and it's amazing that we have these published in the open.

That being said, qwen models are extremely overfit. They can do some things well, but they are very limited in generalisation, compared to closed models. I don't know if it's simply scale, or training recipes, or regimes. But if you test it ood the models utterly fail to deliver, where the closed models still provide value.

vintermann•35m ago
Could you give some practical examples? I don't know what Qwen's 36T-token training set is like, so I don't know what it's overfitting to...
NitpickLawyer•28m ago
Take math and coding for example:

- in math, if they can solve a problem, or a class of problems, they'll solve it. If you use a "thinking" model + maj@x, you'll get strong results. But if you try for example to have the model consider a particular way or method of exploring a problem, it'll default to "solving" mode. It's near impossible to have it do something else with a math problem, other than solving it. Say "explore this part, in this way, using this method". Can't do it. It'll maybe play a bit, but then enter "solving" mdoe and continue to solve it as it was trained.

In practice, this means that "massive parallel" test time compute becomes harder to do with these models, because you can't "guide" them towards certain aspects of a problem. They are extremely "stubborn".

- in coding it's even more obvious. Ask them to produce any 0shot often tested and often shown things (spa, game, visualisation, etc) - and they do it. Convincingly.

But ask them to look at a piece of code and extract meaning, and they fail. Or ask them to reverse an implementation. Figure out what a function does and reverse its use, or make it do something else, and they fail.

elbear•22m ago
It sounds like some people.
croemer•48m ago
ERR_NAME_NOT_RESOLVED
croemer•48m ago
https://archive.is/JH9XL
jychang•42m ago
Coolest part of Qwen3-Next, in my opinion, is that they do MTP without adding another un-embedding matrix.

Deepseek R1 also has a MTP layer (layer 61) https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/mod...

But Deepseek R1 adds embed_tokens and shared_head.head tensors, which are [129280, 7168] or about 2GB in size at FP8.

Qwen3-Next doesn't have that, so it saves a few GB in active parameters for MTP, which is a Big Deal.

puilp0502•19m ago
What kind of benefit does Multi-Token Prediction bring to the inference side? Is it only relevant in pretraining efficiency?
rfoo•11m ago
It could be a better speculative model than separately trained EAGLE etc for speculative decoding.
jychang•1m ago
Speculative decoding!

It makes inference a LOT faster.

slimebot80•36m ago
Complete newbie here - some questions, if I may!

This stuff can run on a local machine without internet access, correct?

And it can pretty much match Nano Banana? https://github.com/PicoTrex/Awesome-Nano-Banana-images/blob/...

Also -- what are the specs for a machine to run it (even if slowly!)

Davidzheng•31m ago
Isn't this one a text model
slimebot80•29m ago
Ah, maybe! I am lost reading this page with all the terminology
prawel•27m ago
what you mean is Qwen Image and Qwen Image Edit, you can run it on local machine, using Draw Things application for example.

the model discussed here is text model, so similar to ChatGPT. You can also run it on your local machine, but not yet, as apps need to be updated with Qwen 3 Next support (llama.cpp, Ollama, etc)

NitpickLawyer•23m ago
This model can be run completely offline, yes. You'll need anywhere from 60-200 gb of RAM (either VRAM for high speeds, or a combination of VRAM and RAM, or just CPU+RAM). The active params are really low (3B) so it'll likely run fine even on CPU. Should get 10-15+t/s even on old DDR4 systems. Offload some experts to a GPU (can be as low as 8-16gb) and you'll see greater speeds.

This has nothing to do with nano banana, or image generation. For that you want the qwen image edit[1] models.

1 - https://huggingface.co/Qwen/Qwen-Image-Edit

dragonwriter•21m ago
> This stuff can run on a local machine without internet access, correct?

Yes.

> And it can pretty much match Nano Banana?

No, Qwen3-Next is not a multimodal model, it has no image generation function.

mynti•23m ago
For anyone curious about what the Gated Delta Network is: https://arxiv.org/pdf/2412.06464
yekanchi•18m ago
how much vram it requires?
DiabloD3•12m ago
Thats not a meaningful question. Models can be quantized to fit into much smaller memory requirements, and not all MoE layers (in MoE models) have to be offloaded to VRAM to maintain performance.
yekanchi•5m ago
i mean 4bit quantized. i can roughly calculate vram for dense models by model size. but i don't know how to do it for MOE models?
NitpickLawyer•2m ago
A good rule of thumb is to think that one param is one unit of storage. The "default" unit of storage these days is bf16 (i.e. 16 bits for 1 weight). So for a 80B model that'll be ~160GB of weights. Then you have quantisation, usually in 8bit and 4bit. That means each weight is "stored" in 8bits or 4bits. So for a 80B model that'll be ~80GB in fp8 and ~40GB in fp4/int4.

But in practice you need a bit more than that. You also need some space for context, and then for kv cache, potentially a model graph, etc.

So you'll see in practice that you need 20-50% more RAM than this rule of thumb.

For this model, you'll need anywhere from 50GB (tight) to 200GB (full) RAM. But it also depends how you run it. With MoE models, you can selectively load some experts (parts of the model) in VRAM, while offloading some in RAM. Or you could run it fully on CPU+RAM, since the active parameters are low - 3B. This should work pretty well even on older systems (DDR4).

irthomasthomas•14m ago
llm -m qwen3-next-80b-a3b-thinking "An ASCII of spongebob"

Here's a classic ASCII art representation of SpongeBob SquarePants:

     .------.
    /  o o  \
   |         |
   |  \___/  |
    \_______/


  llm -m chutes/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 \
  "An ASCII of spongebob"

  Here's an ASCII art of SpongeBob SquarePants:

      .--..--..--..--..--..--.
    .' \  (`._   (_)     _   \
  .'    |  '._)         (_)  |
  \ _.')\      .----..--.'  /
  |(_.'  |    /    .-\-.  \---.
  \     0|    |   ( O| O) |    |
   |  _  |  .--.____.'._.-.  |
   \ (_) | o         -` .-`  |
    |    \   |`-._ _ _ _ _\ /
    \    |   |  `. |_||_|   |
     | o |    \_       \    |     -.   .-.
     |.-.  \    `--..-'   O |     `.`-' .'
   _.'  .' |     `-.-'      /-.__   ' .-'
  .' `-.` '.|='=.='=.='=.='=|._/_ `-'.'
  `-._  `.  |________/\_____|    `-.'
     .'   ).| '=' '='\/ '=' |
     `._.`  '---------------'
             //___\   //___\
               ||       ||
               ||_.-.   ||_.-.
              (_.--__) (_.--__)
dev_hugepages•3m ago
memorized: https://www.asciiart.eu/cartoons/spongebob-squarepants

Auwalsani

https://blog.cloudflare.com/cloudflare-sse-gartner-magic-quadrant-2025/
1•Maidanragi•1m ago•0 comments

Go Podcast

https://www.youtube.com/channel/UC2MDU-j8SJUjDKcwmUNrn8A
1•gethly•2m ago•0 comments

U.S. Senators call on ICE to halt use of facial recognition

https://www.markey.senate.gov/news/press-releases/markey-wyden-and-merkley-demand-ice-stop-using-...
1•Improvement•4m ago•0 comments

Simple Licensing

https://rslstandard.org/
1•Bogdanp•7m ago•0 comments

We engineered RAG to be 50% faster

https://elevenlabs.io/blog/engineering-rag
1•lharries•10m ago•0 comments

All examples from The LaTeX Companion book (3rd edition)

https://ctan.org/pkg/tlc3-examples
1•teleforce•11m ago•0 comments

Setting up rooted Android emulator with Frida and mitmproxy

https://www.trickster.dev/post/setting-up-rooted-android-emulator-with-frida-and-mitmproxy/
1•rl1987•14m ago•0 comments

ASK HN: Would you want to join a computer science newsletter?

1•bilalcg•15m ago•0 comments

What 30k Free Users Taught Me About Charging $10/Month

4•evermike•19m ago•1 comments

Show HN: I made a generative online drum machine with ClojureScript

https://dopeloop.ai/beat-maker/
1•chr15m•19m ago•0 comments

Elizabeth I and the 'Blackamoors': the deportation that never was

https://www.mirandakaufmann.com/blog/elizabeth-i-and-the-blackamoors-the-deportation-that-never-was
1•fanf2•22m ago•0 comments

Scientists pioneer 'animal internet': dog phones and touch screens for parrots

https://www.ft.com/content/10dc5add-ed1c-4640-8ec4-25c6b41eed09
1•stareatgoats•24m ago•1 comments

XAI exposed: Grok admits "humans ordered delays" – FTC Case 191913595

https://www.quora.com/profile/Vic-LU-16/Title-%E6%A8%99%E9%A1%8C-xAI-exposed-Grok-admits-humans-o...
1•VIC513•28m ago•2 comments

iPhones 17 and the Sugar Water Trap

https://stratechery.com/2025/iphones-17-and-the-sugar-water-trap/
1•tosh•29m ago•0 comments

Android Platform-Tools 36.0.0 Released – Not yet Open‑Sourced

https://github.com/nmeum/android-tools/issues/185
4•uneven9434•31m ago•0 comments

Google admits the open web is in 'rapid decline'

https://www.theverge.com/news/773928/google-open-web-rapid-decline
4•sMarsIntruder•34m ago•1 comments

A framework for pricing AI products

https://stripe.com/blog/a-framework-for-pricing-ai-products
1•soheilpro•36m ago•0 comments

Is Go Over? [audio]

https://gopodcast.dev/episodes/059-is-go-over-with-john-arundel
1•gus_leonel•40m ago•0 comments

Visualgpt

https://visualgpt.io/
1•Jennifer_z•42m ago•1 comments

Nintendo's latest patents on Pokémon mechanics should not have been granted

https://www.pcgamer.com/gaming-industry/an-embarrassing-failure-of-the-us-patent-system-videogame...
4•chabad360•46m ago•0 comments

Decompiling the GPL violated Linux kernel using Evolutionary Algorithms

https://far.chickenkiller.com/computing/decompiling-the-kernel-using-ea/
1•farooqkz•48m ago•0 comments

Google is shutting down Tables, its Airtable rival

https://techcrunch.com/2025/09/11/google-is-shutting-down-tables-its-airtable-rival/
2•jmsflknr•50m ago•0 comments

I built a free Chrome extension to summarize anything with 17 AI models

1•huizhu•50m ago•0 comments

Forward Intro Emails

https://also.roybahat.com/introductions-and-the-forward-intro-email-14e2827716a1
2•Nars088•52m ago•0 comments

Playing the Field with My A.I. Boyfriends

https://www.newyorker.com/magazine/2025/09/15/playing-the-field-with-my-ai-boyfriends
1•FinnLobsien•53m ago•0 comments

Larry Ellison briefly becomes richest person

https://www.bbc.com/news/articles/cx2rp992y88o
2•ksec•53m ago•1 comments

Music Arena Leaderboard

https://huggingface.co/spaces/MusicArena/MusicArenaBoard
1•jinqueeny•55m ago•0 comments

The Year of Linux on Smartphones Maybe

https://grigio.org/the-year-of-linux-on-smartphones-maybe/
3•grigio•57m ago•1 comments

Show HN:New Qwen3-Next is coming(Free&Unlimited to use)

https://mixhubai.com/ai-models/qwen3-next
3•sugusd•57m ago•0 comments

Apple Store unreachable due to Preordering

https://www.apple.com/us/shop/goto/store
2•mrwaffle•1h ago•1 comments