Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

https://github.com/nex-agi/Nex-N2/issues/4

103•unrvl22•2h ago

Comments

unrvl22•2h ago

The municipality of Rio de Janeiro (via its IT company IplanRIO) released Rio-3.5-Open-397B, presented as a homegrown Qwen3.5 fine-tune that beats comparable open models on benchmarks. The linked issue argues it's actually a weighted merge of ~60% Nex-N2 Pro + ~40% Qwen3.5-397B-A17B - Nex-N2 having been released about a week earlier.

Lucasoato•39m ago

So the problem isn’t in the missing attribution to Qwen, but with the fact that they didn’t mention Nex-N2 Pro right?

Aurornis•19m ago

The problem is that they claimed to have made a big achievement with their home grown post training, and they expected to receive a lot of praise for it.

Then researchers looked at the weights and there is no post training at all.

They are now attributing both models they merged, but their excuse for the lack of post training is to claim they accidentally uploaded the wrong files.

DonsDiscountGas•19m ago

I didn't know model merging like that was possible. (Obviously possible from a pure software standpoint but I'm surprised it's effective)

AnotherGoodName•1h ago

This is fascinating that it worked though. Can we just merge all the open weight models and get something better?

_3u10•1h ago

No, they need the same arch, but you can distill them into a single model. And yes, if you use the API directly Claude will often say it’s an open weight model (likely the ones it was distilled from)

wds•1h ago

I imagine it'd work the same as merging all the good-tasting foods to get an even tastier one

dindunuf•55m ago

that kinda worked in llama 1/2 era, not between different models but between finetunes of the same model. the briefly legendary Mythomax was IIRC a merge of 5+ tunes, some of which were merges themselves.

avereveard•48m ago

most merge improve a small subset of "feeling" benchmark (too small, too specific, or out of distribution) and tend to show degradation on actual benchmark, with especially punishing result on long chain benchmarks.

also only work on matching architectures (i.e. finetunes/loras of the same model)

AlienRobot•1h ago

The model's webpage at https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B says it's a merge now. It previously didn't contain this paragraph:

>The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.

Incidentally are people using Github issues as blogs now?

jonchurch_•8m ago

It wasnt framed as an issue which is the norm breakage I think you’re reacting to, as in they didnt ask that the readme be updated etc, but it is common now for folks to use a project’s issue tracker to name and shame them in a place they cant easily ignore.

Whether that’s right, prosocial, or professional is up for debate (as well as if any single definition of etiquette can be expected in 2026 on an issue tracker).

But surely you can see the optics reason why someone would take their complaint to the repo directly? It pressures the maintainers to respond, it allows for a pile on from the internet, and makes any decision to lock down a hostile thread into its own kind of statement.

The maintainers should absolutely post an official response and lock the thread though, its getting ugly in there.

zinodaur•1h ago

Oh no, someone is profiting off of their work without proper attribution!?!?

internet2000•1h ago

Attribution isn't the relevant part. Lying about your lab's capabilities is.

Planktonne•1h ago

That's also something all the AI companies have been doing.

dofm•46m ago

Lying about model capability is right now the lingua franca of the cloud AI business model, almost; they yes-and each other's lies because they are in a position of needing to generate interest, including going as far as needing to trigger regulatory capture.

(It's not news to anyone who has worked in sales-led businesses that salespeople are prone to believing the claims of other salespeople, I guess).

functionmouse•53m ago

leopards ate my face

adrian_b•53m ago

I do not see anyone lying.

The model card says:

> Post-trained from Qwen 3.5 397B

The model card also says that they use an inference framework based on "SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs" by Shi et al.:

https://arxiv.org/abs/2510.05069

So the sources seem properly attributed.

They only claim that what they did to "Qwen 3.5 397B" has improved the LLM, including, as expected, with "strong performance in Portuguese".

alfiedotwtf•1h ago

Wasn’t it already obvious given the awfully familiar parameter numbers?

ekjhgkejhgk•1h ago

One funny thing about incompetence is that they don't have the competence to know that their incompetence is straightforward to verify by a competent person.

root-parent•56m ago

You just described every single vibe coder...

carlosjobim•45m ago

Why would they care? They get their salaries and pensions and bonuses, and the tax payer is footing the bill.

thimabi•32m ago

I wouldn’t describe what happened here as incompetence. As a “carioca”, I am pleasantly surprised to know that the government’s IT department is involved in AI work — even without the budget to create its own models from scratch.

arcticfox•25m ago

This seems kind of insane though, every time I go to Rio I think of the potential of AI/technology to solve some problems and leave it even more paradisiacal... But working on their own model? Wtf? There are a million applications of existing ones there that should be followed up on instead.

MadrasTh0rn•1h ago

Not surprised

fkozlowski•58m ago

I'm honestly surprised that they even had the inclination to attempt creating a model. I guess it's bullish that a municipal IT department had the guts to try this?

yieldcrv•55m ago

Didn’t the last thread about this have someone from the lab or an enthusiast in Rio saying exactly that?

Its a fine tune of Qwen

Not a conspiracy

daemonologist•36m ago

The allegation here is that it's not actually a fine-tune of Qwen, but instead an undisclosed mashup (merge) of someone else's fine-tune of Qwen and the original model. Rio subsequently said that the model was in fact a merge, that they did additional fine-tuning after the merge, and that they accidentally uploaded the base merge instead of the version with additional fine-tuning. But this seems like quite an oversight...

jrm4•31m ago

“Well, Steve (Jobs), I think it’s more like we both had this rich neighbor named Xerox, and I broke into his house to steal the TV set, but I found out that you had already stolen it.”

-- Bill Gates

wunderlotus•14m ago

lmao i really hope this is a real quote cuz it’s a banger

ckcheng•3m ago

Apparently:

https://www.folklore.org/A_Rich_Neighbor_Named_Xerox.html

hintymad•2m ago

> Every weight tensor in Rio is, to thousands of standard deviations, the same 0.6/0.4 blend of Nex and Qwen — across all 60 layers and every component of the network. Other finetunes cannot be explained as interpolations.

I find it amazing how robust the current deep learning models are. A simple linear combination of every weight did not degrade the performance of the model, but enhanced it.

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

Show HN: Kage – Shadow any website to a single binary for offline viewing

Firewood Splitting Simulator

The Birth and Death of JavaScript (2014)

Ask HN: What are you working on? (June 2026)

Caddy compatibility for zeroserve: 3x throughput and 70% lower latency

Perlisisms

Lisp's Influence on Ruby

No, everyone is not using AI for everything

FarOutCompany

Swiss voters reject proposal to cap population at ten million

I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models

The only scalable delete in Postgres is DROP TABLE

Global density and biomass of arbuscular mycorrhizal fungal networks

Rio de Janeiro's city government model Rio3.5 beats Qwen3.7 in recent benchmarks

Formal Methods and the Future of Programming

Show HN: Dual YOLOv8n UAV Detection on RK3588S at 42 FPS Using NPU

How did Atari apply side art to Arcade Cabinets?

How to Earn a Billion Dollars

Quivers: A year of linear algebra by drawing arrows

Show HN: 3D print Z reinforcement via injected loops

Linux 7.1

A 'cold blob' in the Atlantic could be a sign of AMOC shutdown

Free SQL→ER diagram tool, runs in the browser, nothing uploaded

Honda Civics and the Evil Valet

FTX's former Anthropic stake would be worth about $75B at today's valuation

Dillo directory – Directory of useful sites that work reasonably well on Dillo

KPMG pulls report on AI usage due to apparent hallucinations

Extinction-Level Capitalism

Dangerous hormone-disrupting chemicals found in US breast milk samples

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

Show HN: Kage – Shadow any website to a single binary for offline viewing

Firewood Splitting Simulator

The Birth and Death of JavaScript (2014)

Ask HN: What are you working on? (June 2026)

Caddy compatibility for zeroserve: 3x throughput and 70% lower latency

Perlisisms

Lisp's Influence on Ruby

No, everyone is not using AI for everything

FarOutCompany

Swiss voters reject proposal to cap population at ten million

I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models

The only scalable delete in Postgres is DROP TABLE

Global density and biomass of arbuscular mycorrhizal fungal networks

Rio de Janeiro's city government model Rio3.5 beats Qwen3.7 in recent benchmarks

Formal Methods and the Future of Programming

Show HN: Dual YOLOv8n UAV Detection on RK3588S at 42 FPS Using NPU

How did Atari apply side art to Arcade Cabinets?

How to Earn a Billion Dollars

Quivers: A year of linear algebra by drawing arrows

Show HN: 3D print Z reinforcement via injected loops

Linux 7.1

A 'cold blob' in the Atlantic could be a sign of AMOC shutdown

Free SQL→ER diagram tool, runs in the browser, nothing uploaded

Honda Civics and the Evil Valet

FTX's former Anthropic stake would be worth about $75B at today's valuation

Dillo directory – Directory of useful sites that work reasonably well on Dillo

KPMG pulls report on AI usage due to apparent hallucinations

Extinction-Level Capitalism

Dangerous hormone-disrupting chemicals found in US breast milk samples

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

Comments