frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Qwen3-Coder-Next

https://qwen.ai/blog?id=qwen3-coder-next
221•danielhanchen•1h ago•106 comments

Agent Skills

https://agentskills.io/home
213•mooreds•3h ago•142 comments

Deno Sandbox

https://deno.com/blog/introducing-deno-sandbox
25•johnspurlock•21m ago•2 comments

Prek: A better, faster, drop-in pre-commit replacement, engineered in Rust

https://github.com/j178/prek
56•fortuitous-frog•1h ago•25 comments

What's up with all those equals signs anyway?

https://lars.ingebrigtsen.no/2026/02/02/whats-up-with-all-those-equals-signs-anyway/
463•todsacerdoti•8h ago•134 comments

France dumps Zoom and Teams as Europe seeks digital autonomy from the US

https://apnews.com/article/europe-digital-sovereignty-big-tech-9f5388b68a0648514cebc8d92f682060
110•AareyBaba•1h ago•35 comments

Defining Safe Hardware Design [pdf]

https://people.csail.mit.edu/rachit/files/pubs/safe-hdls.pdf
9•rachitnigam•43m ago•0 comments

Kilobyte is precisely 1000 bytes

https://waspdev.com/articles/2026-01-11/kilobyte-is-1000-bytes
12•surprisetalk•1h ago•22 comments

Heritability of intrinsic human life span is about 50%

https://www.science.org/doi/10.1126/science.adz1187
79•XzetaU8•2d ago•50 comments

Bunny Database

https://bunny.net/blog/meet-bunny-database-the-sql-service-that-just-works/
133•dabinat•5h ago•63 comments

The Everdeck: A Universal Card System (2019)

https://thewrongtools.wordpress.com/2019/10/10/the-everdeck/
42•surprisetalk•6d ago•11 comments

Launch HN: Modelence (YC S25) – App Builder with TypeScript / MongoDB Framework

13•eduardpi•1h ago•4 comments

Show HN: difi – A Git diff TUI with Neovim integration (written in Go)

https://github.com/oug-t/difi
36•oug-t•4h ago•37 comments

Show HN: Sandboxing untrusted code using WebAssembly

https://github.com/mavdol/capsule
34•mavdol04•3h ago•15 comments

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

https://github.com/ambonvik/cimba
6•ambonvik•1h ago•2 comments

Migrate Wizard – IMAP Based Email Migration Tool

https://migratewizard.com/#features
6•techstuff123•35m ago•3 comments

Floppinux – An Embedded Linux on a Single Floppy, 2025 Edition

https://krzysztofjankowski.com/floppinux/floppinux-2025.html
216•GalaxySnail•13h ago•140 comments

Emerge Career (YC S22) is hiring a product designer

https://www.ycombinator.com/companies/emerge-career/jobs/omqT34S-founding-product-designer
1•gabesaruhashi•5h ago

Show HN: Octosphere, a tool to decentralise scientific publishing

https://octosphere.social/
6•crimsoneer•43m ago•5 comments

Tadpole – A modular and extensible DSL built for web scraping

https://tadpolehq.com/
6•zachperkitny•1h ago•1 comments

Data Brokers Can Fuel Violence Against Public Servants

https://www.wired.com/story/how-data-brokers-can-fuel-violence-against-public-servants/
58•achristmascarl•2h ago•21 comments

The next steps for Airbus' big bet on open rotor engines

https://aerospaceamerica.aiaa.org/the-next-steps-for-airbus-big-bet-on-open-rotor-engines/
10•CGMthrowaway•2h ago•5 comments

Banning lead in gas worked. The proof is in our hair

https://attheu.utah.edu/health-medicine/banning-lead-in-gas-worked-the-proof-is-in-our-hair/
249•geox•16h ago•181 comments

Anki ownership transferred to AnkiHub

https://forums.ankiweb.net/t/ankis-growing-up/68610
518•trms•21h ago•207 comments

Show HN: Safe-now.live – Ultra-light emergency info site (<10KB)

https://safe-now.live
130•tinuviel•8h ago•58 comments

Athena Parthenos: A Reconstruction (2000)

http://www.goddess-athena.org/Museum/Sculptures/Alone/Parthenos_reconstruction_x.htm
11•joebig•4d ago•0 comments

X offices raided. French prosecutors investigate child abuse images & deepfakes

https://apnews.com/article/france-x-investigation-seach-elon-musk-1116be84d84201011219086ecfd4e0bc
24•labrador•1h ago•6 comments

Archive.today is directing a DDoS attack against my blog?

https://gyrovague.com/2026/02/01/archive-today-is-directing-a-ddos-attack-against-my-blog/
300•gyrovague-com•2d ago•126 comments

How does misalignment scale with model intelligence and task complexity?

https://alignment.anthropic.com/2026/hot-mess-of-ai/
228•salkahfi•17h ago•71 comments

GitHub experience various partial-outages/degradations

https://www.githubstatus.com?todayis=2026-02-02
254•bhouston•20h ago•96 comments
Open in hackernews

Qwen3-Coder-Next

https://qwen.ai/blog?id=qwen3-coder-next
218•danielhanchen•1h ago

Comments

danielhanchen•1h ago
For those interested, made some Dynamic Unsloth GGUFs for local deployment at https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF and made a guide on using Claude Code / Codex locally: https://unsloth.ai/docs/models/qwen3-coder-next
ranger_danger•1h ago
What is the difference between the UD and non-UD files?
danielhanchen•1h ago
UD stands for "Unsloth-Dynamic" which upcasts important layers to higher bits. Non UD is just standard llama.cpp quants. Both still use our calibration dataset.
CamperBob2•30m ago
Please consider authoring a single, straightforward introductory-level page somewhere that explains what all the filename components mean, and who should use which variants.

The green/yellow/red indicators for different levels of hardware support are really helpful, but far from enough IMO.

danielhanchen•14m ago
Oh good idea! In general UD-Q4_K_XL (Unsloth Dynamic 4bits Extra Large) is what I generally recommend for most hardware - MXFP4_MOE is also ok
binsquare•1h ago
How did you do it so fast?

Great work as always btw!

danielhanchen•13m ago
Thanks! :) We're early access partners with them!
simonw•1h ago
This GGUF is 48.4GB - https://huggingface.co/Qwen/Qwen3-Coder-Next-GGUF/tree/main/... - which should be usable on higher end laptops.

I still haven't experienced a local model that fits on my 64GB MacBook Pro and can run a coding agent like Codex CLI or Claude code well enough to be useful.

Maybe this will be the one? This Unsloth guide from a sibling comment suggests it might be: https://unsloth.ai/docs/models/qwen3-coder-next

vessenes•1h ago
I'm thinking the next step would be to include this as a 'junior dev' and let Opus farm simple stuff out to it. It could be local, but also if it's on cerebras, it could be realllly fast.
ttoinou•1h ago
Cerebras already has GLM 4.7 in the code plans
vessenes•1h ago
Yep. But this is like 10x faster; 3B active parameters.
ttoinou•1h ago
Cerebras is already 200-800 tps, do you need even faster ?
overfeed•27m ago
Yes! I don't try to read agent tokens as they are generated, so if code generation decreases from 1 minute to 6 seconds, I'll be delighted. I'll even accept 10s -> 1s speedups. Considering how often I've seen agents spin wheels with different approaches, faster is always better, until models can 1-shot solutions without the repeated "No, wait..." / "Actually..." thinking loops
danielhanchen•1h ago
It works reasonably well for general tasks, so we're definitely getting there! Probably Qwen3 CLI might be better suited, but haven't tested it yet.
1dom•1h ago
I run Qwen3-Coder-30B-A3B-Instruct gguf on a VM with 13gb RAM and a 6gb RTX 2060 mobile GPU passed through to it with ik_llama, and I would describe it as usable, at least. It's running on an old (5 years, maybe more) Razer Blade laptop that has a broken display and 16gb RAM.

I use opencode and have done a few toy projects and little changes in small repositories and can get pretty speedy and stable experience up to a 64k context.

It would probably fall apart if I wanted to use it on larger projects, but I've often set tasks running on it, stepped away for an hour, and had a solution when I return. It's definitely useful for smaller project, scaffolding, basic bug fixes, extra UI tweaks etc.

I don't think "usable" a binary thing though. I know you write lot about this, but it'd be interesting to understand what you're asking the local models to do, and what is it about what they do that you consider unusable on a relative monster of a laptop?

regularfry•6m ago
I've had usable results with qwen3:30b, for what I was doing. There's definitely a knack to breaking the problem down enough for it.

What's interesting to me about this model is how good it allegedly is with no thinking mode. That's my main complaint about qwen3:30b, how verbose its reasoning is. For the size it's astonishing otherwise.

embedding-shape•1h ago
> I still haven't experienced a local model that fits on my 64GB MacBook Pro and can run a coding agent like Codex CLI or Claude code well enough to be useful

I've had mild success with GPT-OSS-120b (MXFP4, ends up taking ~66GB of VRAM for me with llama.cpp) and Codex.

I'm wondering if maybe one could crowdsource chat logs for GPT-OSS-120b running with Codex, then seed another post-training run to fine-tune the 20b variant with the good runs from 120b, if that'd make a big difference. Both models with the reasoning_effort set to high are actually quite good compared to other downloadable models, although the 120b is just about out of reach for 64GB so getting the 20b better for specific use cases seems like it'd be useful.

gigatexal•1h ago
I’ve a 128GB m3 max MacBook Pro. Running the gpt oss model on it via lmstudio once the context gets large enough the fans spin to 100 and it’s unbearable.
pixelpoet•13m ago
Laptops are fundamentally a poor form factor for high performance computing.
dust42•47m ago
Unfortunately Qwen3-next is not well supported on Apple silicon, it seems the Qwen team doesn't really care about Apple.

On M1 64GB Q4KM on llama.cpp gives only 20Tok/s while on MLX it is more than twice as fast. However, MLX has problems with kv cache consistency and especially with branching. So while in theory it is twice as fast as llama.cpp it often does the PP all over again which completely trashes performance especially with agentic coding.

So the agony is to decide whether to endure half the possible speed but getting much better kv-caching in return. Or to have twice the speed but then often you have again to sit through prompt processing.

But who knows, maybe Qwen gives them a hand? (hint,hint)

ttoinou•41m ago
I can run nightmedia/qwen3-next-80b-a3b-instruct-mlx at 60-74 tps using LM Studio. What did you try ? What benefit do you get from KV Caching ?
dust42•23m ago
KV caching means that when you have 10k prompt, all follow up questions return immediately - this is standard with all inference engines.

Now if you are not happy with the last answer, you maybe want to simply regenerate it or change your last question - this is branching of the conversation. Llama.cpp is capable of re-using the KV cache up to that point while MLX does not (I am using MLX server from MLX community project). I haven't tried with LMStudio. Maybe worth a try, thanks for the heads-up.

dehrmann•45m ago
I wonder if the future in ~5 years is almost all local models? High-end computers and GPUs can already do it for decent models, but not sota models. 5 years is enough time to ramp up memory production, consumers to level-up their hardware, and models to optimize down to lower-end hardware while still being really good.
manbitesdog•30m ago
Plus a long queue of yet-undiscovered architectural improvements
infinitezest•26m ago
A lot of manufacturers are bailing on consumer lines to focus on enterprise from what I've read. Not great.
regularfry•4m ago
Even without leveling up hardware, 5 years is a loooong time to squeeze the juice out of lower-end model capability. Although in this specific niche we do seem to be leaning on Qwen a lot.
organsnyder•12m ago
They run fairly well for me on my 128GB Framework Desktop.
vessenes•1h ago
3B active parameters, and slightly worse than GLM 4.7. On benchmarks. That's pretty amazing! With better orchestration tools being deployed, I've been wondering if faster, dumber coding agents paired with wise orchestrators might be overall faster than using the say opus 4.5 on the bottom for coding. At least we might want to deploy to these guys for simple tasks.
doctorpangloss•1h ago
Time will tell. All this stuff will get more adoption when Anthropic, Google and OpenAI raise prices.
Alifatisk•20m ago
They can only raise prices as long as people buy their subscriptions / pay for their api. The Chinese labs are closing in on the SOTA models (I would say they are already there) and offer insane cheap prices for their subscriptions. Vote with your wallet.
markab21•1h ago
It's getting a lot easier to do this using sub-agents with tools in Claude. I have a fleet of Mastra agents (TypeScript). I use those agents inside my project as CLI tools to do repetitive tasks that gobble tokens such as scanning code, web search, library search, and even SourceGraph traversal.

Overall, it's allowed me to maintain more consistent workflows as I'm less dependent on Opus. Now that Mastra has introduced the concept of Workspaces, which allow for more agentic development, this approach has become even more powerful.

solumunus•37m ago
Are you just exposing mastra cli commands to Claude Code in md context? I’d love you to elaborate on this if you have time.
IhateAI•5m ago
Do you actually ship anything is the question, or is this all just (expensive) magic tricks you're preforming on yourself? Genuine question.
endymion-light•1h ago
Looks great - i'll try to check it out on my gaming PC.

On a misc note: What's being used to create the screen recordings? It looks so smooth!

zamadatix•1h ago
Can anyone help me understand the "Number of Agent Turns" vs "SWE-Bench Pro (%)" figure? I.e. what does the spread of Qwen3-Coder-Next from ~50 to ~280 agent turns represent for a fixed score of 44.3%: that sometimes it takes that spread of agent turns to achieve said fixed score for the given model?
edude03•1h ago
Essentially the more turns you have the more the agent is likely to fail since the error compounds per turn. Agentic model are tuned for “long horizon tasks” ie being able to go many many turns on the same problem without failing.
zamadatix•1h ago
Much appreciated, but I mean more around "what do the error bars in the figure represent" than what the turn scaling itself is.
esafak•51m ago
For the tasks in SWE-Bench Pro they obtained a distribution of agent turns, summarized as the box plot. The box likely describes the inter-quartile range while the whiskers describe the some other range. You'd have to read their report to be sure. https://en.wikipedia.org/wiki/Box_plot
jsnell•51m ago
That's a box plot, so those are not error bars but a visualization of the distribution of a metric (min, max, median, 25th percentile, 75th percentile).

The benchmark consists of a bunch of tasks. The chart shows the distribution of the number of turns taken over all those tasks.

yorwba•50m ago
SWE-Bench Pro consists of 1865 tasks. https://arxiv.org/abs/2509.16941 Qwen3-Coder-Next solved 44.3% (826 or 827) of these tasks. To solve a single task, it took between ≈50 and ≈280 agent turns, ≈150 on average. In other words, a single pass through the dataset took ≈280000 agent turns. Kimi-K2.5 solved ≈84 fewer tasks, but also only took about a third as many agent turns.
throwaw12•1h ago
We are getting there, as a next step please release something to outperform Opus 4.5 and GPT 5.2 in coding tasks
gordonhart•1h ago
By the time that happens, Opus 5 and GPT-5.5 will be out. At that point will a GPT-5.2 tier open-weights model feel "good enough"? Based on my experience with frontier models, once you get a taste of the latest and greatest it's very hard to go back to a less capable model, even if that less capable model would have been SOTA 9 months ago.
tosh•1h ago
It feels like the gap between open weight and closed weight models is closing though.
theshrike79•51m ago
Mode like open local models are becoming "good enough".

I got stuff done with Sonnet 3.7 just fine, it did need a bunch of babysitting, but still it was a net positive to productivity. Now local models are at that level, closing up on the current SOTA.

When "anyone" can run an Opus 4.5 level model at home, we're going to be getting diminishing returns from closed online-only models.

cirrusfan•1h ago
I think it depends on what you use it for. Coding, where time is money? You probably want the Good Shit, but also want decent open weights models to keep prices sane rather than sama’s 20k/month nonsense. Something like a basic sentiment analysis? You can get good results out of a 30b MoE that runs at good pace on a midrange laptop. Researching things online with many sources and decent results I’d expect to be doable locally by the end of 2026 if you have 128GB ram, although it’ll take a while to resolve.
bwestergard•51m ago
What does it mean for U.S. AI firms if the new equilibrium is devs running open models on local hardware?
selectodude•42m ago
OpenAI isn’t cornering the market on DRAM for kicks…
rglullis•44m ago
I'm going in the opposite direction: with each new model, the more I try to optimize my existing workflows by breaking the tasks down so that I can delegate tasks to the less powerful models and only rely on the newer ones if the results are not acceptable.
yorwba•38m ago
When Alibaba succeeds at producing a GPT-5.2-equivalent model, they won't be releasing the weights. They'll only offer API access, like for the previous models in the Qwen Max series.

Don't forget that they want to make money in the end. They release small models for free because the publicity is worth more than they could charge for them, but they won't just give away models that are good enough that people would pay significant amounts of money to use them.

thepasch•17m ago
If an open weights model is released that’s as capable at coding as Opus 4.5, then there’s very little reason not to offload the actual writing of code to open weight subagents running locally and stick strictly to planning with Opus 5. Could get you masses more usage out of your plan (or cut down on API costs).
Keyframe•19m ago
I'd be happy with something that's close or same as opus 4.5 that I can run locally, at reasonable (same) speed as claude cli, and at reasonable budget (within $10-30k).
segmondy•5m ago
Try KimiK2.5 and DeepSeekv3.2-Speciale
IhateAI•4m ago
Just code it yourself, you might surprise yourself :)
skhameneh•1h ago
It’s hard to elaborate just how wild this model might be if it performs as claimed. The claims are this can perform close to Sonnet 4.5 for assisted coding (SWE bench) while using only 3B active parameters. This is obscenely small for the claimed performance.
cirrusfan•1h ago
If it sounds too good to be true…
theshrike79•53m ago
Should be possible with optimised models, just drop all "generic" stuff and focus on coding performance.

There's no reason for a coding model to contain all of ao3 and wikipedia =)

noveltyaccount•26m ago
I think I like coding models that know a lot about the world. They can disambiguate my requirements and build better products.
regularfry•3m ago
I generally prefer a coding model that can google for the docs, but separate models for /plan and /build is also a thing.
MarsIronPI•20m ago
But... but... I need my coding model to be able to write fanfiction in the comments...
alexellisuk•1h ago
Is this going to need 1x or 2x of those RTX PRO 6000s to allow for a decent KV for an active context length of 64-100k?

It's one thing running the model without any context, but coding agents build it up close to the max and that slows down generation massively in my experience.

segmondy•3m ago
1 6000 should be fine, Q6_K_XL gguf will be almost on par with the raw weights and should let you have 128k-256k context.
Soerensen•1h ago
The agent orchestration point from vessenes is interesting - using faster, smaller models for routine tasks while reserving frontier models for complex reasoning.

In practice, I've found the economics work like this:

1. Code generation (boilerplate, tests, migrations) - smaller models are fine, and latency matters more than peak capability 2. Architecture decisions, debugging subtle issues - worth the cost of frontier models 3. Refactoring existing code - the model needs to "understand" before changing, so context and reasoning matter more

The 3B active parameters claim is the key unlock here. If this actually runs well on consumer hardware with reasonable context windows, it becomes the obvious choice for category 1 tasks. The question is whether the SWE-Bench numbers hold up for real-world "agent turn" scenarios where you're doing hundreds of small operations.

cirrusfan•1h ago
I find it really surprising that you’re fine with low end models for coding - I went through a lot of open-weights models, local and "local", and I consistently found the results underwhelming. The glm-4.7 was the smallest model I found to be somewhat reliable, but that’s a sizable 350b and stretches the definition of local-as-in-at-home.
NitpickLawyer•51m ago
You're replying to a bot, fyi :)
CamperBob2•6m ago
If it weren't for the single em-dash (really an en-dash, used as if it were an em-dash), how am I supposed to know that?

And at the end of the day, does it matter?

syntaxing•1h ago
Is Qwen next architecture ironed out in llama cpp?
orliesaurus•53m ago
how can anyone keep up with all these releases... what's next? Sonnet 5?
Squarex•39m ago
Well there are rumors sonnet 5 is coming today, so...
gessha•35m ago
Tune it out, come back in 6 months, the world is not going to end. In 6 months, you’re going to change your API endpoint and/or your subscription and then spend a day or two adjusting. Off to the races you go.
cedws•50m ago
I kind of lost interest in local models. Then Anthropic started saying I’m not allowed to use my Claude Code subscription with my preferred tools and it reminded me why we need to support open tools and models. I’ve cancelled my CC subscription, I’m not paying to support anticompetitive behaviour.
wahnfrieden•48m ago
OpenAI committed to allowing it btw. I don't know why Anthropic gets so much love here
jmathai•46m ago
Probably because the alternatives are OpenAI, Google, Meta. Not throwing shade at those companies but it's not hard to win the hearts of developers when that's your competition.
cedws•45m ago
Thanks, I’ll try out Codex to bridge until local models get to the level I need.
rustyhancock•44m ago
Cause they make the best coding model.

It's that simple. Everyone else is trying to compete in other ways and Anthropic are pushing for dominate the market.

They'll eventually lose their performance edge and suddenly they will back to being cute and fluffy

I've cancelled a clause sub, but still have one.

bheadmaster•33m ago
Agreed.

I've tried all of the models available right now, and Claude Opus is by far the most capable.

I had an assertion triggered in a fairly complex open-source C library I was using, and Claude Opus not only found the cause, but wrote a self-container reproduction code I could add to a GitHub issue. And it also added tests for that issue, and fixed the underlying issue.

I am sincerely impressed by the capabilities of Claude Opus. Too bad its usage is so expensive.

varispeed•27m ago
On the other hand I feel like 5.2 gets progressively dumbed down. It used to work well, but now initial few prompts go in right direction and then it goes off the rails reminding me more of a GPT-3.5.

I wonder what they are up to.

tomashubelbauer•36m ago
Anthropic banned my account when I whipped up a solution to control Claude Code running on my Mac from my phone when I'm out and about. No commercial angle, just a tool I made for myself since they wouldn't ship this feature (and still haven't). I wasn't their biggest fanboy to begin with, but it gave me the kick in the butt needed to go and explore alternatives until local models get good enough that I don't need to use hosted models altogether.
RationPhantoms•29m ago
There is weaponized malaise employed by these frontier model providers and I feel like that dark-pattern, what you pointed out, and others are employed to rate-limit certain subscriptions.
bri3d•17m ago
They have two products:

* Subscription plans, which are (probably) subsidized and definitely oversubscribed (ie, 100% of subscribers could not use 100% of their tokens 100% of the time).

* Wholesale tokens, which are (probably) profitable.

If you try to use one product as the other product, it breaks their assumptions and business model.

I don't really see how this is weaponized malaise; capacity planning and some form of over-subscription is a widely accepted thing in every industry and product in the universe?

darkwater•28m ago
I control it with ssh and sometimes tmux (but termux+wireguard lead to a surprisingly generally stable connection). Why did you need more than that?
tomashubelbauer•24m ago
I didn't like the existing SSH applications for iOS and I already have a local app that I made that I have open 24/7, so I added a screen that used xterm.js and Bun.spawn with Bun.Terminal to mirror the process running on my Mac to my phone. This let me add a few bells and whistles that a generic SSH client wouldn't have, like notifications when Claude Code was done working etc.
redblacktree•15m ago
How did this work? The ban, I mean. Did you just wake up to find out an email and that your creds no longer worked? Were you doing things to sub-process out to the Claude Code CLI or something else?
tomashubelbauer•11m ago
I left a sibling comment detailing the technical side of things. I used the `Bun.spawn` API with the `terminal` key to give CC a PTY and mirrored it to my phone with xterm.js. I used SSE to stream CC data to xterm.js and a regular request to send commands out from my phone. In my mind, this is no different than using CC via SSH from my phone - I was still bound by the same limits and wasn't trying to bypass them, Anthropic is entitled to their different opinion of course.

And yeah, I got three (for some reason) emails titled "Your account has been suspended" whose content said "An internal investigation of suspicious signals associated with your account indicates a violation of our Usage Policy. As a result, we have revoked your access to Claude.". There is a link to a Google Form which I filled out, but I don't expect to hear back.

I did nothing even remotely suspicious with my Anthropic subscription so I am reasonably sure this mirroring is what got me banned.

Edit: BTW I have since iterated on doing the same mirroring using OpenCode with Codex, then Codex with Codex and now Pi with GPT-5.2 (non-Codex) and OpenAI hasn't banned me yet and I don't think they will as they decided to explicitly support using your subscription with third party coding agents following Anthropic's crackdown on OpenCode.

giancarlostoro•29m ago
I do wonder if they locked things down due to people abusing their CC token.
whywhywhywhy•20m ago
Taking umbrage as if it matters how I use the compute I'm paying for via the harness they want me to use it within as long as I'm just doing personal tasks I want to do for myself, not trying to power an apps API with it seems such a waste of their time to be focusing on and only causes brand perception damage with their customers.

Could have just turned a blind eye.

echelon•18m ago
Doesn't matter. Nobody should have that power.

If a company is going to automate our jobs, we shouldn't be giving them money and data to do so. They're using us to put ourselves out of work, and they're not giving us the keys.

I'm fine with non-local, open weights models. Not everything has to run on a local GPU, but it has to be something we can own.

I'd like a large, non-local Qwen3-Coder that I can launch in a RunPod or similar instance. I think on-demand non-local cloud compute can serve as a middle ground.

cedws•18m ago
In what way would it be abused? The usage limits apply all the same, they aren't client side, and hitting that limit is within the terms of the agreement with Anthropic.
bri3d•16m ago
The subscription services have assumptions baked in about the usage patterns; they're oversubscribed and subsidized. If 100% of subscriber customers use 100% of their tokens 100% of the time, their business model breaks. That's what wholesale / API tokens are for.

> hitting that limit is within the terms of the agreement with Anthropic

It's not, because the agreement says you can only use CC.

Nemi•5m ago
> The subscription services have assumptions baked in about the usage patterns; they're oversubscribed and subsidized.

Selling dollars for $.50 does that. It sounds like they have a business model issue to me.

behnamoh•5m ago
> It's not, because the agreement says you can only use CC.

it's like Apple: you can use macOS only on our Macs, iOS only on iPhones, etc. but at least in the case of Apple, you pay (mostly) for the hardware while the software it comes with is "free" (as in free beer).

CamperBob2•16m ago
How do I "abuse" a token? I pass it to their API, the request executes, a response is returned, I get billed for it. That should be the end of the conversation.
bri3d•13m ago
You can buy this product, right here: https://platform.claude.com/docs/en/about-claude/pricing

That's not the product you buy when you a Claude Code token, though.

disiplus•5m ago
im downloading it as we speek to try to run it on a 32gb 5090 + 128gb ddr5 i will compare it to glm 4.7-flash that was my local model of choice
ossicones•48m ago
What browser use agent are they using here?
valcron1000•32m ago
Still nothing to compete with GPT-OSS-20B for local image with 16 VRAM.
storus•22m ago
Does Qwen3 allow adjusting context during an LLM call or does the housekeeping need to be done before/after each call but not when a single LLM call with multiple tool calls is in progress?
Alifatisk•19m ago
As always, the Qwen team is pushing out fantastic content

Hope they update the model page soon https://chat.qwen.ai/settings/model

ionwake•16m ago
will this run on an apple m4 air with 32gb ram?

Im currently using qwen 2.5 16b , and it works really well

Robdel12•2m ago
I really really want local or self hosted models to work. But my experience is they’re not really even close to the closed paid models.

Does anyone any experience with these and is this release actually workable in practice?