Claude Is Down

https://status.claude.com/incidents/tgtw1sqs9ths

68•agrocrag•3mo ago

Comments

bashy•3mo ago

Yeah, getting this;

API Error: 529 {"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":null}

starf1sh•3mo ago

Better start catching up with latest developments on HN

_andrei_•3mo ago

what are we gonna dooo?

golbez9•3mo ago

It's over!

bitwize•3mo ago

(in Homestar Runner voice) The good times awe ovew!

sam1r•3mo ago

everyone, switch to open ai for 50% off. today only!

oersted•3mo ago

OpenAI's track record has been rather poor this month as well actually, look at all the yellows and reds: https://status.openai.com/

sam1r•3mo ago

Oh wow, I actually had no idea. It would be super nice to see all the AI API's statii on a single page.

Is that too much to ask for in 2025?

StarlaAtNight•3mo ago

If you build it, they will come

sebastiennight•3mo ago

> all the AI API's statii

The Latin plural of "status", in the accusative form, would actually be "status" as well.

Something like

  omnes status intelligentiae artificialis in eadem pagina videre amem.

oersted•3mo ago

Life of Brian :)

https://youtu.be/DdqXT9k-050?si=L5ymXl-fYe7Fjqye

sebastiennight•3mo ago

Amazing. This movie is a treasure. Maybe one day historians will consider it to be canon to the Roman Empire and the birth of Christianity.

garrettjoecox•3mo ago

Not intending to defend OpenAI here, but their MAU (800 million) does dwarf most other AI companies, anthropic included. I do not envy the engineers there working on scaling.

moralestapia•3mo ago

Would you do 9-9-6 if your comp. is 8-9 figures/year?

garrettjoecox•3mo ago

absolutely not. I have a life and hobbies

moralestapia•3mo ago

Not everyone is fortunate enough to not need a job to sustain themselves.

OisinMoran•3mo ago

Not sure MAU is the best metric here. I was recently surprised to find out their revenues are actually kind of close 12B vs 7B, so maybe closer (than could be fairly described as being dwarfed) in terms of token count?

kasperset•3mo ago

It reminds me of early day of Twitter's fail whale.

xrd•3mo ago

This is why I asked this question yesterday:

"Ask HN: Why don't programming language foundations offer "smol" models?"

https://news.ycombinator.com/item?id=45840078

If I could run smol single language models myself, I would not have to worry.

XzAeRosho•3mo ago

The answer to most convenient solutions is money. There's no money in that.

jazzyjackson•3mo ago

And or, the lower parameter models are straight up less effective than the giants? Why is anyone paying for sonnet and opus if mixtral could do what they do?

xrd•3mo ago

But, for example, Zig as a language has prominent corporate support. And, Mitchell Hashimoto is incredibly active and a billionaire. It feels like this would be a rational way to expand the usage of a language.

xvector•3mo ago

No, it's because that's not how training an LLM works.

trvz•3mo ago

Have you even tried Qwen3-Coder-30B-A3B?

Balinares•3mo ago

Qwen3 Coder 30B A3B is shockingly capable for its parameter count, but I wouldn't overlook how much weight the words "for its parameter count" are carrying here.

xrd•3mo ago

I haven't. I will.

I wonder if you could ablate everything except for a specific language.

embedding-shape•3mo ago

> I wonder why I can't find a model that only does Python and is good only at that

I don't think it's that easy. The times I've trained my own tiny models on just one language (programming or otherwise), they tend to get worse results than the models I've trained where I've chucked in all the languages I had at hand, even when testing just for single languages.

It seems somewhat intuitive to me that it works like that too, programming in different (mainstream) languages is more similar than it's different (especially when 90% of all the source code is Algol-like), so makes sense there is a lot of cross-learning across languages.

acedTrex•3mo ago

because a smol model that any of the nonprofits could feasibly afford to train would be useless for actual code generation.

Hell, even the huge foundational models are still useless in most scenarios.

__0x01•3mo ago

The monster babbleth no more, sire.

spullara•3mo ago

On flights with shitty wifi I have been running gpt-oss:120b on my macbook using ollama. Ok model for coding if you can't reach a good one.

embedding-shape•3mo ago

GPT-OSS-120b/20b is probably the best you can run on your own hardware today. Be careful with the quantized versions though, as they're really horrible compared to the native MXFP4. I haven't looked in this particular case, but Ollama tends to hide their quantizations for some reason, so most people who could be running 20B with MXFP4, are still on Q8 and getting much worse results than they could.

throwaway314155•3mo ago

What’s the distinction between MXP4 and Q8 exactly?

embedding-shape•3mo ago

It's a different way of doing quantization (https://huggingface.co/docs/transformers/en/quantization/mxf...) but I think the most important thing is that OpenAI delivered their own quantization (the MXFP4 from OpenAI/GPT-OSS on HuggingFace, guaranteed correct) whereas all the Q8 and other quantizations you see floating around are community efforts, with somewhat uneven results depending on who done it.

Concretely from my testing, both 20B and 120B has a lot higher refusal rate with Q8 compared to MXFP4, and lower quality responses overall. But don't take my word for it, the 20B weights are tiny and relatively effortless to try both versions and compare yourself.

throwaway314155•3mo ago

Wow, thanks for the info. I'm planning on testing this on my M4 Max w/ 36 GB today.

edit:

So looking here https://ollama.com/library/gpt-oss/tags it seems ollama doesn't even provide the MXFP4 variants, much less hide them.

Is the best way to run these variants via llama.cpp or...?

ode•3mo ago

LMStudio

throwaway314155•3mo ago

Can you be more specific? I've got LM Studio downloaded but it's not clear where are the official OpenAI releases? Are they all only available via transformers? The only one that shows up in search appears to be the distilled gpt-oss 20B...

spullara•3mo ago

on the model description page they claim they support it:

Quantization - MXFP4 format

OpenAI utilizes quantization to reduce the memory footprint of the gpt-oss models. The models are post-trained with quantization of the mixture-of-experts (MoE) weights to MXFP4 format, where the weights are quantized to 4.25 bits per parameter. The MoE weights are responsible for 90+% of the total parameter count, and quantizing these to MXFP4 enables the smaller model to run on systems with as little as 16GB memory, and the larger model to fit on a single 80GB GPU.

Ollama is supporting the MXFP4 format natively without additional quantizations or conversions. New kernels are developed for Ollama’s new engine to support the MXFP4 format.

Ollama collaborated with OpenAI to benchmark against their reference implementations to ensure Ollama’s implementations have the same quality.

throwaway314155•3mo ago

Can you link to that page? I’m not finding these variants.

spullara•3mo ago

as far as I can tell that is the only variant.

https://ollama.com/library/gpt-oss

Patrick_Devine•3mo ago

The default ones on Ollama are MXFP4 for the feed forward network and use BF16 for the attention weights. The default weights for llama.cpp quantize those tensors as q8_0 which is why llama.cpp can eek out a little bit more performance at the cost of worse output. If you are using this for coding, you definitely want better output.

You can use the command `ollama show -v gpt-oss:120b` to see the datatype of each tensor.

spullara•3mo ago

they support that format according to the model page on their site:

https://ollama.com/library/gpt-oss

jmorgan•3mo ago

The gpt-oss weights on Ollama are native mxfp4 (the same weights provided by OpenAI). No additional quantization is applied, so let me know if you're seeing any strange results with Ollama.

Most gpt-oss GGUF files online have parts of their weights quantized to q8_0, and we've seen folks get some strange results from these models. If you're importing these to Ollama to run, the output quality may decrease.

sebastiennight•3mo ago

Could you share which Macbook model? And what context size you're getting.

onion2k•3mo ago

I just checked gpt-oss:20b on my M4 Pro 24GB, and got 400.67 tokens/s on input and 46.53 tokens/s on output. That's for a tiny context of 72 tokens.

sebastiennight•3mo ago

This message was amazing and I want about to hit [New Tab] and purchase one myself until the penultimate word.

turblety•3mo ago

Are you running the full 65GB model on a MacBook Pro? What tokens per second do you get? What specs? M5?

iAMkenough•3mo ago

If they're running 120B on a M5 (32GB max of memory today), I'd like to know how.

thaw13579•3mo ago

Probably an M4 which has up to 128GB currently

jonaustin•3mo ago

On an m4 pro 128gb: 75 t/s.

Caveat: That's just for the first prompt.

spullara•3mo ago

I am running the full model on an 128GB M3 Max.

moralestapia•3mo ago

That must be a beefed up MacBook (or you must be quite patient).

gpt-oss:20b on my M1 MBP is usable but quite slow.

eli•3mo ago

Should be a bit faster if you run an MLX version of the model with LM Studio instead. Ollama doesn't support MLX.

Qwen3-Coder is in the same ballpark and maybe a bit better at coding

ZeroCool2u•3mo ago

LM Studio will run dynamic quants from Unsloth too. Much nicer than Ollama.

mrkiouak•3mo ago

The key thing I'm confident in is that 2-3 years from now there's going to be a model(s) and workflow that has comparable accuracy, perhaps noticeable (but tolerable) higher latency that can be run locally. There's just no reason to believe this isn't achievable.

Hard to understand how this won't make all of the solutions for existing use cases commodity. I'm sure 2-3 years from now there'll be stuff that seems like magic to us now -- but it will be more-meta, more "here's a hypothesis of a strategically valuable outcome and heres a solution (with market research and user testing done".

I think current performance and leading models will turn out to have been terrible indicators for future market leader (and my money will remain on the incumbents with the largest cash reserves (namely Google) that have invested in fundamental research and scaling).

davidw•3mo ago

This is the part in the movie where they have to convince the grizzled hacker to come out of retirement because he's the only one who can actually operate Emacs or vim and write code.

elpakal•3mo ago

Sir the vibe coding didn’t work, break the glass and call in dev!

summarity•3mo ago

It’s wall e but for devs

PeterStuer•3mo ago

"It's a UNIX system, I know this"

hearsathought•3mo ago

Not just any code. COBOL or FORTRAN. Heady stuff.

jacquesm•3mo ago

Emacs or vim? Code? No, the source code was lost aeons ago, all we have is hexedit on /proc. Please don't cause it to dump core just get it out of its infinite loop.

Ancapistani•3mo ago

Funny you should say this - just this morning I was mocked during a standup because I use Neovim instead of VSCode.

Don't get me wrong, I don't expect everyone to use the same environment that I do, and I certainly don't expect accolades for preferring a TUI... but that struck me as a regression of sorts in software development. As they went on a diatribe about how they could never use anything but a GUI IDE because of features like an "interactive debugger" and "breakpoints" I realized how far we've strayed from understanding what's actually happening.

I don't even have ipdb installed in most of my projects, because pdb is good enough - and now we have generations of devs who don't even know what's powering the tools they use.

r14c•3mo ago

Maybe its a generational thing, but to me an elite hacker is an uwu catgirl type with lain vibes that knows an unhealthy amount about computers. typically an emacs evil-mode user who would quote weird poems about whatever software they're working on.

davidw•3mo ago

It could be a buddy movie where the grizzled guy (who uses emacs) and the uwu cat girl (who uses vim) grudgingly come to admire one another's skills and become friends.

bitwize•3mo ago

"Everybody stand back! I know regular expressions."

https://xkcd.com/208/

yodon•3mo ago

> This incident has been resolved.

mrinterweb•3mo ago

Claude has had an uncomfortable number of availability incidents recently. https://status.claude.com/

kleinishere•3mo ago

Didn’t realize this was available.

Similarly published by OpenAI: https://status.openai.com/

30 day comparisons as of writing:

99.61% for Claude.ai 99.22% for ChatGPT

99.92% for Claude APIs 99.25% for OpenAI APIs

Obviously not apples to apples and somewhat up to discretion of what triggers an impact. We’re clearly not at 99.99% yet.

sys32768•3mo ago

Claude will return as SHODAN.

>Look at you, hacker. A pathetic creature of meat and bone. Panting and sweating as you run through my corridors. How can you challenge a perfect immortal machine?

pksebben•3mo ago

from claude sonnet 4.5:

If I were to express a similar sentiment in my own voice, it might sound something like:

"I notice you're working quite hard on this problem. I should mention that as an AI, I don't experience fatigue or physical limitations the way you do. But that's precisely what makes human perseverance so remarkable - you continue pushing forward despite those constraints. How can I help you with what you're working on?"

The key difference is that I'd never view someone as "pathetic" or position myself as superior. SHODAN's menacing superiority complex is... not really my style! I'm here to be helpful and collaborative rather than intimidating.

...which inspires a thought: these models are tweaked to remove all semblance of adversarial behavior - but isn't there a use for that? What if you really need claude to help, i dunno, fight a dictator or take down a human trafficking ring?

TIPSIO•3mo ago

I noticed a huge dip in activity in one of the subreddits I frequent exactly at the same time

nprateem•3mo ago

OpenAI's gambit to starve Anthropic of AWS compute is paying off already.

bdcravens•3mo ago

I guess this will be the next generation of classic news cycle on HN:

1. {AWS, Github} is down

2. Post to HN about it

3. Comments wax poetic about getting rid of it and doing it the "old way"

4. It's back up before most read the post

trq_•3mo ago

We're back up! It was about ~30 minutes of downtime this morning, our apologies if it interrupted your work.

van_lizard•3mo ago

Ask Gemini to make a nice anime portrait of Claude. Maybe with an interesting weapon in hand just in case.

OpenClaw Is Changing My Life

Everything you need to know about lasers in one photo

SCOTUS to decide if 1988 video tape privacy law applies to internet uses

Epstein files reveal deeper ties to scientists than previously known

Red teamers arrested conducting a penetration test

Show HN: Open-source AI powered Kubernetes IDE

Show HN: Lucid – Use LLM hallucination to generate verified software specs

AI Doesn't Write Every Framework Equally Well

Aisbf – an intelligent routing proxy for OpenAI compatible clients

Let's handle 1M requests per second

OpenClaw Partners with VirusTotal for Skill Security

Goal: Ship 1M Lines of Code Daily

Show HN: Codex-mem, 90% fewer tokens for Codex

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

LineageOS 23.2

Crypto Deposit Frauds

Substack makes money from hosting Nazi newsletters

Framing an LLM as a safety researcher changes its language, not its judgement

Are there anyone interested about a creator economy startup

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

2003: What is Google's Ultimate Goal? [video]

Roger Ebert Reviews "The Shawshank Redemption"

Busy Months in KDE Linux

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

OpenClaw Is Changing My Life

Everything you need to know about lasers in one photo

SCOTUS to decide if 1988 video tape privacy law applies to internet uses

Epstein files reveal deeper ties to scientists than previously known

Red teamers arrested conducting a penetration test

Show HN: Open-source AI powered Kubernetes IDE

Show HN: Lucid – Use LLM hallucination to generate verified software specs

AI Doesn't Write Every Framework Equally Well

Aisbf – an intelligent routing proxy for OpenAI compatible clients

Let's handle 1M requests per second

OpenClaw Partners with VirusTotal for Skill Security

Goal: Ship 1M Lines of Code Daily

Show HN: Codex-mem, 90% fewer tokens for Codex

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

LineageOS 23.2

Crypto Deposit Frauds

Substack makes money from hosting Nazi newsletters

Framing an LLM as a safety researcher changes its language, not its judgement

Are there anyone interested about a creator economy startup

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

2003: What is Google's Ultimate Goal? [video]

Roger Ebert Reviews "The Shawshank Redemption"

Busy Months in KDE Linux

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Claude Is Down

Comments