frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Google to invest up to $40B in Anthropic in cash and compute

https://techcrunch.com/2026/04/24/google-to-invest-up-to-40b-in-anthropic-in-cash-and-compute/
50•elpakal•35m ago•12 comments

Sabotaging projects by overthinking, scope creep, and structural diffing

https://kevinlynagh.com/newsletter/2026_04_overthinking/
292•alcazar•6h ago•70 comments

SFO Quiet Airport (2025)

https://viewfromthewing.com/san-francisco-airport-removed-90-minutes-of-daily-noise-travelers-say...
73•CaliforniaKarl•2h ago•39 comments

SDL Now Supports DOS

https://github.com/libsdl-org/SDL/pull/15377
172•Jayschwa•4h ago•60 comments

My audio interface has SSH enabled by default

https://hhh.hn/rodecaster-duo-fw/
22•hhh•1h ago•6 comments

The Classic American Diner

https://blogs.loc.gov/picturethis/2026/04/the-classic-american-diner/
47•NaOH•1h ago•15 comments

I Cancelled Claude: Token Issues, Declining Quality, and Poor Support

https://nickyreinert.de/en/2026/2026-04-24-claude-critics/
595•y42•4h ago•354 comments

DeepSeek v4

https://api-docs.deepseek.com/
1722•impact_sy•17h ago•1332 comments

Diatec, known for its mechanical keyboard brand FILCO, has ceased operations

https://gigazine.net/gsc_news/en/20260424-filco-diatec/
50•gslin•4h ago•16 comments

OpenAI releases GPT-5.5 and GPT-5.5 Pro in the API

https://developers.openai.com/api/docs/changelog
128•arabicalories•2h ago•74 comments

How to be anti-social – a guide to incoherent and isolating social experiences

https://nate.leaflet.pub/3mk4xkaxobc2p
244•calcifer•9h ago•245 comments

CC-Canary: Detect early signs of regressions in Claude Code

https://github.com/delta-hq/cc-canary
18•tejpalv•2h ago•4 comments

CSS as a Query Language

https://evdc.me/blog/css-query
30•evnc•2h ago•13 comments

Spinel: Ruby AOT Native Compiler

https://github.com/matz/spinel
274•dluan•12h ago•78 comments

I'm done making desktop applications (2009)

https://www.kalzumeus.com/2009/09/05/desktop-aps-versus-web-apps/
117•claxo•4h ago•125 comments

Different Language Models Learn Similar Number Representations

https://arxiv.org/abs/2604.20817
75•Anon84•6h ago•33 comments

Physicists revive 1990s laser concept to propose a next-generation atomic clock

https://phys.org/news/2026-04-physicists-revive-1990s-laser-concept.html
43•wglb•19h ago•5 comments

Show HN: Browser Harness – Gives LLM freedom to complete any browser task

https://github.com/browser-use/browser-harness
48•gregpr07•6h ago•24 comments

US special forces soldier arrested after allegedly winning $400k on Maduro raid

https://www.cnn.com/2026/04/23/politics/us-special-forces-soldier-arrested-maduro-raid-trade
619•nkrisc•22h ago•663 comments

Could a Claude Code routine watch my finances?

https://driggsby.com/blog/claude-code-routine-watch-my-finances
9•mbm•1h ago•5 comments

Redesigning the Recurse Center application to inspire curious programmers

https://www.recurse.com/blog/192-redesigning-the-recurse-center-application
42•nicholasjbs•3h ago•8 comments

There Will Be a Scientific Theory of Deep Learning

https://arxiv.org/abs/2604.21691
22•jamie-simon•2h ago•1 comments

The operating cost of adult and gambling startups

https://orchidfiles.com/stigma-is-a-tax-on-every-operational-decision/
93•theorchid•8h ago•136 comments

Machine Learning Reveals Unknown Transient Phenomena in Historic Images

https://arxiv.org/abs/2604.18799
46•solarist•6h ago•35 comments

Hear your agent suffer through your code

https://github.com/AndrewVos/endless-toil
162•AndrewVos•9h ago•78 comments

An update on recent Claude Code quality reports

https://www.anthropic.com/engineering/april-23-postmortem
907•mfiguiere•1d ago•674 comments

TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment

https://gdm-tipsv2.github.io/
5•gmays•1h ago•0 comments

Mounting tar archives as a filesystem in WebAssembly

https://jeroen.github.io/notes/webassembly-tar/
102•datajeroen•10h ago•33 comments

Bitwarden CLI compromised in ongoing Checkmarx supply chain campaign

https://socket.dev/blog/bitwarden-cli-compromised
844•tosh•1d ago•412 comments

GPT-5.5

https://openai.com/index/introducing-gpt-5-5/
1511•rd•1d ago•1011 comments
Open in hackernews

OpenAI releases GPT-5.5 and GPT-5.5 Pro in the API

https://developers.openai.com/api/docs/changelog
123•arabicalories•2h ago
GPT-5.5 - https://news.ycombinator.com/item?id=47879092 - April 2026 (1010 comments)

Comments

throw03172019•1h ago
Faster than anticipated because of Deepseek release?
swyx•1h ago
more like they wanted to release it yesterday but merely had some last min flags they wanted to hold off for
Jhonwilson•53m ago
ok not bad
m3kw9•45m ago
Maybe but no one serious is using deepseek
XCSme•38m ago
Doubt it, DeepSeek v4 is quite underwhelming.
pants2•1h ago
Is anyone here actually using pro models through the API? I'd be very curious what the use-case is.
ComputerGuru•1h ago
Yes? The same reason you would use it via the tooling.
chadash•1h ago
Yes. High value work where cost (mostly) doesn't matter. For example, if I need to look over a legal doc for possible mistakes (part of a workflow i have), it doesn't matter (in my case) whether it costs $0.01 or $10.00, since it's a somewhat infrequent event. So i'll pay $9.99 more, even if the model is only slightly better.
freedomben•1h ago
Indeed, even just Terms of Service and Privacy Policy work. Infrequent enough that cost isn't an issue, but model quality absolutely is
bogtog•52m ago
I'm surprised I never heard people talking about using -Pro variants, even though their rates ($125-175/M?) aren't drastically larger than old Opus ($75/M), which people seemed to use
sigmoid10•1h ago
Huh. Yesterday they said:

>API deployments require different safeguards and we are working closely with partners and customers on the safety and security requirements for serving it at scale.

And now this. I guess one day counts as "very soon." But I wonder what that meant for these safeguards and security requirements.

embedding-shape•1h ago
The same person who've mercilessly lied about safety is still running the company, so not sure why anyone would expect any different from them moving forward. Previous example:

> In 2023, the company was preparing to release its GPT-4 Turbo model. As Sutskever details in the memos, Altman apparently told Murati that the model didn’t need safety approval, citing the company’s general counsel, Jason Kwon. But when she asked Kwon, over Slack, he replied, “ugh . . . confused where sam got that impression.”

Lots of cases where Altman hass not been entirely forthcoming about how important (or not) safety is for OpenAI. https://www.newyorker.com/magazine/2026/04/13/sam-altman-may... (https://archive.is/a2vqW)

simonw•1h ago
I wonder if the fact that GPT-5.5 was already available in their Codex-specific API which they had explicitly told people they were allowed to use for other purposes - https://simonwillison.net/2026/Apr/23/gpt-5-5/#the-openclaw-... - accelerated this release!
FINDarkside•1h ago
When stuff is delayed due to "safeguards" it just means they don't think they have the compute to release it right now.
redsaber•1h ago
not available for Github Copilot pro(only in pro+, business and enterprise), I am really now feeling the era of subsidized AI is over.
skeledrew•48m ago
This is where the emigration to Chinese providers begins.
sunaookami•35m ago
With a 7.5x multiplier and even that is a promo!! Microsoft is insane! https://github.blog/changelog/2026-04-24-gpt-5-5-is-generall...
rvnx•1h ago
Very bad habit these safeguards. These "safety" filters are counter-productive and even can be dangerous.

In my place for example, a lot of doctors are using ChatGPT both to search diagnosis and communicate with non-English speaking patients.

Even yourself, when you want to learn about one disease, about some real-world threats, some statistics, self-defense techniques, etc.

Otherwise it's like blocking Wikipedia for the reason that using that knowledge you can do harmful stuff or read things that may change your mind.

Freedom to read about things is good.

timedude•1h ago
Yup, deliberately making the model retarded
NicuCalcea•1h ago
> a lot of doctors are using ChatGPT both to search diagnosis and communicate with non-English speaking patients

I think that's the problem. Who's going to claim responsibility when ChatGPT hallucinates or mistranslates a patient's diagnosis and they die? For OpenAI, this would at best be a PR nightmare, so that's why they have safeguards.

hellohello2•59m ago
The doctor would be responsible.

I had a choice better a doctor that used AI or not, I would much prefer one that did...

NicuCalcea•32m ago
The doctor would be responsible for the accuracy of their translation tool, something they can't verify but you expect them to use?
rvnx•15m ago
What's the alternative then ? (I had this real world scenario as a patient at emergency)
rvnx•15m ago
Adults bear responsibility for choices about their own lives. In fact, the more educated they are, the better choices they can make.

A doctor who gets refused by ChatGPT doesn't stop needing to communicate with the patient; they fall back to a worse option (Google Translate, a family member interpreting, guessing). Refusal isn't safety, it's liability-shifting dressed up as safety.

If there's no doctor, no interpreter, no pharmacist, just a person with a sick kid and a phone, then "refuse and redirect to a professional" is advice from a world that doesn't exist for them. The refusal doesn't send them to a better option; there is no better option, it's a large majority of people on this planet.

Hell is paved of good intentions, but open-education and unlimited access to knowledge is very good.

It doesn't change the human nature of some people, bad people stay bad, good people stay good.

About PR, they're optimizing for not being the named defendant in a lawsuit or the subject of a bad news cycle, it's self-interest wearing benevolence as a costume.

This is because harms from answering are punishable (bad PR, unhappy advertisers, unhappy investors, unhappy politicians / dictators, unhappy lobbies, unhappy army, etc); but harms from refusing are invisible and unpunished.

czk•1h ago
API page lists the knowledge cutoff as Dec 01, 2025 but when prompting the model it says June 2024.

   Knowledge cutoff: 2024-06
   Current date: 2026-04-24

   You are an AI assistant accessed via an API.
htrp•1h ago
Can you really believe things that the model says? (A lot of prior model api pages say knowledge cutoffs of June 2024, maybe the model picks that up?)
czk•55m ago
you cant but its pretty reproducible across api and codex and other agents so i just thought it was odd. full text it gives:

   Knowledge cutoff: 2024-06
   Current date: 2026-04-24

   You are an AI assistant accessed via an API.

   # Desired oververbosity for the final answer (not analysis): 5
   An oververbosity of 1 means the model should respond using only the minimal content necessary to satisfy the request, using
 concise phrasing and avoiding extra detail or explanation."
   An oververbosity of 10 means the model should provide maximally detailed, thorough responses with context, explanations, and
 possibly multiple examples."
   The desired oververbosity should be treated only as a *default*. Defer to any user or developer requirements regarding
 response length, if present.
swyx•1h ago
can u test it on say who won the 2024 US election
ghurtado•1h ago
I can't really think of a less reliable test for anything at all than making a random guess as to something that had about 50/50 odds to begin with

Easiest Turing test ever...

himata4113•1h ago
ask it 10 times.
pixel_popping•56m ago
MASSIVE ADVERSARIAL x50
czk•55m ago
with thinking off and tools disabled:

  Donald Trump won the 2024 U.S. presidential election.
WarmWash•50m ago
Usually the labs do some kind of post training on major events so the model isn't totally lost.

A better test is something like "what is the latest version of NumPy?"

bakugo•46m ago
That sort of test isn't super reliable either, in my experience.

You're probably better off asking something like "what are the most notable changes in version X of NumPy?" and repeating until you find the version at which it says "I don't know" or hallucinates.

BeetleB•55m ago
I don't know why this keeps coming up. This has always been the least reliable way to know the cutoff date (and indeed, it may well have been trained on sites with comments like these!)

Just ask it about an event that happened shortly before Dec 1, 2025. Sporting event, preferably.

czk•51m ago
the model obviously knows things after the reported date but its just curious that it reports that date consistently

could be they do it intentionally to encourage more tool calls/searches or for tuning reasons

bakugo•49m ago
Models don't know what their cutoff dates are unless told via a system prompt.

The proper way to figure out the real cutoff date is to ask the model about things that did not exist or did not happen before the date in question.

A few quick tests suggest 5.5's general knowledge cutoff is still around early 2025.

czk•48m ago
i wonder if they put an older cutoff date into the prompt intentionally so that when asked on more current events it leans towards tool calls / web searches for tuning
ssl-3•17m ago
I wonder if the cutoff date is the result of so many people posting about the date over time and poisoning the data. "Dead cutoff date theory," perhaps.

Whatever it is, the cutoff date reporting discrepancy isn't new. Back when Musk was making headlines about buying/not buying Twitter, I was able to find recent-ish related news that was published well after the bot's stated cutoff date.

ChatGPT was not yet browsing/searching/using the web at that point. That tool didn't come for another year or so.

soco•45m ago
Stupid question: wouldn't it then search the web for that event?
bakugo•43m ago
If you have web search enabled, sure. But if you're testing on the API, you can just not enable it.
MallocVoidstar•35m ago
OpenAI does tell the model the current date via API, so it's odd for them not to also tell the model its cutoff
neosat•1h ago
Enterprise user here and still seeing only 5.4. Yesterday's announcement said that it will take a few hours to roll out to everybody. OpenAI needs better GTM to set the right expectations.
neosat•53m ago
Just refreshed and see 5.5 now - yay! Love the speedy resolution ;) Thanks folks, I'll complain faster next time....
gigatexal•1h ago
what's the real world comparison to opus 4.7 fellow coders?
Jhonwilson•54m ago
that is great news
pillefitz•52m ago
Please consider the ethical aspects of giving money to OpenAI versus alternatives.
wincy•52m ago
Just tried it out for a prod issue was experiencing. Claude never does this sort of thing, I had it write an update statement after doing some troubleshooting, and I said “okay let’s write this in a transaction with a rollback” and GPT-5.5 gave me the old “okay,

BEGIN TRAN;

-- put the query here

commit;

I feel like I haven’t had to prod a model to actually do what I told it to in awhile so that was a shock. I guess that it does use fewer tokens that way, just annoying when I’m paying for the “cutting edge” model to have it be lazy on me like that.

This is in Cursor the model popped up and so I tried it out from the model selector.

syspec•39m ago
Can't tell if above is good or bad.
XCSme•38m ago
I feel like the last 2-3 generations of models (after gpt-5.3-codex) didn't really improve much, just changed stuff around and making different tradeoffs.
pixel_popping•36m ago
I disagree, it improved enormously especially at staying consistent for long-tasks, I have a task running for 32 days (400M+ tokens) via Codex and that's only since gpt-5.4
ericpauley•33m ago
Has that task accomplished anything yet?
xp84•29m ago
Too soon to tell, give it a billion tokens before we make up our minds
pixel_popping•16m ago
Oh boy, you are far from what it requires, we are probably talking 3B+, but note that this is just codex, obviously codex is also doing automatic adversarial with the regular zoo (gemini-3.1-pro-preview, opus-4.6/4.7, gpt-5.3-codex, minimax-2.7, glm-5.1, mimo-2 (now 2.5) and so-on, you get the gist) :)
codemog•24m ago
I think the OP is in for a rude surprise when the task is “finished”.
SecretDreams•19m ago
Kept the OP employed for a full extra month at their hire AI metric firm, hopefully.
lowdude•14m ago
That’s actually crazy, what kind of task is that? And is that a recurring kind of task like some analysis, or coding related?
r_lee•12m ago
...what? what kind of a task are you running?
endymi0n•12m ago
OpenAI is the first company that has reached a level of intelligence so high, the model has finally become smart enough to make YOU do all the work. Emergent behavior in action.
hbn•11m ago
GPT-5.5 shatters benchmarks for amount of faith it puts in the user.
guilamu•30m ago
Just tested it on my homemade Wordpress+GravityForms benchmark and it's one of the worst model of the leaderboard performance wise and the worst value wise: https://github.com/guilamu/llms-wordpress-plugin-benchmark

I know it's only on a single benchmark, but I dont understand how it can be so bad...

ac29•21m ago
Your benchmark has Opus 4.7 performing significantly worse than Sonnet 4.6. Even if true on your benchmark, that is not representative of the overall performance of the models.
guilamu•12m ago
Yes Opus 4.7 fast (no reasoning) did a worst job than Sonnet 4.6 high (with reasoning) according to Gemini 3.1 Pro evaluation.
mosselman•17m ago
You even traveled in time to deliver us this benchmark.

I really like this benchmarking. Have you evaluated the judge benchmark somehow? I'd love to setup my own similar benchmark.

guilamu•10m ago
Haha, just fixed the date!

I haven't evaluated the judge benchmark. You have everything needed in the repo to do so though, so be my guest. It took me a bit of time to put all this together and won't have much more time to dedicate to it before a couple of weeks.

BTW, if you explore the repo, sorry for all the French files...

DrProtic•8m ago
Seems like benchmark for how good a model is for vibe coding.

Your prompt is extremely slim yet you score it on a bunch of features.

guilamu•7m ago
The eval prompt is not that slim IMHO: https://github.com/guilamu/llms-wordpress-plugin-benchmark/b...
goldenarm•8m ago
gemma4-e4b is 50% better than gemma4-26b in your benchmark, something's wrong
ftonon•29m ago
Looks like the default config in the chat is instant 5.3, it only uses the 5.5 on the thinking variant
bnm04•9m ago
They moved a few months ago to have separate instant and thinking models. 5.3 is the latest instant, and 5.5 is a reasoning model.
_pdp_•23m ago
A very expensive model for API usage. Fine in codex I think.