frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•ksec•9m ago•0 comments

JobArena – Human Intuition vs. Artificial Intelligence

https://www.jobarena.ai/
1•84634E1A607A•13m ago•0 comments

Concept Artists Say Generative AI References Only Make Their Jobs Harder

https://thisweekinvideogames.com/feature/concept-artists-in-games-say-generative-ai-references-on...
1•KittenInABox•17m ago•0 comments

Show HN: PaySentry – Open-source control plane for AI agent payments

https://github.com/mkmkkkkk/paysentry
1•mkyang•19m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
1•ShinyaKoyano•28m ago•0 comments

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

https://twitter.com/nicbstme/status/2019149771706102022
1•SubiculumCode•33m ago•0 comments

Pax Historia – User and AI powered gaming platform

https://www.ycombinator.com/launches/PMu-pax-historia-user-ai-powered-gaming-platform
2•Osiris30•34m ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
1•ambitious_potat•39m ago•0 comments

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

https://blog.afrowallet.co/en_GB/tiers-app/scams-fraud-and-fake-apps-in-africa
1•jonatask•39m ago•0 comments

Porting Doom to My WebAssembly VM

https://irreducible.io/blog/porting-doom-to-wasm/
1•irreducible•40m ago•0 comments

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

https://www.mdpi.com/2075-5309/15/16/2968
1•rbanffy•42m ago•0 comments

Full-Blown Cross-Assembler in a Bash Script

https://hackaday.com/2026/02/06/full-blown-cross-assembler-in-a-bash-script/
1•grajmanu•47m ago•0 comments

Logic Puzzles: Why the Liar Is the Helpful One

https://blog.szczepan.org/blog/knights-and-knaves/
1•wasabi991011•58m ago•0 comments

Optical Combs Help Radio Telescopes Work Together

https://hackaday.com/2026/02/03/optical-combs-help-radio-telescopes-work-together/
2•toomuchtodo•1h ago•1 comments

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

https://github.com/ppomes/myanon
1•pierrepomes•1h ago•0 comments

The Tao of Programming

http://www.canonical.org/~kragen/tao-of-programming.html
2•alexjplant•1h ago•0 comments

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

https://medium.com/@ognian.milanov/forcing-rust-how-big-tech-lobbied-the-government-into-a-langua...
3•akagusu•1h ago•0 comments

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

https://www.tryinspector.com/blog/code-first-design-tools
2•quentinrl•1h ago•2 comments

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•1h ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•1h ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
7•DesoPK•1h ago•4 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•1h ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
35•mfiguiere•1h ago•20 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
3•meszmate•1h ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•1h ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•1h ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•1h ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•2h ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
5•gmays•2h ago•1 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•2h ago•1 comments
Open in hackernews

GPT-5-Codex-Mini – A more compact and cost-efficient version of GPT-5-Codex

https://github.com/openai/codex/releases/tag/rust-v0.56.0
56•wahnfrieden•3mo ago

Comments

vessenes•3mo ago
Looks like a leak: https://platform.openai.com/docs/models does not list it, and codex-mini-latest says that it's based on 4o. I wonder if it will be faster than codex; gpt-5-nano and -mini are still very slow for me on API, surprisingly so.
ChadNauseam•3mo ago
I noticed the same thing with -mini. It can be even slower than the full fat version. I'm guessing their infra for it is very cost-optimized to help them offer it at such a low price
wahnfrieden•2mo ago
They also just announced that they’re giving Pro tier prioritized access to Codex models. Are you on Pro? If not you might be getting deprioritized.
vessenes•2mo ago
Yeah in my mind it was smaller -> faster. But seems like it might be 'a bit smaller and batched' to hit the price target.
simonw•3mo ago
They announced it on Twitter yesterday: https://x.com/OpenAIDevs/status/1986861734619947305 and https://x.com/OpenAIDevs/status/1986861736041853368

> GPT-5-Codex-Mini allows roughly 4x more usage than GPT-5-Codex, at a slight capability tradeoff due to the more compact model.

> Available in the CLI and IDE extension when you sign in with ChatGPT, with API support coming soon.

cmdtab•3mo ago
If any open AI devs reading this comment section: is it possible for us to get api access at runable.com ?
bgwalter•3mo ago
All "AI" providers cut corners in the models right now because the subsidized cost is unsustainable.

Grok's latest update made it far worse than the version right after the Grok-4 release. It makes outright mistakes now. Copilot has cut corners long ago. Google "AI" was always horrible.

The whole "AI" experiment was an outrageously expensive IP laundering parlor trick that is meeting economic realities now.

stavros•3mo ago
That's a very long-winded way of saying "it was subsidized so it could capture a large market segment, and now that's stopping", which is what SV companies have done since checks notes forever.
bgwalter•3mo ago
An LLM would have generated four pages on this topic in order to increase the token count!

LLMs are advertised for serious applications. I don't recall that CPUs generally hallucinate except for the FDIV bug. Or that AirBnB rents you apartments that don't exist in 30% of all cases. Or that Uber cars drive into a river during 20% of all rides.

stavros•3mo ago
Are we talking about economics, or about hallucinations?

"CPUs don't hallucinate" would be a reasonable argument if CPUs were an alternative to LLMs, which they aren't, so I'm not really sure what argument you're making there.

Seems like you're saying "a calculator makes fewer mistakes than an accountant", which is true, but I still pay an accountant to do my taxes, and not a calculator.

bgwalter•3mo ago
I was obviously responding to your "SV companies have been doing that forever". You have introduced the general topic.
stavros•3mo ago
I don't see how CPU bugs have anything to do with subsidizing a product to capture market share, can you elaborate?
bgwalter•3mo ago
Certainly!

Thinking ...

- The user is asking about the connection between CPU bugs and price dumping in order to capture market share.

- The user appears to miss the original thread starter that mentions cutting corners in models after the subsidy phase is over.

- The mention of CPUs, AirBnB and Uber appear to be examples where certain quality standards were usually kept even after the subsidy phase.

Generating response ...

NamlchakKhandro•3mo ago
if you don't want hallucinations:

- set temp to 0

- be more specific

But I'd argue that if your LLM isn't hallucinating, then it's useless

eru•2mo ago
How would setting temp to 0 preclude hallucinations?
wahnfrieden•2mo ago
It wouldn’t
nicce•3mo ago
Not saying that you are completely wrong, but you could try to rephrase this to make a better conversation.

I agree that many new model versions are worse than the previous. But it is also related to base rules of the model - they try to please you and manipulate you to like them, way too much.

simonw•3mo ago
Charging developers $200/month for Claude Code and getting to a billion in ARR sounds like a pretty great business to be in to me, especially with this growth rate:

> Claude Code is reportedly close to generating $1 billion in annualized revenue, up from about $400 million in July.

https://techcrunch.com/2025/11/04/anthropic-expects-b2b-dema...

bgwalter•3mo ago
So Misanthropic claims that 416666.66 software developers have bought their expensive $200 subscription when there are 4.4 million software developers in the US.

That sounds reasonable given that 10% of software developers are talkers that need someone to output something that looks like a deliverable.

We were however talking profits here, not revenue.

simonw•3mo ago
Presumably their "$1bn ARR from Claude Code" number isn't just the $200/month subscribers, they have $20/month and $100/month plans too, both of which their internal analytics could be crediting to Claude Code based on API usage patterns.

That $1bn number was in a paywalled Information article which was then re-reported by TechCrunch so the actual source of the number isn't clear. I'm assuming someone leaked to the Information, they appear to have some very useful sources.

I doubt this is just US developers - they've boasted about how successful they are in Europe recently too:

> Businesses across Europe are trusting Claude with their most important work. As a result, EMEA has become our fastest-growing region, with a run-rate revenue that has grown more than 9x in the past year.

https://www.anthropic.com/news/new-offices-in-paris-and-muni...

felipeerias•3mo ago
Relative to its competitors, Anthropic seems to have a higher share of professional users paying premium subscriptions, which is probably more sustainable in the long term.
crazylogger•2mo ago
Anecdotally, a Max subscriber gets something like $100 worth of usage per day. The more people use Claude Code, the more Anthropic loses, so it sounds like a classical "selling a dollar for 85 cents" business to me.

As soon as users are confronted with their true API cost, the appearance of this being a good business falls apart. At the end of the day, there is no moat around large language models - OpenAI, Anthropic, Google, DeepSeek, Alibaba, Moonshot... any company can make a SOTA model if they wish, so in the long run it's guaranteed to be a race to the bottom where nobody can turn a profit.

simonw•2mo ago
> Anecdotally, a Max subscriber gets something like $100 worth of usage per day.

Where are you getting that number from?

Anthropic added quite strict limits on usage - visible from the /usage method inside Claude Code. I would be surprised if those limits turn out to still result in expensive losses for them.

crazylogger•2mo ago
This is just personal experience + reddit anecdotes. I've been using CC from day one (when API pricing was the only way to pay for CC), then I've been on the $20 Pro plan and am getting a solid $5+ worth of usage in each 5h session, times 5-10 sessions per week (so an overall 5-10x subsidy over one month.) And I extrapolated that $200 subscribers must be getting roughly 10x Pro's usage. I do feel the actual limit fluctuates each week as Claude Code engage in this new subsidy war with OAI Codex though.

My theory is this:

- we know from benchmarks that open-weight models like Deepseek R1 and Kimi K2's capabilities are not far behind SOTA GPT/Claude

- open-weight API pricing (e.g. on openrouter) is roughly 1/10~1/5 that of GPT/Claude

- users can more or less choose to hook their agent CLI/IDEs to either closed or open models

If these points are true, then the only reason people are primarily on CC & Codex plans is because they are subsidized by at least 5~10x. When confronted with true costs, users will quickly switch to the lowest inference cost vendor, and we get perfect competition + zero margin for all vendors.

wahnfrieden•2mo ago
The benchmarks lie. Go try coding full-time with R1 vs Codex or GPT-5 (in Codex). The latter is firmly preferred even by those who have no issue with budgeting tokens for their productivity.
esafak•3mo ago
The Chinese open source models don't have this problem -- and they're state of the art!
lostmsu•3mo ago
GPT-5 and GPT-5-Codex are already not clever enough for anything interesting.
asadm•3mo ago
not correct. are you using high reasoning? i am using codex everyday.
pogue•3mo ago
What sort of stuff are you using it for?
asadm•2mo ago
computer vision things these days.
wahnfrieden•2mo ago
Medium reportedly uses even higher reasoning than high, when it chooses to. Whereas high is fixed at a high but not as high level.
jeswin•3mo ago
What's your definition of interesting?

I'm half way through writing a typescript to native code translator (via .Net) compiling a large enough subset of current code with a lot of help from GPT5 and Codex CLI. It has completely blown me away.

I'd like to give you a concrete example which stood out (from by now, dozens). I wanted d.ts files from the .Net Standard Libs. One immediately obvious problem is that .Net would allow classes/interfaces to be redefined if the generic type arity is different. For example, there can be SomeClass<int> and SomeClass<int, int> which are completely separate. TypeScript of course, wouldn't allow this - you can have one with all types defined, but it'd obviously be a mess.

I was stuck with (quite ugly): const users = new List_1<User>(...); instead of const users = new List<User>(...);

So GPT comes up with this:

  declare const __unspecified: unique symbol;
  type __ = typeof __unspecified;

  // Your arity-anchored delegates exist elsewhere:
  //   import("internal/System").Action_0
  //   import("internal/System").Action_1<T1>
  //   import("internal/System").Action_2<T1, T2>
  //   import("internal/System").Action_3<T1, T2, T3>
  //   ... up to 17

  export type Action<
    T1 = __, T2 = __, T3 = __, // ... continue through T17 = __
  > =
    [T1] extends [__] ? import("internal/System").Action_0 :
    [T2] extends [__] ? import("internal/System").Action_1<T1> :
    [T3] extends [__] ? import("internal/System").Action_2<T1, T2> :
    /* next lines follow the same pattern … */
    import("internal/System").Action_3<T1, T2, T3>;
This lets me write:

  const a: Action<number> = (n) => {};        // OK (void)
  const f: Func<number, string> = (s) => 20;  // OK (string -> number)
A human could come up with this, of course. But doing this at scale (there are many such problems which crop up), would take a lot of effort. Btw I'm using Claude for the grunt work (because its faster), but GPT5 is doing all the architecture/thinking/planning/review.
nawgz•2mo ago
Ternary chains are pretty common in TS, since it’s the main control flow. Are you a comfortable TS user normally?
jeswin•2mo ago
I'm a somewhat comfortable TS user.

Are you saying ternary chains using sentinels for arity inference is pretty common? I would disagree.

> since it’s the main control flow

Perhaps you're saying ternery chains are common in TS code? That's a very different thing though - the code above is not for runtime behavior.

nawgz•2mo ago
For instance if you wrote a type to extend number to something like range<min,max> - obviously a toy - it would look very similar to what you’ve posted. So I’m struggling to see what the insight from the LLM is. Anytime one needs to iterate in TS, arity or otherwise, that is the technique to use…
jeswin•2mo ago
> For instance if you wrote a type to extend number to something like range<min,max> - obviously a toy - it would look very similar to what you’ve posted.

Why would Range need a sentinel?

My point is that using a sentinel to bridge TypeScript's lack of generic arity-based specialization is a non-trivial problem. After you mentioned it, I looked for examples on Google and couldn't find anything that matches precisely.

I'm not claiming humans can't solve this, or that gpt5 invented something fundamentally new. My original point was about productivity at scale. Having a model apply the right solution across dozens of similar problems, rather than me manually figuring out each one.

nawgz•2mo ago
What is a sentinel?
lostmsu•2mo ago
I actually also used it on .NET codebase, specifically https://github.com/m4rs-mt/ILGPU

It is just poor at designing a generic solution despite repeated requests to follow the design of existing alternatives (present in the same repro). It tended to plug holes in a broken architecture it came up with on its own instead of redesigning or trying to simplify its code to be able to keep it in its own head. TBH I suspect this might be limited purely by context length.

It produced fine(-ish) initial bits so a few tests would pass, but it dug itself a hole of introducing provenance and could not keep track of it properly. You can see it: https://github.com/lostmsu/ILGPU/tree/Vulkan-GPT-5-Stuck

TBH2: this was a huge request. But also there are already other backends it could just mirror.

nawgz•2mo ago
Couldn't you just replace `__` with `never`?
EnPissant•2mo ago
The GPT5 family is the most intelligent model IMO. It's the only model that I have found that can refactor complex code and have the result be better than what came before it.
RestartKernel•2mo ago
I'd say this holds true for LLMs in general depending on your standards of interest, but Codex has blown Claude out of the water for me personally. It seems to have much better code "taste" than any other model.
simonw•3mo ago
I managed to get GPT-5-Codex-Mini to draw me a pelican. It's not a very good one! https://static.simonwillison.net/static/2025/codex-hacking-m...

For comparison, here's GPT-5-Codex (not mini) https://static.simonwillison.net/static/2025/codex-hacking-d... and full GPT-5: https://static.simonwillison.net/static/2025/codex-hacking-g...

I had quite a fun time getting those pelicans though... since GPT-5 Codex Mini isn't officially available via API yet I instead had OpenAI's Codex CLI tool extend itself (in Rust) to add a "codex prompt ..." tool which uses their existing custom auth scheme and backend API, then used that to generate the pelicans. Full details here: https://simonwillison.net/2025/Nov/9/gpt-5-codex-mini/

aurareturn•2mo ago
Just curious, do you think LLM makers are deliberately adding training data for your pelican test by now?
simonw•2mo ago
I'll know if they do, because I'll notice that a new model is suspiciously great at drawing pelicans riding bicycles while still sucking at drawing other things.
RestartKernel•2mo ago
Can't believe I'm saying this, but GPT-5's pelican is its most impressive improvement over 4o I've seen. I wonder if Codex' MoE does not contain any expert fit for this task due to its fine-tuning on code.
wahnfrieden•2mo ago
Make sure you check GPT-5.1’s from a couple days ago
karolcodes•2mo ago
wait it's not yet out, isn't it?
wahnfrieden•2mo ago
It's being tested already on huggingface under another name
hnidiots3•2mo ago
Codex isn’t even a good model.
k4rli•2mo ago
It's been praised here. However in my experience Sonnet4.5 has produced as good if not better results.
conception•2mo ago
What I discovered is different models are simply just better at different things and it’s very hard to predict which one will be better than the others at a certain task. Thus no one has similar experiences as no one is doing the same things the same way ever.
7thpower•2mo ago
Based on what? It’s been great for me.
pietz•2mo ago
Since they also limited the usage of codex CLI quite a bit, this might help. I really like the gpt-5-codex model. It has the most impressive dynamic reasoning effort I've seen. Responding instantly to simple questions while thinking very long when necessary. It's also interesting how close it is in many benchmarks to gpt-5. It's not just a good coding model. It's a great agentic model.