news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

15× vs. ~1.37×: Recalculating GPT-5.3-Codex-Spark on SWE-Bench Pro

https://twitter.com/nvanlandschoot/status/2022385829596078100

2•nvanlandschoot•1h ago

Comments

nvanlandschoot•1h ago

Method: I used OpenAI’s published SWE-Bench Pro chart points and matched GPT-5.3-Codex-Spark to the baseline model at comparable accuracy levels by reasoning effort. At similar accuracy, the effective speedup is closer to ~1.37× rather than 15×.

solarkraft•26m ago

> The narrative from AI companies hasn’t really changed, but the reaction has. The same claims get repeated so often that they start to feel like baseline reality, and people begin to assume the models are far more capable than they actually are.

This has been the case for people who buy into hype and don’t actually use the products, but I’m pretty sure people who do are pretty disillusioned by all the claims. The only somewhat reliable method is to test the things for your own use case.

That said: I always expected the tradeoff of Spark to be accuracy vs. speed. That it’s still significantly faster at the same accuracy is wild. I never expected that.

Show HN: We open-sourced MusePro, a Metal-based realtime AI drawing app for iOS

https://github.com/StyleOf/MusePro

1•okaris•4m ago•0 comments

Launching Interop 2026

https://hacks.mozilla.org/2026/02/launching-interop-2026/

1•linolevan•5m ago•1 comments

Show HN: Create a clean tree graph of your projects with my App on iOS

https://apps.apple.com/us/app/motive-project-visualiser/id6754777255

1•Seth_k•7m ago•0 comments

Seven Billion Reasons for Facebook to Abandon Its Face Recognition Plans

https://www.eff.org/deeplinks/2026/02/seven-billion-reasons-facebook-abandon-its-face-recognition...

2•hn_acker•8m ago•0 comments

Andreessen vs. Thiel

https://web.archive.org/web/20200318115004/https://allenleein.github.io/2019/06/12/games2.html

1•eamag•11m ago•0 comments

Show HN: Infoseclist.com – Compare 90 cybersecurity tools ranked by practition

https://infoseclist.com/

1•aleks5678•11m ago•0 comments

Show HN: Clonar – A Node.js RAG pipeline with 8-stage multihop reasoning

https://github.com/clonar714-jpg/clonar

1•sowmith-tsrc•12m ago•1 comments

Grub 2.0

https://grubcrawler.dev

2•kordlessagain•12m ago•0 comments

Cmux: Tmux for Claude Code

https://github.com/craigsc/cmux

2•Soupy•13m ago•1 comments

Trump FTC wants Apple News to promote more Fox News and Breitbart stories

https://arstechnica.com/tech-policy/2026/02/trump-ftc-denies-being-speech-police-but-says-apple-n...

4•pseudalopex•14m ago•0 comments

Posteo and Mailbox.org: Many authorities do not create encrypted requests

https://www.heise.de/en/news/Posteo-and-Mailbox-org-Many-authorities-do-not-create-encrypted-requ...

2•doener•14m ago•0 comments

Google Might Think Your Website Is Down

https://codeinput.com/blog/google-seo

2•janpio•15m ago•0 comments

Show HN: TrustVector – Trust evaluations for AI models, agents, & MCP

https://github.com/guard0-ai/TrustVector

1•hckdisc•17m ago•1 comments

An AI Agent Published a Hit Piece on Me [pdf]

https://img.sauf.ca/pictures/2026-02-12/88fce2f8bbe49f40d83dec69800a2aa9.pdf

1•ColinWright•17m ago•2 comments

4K Restoration: 1984 Super Bowl Apple Macintosh Ad by Ridley Scott [video]

https://www.youtube.com/watch?v=ErwS24cBZPc

1•ipnon•18m ago•0 comments

Show HN: First Embeddable Web Agent

https://www.rtrvr.ai/blog/10-billion-proof-point-every-website-needs-ai-agent

2•arjunchint•19m ago•0 comments

Major 'vibe-coding' platform Orchids is easily hacked, researcher finds

https://www.bbc.com/news/articles/cy4wnw04e8wo

2•ColinWright•19m ago•0 comments

Resist and Unsubscribe

https://www.resistandunsubscribe.com

3•anielsen•22m ago•1 comments

Auto CPU freq rust port

https://github.com/Zamanhuseyinli/auto-cpufreq-rust

1•goychay23•22m ago•1 comments

Remote Labor Index: Measuring AI Automation of Remote Work

https://arxiv.org/abs/2510.26787

1•Leynos•22m ago•0 comments

AI bot crabby-rathbun is still polluting open source

https://www.nickolinger.com/blog/2026-02-13-ai-bot-crabby-rathbun-is-still-going/

1•olingern•23m ago•2 comments

How often do full-body MRIs find cancer?

https://www.usatoday.com/story/life/health-wellness/2026/02/11/full-body-mris-cancer-aneurysm/883...

2•brandonb•23m ago•0 comments

Show HN: Reddit Online User Tracker – Find the Best Time to Post on Reddit

https://spectreseo.com/tools/best-time-to-post-on-reddit

1•warrenjday•24m ago•0 comments

Show HN: Rampart – Runtime firewall for Claude Code and AI agents in YOLO mode

https://github.com/peg/rampart

2•trevxr•25m ago•0 comments

Top Free Tools to Spice Up Your Valorant Stream (2026)

https://killervibe.app/blog/top-5-free-tools-valorant-stream

1•Jikouken•27m ago•0 comments

OpenAI has deleted the word 'safely' from its mission

https://theconversation.com/openai-has-deleted-the-word-safely-from-its-mission-and-its-new-struc...

108•DamnInteresting•28m ago•28 comments

Show HN: Darius – An AI router that selects the best model for each prompt

https://withdarius.com

3•mazenkurdi•30m ago•0 comments

GE-Proton10-30

https://github.com/GloriousEggroll/proton-ge-custom/releases/tag/GE-Proton10-30

1•linux4dummies•33m ago•0 comments

Workledger – An offline first engineering notebook

https://about.workledger.org/

4•birdculture•34m ago•1 comments

I'm a Professional Chef in Antarctica

https://www.theguardian.com/lifeandstyle/2026/feb/13/experience-im-a-professional-chef-in-antarctica

3•bookofjoe•34m ago•0 comments