frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•1m ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•2m ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•5m ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
2•geox•6m ago•0 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•7m ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
2•bookmtn•11m ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
1•tjr•13m ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
1•alephnerd•13m ago•0 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
2•keepamovin•19m ago•0 comments

Show HN: I built the first tool to configure VPSs without commands

https://the-ultimate-tool-for-configuring-vps.wiar8.com/
2•Wiar8•22m ago•2 comments

AI agents from 4 labs predicting the Super Bowl via prediction market

https://agoramarket.ai/
1•kevinswint•27m ago•1 comments

EU bans infinite scroll and autoplay in TikTok case

https://twitter.com/HennaVirkkunen/status/2019730270279356658
4•miohtama•30m ago•1 comments

Benchmarking how well LLMs can play FizzBuzz

https://huggingface.co/spaces/venkatasg/fizzbuzz-bench
1•_venkatasg•33m ago•1 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
13•SerCe•33m ago•6 comments

Octave GTM MCP Server

https://docs.octavehq.com/mcp/overview
1•connor11528•34m ago•0 comments

Show HN: Portview what's on your ports (diagnostic-first, single binary, Linux)

https://github.com/Mapika/portview
3•Mapika•36m ago•0 comments

Voyager CEO says space data center cooling problem still needs to be solved

https://www.cnbc.com/2026/02/05/amazon-amzn-q4-earnings-report-2025.html
1•belter•40m ago•0 comments

Boilerplate Tax – Ranking popular programming languages by density

https://boyter.org/posts/boilerplate-tax-ranking-popular-languages-by-density/
1•nnx•40m ago•0 comments

Zen: A Browser You Can Love

https://joeblu.com/blog/2026_02_zen-a-browser-you-can-love/
1•joeblubaugh•42m ago•0 comments

My GPT-5.3-Codex Review: Full Autonomy Has Arrived

https://shumer.dev/gpt53-codex-review
1•gfortaine•43m ago•0 comments

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

https://github.com/AGDNoob/FastLog
2•AGDNoob•45m ago•1 comments

God said it (song lyrics) [pdf]

https://www.lpmbc.org/UserFiles/Ministries/AVoices/Docs/Lyrics/God_Said_It.pdf
1•marysminefnuf•46m ago•0 comments

I left Linus Tech Tips [video]

https://www.youtube.com/watch?v=gqVxgcKQO2E
1•ksec•47m ago•0 comments

Program Theory

https://zenodo.org/records/18512279
1•Anonymus12233•51m ago•0 comments

Show HN: Local DNA analysis skill for OpenClaw

https://github.com/wkyleg/personal-genomics
2•wkyleg•52m ago•0 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

1•netfortius•1h ago•0 comments

WiFi Could Become an Invisible Mass Surveillance System

https://scitechdaily.com/researchers-warn-wifi-could-become-an-invisible-mass-surveillance-system/
6•mgh2•1h ago•0 comments

Build your own Mac cloud

https://ciderstack.com
2•ciderdev•1h ago•0 comments

Anduril announces AI Grand Prix – autonomous drone racing competition (2026)

https://www.dcl-project.com/
1•aanet•1h ago•0 comments

How the Tandy Color Computer Works [video]

https://www.youtube.com/watch?v=r2Tq8jdS6mY
2•amichail•1h ago•0 comments
Open in hackernews

GPT-5 on SWE-bench: Cost and performance deep-dive

https://mini-swe-agent.com/latest/blog/2024/01/15/gpt-5-on-swe-bench-cost--performance-deep-dive/
4•lieret•6mo ago

Comments

lieret•6mo ago
We evaluated the new GPT models with a minimal agent on SWE-bench verified. GPT-5 scores 65%, mini 60%, nano 35%. Still behind Opus 5 (68%), on par with Sonnet 4 (65%). But a lot cheaper, especially mini!

Cost is tricky to compare with agents, because agents succeed fast, but fail slowly. If an agent doesn't succeed, it should just continue trying until it succeeds, or hits a run time limit. And that's (almost) what happens.

But even so, it's very clear that

1. GPT-5 is cheaper than Sonnet 4 2. GPT-5-mini is _incredibly_ cheap for what it provides (you only sacrifice some 5%pts, but end up paying maybe 1/5th of the total cost)

All of the code to reproduce our numbers is open-source. There's a box on the bottom with the exact command to run in order to reproduce our numbers.

Also very happy to answer questions here!

techpineapple•6mo ago
I'm curious if this might help Cursor's lighting money on fire problem?

https://pivot-to-ai.com/2025/07/09/cursor-tries-setting-less...

is this enough of a price difference to make cursor profitable?

lieret•6mo ago
I think gpt-5-mini should really help them. At least from these benchmark scores, there probably shouldn't be a huge performance degradation for letting gpt-5-mini drive most of the workflow. Of course users might still want to just run with latest and greatest (but still gpt-5 will be cheaper I think)