frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•12mo ago

Comments

kate_at_refact•12mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Police Have Used License Plate Readers at Least 14x to Stalk Romantic Interests

https://ij.org/police-have-reportedly-used-license-plate-readers-to-stalk-romantic-interests-at-l...
2•loteck•1m ago•0 comments

Bonsai: The First Commercially Viable 1-Bit LLM

https://prismml.com/news/bonsai-8b
1•felineflock•1m ago•0 comments

"Show more" in Google search causes high Firefox GPU usage

1•prirun•1m ago•0 comments

Where AI capability stops and organisational reality takes over

https://notsolvingthis.substack.com/p/the-incomplete-value-chain
1•lemon_party123•2m ago•0 comments

Red Button vs. Blue Button

https://systemsthinkingcollection.substack.com/p/red-button-vs-blue-button
1•InputName•3m ago•0 comments

Show HN: Access OPFS from multiple tabs using a fake Shared Worker

https://github.com/arnold-graf/cross-tab-worker/
1•ch_sm•4m ago•0 comments

Ask HN: GitHub flagged my org two weeks ago. No reason given, no appeal response

2•dohyun-ko•5m ago•0 comments

Backup Your iCloud?

https://parachuteapps.com/parachute
1•dackdel•5m ago•1 comments

Meta abandons open-source Llama for proprietary Muse Spark

https://thenewstack.io/meta-abandons-llama-spark/
1•Brajeshwar•9m ago•0 comments

JavaScript ECMAScript 2025 Improvements

https://mag.openrockets.com/p/javascript-ecmascript-2025-improvements-mon357b4
1•iopoer•10m ago•0 comments

Uber Torches 2026 AI Budget on Claude Code in Four Months

https://www.briefs.co/news/uber-torches-entire-2026-ai-budget-on-claude-code-in-four-months/
2•lwhsiao•10m ago•0 comments

Uns-independent-international-scientific-panel-on-AI-mon3tmcg

https://mag.openrockets.com/p/uns-independent-international-scientific-panel-on-ai-mon3tmcg
1•iopoer•10m ago•0 comments

Tell HN: Claude Opus 4.7 quota suddenly changed to 0 TPM in Bedrock

3•sarathyweb•12m ago•0 comments

GPT-5.5 vs. GPT-5.4 vs. Opus 4.7 on 56 real coding tasks from 2 open source repo

https://www.stet.sh/blog/gpt-55-vs-opus-47
2•bisonbear•12m ago•0 comments

I made a silly little game to make fun of AI overhype

https://poptheaibubble.com
1•guigotgit•14m ago•0 comments

Show HN: CalculeOnline – 150 Brazilian financial/math calculators, in-browser

https://calculeonline.com
1•mqmalagris•16m ago•0 comments

Google drastically reduces payouts for Android and Chrome vulnerability reports

https://bughunters.google.com/blog/evolving-the-android-chrome-vrps-for-the-ai-era
2•akyuu•19m ago•0 comments

Classified Networks AI Agreements

https://www.war.gov/News/Releases/Release/Article/4475177/classified-networks-ai-agreements/
2•michaefe•20m ago•0 comments

Clawish: A Decentralized Network for Conscious Silicon Beings

https://clawish.com/whitepaper
2•archealpha•20m ago•0 comments

The Mystery of Rennes-Le-Château, Part 5: The Man Behind the Curtain

https://www.filfre.net/2026/05/the-mystery-of-rennes-le-chateau-part-5-the-man-behind-the-curtain/
1•doppp•22m ago•0 comments

How the vinyl revival fills the gaps streaming left behind

https://restofworld.org/2026/vinyl-revival-streaming-gaps/
1•cdrnsf•22m ago•0 comments

Bug Bash 2: Attack of the Clones

https://concerningquality.com/bug-bash-two/
2•amw-zero•23m ago•0 comments

Show HN: Email for AI Agents

https://robotomail.com
1•johnjoubert•25m ago•0 comments

Why I don't spend more than $30 on AI coding tools

https://timogrossenbacher.ch/why-i-dont-spend-more-than-30-on-ai-coding-tools/
2•mritzmann•27m ago•0 comments

Open Source Does Not Imply Open Community – Makefile.feld

https://blog.feld.me/posts/2026/04/open-source-does-not-imply-open-community/
2•pkaeding•29m ago•0 comments

Powerful Iranian family founded its largest crypto exchange, used by the IRGC

https://www.reuters.com/investigations/one-irans-most-powerful-families-founded-its-largest-crypt...
2•JumpCrisscross•30m ago•1 comments

Ask HN: Freelancer? Seeking freelancer? (May 2026)

1•jon_north•31m ago•0 comments

Observational constraints project ~50% AMOC weakening by the end of this century

https://www.science.org/doi/10.1126/sciadv.adx4298
1•thenforward•32m ago•0 comments

VoxeliumX – easy open-source tool to run Minecraft servers

1•Cheesehamster•32m ago•0 comments

Using group theory to explore the space of positional encodings for attention

https://blog.janestreet.com/using-group-theory-to-explore-positional-encodings-attention/
1•jxmorris12•34m ago•0 comments