frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•1y ago

Comments

kate_at_refact•1y ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Meta won't let you block its AI account on Threads

https://www.theverge.com/tech/929091/meta-ai-threads-account-block
1•logickkk1•1m ago•0 comments

I built an iPhone app for practicing happiness through small daily actions

https://apps.apple.com/us/app/practice-of-happiness/id6466039739
1•practiceofhappy•1m ago•0 comments

Made This Free URL Shortener with Indepth Analytics

https://old.reddit.com/r/sideprojects/comments/1tabelm/finished_making_this_url_shortener_complet...
1•godlymod•1m ago•0 comments

RFV-0001: Request for Vibes

https://tomaytotomato.com/request-for-vibes/
2•tomaytotomato•3m ago•0 comments

Why We Ignore Advice

https://yusufaytas.com/why-we-ignore-advice
9•yusufaytas•4m ago•0 comments

New Linux kernel security bug; Fragnesia

https://www.openwall.com/lists/oss-security/2026/05/13/3
2•tankenmate•4m ago•0 comments

Ask HN: How to Invest 10K EUR

2•default_•5m ago•1 comments

Harvard Votes on Limiting "A" Grades

https://www.axios.com/local/boston/2026/05/11/harvard-faculty-vote-grade-inflation-cap-a-grades-i...
1•smurda•5m ago•0 comments

What happened to the car designed for women, by women?

https://uxdesign.cc/what-happened-to-the-car-designed-for-women-by-women-1fdcde8a0bc4
1•rmason•6m ago•1 comments

The AI Apocalypse

https://www.youtube.com/watch?v=8nsxuB3Vsts
1•indigodaddy•7m ago•0 comments

Continuous lamination unlocks production of large-area flexible circuit boards

https://techxplore.com/news/2026-04-lamination-stable-production-large-area.html
2•PaulHoule•9m ago•0 comments

Hilary Putnam: Brains in a Vat (1981)

https://www.cambridge.org/core/books/abs/reason-truth-and-history/brains-in-a-vat/4301D7FCC586969...
2•brudgers•10m ago•0 comments

We built the fastest database for querying large agent traces using Rust

https://www.langchain.com/blog/introducing-smithdb
1•palashshah•11m ago•0 comments

Princeton mandates proctoring in-person exams, upending 133 years of precedent

https://www.dailyprincetonian.com/article/2026/05/princeton-news-adpol-proctoring-in-person-exami...
7•bookofjoe•11m ago•1 comments

GitHub Copilot individual plans: Introducing flex allotments

https://github.blog/news-insights/company-news/github-copilot-individual-plans-introducing-flex-a...
1•dougskinner•11m ago•0 comments

Ask HN: Why does my ChatGPT always display one sentence per line?

1•djyde•12m ago•0 comments

Show HN: Arrivl – Analytics for AI agent traffic on your site

https://arrivl.ai
1•starfun•12m ago•0 comments

Show HN: Vibe-coding video games with Claude (Day 30: Chess)

https://gamevibe.us/30-chess
1•pzxc•13m ago•0 comments

500 Lines or Less: An Archaeology-Inspired Database

https://aosabook.org/en/500L/an-archaeology-inspired-database.html
1•smartmic•13m ago•0 comments

Exploring the HTML-in-Canvas Proposal

https://tympanus.net/codrops/2026/05/13/exploring-the-html-in-canvas-proposal/
1•motiontx•13m ago•0 comments

Authenticity in Creative Expression with AI

https://postcorporate.substack.com/p/good-morning-dr-chandra-this-is
1•gnostikka•13m ago•0 comments

How to Achieve Truly Serverless GPUs

https://modal.com/blog/truly-serverless-gpus
1•birdculture•14m ago•0 comments

Omaha as Judgment Day for AGI

https://mayankagrawalphd.substack.com/p/omaha-as-judgment-day-for-agi
2•timshell•15m ago•0 comments

There Is a Fire Sale on M.B.A.s

https://www.wsj.com/lifestyle/careers/there-is-a-fire-sale-on-m-b-a-s-87d56c69
3•harambae•15m ago•0 comments

Notion's Agents SDK

https://twitter.com/NotionDevs/status/2054600927810920894
2•umangsehgal93•17m ago•0 comments

Claude helped recover 5 BTC that was thought lost for 11 years

https://twitter.com/cprkrn/status/2054586810475364536
2•serial_dev•18m ago•0 comments

Local Android emulation in an AI agent

https://docs.devin.ai/onboard-devin/environment/android-emulation
1•Alextigtig•20m ago•0 comments

Adopting Helix*Isms

https://kristun.dev/posts/adopting-helix-isms/
1•polyamid23•20m ago•0 comments

AI in Bio Biggest Questions

https://shelbyann.substack.com/p/biggest-questions-in-biotech
1•cellsnstuff•20m ago•0 comments

Everything Claude Code: performance optimization system for AI agent harnesses

https://github.com/affaan-m/everything-claude-code
1•doener•22m ago•0 comments