frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•1y ago

Comments

kate_at_refact•1y ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Training within 10 hours of bedtime measurably hurts recovery

https://tryterra.co/research/best-time-to-workout-for-sleep
1•kyriakosel•1m ago•0 comments

Company Was an American Success Story. Until MAHA Influencers Sank It

https://www.thefp.com/p/this-company-was-an-american-success-until-maha-sank-it
1•rmason•1m ago•0 comments

Trump Order Would Push Banks to Check Clients' Citizenship Status

https://www.wsj.com/finance/regulation/trump-order-would-push-banks-to-check-clients-citizenship-...
1•petethomas•6m ago•0 comments

You Can't Park at 0.1c

https://samirvarma.substack.com/p/you-cant-quietly-park-at-01c
1•qsi•7m ago•0 comments

What are account recovery options with FusionAuth?

https://fusionauth.io/community/forum/topic/3133/what-are-account-recovery-options-with-fusionauth
1•mooreds•12m ago•0 comments

Why "Fast AI" and "Safe AI" Were Never in Conflict

https://www.enkryptai.com/blog/why-fast-ai-and-safe-ai-were-never-actually-in-conflict
1•mooreds•13m ago•0 comments

Etymonline: An Online Etymology Dictionary

https://www.etymonline.com
1•airstrike•14m ago•0 comments

Debunking the job-hopping myth: A Data-Driven Look at Tenure and Turnover [pdf]

https://www.nirsonline.org/wp-content/uploads/2025/09/25-Debunking-Job-Hopping-Myth_FINAL.pdf
2•gmays•16m ago•0 comments

Enterprise AI: Mystery Meat, Kill Zones, Cognitive Surrender, Vibe Bombs

https://kyield.com/insights/newsletter/2026/05/vibe-bombs-cognitive-surrender.html
1•mooreds•16m ago•0 comments

Apple Taps Virtual Avatar Firm Animato's Expertise and Intellectual Property

https://www.macrumors.com/2026/05/19/apple-acquires-animato/
1•mgh2•18m ago•0 comments

Cyoda-Go: The Enterprise Edbms

https://github.com/Cyoda-platform/cyoda-go
1•petethomas•22m ago•0 comments

Alternatives to HN for "tech outside of AI" discussion?

2•summonerOS•23m ago•0 comments

Customizing an LLM for Enterprise Software Engineering

https://arxiv.org/abs/2605.16517
1•azhenley•23m ago•0 comments

Zephex is hosted MCP that gives AI coding editors persistent project context

https://zephex.dev
1•zephex•27m ago•0 comments

Repugnant Economics

https://marginalrevolution.com/marginalrevolution/2026/05/repugnant-economics.html
3•paulpauper•31m ago•0 comments

Show HN: SafeRun – Replay debugging and inline prevention for AI agents

1•Tidianez•31m ago•0 comments

Old Space, New Space: A Commercial Revolution in Innovation?

https://www.nber.org/papers/w35212
1•paulpauper•31m ago•0 comments

Australia and Pax Silica: The Quiet Foundations of a New Western Order?

https://sldinfo.com/2026/05/australia-and-pax-silica-the-quiet-foundations-of-a-new-western-order/
1•Gaishan•32m ago•0 comments

Sen. Cassidy casts deciding vote in legislation to end Iran war [video][40s]

https://www.youtube.com/watch?v=S-h8oM7kGEE
2•Bender•33m ago•0 comments

Vacuum flux has memory too [4]

1•sargstuff•33m ago•0 comments

Meta Begins 8k Global Job Cuts in Asian Hub of Singapore

https://finance.yahoo.com/sectors/technology/articles/meta-begins-8-000-global-004153394.html
3•doppp•36m ago•0 comments

Copyright Office Rejected My Attempt to Copyright a Tweet

https://www.techdirt.com/2014/08/04/copyright-office-rejected-my-attempt-to-copyright-tweet/
2•danhite•41m ago•1 comments

Ember: 365-day audited record of AI models vs. Polymarket, scored by Brier

https://emberfyi.com/
1•emberfyi•42m ago•0 comments

AI video editing is blowing my mind

https://aivideoediting.io/
1•pekingzcc•47m ago•0 comments

Architect of the UK Online Safety Act Calls for Its Complete Repeal

https://prestonbyrne.com/2026/05/19/architect-of-the-uk-online-safety-act-calls-for-its-complete-...
5•iamnothere•48m ago•0 comments

Rendezvous: A serverless, Zoom-like video conferencing web app

https://github.com/predatorray/rendezvous
2•zetaplusae•54m ago•0 comments

Generalization Dynamics of LM Pre-Training

https://jiaxin-wen.github.io/blog/generalization-dynamics
1•gmays•55m ago•0 comments

Iteration

https://blog.viktomas.com/posts/iteration/
1•luca-sctr•56m ago•0 comments

GPU telemetry anomaly: 146W idle draw on A100 (white paper)

https://github.com/mikebains41-debug/ai-gpu-energy-optimizer-/blob/main/WHITEPAPER.md
1•mikebains•1h ago•0 comments

Who Wins the Future: Chips vs. Frontier LLMs

https://medium.com/@vektormemory/who-wins-the-future-chips-vs-frontier-llms-1e8e0ca42641
2•vektormemory•1h ago•1 comments