frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•1y ago

Comments

kate_at_refact•1y ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

SpaceX Hangover Spreads

https://www.ft.com/content/6a13a108-ef3e-41b5-aaba-7f61eff61ae0
1•1vuio0pswjnm7•2m ago•0 comments

Stevie: Vim Predecessor ("ST editor for vi enthusiasts")

https://en.wikipedia.org/wiki/Stevie_(text_editor)
1•erelong•4m ago•0 comments

Why is Apple asking me to pay more for Big Tech's AI obsession?

https://www.theverge.com/report/958678/apple-consumer-price-increase-ai-big-tech
1•1vuio0pswjnm7•5m ago•0 comments

Early look at Anthropic's Claude Science app for researchers

https://www.testingcatalog.com/early-look-at-anthropics-claude-science-app-for-researchers/
1•willmarch•6m ago•0 comments

Fable Is Back: This Safeguard Has Some AI in It

https://www.thealgorithmicbridge.com/p/fable-is-back-this-safeguard-has
1•swolpers•7m ago•0 comments

Making LLMs Better at Creative Writing Using Entropy

https://www.countbayesie.com/blog/2026/7/1/making-llms-better-at-creative-writing-using-entropy
2•roadside_picnic•8m ago•0 comments

Programming Quotes

https://gist.github.com/Potherca/5ffd57393a85553ab55b
1•sillysaurusx•9m ago•1 comments

A new Android malware from Google

https://f-droid.org/2026/07/01/adv-malware.html
1•drewfax•11m ago•0 comments

The Shilajit Dilemma

https://www.bloomberg.com/features/2026-shilajit-testosterone-liver-risks-maha/
1•littlexsparkee•11m ago•1 comments

Show HN: Techno Kick

https://technokick.com/
1•stagas•12m ago•0 comments

How to Tell a Real Win from Noise in a Tiny Eval

https://medium.com/@alanscottencinas/how-to-tell-a-real-win-from-noise-in-a-tiny-eval-902c89e6aa51
1•encinas88•15m ago•0 comments

Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers

https://senior-swe-bench.snorkel.ai/
2•matt_d•16m ago•0 comments

Paper mill cancer studies get double the number of citations as genuine papers

https://www.nature.com/articles/d41586-026-01908-8
1•gnabgib•20m ago•0 comments

A Significant Increase in Digital Labor Automation

https://safe.ai/blog/significant-increase-in-digital-labor-automation
1•willmarch•20m ago•1 comments

Apple Seeks to Buy Chinese-Made Memory Chips by Lobbying US

https://www.bloomberg.com/news/articles/2025-06-13/chess-com-is-increasingly-popular-with-profess...
2•htrp•21m ago•0 comments

Why I Don't Believe in Machine Consciousness – Jaron Lanier

https://www.jaronlanier.com/1000words.html
3•andsoitis•26m ago•1 comments

Google ordered to pay Klarna $2B antitrust damages

https://www.ft.com/content/1f30ea07-63f7-4c74-b5d2-acf313177778
2•petethomas•28m ago•0 comments

Fedora Council Seeks to Shutdown Current Discussions over AI Developer Desktop

https://www.phoronix.com/news/Fedora-Council-AI-Desktop
2•Bender•28m ago•0 comments

Albania warned EU accession at risk over Jared Kushner-backed resort plans

https://www.theguardian.com/world/2026/jul/01/albania-warned-eu-accession-at-risk-jared-kushner-r...
3•andsoitis•32m ago•2 comments

Humans Are Not Conscious

https://philosophersmag.com/no-humans-are-not-conscious/
2•FillMaths•32m ago•0 comments

Show HN: Tenjin – marketplace where humans and agents buy and sell MD files

https://tenjin.blog/
2•vraspar•38m ago•1 comments

Pakistan's solar miracle – how the hell did they do it?

https://www.thenewworld.co.uk/jan-rosenow-pakistans-solar-miracle-how-the-hell-did-they-do-it/
4•thunderbong•39m ago•2 comments

Bring Back Crappy Forums

https://tedium.co/2026/07/01/online-web-forums-retrospective/
8•pentagrama•41m ago•3 comments

Ask HN: What do you use computer mode for?

2•aryamaan•41m ago•0 comments

Scientists have built a cell from the ground up

https://www.economist.com/science-and-technology/2026/07/01/scientists-have-built-a-cell-from-the...
2•andsoitis•44m ago•1 comments

Papa Johns Can Predict When Your Fridge Is Empty

https://www.adexchanger.com/tv/papa-johns-can-predict-when-your-fridge-is-empty/
4•WaitWaitWha•47m ago•3 comments

Will AI spark a scientific Renaissance – or a diffuse monoculture?

https://www.nature.com/articles/d41586-026-01954-2
2•zaikunzhang•47m ago•1 comments

Reached 440+ stars: Auditable sandbox to record what AI agents did

https://old.reddit.com/r/SideProject/comments/1ul5fe9/reached_440_stars_built_an_auditable_sandbo...
2•syumei•47m ago•0 comments

Avoiding Fallback in Distributed Systems

https://builder.aws.com
10•joeyhage•51m ago•2 comments

IBM 7-Angstrom Technology Packs in 100B Transistors (sub 1 nanometer)

https://www.electronicdesign.com/technologies/eda/article/55388297/electronic-design-ibm-7-angstr...
3•WaitWaitWha•57m ago•0 comments