frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•8mo ago

Comments

kate_at_refact•8mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Compare LLM Responses with OverallGPT

https://overallgpt.com/about
1•walterbell•3m ago•0 comments

The secretive powerbroker with a tight grip on corporate Spain

https://www.ft.com/content/708bbf41-a174-4c04-b96d-c6cfeca2a805
1•petethomas•7m ago•0 comments

Timelang: Natural Language Time Parser

https://timelang.dev/
1•kamranahmedse•10m ago•1 comments

Young Men Are Lost. A New Book Can Guide the Way

https://www.bloomberg.com/news/articles/2026-01-12/day-trader-memoir-generation-desperation-a-fra...
2•petethomas•13m ago•0 comments

Trump Touts New Microsoft Data-Center Pledges After Local Backlash

https://www.wsj.com/livecoverage/stock-market-today-dow-sp-500-nasdaq-01-12-2026/card/trump-touts...
2•nycdatasci•13m ago•0 comments

Show HN: ProofLoop – Autonomous long-running agents with verifiable completion

https://github.com/exiw-ai/proofloop
1•egordev•17m ago•0 comments

Show HN: I made a $25 lifetime Screen Studio alternative

https://debut.sh/
1•ben_hrris•21m ago•0 comments

Are Atomic Operations Better Than a Mutex? It Depends

https://madflojo.dev/posts/are-atomic-operations-faster-than-a-mutex/
1•madflojo•23m ago•0 comments

They Write the Right Stuff

https://www.fastcompany.com/28121/they-write-right-stuff
1•robbs•28m ago•1 comments

A rare interview with the elusive Agatha Christie

https://www.bbc.com/culture/article/20260109-a-rare-interview-with-the-elusive-agatha-christie
1•1659447091•29m ago•0 comments

Be Wary of Digital Deskilling

https://calnewport.com/be-wary-of-digital-deskilling/
2•monobot12•29m ago•0 comments

Show HN: Yoth-yoth – your all-in-one workspace

https://yoth-yoth.com/hello
2•volokh•34m ago•0 comments

Data is not available upon request

https://osf.io/preprints/psyarxiv/jbu9r_v3
2•sien•36m ago•0 comments

I Graduated from Survival Mode

https://www.fieldnotes.nautilus.quest/p/i-graduated-from-survival-mode
1•zeldapoem•37m ago•0 comments

The Pentagon used a secret aircraft painted to look like a civilian plane

https://www.nytimes.com/2026/01/12/us/politics/us-boat-attacks-law.html
2•perihelions•38m ago•2 comments

Verizon to stop automatic unlocking of phones as FCC ends 60-day unlock rule

https://arstechnica.com/tech-policy/2026/01/fcc-lets-verizon-lock-phones-for-longer-making-it-har...
3•DefineOutside•44m ago•0 comments

A 'Holy Grail' Sleep Apnea Pill Could Be on the Market Next Year

https://www.forbes.com/sites/amyfeldman/2026/01/12/a-holy-grail-sleep-apnea-pill-could-be-on-the-...
4•cebert•44m ago•2 comments

Great code doesn't matter if you can't sell it

https://platformtoolsmith.com/blog/senior-engineer-part-3/
2•sharp-dev•44m ago•1 comments

FCC revises Verizon phone unlocking rules after significant fraud issues

https://www.reuters.com/business/media-telecom/fcc-revises-verizon-phone-unlocking-rules-after-si...
2•petethomas•44m ago•0 comments

Canada's Scaling Problem Isn't Compute, It's Coastlines

https://zeitgeistml.substack.com/p/canadas-scaling-problem-isnt-compute
2•sjosh003•45m ago•0 comments

Show HN: Minimal type-safe language for software architecture

https://github.com/tesserato/Tect
2•tesserato•46m ago•0 comments

Bug 55867 – Doesn't know how to tag XI_TRACKBALL

https://bugs.freedesktop.org/show_bug.cgi?__goaway_challenge=meta-refresh&__goaway_id=bdd4239d39d...
1•rballpug•50m ago•3 comments

From Starbase: Pete Hegseth on Defense Innovation Reform [video]

https://www.youtube.com/watch?v=MlRm9tCT0Ug
1•0xWTF•54m ago•1 comments

Meta shakes up its review system with 'stronger rewards for top performers'

https://www.businessinsider.com/meta-performance-review-system-stronger-rewards-top-performers-20...
3•ryandrake•1h ago•0 comments

Paramount Wants Warner to Show Its Work

https://www.bloomberg.com/opinion/newsletters/2026-01-12/paramount-wants-to-warner-to-show-its-work
1•feross•1h ago•0 comments

Turning Agents into Learning Machines

https://twitter.com/ashpreetbedi/status/2010781132418064750
1•bedify•1h ago•0 comments

DJT Says Microsoft to Make Changes to Curb Data Center Power Costs for Americans

https://money.usnews.com/investing/news/articles/2026-01-12/trump-says-microsoft-to-make-changes-...
1•schmuckonwheels•1h ago•1 comments

Living with LLMs Everywhere – How Ambient LLMs Negate Security Policy

1•djwide•1h ago•0 comments

Who Companies Call When They Want to Become a Bank

https://www.bloomberg.com/news/articles/2026-01-12/fintechs-call-klaros-group-when-they-want-bank...
1•petethomas•1h ago•0 comments

Apple: You (Still) Don't Understand the Vision Pro

https://stratechery.com/2026/apple-you-still-dont-understand-the-vision-pro/
2•feross•1h ago•1 comments