frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•9mo ago

Comments

kate_at_refact•9mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

I Am Not a Functional Programmer

https://blog.daniel-beskin.com/2026-01-28-i-am-not-a-functional-programmer
1•birdculture•2m ago•0 comments

Sam Altman: I wonder why Anthropic would go for something so clearly dishonest

https://twitter.com/sama/status/2019139174339928189
1•doener•3m ago•1 comments

METR estimates that GPT-5.2 has a 50%-time-horizon of around 6.6 hrs

https://twitter.com/METR_Evals/status/2019169900317798857
1•tedsanders•4m ago•0 comments

Epistemological Fault Lines Between Human and Artificial Intelligence

https://arxiv.org/abs/2512.19466
1•DyslexicAtheist•5m ago•0 comments

Debian's Challenge When Its Developers Drift Away

https://www.phoronix.com/news/Debian-Developers-Quiet-Away
2•cuechan•8m ago•0 comments

I was just laid off by The Washington Post in the middle of a warzone

https://twitter.com/lizziejohnsonnn/status/2019083204133609846
3•doener•8m ago•0 comments

Anthropic Ad

https://twitter.com/Grantblocmates/status/2019093077936497031
2•doener•9m ago•1 comments

Building Reliable AI Infrastructure: What We Learned Scaling AI Visibility

https://amplitude.com/blog/scaling-ai-visibility
1•linksku•10m ago•0 comments

Show HN: Cradle Log – Free offline baby tracker with voice logging, AI insights

https://www.getcradlelog.com/
1•jack_burrr•11m ago•0 comments

An FPS built with Svelte, Threlte and Claude Opus built in just 2 hours

https://www.mr-spankys-meatballs.com
1•paulbjensen•11m ago•0 comments

Pinterest sacks two engineers for creating software to identify fired workers

https://www.theguardian.com/technology/2026/feb/04/pinterest-sacks-two-engineers-for-software-ide...
2•erehweb•14m ago•0 comments

Show HN: Cohesix 0.4.0-alpha, a no-std control-plane OS

https://github.com/lukeb-aidev/cohesix
2•Cohesix•15m ago•0 comments

Show HN: We simulated 10K freelancers deciding to work for AI agents

1•Mert_Predicts•17m ago•0 comments

Open-source AI tool beats LLMs in literature reviews – and gets citations right

https://www.nature.com/articles/d41586-026-00347-9
2•sohkamyung•17m ago•0 comments

Former Prime Ministers Harper and Chrétien Discuss Canada and the World [video]

https://www.youtube.com/watch?v=jhCacENdj7U
1•thomassmith65•18m ago•0 comments

Japan's Tourism Challenges: Declining Visitors and Shifting Trends in 2026

https://www.travelandtourworld.com/news/article/japans-tourism-challenges-declining-visitors-and-...
2•mikhael•19m ago•0 comments

AI and Higher Ed: An Impending Collapse

https://www.insidehighered.com/opinion/views/2025/07/24/ai-and-higher-ed-impending-collapse-opinion
1•talon8635•20m ago•0 comments

Ask HN: Where does operational truth live before it reaches "systems of record"?

2•former-aws•20m ago•0 comments

Show HN: LayerClaw – Observability tool for PyTorch training

2•prabhavsanga•21m ago•0 comments

"Grok, Is This True?" Analyzing LLM-Powered Fact-Checking on Social Media

https://osf.io/preprints/psyarxiv/85quw_v2
2•ytpete•23m ago•1 comments

Fast Autoscheduling for Sparse ML Frameworks

https://ajroot.pl/cgo2026scorch.html
1•matt_d•24m ago•0 comments

Show HN: WhookTown – Visualize your infrastructure as a 3D cyberpunk city

https://www.whook.town/
1•fralix•24m ago•0 comments

You don't want a faster Notion

https://outcrop.app/blog/speed
1•imedadel•24m ago•0 comments

AWS says you're on your own if media codec patent owners come knocking

https://www.theregister.com/2026/02/04/aws_codec_patent_holders/
2•ffworld•24m ago•1 comments

Show HN: BederSnake Revolution: Snake+Sokoban+Match3 puzzle with MLsolvable lvls

https://bedersnake.itch.io/bedersnake-revolution
2•avtomatron•25m ago•0 comments

Why Most of America Is Terrible at Making Biscuits (2018)

https://www.theatlantic.com/health/archive/2018/11/better-biscuits-south-thanksgiving/576526/
3•Mernit•27m ago•1 comments

Everything We Teach at Y Combinator in 10 Minutes [video]

https://www.youtube.com/watch?v=Pg72m3CjuK4
2•Brysonbw•30m ago•0 comments

Apple Beats Tech Stocks by Most in a Year as It Avoids AI Panic

https://finance.yahoo.com/news/apple-beats-tech-stocks-most-174832890.html
1•wslh•32m ago•0 comments

Show HN: I stopped trying to sleep on long-haul flights

https://www.flight-ready.online/
1•Zaleo•33m ago•2 comments

Portugal ruling party MPs seek social media ban for teens

https://macaubusiness.com/portugal-ruling-party-mps-seek-social-media-ban-for-teens/
1•belter•34m ago•0 comments