frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•10mo ago

Comments

kate_at_refact•10mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

DOGE employee stole Social Security data and put it on a thumb drive

https://techcrunch.com/2026/03/10/doge-employee-stole-social-security-data-and-put-it-on-a-thumb-...
2•elsewhen•1m ago•0 comments

Claude Opus 4.6 generated a YouTube poop video with a single prompt

https://twitter.com/josephdviviano/status/2031196768424132881
1•dokdev•1m ago•1 comments

Build a "Deep Data" MCP Server to Connect LLMs to Your Local Database in 10min

https://root-ai.beehiiv.com/p/build-a-deep-data-mcp-server-to-connect-llms-to-your-local-database...
1•mehdikbj•3m ago•0 comments

Aaron Swartz and the Return of Jottit

https://jottit.org/
1•shanselman•4m ago•1 comments

A Special AMD Ryzen AM5 Motherboard for Linux / Open-Source Enthusiasts

https://www.phoronix.com/review/msi-pro-b850p-wifi
2•RachelF•4m ago•0 comments

IOA Core, an open-source governance kernel for AI workflows

1•OrchIntel•5m ago•0 comments

Side questions with /btw in Claude Code

https://code.claude.com/docs/en/interactive-mode
1•mfiguiere•6m ago•0 comments

Mathematics is undergoing the biggest change in its history

1•Stratoscope•7m ago•0 comments

SaaSpocalypse Now

https://hantverkskod.se/2026/03/01/saaspocalypse/
1•mosura•8m ago•0 comments

Classifying email providers of 2000 Swiss municipalities via DNS

https://mxmap.ch/
1•notmine1337•10m ago•0 comments

I Ching or Book of Changes

https://iching.r053.org/
1•tzury•11m ago•0 comments

I Got Root on Meta AI's Infrastructure Using a Chat Prompt

https://netguard24-7.com/blog/meta-ai-root
1•cybrdude•11m ago•0 comments

Chemists thought phosphorus had shown all its cards–until it surprised them

https://phys.org/news/2026-02-chemists-thought-phosphorus-shown-cards.html
1•PaulHoule•11m ago•0 comments

How to start coding with AI agents

https://www.paralect.com/academy/product-engineer/ai-agents-coding
1•igorkrasnik•12m ago•0 comments

Zero Point Energy

https://twitter.com/EagleworksSonny/status/2031128667019972616
1•Flere-Imsaho•13m ago•0 comments

Show HN: Repovex – GitHub repo health scores for your whole org

https://repovex.com
1•calminferno•19m ago•0 comments

Front End Memory Leaks: 500-Repo Static Analysis and 5-Scenario Benchmark Study

https://stackinsight.dev/blog/memory-leak-empirical-study/
1•nadis•22m ago•0 comments

Visual plasticity and exercise revisited: No evidence for a "cycling lane"

https://jov.arvojournals.org/article.aspx?articleid=2737222
2•amadeuspagel•24m ago•0 comments

Google and Tesla think we're managing the electrical grid all wrong

https://techcrunch.com/2026/03/10/google-and-tesla-think-were-managing-the-electrical-grid-all-wr...
1•jnord•24m ago•0 comments

I've no technical background, hope someone finds this interesting

https://github.com/aleflow420/rinoa
1•aleflow420•24m ago•0 comments

GLP-1 drugs push U.S. consumers toward spicy foods, lifting sauce makers

https://www.reuters.com/business/healthcare-pharmaceuticals/sauce-spice-makers-attract-deal-inter...
2•petethomas•24m ago•0 comments

Television and computer use and dementia risk in older adults

https://alz-journals.onlinelibrary.wiley.com/doi/10.1002/alz.71259
3•amadeuspagel•26m ago•0 comments

Modern Compiler Design: C Implementation Details [pdf] (2004)

https://www.cs.usfca.edu/~galles/compilerdesign/cimplementation.pdf
2•turtleyacht•27m ago•1 comments

Covenant-72B: Pre-Training a 72B LLM with Trustless Peers Over-the-Internet

https://twitter.com/tplr_ai/status/2031388295972929720
2•rzk•27m ago•0 comments

Dox with Grok

https://mattsayar.com/dox-with-grok/
2•ohjeez•29m ago•2 comments

Ask HN: What's your favorite "what would SWEs do in 1-3 year from now?"

1•itissid•32m ago•0 comments

The Situation: Thinking About Anthropic's Red Lines

https://www.lawfaremedia.org/article/the-situation--thinking-about-anthropic-s-red-lines
2•hn_acker•33m ago•0 comments

Military AI Policy by Contract: The Limits of Procurement as Governance

https://www.lawfaremedia.org/article/military-ai-policy-by-contract--the-limits-of-procurement-as...
3•hn_acker•35m ago•0 comments

Ask HN: How to "make it" as a newlygrad/junior?

2•kartoffelsaft•37m ago•1 comments

Credit Bureaus Are Leaving More Mistakes on Frustrated Consumers' Reports

https://www.propublica.org/article/credit-report-mistakes-cfpb-experian-transunion
5•hn_acker•38m ago•2 comments