frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•1y ago

Comments

kate_at_refact•1y ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Earliest evidence for invasive mitigation of dental caries by Neanderthals

https://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0347662
1•tambourine_man•1m ago•0 comments

From Benchmarketing to Benchmaxxing

https://www.typedef.ai/blog/from-benchmarketing-to-benchmaxxing-what-40-years-of-database-evals-c...
1•cpard•2m ago•0 comments

Automate your version control with, GitMo

https://github.com/KyleBenzle/GitMo
1•Hilliard_Ohiooo•5m ago•1 comments

DeepSWE blows up AI coding leaderboard, crowns GPT-5.5, + ClaudeOpus loophole

https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-...
2•BriStoller•11m ago•0 comments

Imece – a decentralized AI compute cooperative

https://github.com/jstdv/imece
1•jstdv•12m ago•0 comments

Windows Reactor – React like WinUI 3 framework

https://github.com/microsoft/microsoft-ui-reactor
1•rexpan•14m ago•0 comments

Llms.txt Examples: Real Patterns for API Docs, Help Centers, and Developer Docs

https://docsalot.dev/blog/llms-txt-examples
1•fazkan•18m ago•0 comments

Prism is a purpose-built, redundant, global broadcasting platform

https://www.prism18.com/
1•petethomas•19m ago•0 comments

Scaling Laws for Agent Harnesses via Effective Feedback Compute

https://arxiv.org/abs/2605.29682
1•veryluckyxyz•20m ago•0 comments

2026 vibe coding tool comparison

https://read.technically.dev/p/2026-vibe-coding-tool-comparison
1•eigenBasis•21m ago•0 comments

AI Isn't Replacing Curious Developers

https://dataengineeringcentral.substack.com/p/ai-isnt-replacing-developers
1•eigenBasis•22m ago•0 comments

Uber and the Bitter Truth About Low AI ROI

1•thegrandidiot•28m ago•0 comments

Make Content That Sells

https://www.profitthreads.com
1•mattmerrick•31m ago•0 comments

After decades risking arrest, South Korea's tattoo artists step into limelight

https://www.bbc.com/news/articles/cg4pwdn6130o
3•breve•33m ago•0 comments

TempleOS WASM

https://templeos.reiko.app/
4•zdgeier•33m ago•0 comments

Perry Compiles TypeScript directly to executables using SWC and LLVM

https://www.perryts.com/
7•0x1997•40m ago•7 comments

OWASP Vulnerableapp

https://github.com/SasanLabs/VulnerableApp
2•preetkaran20•44m ago•1 comments

Should You Automate Your Life?

https://www.newyorker.com/culture/open-questions/should-you-automate-your-life
1•petethomas•45m ago•0 comments

Show HN: VT Code – open-source terminal coding agent in Rust

https://github.com/vinhnx/VTCode
3•vinhnx•47m ago•0 comments

A Famous Math Problem Stumped Humans for 80 Years. AI Just Cracked It

https://www.wsj.com/tech/ai/ai-math-solves-erdos-problem-openai-c4029e84
1•bryan0•50m ago•0 comments

Lessons from Shipping Persistent Memory for AI Agents

https://www.pingcap.com/blog/how-we-built-mem9-agent-memory-product/
1•jinqueeny•55m ago•1 comments

Utiq – The ad tracking of your (European) ISP and how to avoid it

https://korben.info/utiq-identifiant-publicitaire-telcos.html
1•LelouBil•56m ago•0 comments

CLI tool for automating personal data removal requests from broker sites

https://github.com/Enthropic-Data-LLC/data-removal
1•sohocs509•1h ago•1 comments

How Excel got agentic

https://commandline.microsoft.com/mukul-singh-excel-agent-mode-copilot-research-into-product/
1•azhenley•1h ago•0 comments

SpaceX has almost finished writing v1.0 of an in-house AI training stack in C

https://xcancel.com/elonmusk/status/2059884150187053488
1•SilverElfin•1h ago•2 comments

The Tech Behind an NBA Broadcast [video]

https://www.youtube.com/watch?v=mk_wdHePbtQ
1•b0ner_t0ner•1h ago•0 comments

US strike campaign against drug boats tops 200 deaths

https://www.independent.co.uk/news/world/americas/us-politics/drug-boat-strikes-cocaine-trafficki...
11•asdefghyk•1h ago•2 comments

The healthy life beverage book (1911)

https://gutenberg.org/cache/epub/78781/pg78781-images.html
3•petethomas•1h ago•1 comments

Measuring LLMs' ability to develop exploits

https://red.anthropic.com/2026/exploit-evals/
2•gmays•1h ago•0 comments

BP's annual report shows you shouldn't believe what you read

https://www.thetimes.com/business/companies-markets/article/bps-annual-report-shows-you-shouldnt-...
2•petethomas•1h ago•0 comments