frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•9mo ago

Comments

kate_at_refact•9mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Show HN: I turned my PDFs into audiobooks I can have conversations with

https://asktotle.com
1•simpnoza•56s ago•0 comments

How Tailscale is improving NAT traversal (Part 1)

https://tailscale.com/blog/nat-traversal-improvements-pt-1
1•rzk•1m ago•0 comments

Making physical Japanese cards: The full walkthrough from zero to launch

https://alt-romes.github.io/posts/2026-01-30-from-side-project-to-kickstarter-a-walkthrough.html
1•romes•2m ago•0 comments

How to quickly run your own ClawdBot/OpenClaw on AWS

https://deadneurons.substack.com/p/how-to-quickly-run-your-own-clawdbotopenclaw
1•nr378•9m ago•0 comments

Why is everyone pretending Moltbook is for bots?

https://news.ycombinator.com/submitted?id=72ave2
1•72ave2•9m ago•1 comments

I Test Drove a Chinese EV. Now I Don't Want to Buy American Cars Anymore

https://www.wsj.com/tech/personal-tech/chinese-ev-test-drive-xiaomi-su7-c3e59282
3•dkobia•9m ago•0 comments

Japan's Kioxia extends memory chip JV with SanDisk, receiving $1B

https://asia.nikkei.com/business/tech/semiconductors/japan-s-kioxia-extends-memory-chip-jv-with-s...
1•walterbell•10m ago•0 comments

Ask HN: How do you give AI enough Java-specific context before code generation?

1•decebals•10m ago•1 comments

Zero-Knowledge Privacy Infrastructure for Solana

1•2r1in•16m ago•0 comments

Anthropic 'destructively' scanned books to build Claude

https://www.washingtonpost.com/technology/2026/01/27/anthropic-ai-scan-destroy-books/
2•Anon84•16m ago•0 comments

Show HN: Prompt-injection firewall for OpenClaw agents

https://github.com/ContextFort-AI/clawdbot-runtime-controls
1•ashwinr2002•17m ago•0 comments

What makes an engineer when everyone can vibe code

https://twitter.com/rohit4verse/status/2018013775023263806
1•7777777phil•17m ago•0 comments

Trust in Ranking

https://www.marginalia.nu/log/a_130_trust_in_ranking/
1•signa11•18m ago•0 comments

What do people use for Text-to-Voice?

1•bbyford•18m ago•0 comments

When AI Assumes We Know

https://www.psychologytoday.com/us/blog/the-digital-self/202601/when-ai-assumes-we-already-know
1•omkar-foss•19m ago•0 comments

I calculated what 1M tokens costs across 50 LLM models

https://withorbit.io/blog
1•harshit19932703•20m ago•0 comments

Show HN: I built a digital clock with a 3D-printed case, custom PCB, and Arduino

https://boxart.lt/blog/diy_digital_clock
1•roadsidejesus•21m ago•0 comments

Claude for Excel system prompt, tools and beta headers

https://twitter.com/hewliyang/status/2018278447429382531
1•hewliyang•28m ago•0 comments

To Every Developer Close to Burnout, Read This · TheSeniorDev

https://www.theseniordev.com/blog/to-every-developer-close-to-burnout-read-this
1•birdculture•29m ago•0 comments

Show HN: Judgment Boundary – Stop as a First-Class Outcome for AI Systems

https://github.com/Nick-heo-eg/stop-first-rag
1•echoos•30m ago•1 comments

Copy Protection in Jet Set Willy

https://intarch.ac.uk/journal/issue45/2/1.html
1•Dachande663•31m ago•0 comments

DNS Mesh with eBPF

2•woodprogrammer•31m ago•0 comments

Build chatbot to talk with your PostgreSQL database using Python and local LLM

https://mljar.com/blog/chatbot-python-postgresql-local-llm/
1•pplonski86•32m ago•0 comments

New satellite view of Tibet's tectonic clash

https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-1/New_satellite_view_of_...
2•layer8•33m ago•0 comments

Android + termux + pi

https://twitter.com/badlogicgames/status/2018200939979526335
2•tosh•35m ago•0 comments

A Pyrrhic Victory?

https://zhaoxo.substack.com/p/a-pyrrhic-victory
1•shrinkzxo•35m ago•0 comments

We Developed a Rule Database

1•rockeetterark•35m ago•0 comments

Show HN: Uruflow – A self-hosted, lightweight CI/CD server written in Go

https://github.com/urustack/uruflow
1•musnas•38m ago•0 comments

Show HN: WonderPic – Turn photos into cartoons/sketches (Free, No Login)

https://www.wonderpic.art/
1•Sharon111•41m ago•1 comments

JigsawPuzzle.pro – Turn any photo into a puzzle (Client-side only)

https://jigsawpuzzle.pro/
1•zealer•41m ago•1 comments