frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•9mo ago

Comments

kate_at_refact•9mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Show HN: Free AI math solver with step-by-step explanations (no paywall)

https://www.quizwhiz.ai/tools/math-solver/
1•sunnyville•32s ago•0 comments

AI harness for PG –> CH migrations

https://clickhouse.com/blog/ai-powered-migraiton-from-postgres-to-clickhouse-with-fiveonefour
1•oatsandsugar•3m ago•0 comments

Portabase 1.2.7: Large database backup support, UI/UX improvements

1•rambokdev•4m ago•1 comments

Writes and Write-Nots

https://paulgraham.com/writes.html
2•supriyo-biswas•5m ago•0 comments

Computronium

https://en.wikipedia.org/wiki/Computronium
2•dvrp•5m ago•0 comments

Phoenix pay system fiasco: 10 years of mistakes and lessons

https://www.cbc.ca/news/canada/ottawa/federal-phoenix-pay-system-10-year-anniversary-9.7093933
3•Teever•8m ago•0 comments

Microsoft's new 10k-year data storage medium: glass

https://arstechnica.com/science/2026/02/microsofts-new-10000-year-data-storage-medium-glass/
3•danousna•8m ago•0 comments

A benchmark for vericoding: formally verified program synthesis

https://arxiv.org/abs/2509.22908
1•luskira•9m ago•0 comments

Show HN: CogmemAi – Persistent Memory for Claude Code via MCP

https://github.com/hifriendbot/cogmemai-mcp
1•hifriendbot•10m ago•0 comments

Debit, Charge, or Credit: Which Card Program Is Right for Your Business?

https://www.synctera.com/post/debit-charge-or-credit-which-card-program-is-right-for-your-business
2•thatdrew•10m ago•0 comments

Show HN: Syncpack v14, Monorepo CLI tool

https://syncpack.dev/
2•fold_left•11m ago•0 comments

Ex-DeepMind's David Silver Eyes $1B Fundraise for Ineffable Intelligence

https://techfundingnews.com/ex-deepmind-ai-researcher-eyes-1b-fundraise-for-london-based-ineffabl...
1•chrisloy•11m ago•0 comments

Popcorn Time R3 – The Reboot on Ethereum

https://bitcoin-zero-down-2ea152.gitlab.io/gallery/gallery-item-neg-863/
2•machardmachard•11m ago•1 comments

The Transcritical CO2 Cycle: Promise, Pitfalls, and Prospects

https://www.mdpi.com/1996-1073/19/3/585
1•PaulHoule•12m ago•0 comments

Thunderbolt 4 on PC NVM firmware update from Intel breaks compatibility

https://asusproart.medium.com/thunderbolt-on-pc-is-a-nightmare-of-intels-own-making-edd3141cc03f
2•ibobev•12m ago•0 comments

The Philosopher's Elevator

https://practicalradicalism.substack.com/p/the-philosophers-elevator
2•paulpauper•13m ago•0 comments

Texas Sues TP-Link over 'Web of Deception' About Alleged China Ties

https://www.pcmag.com/news/texas-sues-tp-link-over-web-of-deception-about-alleged-china-ties?test...
2•speckx•13m ago•0 comments

Show HN: See – searchable JSON compression, smaller than ZSTD (on our data)

https://github.com/kodomonocch1/see_proto
1•Tetsuro•13m ago•1 comments

Claude is dropping max plans for enterprise (maybe for everyone?)

https://old.reddit.com/r/ClaudeCode/comments/1r82req/claude_is_dropping_max_plans_for_enterprise_...
2•agentifysh•13m ago•0 comments

Show HN: I analyzed 157K HN posts and built skills with guardrails against BS

https://github.com/JanBussieck/hn-skill
1•buss_jan•14m ago•0 comments

99% of adults over 40 have shoulder "abnormalities" on an MRI, study finds

https://arstechnica.com/health/2026/02/99-of-adults-over-40-have-shoulder-abnormalities-on-an-mri...
3•rbanffy•14m ago•0 comments

Clinejection: Compromising Cline's prod releases by prompting the issue triager

https://adnanthekhan.com/posts/clinejection/
1•hrpnk•14m ago•1 comments

Custom Kernels for All from Codex and Claude

https://huggingface.co/blog/custom-cuda-kernels-agent-skills
2•ibobev•15m ago•0 comments

The Cassidy Report on the FDA

https://marginalrevolution.com/marginalrevolution/2026/02/the-cassidy-report-on-the-fda.html
2•paulpauper•15m ago•0 comments

Exposed Social Security Numbers May Put Millions at Risk of Identity Theft

https://www.wired.com/story/a-mega-trove-of-exposed-social-security-numbers-underscores-critical-...
1•upguardnews•16m ago•0 comments

A volcano scorched these Roman scrolls – can AI recover their text?

https://www.understandingai.org/p/a-volcano-scorched-hundreds-of-roman
1•speckx•16m ago•0 comments

Show HN: The Extensible, Multi-Agent Personal AI Sidekick

https://github.com/meetopenbot/openbot
4•undersky•16m ago•1 comments

Re: I'm new to GitHub and I have lots to say

https://www.jonaylor.com/blog/make-an-exe-file/
1•jonaylor89•17m ago•0 comments

EVMbench – OpenAI

https://openai.com/index/introducing-evmbench/
2•bpierre•17m ago•0 comments

Flow Speed Reader: read twice as fast

https://sean-reid.github.io/flow/
1•sean-reid•17m ago•0 comments