frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•11mo ago

Comments

kate_at_refact•11mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Over 4,732 Messages, He Fell in Love with an AI Chatbot. Now He's Dead

https://www.wsj.com/tech/ai/google-gemini-jonathan-gavalas-death-07351ab2
1•Brajeshwar•2m ago•0 comments

Firms Promised HighTech Ransomware Solutions;They Usually Just Pay Hackers(2019)

https://features.propublica.org/ransomware/ransomware-attack-data-recovery-firms-paying-hackers/
1•bookofjoe•7m ago•0 comments

Show HN: SpecSource – AI That Writes Linear Specs from Sentry, GitHub, & Slack

https://www.specsource.ai
1•bring-shrubbery•7m ago•0 comments

The philosophy of great customer service (2014)

https://sive.rs/cs
2•Michelangelo11•8m ago•0 comments

Touch some grass or how to survive during AI times

https://www.mcbaguetti.xyz/touchgrass.html
1•mcbaguetti•8m ago•0 comments

OpenAI's vision for AI economy: public wealth funds, robot taxes, 4-day workweek

https://techcrunch.com/2026/04/06/openais-vision-for-the-ai-economy-public-wealth-funds-robot-tax...
1•genphy1976•10m ago•0 comments

Show HN: Real-Time OLAP Infrastructure

https://modolap.com
2•ronfriedhaber•11m ago•0 comments

Why weekends are under threat

https://thehustle.co/originals/why-weekends-are-under-threat
13•Anon84•18m ago•1 comments

Glass – a replay-first bounded investigation surface for runtime activity

https://github.com/StealthEyeLLC/glass
1•stealtheyellc•18m ago•0 comments

Agentsview: A local-first desktop and web app for browsing AI agent sessions

https://github.com/wesm/agentsview
1•ayhanfuat•18m ago•0 comments

Stonks-CLI – track your investment portfolio from the terminal

https://github.com/igoropaniuk/stonks-cli
1•friedchocolate•18m ago•0 comments

We have a 99% email reputation. Gmail disagrees

https://blogfontawesome.wpcomstaging.com/we-have-a-99-email-reputation-gmail-disagrees/
2•em-bee•19m ago•0 comments

Troubleshooting Email Delivery to Microsoft Users

https://rozumem.xyz/posts/14
1•rozumem•21m ago•0 comments

Pact: Protocol for Agent Coordination and Transfer (Draft RFC)

https://github.com/noahfavreau/pact-protocol/blob/main/pact-spec/docs/PACT-RFC-001.md
1•legax•22m ago•0 comments

Show HN: Chunk – macOS menu bar time-blocking app with Claude AI integration

https://www.chunkapp.net
2•dudleyspence•23m ago•0 comments

Show HN: OIFI Databook: An intelligence database of Iran's power structure

https://databook.oifi.org
1•ksajadi•27m ago•0 comments

One Vendor. 80% of Dutch Hospitals. Ransomware

https://stateofsurveillance.org/news/chipsoft-ransomware-dutch-hospitals-80-percent-patient-recor...
2•Ey7NFZ3P0nzAe•27m ago•0 comments

Windows debloating tools are basically useless

https://www.pcmag.com/explainers/i-tested-4-windows-debloating-tools-spoiler-theyre-basically-use...
1•dryadin•27m ago•0 comments

What if local control can help build housing?

https://www.noahpinion.blog/p/what-if-local-control-can-actually
1•firasd•28m ago•0 comments

Coders and testers help me out

https://github.com/andlind/almondsrc
1•andreasli72•29m ago•1 comments

Tell HN: docker pull fails in spain due to football cloudflare block

4•littlecranky67•33m ago•0 comments

Why AI Sucks at Front End

https://nerdy.dev/why-ai-sucks-at-front-end
1•tobr•36m ago•0 comments

Playing a nomic could be used to build small directly-democratic organizations

https://democranomic.neocities.org/
1•cobber2005•37m ago•1 comments

You Don't Need Claude Code

https://tildehacker.com/you-dont-need-claude-code
3•tildehacker•39m ago•1 comments

Editorial India B788 Ahmedabad 2025-06-12 lost height shortly after takeoff

https://avherald.com/h?article=52b0a800/0000
2•hugh-avherald•40m ago•0 comments

Please Review It( Vital Weave)

https://hackathoncbc.streamlit.app/
1•vignesh_146•40m ago•2 comments

Bring Back Idiomatic Design

https://essays.johnloeber.com/p/4-bring-back-idiomatic-design
3•phil294•40m ago•0 comments

Apple, Still

https://taoofmac.com/space/blog/2026/04/12/1330
2•rcarmo•43m ago•0 comments

BrickFormer Source Code Released

https://github.com/loryruta/brickformer
2•loryruta•50m ago•0 comments

LLM Wiki Skill: Build a Second Brain with Claude Code and Obsidian

https://medium.com/@alirezarezvani/llm-wiki-skill-build-a-second-brain-with-claude-code-and-obsid...
2•jungard•55m ago•0 comments