frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•9mo ago

Comments

kate_at_refact•9mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Partner at Day One Ventures ranks intelligence by how Jewish someone is

https://twitter.com/amasad/status/2018175092648472913
1•sosomoxie•43s ago•0 comments

Former Windows 8 boss recruited Epstein to help negotiate his Microsoft exit

https://www.theverge.com/report/872073/steven-sinofsky-jeffrey-epstein-files-emails-microsoft-exi...
1•cebert•1m ago•0 comments

Modders Resurrect Half-Dead RTX 5070 Ti, Break Benchmark Record

https://www.pcmag.com/news/modders-resurrect-half-dead-rtx-5070-ti-break-benchmark-record?test_uu...
1•geox•4m ago•0 comments

Show HN: S3ui – Lightweight UI for Browsing Amazon S3

https://github.com/justinGrosvenor/s3ui
1•justingrosvenor•6m ago•0 comments

ReKindle

https://rekindle.ink/
1•todsacerdoti•6m ago•0 comments

Firefox Getting New Controls to Turn Off AI Features

https://www.macrumors.com/2026/02/02/firefox-ai-toggle/
7•stalfosknight•8m ago•0 comments

Glossopetrae: Procedural Xenolinguistic Engine for Al

https://github.com/elder-plinius/GLOSSOPETRAE
1•SeriousM•8m ago•1 comments

Show HN: TPA – A Zero-Trust Protocol for Sovereign Governance

1•CuriousP•9m ago•0 comments

EasyClaw – lightweight GUI installer for OpenClaw

https://easyclaw.com
1•svemyh•11m ago•2 comments

Does AI have human-level intelligence? (Nature Comment)

https://www.nature.com/articles/d41586-026-00285-6
1•dryarzeg•12m ago•0 comments

Ask HN: What weird or scrappy things did you do to get your first users?

4•preston-kwei•12m ago•0 comments

DuckDB Developer Meeting 1

https://duckdb.org/events/2026/01/30/duckdb-developer-meeting-1/
1•kermatt•14m ago•0 comments

The Local Weather

https://joeyh.name/blog/entry/the_local_weather/
1•pabs3•15m ago•0 comments

List of Fallacies

https://en.wikipedia.org/wiki/List_of_fallacies
1•basilikum•17m ago•1 comments

NNSA conducts aerial radiation surveys over San Francisco ahead of Super Bowl

https://www.energy.gov/nnsa/articles/nnsa-conduct-aerial-radiation-assessment-surveys-over-san-fr...
2•wilson090•18m ago•1 comments

Nvidia and Oracle are sending similar warning signs about the AI trade

https://www.morningstar.com/news/marketwatch/2026020291/nvidia-and-oracle-are-sending-similar-war...
1•zerosizedweasle•18m ago•0 comments

Show HN: Nono – Kernel-enforced sandboxing for AI agents

https://github.com/lukehinds/nono
1•decodebytes•19m ago•0 comments

How to Measure Social Media Marketing Performance

https://www.scoopanalytics.com/blog/how-to-measure-social-media-marketing-performance
1•nathansmithsco•21m ago•1 comments

Now anyone can tap Ring doorbells to search for lost dogs

https://www.theverge.com/tech/871916/search-party-non-ring-owners-neighbors-app
1•cdrnsf•22m ago•0 comments

French IT group Capgemini to sell US subsidiary linked to ICE after outcry

https://www.france24.com/en/france/20260201-french-it-group-capgemini-to-sell-us-subsidiary-linke...
2•ohjeez•22m ago•0 comments

Notepad++ hijacking blamed on Chinese Lotus Blossom

https://www.theregister.com/2026/02/02/notepad_hijacking_lotus_blossom/
1•maguszin•23m ago•0 comments

Grounding LLMs with Recursive Code Execution

https://yogthos.net/posts/2026-01-12-recursive-language-model.html
1•PaulHoule•24m ago•0 comments

Show HN: I'm writing a scalable alternative to gource with diff animation modes

https://github.com/navid-m/rush
1•death_eternal•25m ago•0 comments

Step 3.5 Flash LLM model, agentic coding ~18x faster than GLM 4.7 / Kimi K2.5

https://huggingface.co/stepfun-ai/Step-3.5-Flash
1•skhameneh•25m ago•1 comments

Newsgrouper

https://newsgrouper.org/tops
2•DyslexicAtheist•25m ago•0 comments

How to Measure Content Performance

https://www.scoopanalytics.com/blog/how-to-measure-content-performance
1•andrewsimone•25m ago•1 comments

Safeclaw the safe OpenClaw alternative with no language model and no APIs

https://github.com/princezuda/safeclaw
2•safeclaw•26m ago•1 comments

Open Sourced US Tax Calculation Library

https://github.com/tax-logic-core/tax-logic-core
3•tax-logic•27m ago•1 comments

Why Track Business Metrics

https://www.scoopanalytics.com/blog/why-track-business-metrics
1•emilyrhodes•28m ago•0 comments

Show HN: aTerm – a terminal workspace built for AI coding workflows

https://github.com/saadnvd1/aTerm
1•saadn92•31m ago•0 comments