frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•8mo ago

Comments

kate_at_refact•8mo ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

I Stopped Using Nbdev

https://hamel.dev/blog/posts/ai-stack/
1•enzojean•18s ago•0 comments

Norm-Preserving Biprojected Abliteration

https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration
1•lobo_tuerto•2m ago•0 comments

Why everyone suddenly wants Greenland (2024) [video]

https://youtu.be/sxRdKRORYoA
1•teleforce•2m ago•0 comments

Show HN: Shrp – Free AI writing tools, no signup required

https://shrp.app
1•digi_wares•3m ago•0 comments

Run Maestro Mobile App Tests on Physical iOS Devices

https://testingbot.com/blog/maestro-physical-device-testing
1•defied•3m ago•0 comments

Cosmos is an operating system development kit

https://www.gocosmos.org/
1•doener•4m ago•0 comments

Why Observability Requires a Distributed Column Store (2021)

https://www.honeycomb.io/blog/why-observability-requires-distributed-column-store
1•tosh•4m ago•0 comments

Show HN: Afelyon – AI agent that turns Jira tickets into GitHub PRs

https://afelyon.com
1•AbduNebu•5m ago•0 comments

Letter from a Birmingham Jail [King, Jr.] (1963)

https://www.africa.upenn.edu/Articles_Gen/Letter_Birmingham.html
10•hn_acker•6m ago•0 comments

Show HN: I built a tool to make 15-minute AI videos with character consistency

https://longstories.ai
2•rboriol•6m ago•0 comments

Show HN: Researching politics with Claude Code and 55 years of UN speeches

https://un.koenvangilst.nl/research
1•vnglst•6m ago•0 comments

Giving University Exams in the Age of Chatbots

https://ploum.net/2026-01-19-exam-with-chatbots.html
2•zdw•7m ago•0 comments

Moldable – Claude Cowork for the rest of us, local apps, private

https://moldable.sh
1•iwasrobbed•8m ago•0 comments

RAM Coffers– I built conditional memory for LLMs 27 days before DeepSeek'sEngram

https://github.com/Scottcjn/ram-coffers
1•AutoJanitor•8m ago•1 comments

Problem Hunt – Product hunt but for problems

https://problemhunt.xyz/
1•RJagiasi•9m ago•0 comments

Valentino, 'The Last Emperor' of High Fashion, Dies at 93

https://www.wsj.com/style/fashion/valentino-the-last-emperor-of-high-fashion-dies-at-93-6335ee24
2•fortran77•9m ago•2 comments

Lichens Are Wild

https://youtu.be/Tc0nILyks-U?si=BczdHxOonGGOX41E
3•brudgers•10m ago•0 comments

A grounded take on agentic coding for production environments

https://iximiuz.com/en/posts/grounded-take-on-agentic-coding/
1•iforapsy•10m ago•0 comments

NetAlertx – network visibility and continuous asset discovery

https://netalertx.com
2•kristianpaul•12m ago•0 comments

ChatGPT breaks if you ask it about a Spanish verb tense

https://chatgpt.com/share/696e8126-8678-8003-9688-68582af65113
2•seagram•13m ago•0 comments

Ask HN: What should I do with my old laptop in 2026?

2•nanfinitum•13m ago•2 comments

Tesla to restart work on Dojo Supercomputer

https://www.engadget.com/ai/musk-claims-tesla-will-restart-work-on-its-dojo-supercomputer-1731278...
3•nish__•13m ago•0 comments

A Startup Failure That Looked Fine Until It Didn't

https://substack.com/home/post/p-185097529
1•josh_carterPDX•13m ago•0 comments

Chris Messina: Code as Commodity

https://tessl.io/blog/code-as-commodity/
1•nadis•14m ago•0 comments

Musk shocks with $10M donation in Ky. Senate race

https://www.axios.com/2026/01/19/elon-musk-10-million-campaign-donation-kentucky
3•srameshc•16m ago•0 comments

Show HN: Movieagent.io – An agent for movie recommendations (with couple mode)

https://movieagent.io
3•roknovosel•16m ago•0 comments

Signal-Based Adaptive Orchestration: When to Use One AI vs. Many

https://www.blundergoat.com/articles/sbao-5-week-to-5-hours
1•blundergoat•17m ago•1 comments

Show HN: ColmapView is a browser-based COLMAP viewer

https://colmapview.github.io/
1•yxl448•17m ago•0 comments

Macron to urge EU to use trade 'bazooka' in response to Trump's tariffs

https://www.politico.eu/article/macron-to-urge-eu-to-use-trade-bazooka-in-response-to-trumps-tari...
3•tosti•18m ago•2 comments

The Unpredicted vs. the Over-Expected

https://kevinkelly.substack.com/p/the-unpredicted-vs-the-over-expected
1•thm•22m ago•0 comments