frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-source-refact-ai/
3•kate_at_refact•1y ago

Comments

kate_at_refact•1y ago
Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Building a Local AI Workspace Inside VS Code

https://jsdevspace.substack.com/p/building-a-fully-local-ai-workspace
1•javatuts•46s ago•0 comments

In-Kernel Broadcast Optimization: Co-Designing Kernels for RecSys Inference

https://pytorch.org/blog/in-kernel-broadcast-optimization-co-designing-kernels-for-recsys-inference/
1•gmays•1m ago•0 comments

ChatGPT adoption broadened in early 2026

https://openai.com/signals/research/2026q1-update/
1•Brajeshwar•1m ago•0 comments

Company behind GLiNER model released open source model for running LLM guardrail

https://pioneer.ai/blog/gliguard-16x-faster-safety-moderation-with-a-small-language-model
1•neon_share1•1m ago•0 comments

Dependencies Are Someone Else's Attack Surface

https://quodeq.ai/blog/supply-chain-attack-surface/
1•VictorPurMar•2m ago•0 comments

AI Is Starting to Build Better AI (Recursive self-improvement)

https://spectrum.ieee.org/recursive-self-improvement
1•marojejian•4m ago•0 comments

AI overlay that stays invisible to screen recorders

1•unviewable•4m ago•0 comments

Are LLM Useful for Solo Founders

1•sinsudo•6m ago•1 comments

US budget watchdog estimates Golden Dome will cost $1.2T

https://www.reuters.com/business/aerospace-defense/us-budget-watchdog-estimates-golden-dome-will-...
2•OutOfHere•6m ago•0 comments

What Is RTSP Streaming and Why It Is Still Relevant in 2026

https://www.red5.net/blog/4-reasons-rtsp-streaming-is-still-relevant/
1•mondainx•6m ago•0 comments

Show HN: Mealplannr – turn YouTube chef videos into weekly meal plans

https://mealplannr.io
1•nullandvoid•6m ago•0 comments

GitLab Outage

https://status.gitlab.com/
2•Sparkle-san•7m ago•1 comments

New Project

https://nebulad-studios.gt.tc
1•dom_kom•7m ago•1 comments

Show HN: Awesome Stars- render github awesome list with live star/fork badges

https://awesome-stars.github.io
1•arashbehmand•9m ago•0 comments

A code (reformatting) conundrum in Python, and heuristics

https://utcc.utoronto.ca/~cks/space/blog/python/CodeFormattingBlockHeuristics
1•speckx•9m ago•0 comments

Cook a Django project well, the agent-skill take on cookiecutter

https://github.com/RobustaRush/seedkit
2•kmmbvnr_•14m ago•0 comments

Elon Musk went to court. The judge wasn't amused

https://www.washingtonpost.com/technology/2026/05/02/musk-altman-openai-trial/
2•1vuio0pswjnm7•16m ago•0 comments

French Defenestration

https://darkomulej.substack.com/p/french-defenestration
3•Darius-BC•19m ago•0 comments

Learnings from Building an Ecommerce Chatbot for a Hardware Store

https://www.tommyjepsen.com/blog/building-an-ecommerce-chatbot-for-jemfix
2•tommyjepsen•19m ago•0 comments

Ask HN: Do people still pay for simple utility tools, or use ChatGPT/Claude now?

3•kamscruz•20m ago•0 comments

High-Quality Chaos

https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/
2•akyuu•20m ago•0 comments

After extensive work with agents, the non-technical sentence is the shape I see

https://sdocs.dev/s/qtIcZCIL#k=sHoAJ4Syfkv25404v5a3Ft4gJBPZwj7aAhquWmdzDPM
2•FailMore•20m ago•0 comments

How the World Became a Casino

https://www.404media.co/how-the-world-became-a-casino/
2•Brajeshwar•20m ago•0 comments

Show HN: Game evolves itself with your wishes

https://aion.quest/
2•xkoda•22m ago•0 comments

Notepad++ v8.9.5 Release

https://notepad-plus-plus.org/news/v895-released/
3•neustradamus•22m ago•0 comments

Studio Kylemcdonald.net

https://kylemcdonald.net/
2•rolph•22m ago•0 comments

Lies, damned lies, and Elastic's benchmarks

https://www.gouthamve.dev/lies-damned-lies-and-elastics-benchmarks/
2•valyala•24m ago•0 comments

E3 started 31 years ago and gaming was never the same

https://nichegamer.com/e3-started-31-years-ago-today-and-gaming-was-never-the-same/
2•HelloUsername•26m ago•0 comments

Unauthorized Anthropic stock sales and investment scams

https://support.claude.com/en/articles/13704655-unauthorized-anthropic-stock-sales-and-investment...
3•Nrbelex•26m ago•1 comments

The Problem with "Mathematically Proven" Claims About LLMs

https://webdirections.org/blog/the-problem-with-mathematically-proven-claims-about-llms/
2•gmays•27m ago•0 comments