frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•24s ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•3m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
1•michaelchicory•5m ago•0 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•8m ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•9m ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•11m ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
1•calcifer•16m ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•20m ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
2•MilnerRoute•21m ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
4•alaserm•22m ago•3 comments

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

https://github.com/paolobietolini/gtm-mcp-server
1•paolobietolini•23m ago•0 comments

Launch of X (Twitter) API Pay-per-Use Pricing

https://devcommunity.x.com/t/announcing-the-launch-of-x-api-pay-per-use-pricing/256476
1•thinkingemote•24m ago•0 comments

Facebook seemingly randomly bans tons of users

https://old.reddit.com/r/facebookdisabledme/
1•dirteater_•25m ago•1 comments

Global Bird Count Event

https://www.birdcount.org/
1•downboots•25m ago•0 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
2•soheilpro•27m ago•0 comments

Jon Stewart – One of My Favorite People – What Now? with Trevor Noah Podcast [video]

https://www.youtube.com/watch?v=44uC12g9ZVk
2•consumer451•30m ago•0 comments

P2P crypto exchange development company

1•sonniya•43m ago•0 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
2•jesperordrup•48m ago•0 comments

Write for Your Readers Even If They Are Agents

https://commonsware.com/blog/2026/02/06/write-for-your-readers-even-if-they-are-agents.html
1•ingve•48m ago•0 comments

Knowledge-Creating LLMs

https://tecunningham.github.io/posts/2026-01-29-knowledge-creating-llms.html
1•salkahfi•49m ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•56m ago•0 comments

Sid Meier's System for Real-Time Music Composition and Synthesis

https://patents.google.com/patent/US5496962A/en
1•GaryBluto•1h ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
6•keepamovin•1h ago•1 comments

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

https://github.com/justin55afdfdsf5ds45f4ds5f45ds4/EmpusaAI
1•justinlord•1h ago•0 comments

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

https://github.com/0xdeadbeefnetwork/sigil-web
2•sickthecat•1h ago•1 comments

White House Explores Opening Antitrust Probe on Homebuilders

https://www.bloomberg.com/news/articles/2026-02-06/white-house-explores-opening-antitrust-probe-i...
1•petethomas•1h ago•0 comments

Show HN: MindDraft – AI task app with smart actions and auto expense tracking

https://minddraft.ai
2•imthepk•1h ago•0 comments

How do you estimate AI app development costs accurately?

1•insights123•1h ago•0 comments

Going Through Snowden Documents, Part 5

https://libroot.org/posts/going-through-snowden-documents-part-5/
1•goto1•1h ago•0 comments

Show HN: MCP Server for TradeStation

https://github.com/theelderwand/tradestation-mcp
1•theelderwand•1h ago•0 comments
Open in hackernews

Show HN: Loki Mode hit 99.67% SWE-Bench – MAF built a SaaS overnight

https://github.com/asklokesh/claudeskill-loki-mode
2•slogansand•1mo ago
Last month I shared Loki Mode here. Since then, benchmarks came back.

SWE-Bench: 99.67% (299/300 problems) HumanEval: 98.78% Pass@1 (162/164)

For context, most single-agent systems hit 30-50%. Best proprietary ones hover around 70-80%.

The difference is architecture. 37 specialized agent types across 6 swarms (engineering, ops, business, data, product, growth). Parallel 3-reviewer code review. Feedback loops that actually learn.

To stress test it, I pointed it at a blank folder and said "build a ServiceNow replacement." It ran for 19 hours and built FireLater - complete ticket management, workflows, CMDB, knowledge base, self-service portal. I wrote zero lines of code.

New in this version: - Kanban board to visualize agent actions in real-time - Perpetual improvement via self-healing feedback loops - Smarter swarm coordination

Still open source. MIT license. Still not selling anything.

Loki Mode: https://github.com/asklokesh/claudeskill-loki-mode FireLater (built by Loki Mode): https://github.com/asklokesh/FireLater

Happy to answer questions about the architecture or benchmarks.

Comments

slogansand•1mo ago
Author here. Quick context on the benchmarks:

We used RARV (Retrieve, Analyze, Reason, Validate) pattern with multi-agent collaboration. Each problem gets worked by specialized agents, reviewed by 3 parallel reviewers (code, business logic, security), and only merged after consensus.

The 99.67% isn't cherry-picked. Full run against standard SWE-Bench dataset. Happy to share methodology if anyone wants to reproduce.

slogansand•1mo ago
On the swarm architecture for those curious:

Engineering (8 types): frontend, backend, database, mobile, API, QA, perf, infra Operations (8 types): devops, SRE, security, monitoring, incident, release, cost, compliance Business (8 types): marketing, sales, finance, legal, support, HR, investor, partnerships Data (3 types): ML, data eng, analytics Product (3 types): PM, design, tech writer Growth (4 types): growth hacker, community, success, lifecycle Review (3 types): code, business, security

Agents don't step on each other. Frontend agent never thinks about database schemas. QA agent never writes deployment scripts. Domain isolation is key.

slogansand•1mo ago
For the skeptics (fair): FireLater repo has full git history. You can see the commits. No human intervention in the implementation phase.

I reviewed outputs and approved deployments. But architecture decisions, code, tests, docs - all Loki Mode.

It's not perfect. Some rough edges. But it works and enterprises can self-host it today.

slogansand•1mo ago
vs single-agent coding assistants: They tap out around 50% on SWE-Bench. No specialization. No parallel review. No self-healing.

vs other multi-agent frameworks: Most focus on chat or simple task delegation. Loki Mode runs full SDLC - from PRD to deployed product with monitoring and business ops.

vs hiring a team: Obviously humans are better for ambiguous problems. But for well-defined PRDs, this removes the "I'll get to it this weekend" bottleneck.

slogansand•1mo ago
Last time someone raised concerns about web crawling for competitive research. Valid point.

New version has configurable research modes. You can disable external crawling entirely and run fully offline if needed. Feedback heard.