frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
1•o8vm•58s ago•0 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•1m ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•14m ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•17m ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
1•helloplanets•20m ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•28m ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•29m ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•31m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•31m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•34m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•34m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•39m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
3•throwaw12•40m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•40m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•41m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•43m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•46m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•49m ago•1 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
2•mgh2•55m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•57m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•1h ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•1h ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•1h ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•1h ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•1h ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•1h ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•1h ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•1h ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•1h ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•1h ago•1 comments
Open in hackernews

Backlog.md – Markdown‑native Task Manager and Kanban visualizer for any Git repo

https://github.com/MrLesk/Backlog.md
254•mrlesk•7mo ago

Comments

mrlesk•7mo ago
I threw Claude Code at an existing codebase a few months back and quickly quit— untangling its output was slower than writing from scratch. The fix turned out to be process, not model horsepower.

Iteration timeline

==================

• 50 % task success - added README.md + CLAUDE.md so the model knew the project.

• 75 % - wrote one markdown file per task; Codex plans, Claude codes.

• 95 %+ - built Backlog.md, a CLI that turns a high-level spec into those task files automatically (yes, using Claude/Codex to build the tool).

Three step loop that works for me 1. Generate tasks - Codex / Claude Opus → self-review.

2. Generate plan - same agent, “plan” mode → tweak if needed.

3. Implement - Claude Sonnet / Codex → review & merge.

For simple features I can even run this from my phone: ChatGPT app (Codex) → GitHub app → ChatGPT app → GitHub merge.

Repo: https://github.com/MrLesk/Backlog.md

Would love feedback and happy to answer questions!

mitjam•7mo ago
Really love this.

Would love to see an actual end to end example video of you creating, planning, and implementing a task using your preferred models and apps.

mrlesk•7mo ago
Will definitely do. I am also planning to run a benchmark with various models to see which one is more effective at building a full product starting from a PRD and using backlog for managing tasks
bazooka5798•7mo ago
I'd love to see openRouter connectivity to try non Claude models for some of the planning parts of the cycle.
westurner•7mo ago
Is there an established benchmark for building a full product?

- SWE-bench leaderboard: https://www.swebench.com/

- Which metrics for e.g. "SWE-Lancer: a benchmark of freelance software engineering tasks from Upwork"? https://news.ycombinator.com/item?id=43101314

- MetaGPT, MGX: https://github.com/FoundationAgents/MetaGPT :

> Software Company as Multi-Agent System

> MetaGPT takes a one line requirement as input and outputs user stories / competitive analysis / requirements / data structures / APIs / documents, etc. Internally, MetaGPT includes product managers / architects / project managers / engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.

- Mutation-Guided LLM-based Test Generation: https://news.ycombinator.com/item?id=42953885

- https://news.ycombinator.com/item?id=41333249 :

- codefuse-ai/Awesome-Code-LLM > Analysis of AI-Generated Code, Benchmarks: https://github.com/codefuse-ai/Awesome-Code-LLM :

> 8.2 Benchmarks: Integrated Benchmarks, Evaluation Metrics, Program Synthesis, Visually Grounded Program, Synthesis, Code Reasoning and QA, Text-to-SQL, Code Translation, Program Repair, Code Summarization, Defect/Vulnerability Detection, Code Retrieval, Type Inference, Commit Message Generation, Repo-Level Coding

- underlines/awesome-ml/tools.md > Benchmarking: https://github.com/underlines/awesome-ml/blob/master/llm-too...

- formal methods workflows, coverage-guided fuzzing: https://news.ycombinator.com/item?id=40884466

- "Large Language Models Based Fuzzing Techniques: A Survey" (2024) https://arxiv.org/abs/2402.00350

Leave_OAI_Alone•7mo ago
You have compiled an interesting list of benchmarks and adjacent research. The implicit question is whether an established benchmark for building a full product exists.

After reviewing all this, what is your actual conclusion, or are you asking? Is the takeaway that a comprehensive benchmark exists and we should be using it, or is the takeaway that the problem space is too multifaceted for any single benchmark to be meaningful?

westurner•7mo ago
The market - actual customers - is probably the best benchmark for a product.

But then outstanding liabilities due to code quality and technical debt aren't costed in by the market.

There are already code quality metrics.

SAST and DAST tools can score or fix code, as part of a LLM-driven development loop.

Formal verification is maybe the best code quality metric.

Is there more than Product-Market fit and infosec liabilities?

unshavedyak•7mo ago
Would love more detail on your integration with claude. Are you telling claude to use backlog to plan X task? Feels like some MCP integration or something might make it feel more native?

Though i've not had much luck in getting Claude to natively use MCPs, so maybe that's off base heh.

mrlesk•7mo ago
No mcp, just custom instructions.

When you initialize backlog in a folder it asks you if you want to set up agent’s instructions like CLAUDE.md. It is important to say yes here so that Claude knows how to use Backlog.md.

Afterwards you can just write something like: Claude please have a look at the @prd.md file and use ultrathink to create relevant tasks to implement it. Make sure you correctly identify dependencies between tasks and use sub tasks when necessary.

Or you can just paste your feature request directly without using extra files.

Feels a bit like magic

jwpapi•7mo ago
How can I change from gemini to claude ?

Also I’m not fully sure about your setup. In my fresh pov I would next set up agents that check my github repo for backlog tasks and do pull requests on those tasks. If I write a good description and ideally tests I can optimize the results of these.

This creates the possibility of agents checking your backlog and prepare the work.

I usually work with aider everyday and I’m quite fast in achieving task, the next limitation would be the latency and some back and forth. I have some dead time in between. I can definitely define tasks faster than 1-1 AI.

Yeah if you could share a bit more how you do this with Claude we would all be thankful, also I havent seen anywhere to sponsor/tip you, would love to!

thelittleone•7mo ago
I've had same experience. Taskmaster-ai was pretty good, but sometimes the agent ignored it as the project grew larger (can probably prevent that now using claude code hooks).

Trying this project today looks nice. I see you have sub-tasks. Any thoughts on a 'dependency' relation? I.e., don't do X if it is dependent on task A which is not complete.

FYI, there is a 404 in the AGENTS.md GEMINI.md etc pointing to a non existing README.md.

mrlesk•7mo ago
Yep. Dependecies are supported via —dep parameter.

Will check the 404 issues. Thanks for reporting it

jwpapi•7mo ago
Hey man amazing work! You’re a legend
beef_rendang•7mo ago
>ChatGPT app (Codex) → GitHub app → ChatGPT app → GitHub merge

I look forward to a future where we are reduced to rubberstamping fully-agentic-generated code on our glass slates for $0.01 eurodollars a PR.

knownhoot•6mo ago
why codex for planning?
bearjaws•7mo ago
Like the idea of a self hosted kanban in git, one item you should do in your repo is add the installation instructions to the readme :)

I see its a TS app so I am sure the bun bundle is the install, but always good to include in your 5 min intro.

mrlesk•7mo ago
You’re absolutely right

Joking aside there is a npm/bun install -g backlog.md at the top but I can add an extra one in the 5 min intro.

I am using Bun’s new fullstack single file builds. I’m really impressed by how easy it was to set up everything.

rumblefrog•7mo ago
Is there an alternative that integrates with a Jira instance?

Many of my tasks already exists in forms of a Jira ticket, would be interesting to prompt it to take over a specific ticket & update its ticket progress as well.

mrlesk•7mo ago
For such kind of tasks I would go with Taskmaster AI. It had mcp integration and probably could connect with jira.

Backlog is more for smaller projects where you wouldn’t normally have a project management tool

adobbs•7mo ago
Brilliant! Thank you for sharing.

Had similar success with making some more markdown files to help guide the agent but never would have thought of something this useful.

Will try your workflow and backlog on a build this week.

mrlesk•7mo ago
Backlog.md is still a bit rough on the edges but is definitely a proof that markdown files and AI Agents work really well together. I started working on it exactly a month ago
QRY•7mo ago
Ooh, definitely trying this out! I ended up homebrewing a whole context maintainance ritual, but that was a pain to get an AI agent to consistently apply, so it spun out into building a whole project management... thing.

This looks much more thought out, thanks for sharing!

crashabr•7mo ago
Really nice! Have you thought of interfacing with the todo.txt ecosystem?
mrlesk•7mo ago
Didn’t know this tool. Thanks for sharing
TimMeade•7mo ago
That's look fascinating. I will certainly be testing it in the morning! Thanks!
mrlesk•7mo ago
Thanks. Let me know if you have any feedback
ttoinou•7mo ago
Seems like a great idea. How would that work with multiple branches ? One task might be implemented in a different branch, we might want to have a global overview of all the tasks being coded in the main branch

  All data is saved under backlog folder as human‑readable Markdown with the following format task-<task-id> - <task-title>.md (e.g. task-12 - Fix typo.md).

If every "task" is one .md file, I believe AI have issues editing big files, it can't easily append text to a big file due to context window, we need to force a workaround launching a command line to append text instead of editing a file. So this means the tasks have to remain small, or we have to avoid putting too much information in each task.
mrlesk•7mo ago
1) How will it work with multiple branches? Simple: using git :) Git allows to fetch certain files from other branches including remote ones without checking out those branches.

The state is always up to date no matter if you are running backlog.md from main branch or a feature branch.

It works well when there are not many branches but I need to check if I can improve the performance when there are lots of branches.

ttoinou•7mo ago
Nice, so there could be some kind of git kung fu command line to help with that. Maybe we could also have a separate folder using git worktree to post all the information in one branch. That'd duplicates all files though.

Another idea is to use git notes

mrlesk•7mo ago
2) AI Agents have issues editing larger files.

Correct. One of the instructions that ships with backlog.md is to make the tasks “as big as they would fit in a pr”. I know this is very subjective but Claude really gets much better because of this.

https://github.com/MrLesk/Backlog.md/blob/main/src/guideline...

You will notice yourself that smaller atomic tasks are the only way for the moment to achieve a high success rate.

tptacek•7mo ago
This is a good idea. But the screenshots you have show lots of tasks in a project; how are you dispatching tasks (once planned) to an agent, and how are agents navigating the large number of markdown task content you're producing without blowing out their context budget?
mrlesk•7mo ago
For task dispatch I just ask Claude: please take over task 123.

Because of the embedded custom instructions Claude knows exactly how to proceed.

Since I never create too big tasks, what blows most context are actually the docs and the decisions markdown files.

jedimastert•7mo ago
Can we change the title to include that this is a tool for AI? I thought it was just gonna be a visualizer.

The tagline from the repo seems fine: "A tool for managing project collaboration between humans and AI Agents in a git ecosystem"

d1sxeyes•7mo ago
You can use this perfectly fine without AI agents, it just so happens to produce output which is easily ingestable by LLMs. It also has drag and drop visualisation and simple syntax for creating and tracking tasks in your codebase.
dayvough•7mo ago
All I'm wondering is how did you secure the .md TLD?
mrlesk•7mo ago
Hehe. It was a veerry lucky situation.

I sent a message to someone telling that I was working on backlog.md and it turned the name into a link automatically.

I wanted to remove the link and I clicked on it accidentally and discovered that not only there was nothing on that domain but was not registered yet. I got the domain few mins later :)

jasir•7mo ago
It's the gTLD for Moldova, seems to have limited registrar availability[0] but there's no residency/association restriction like some countries impose so anyone can get one. I've seen markdown related projects use it here and there like obsidian.md

[0] https://tld-list.com/tld/md

dist-epoch•7mo ago
https://obsidian.md
kurtis_reed•7mo ago
Part of this confusing trend of naming projects like files
mrlesk•7mo ago
It is so confusing that I had to add custom instructions for Codex telling him that Backlog.md is the project folder and not a file. He was wasting few mins trying to CAT Backlog.md instead of CD
danpalmer•7mo ago
I built myself a tool that does something quite similar. It's a single no-dependency Python script that parses "tasks.md" in the root of the repo which contains a Markdown table of tasks, then has basic support for BLOCKED/READY/DONE/CANCELLED, dependencies, titles, tags, etc.

For a project that is just for me, it's exactly what I need – dependency tracking and not much more, stored offline with the code. Almost all of the code for it was written by Gemini.

mrlesk•7mo ago
Yep. I’m happy to hear that more and more people are converging towards a very similar process as this ends up being the most productive.
jwpapi•7mo ago
With aider you can run a second instance along with --watch-files and if you in your tasks do // #AI it will be added to the chat and with // !AI AI will then respond

so you can do

`backlog task create "Feature" --plan "1. Research\n2. Suggest Implementation// #AI AI!"` (yes weird order with the !)

and in the background aider will propose solutions.

I’m not sure how this compares to Claude Code or Codex, but its LLM-flexible. Downside is it doesn’t create a pull request. So it’s more helpful for local code.

I would probably add some Readme.md files to the --watch-files session and I think you need to click once [D]ont ask again so it wont keep asking you to add files

oc1•7mo ago
Ah, this is what the future after Jira looks like.
totaa•7mo ago
Would love to be able to use this UI, connected to an external source like Linear.
cloudking•7mo ago
This is a neat implementation, personally I use https://www.task-master.dev
urlwolf•7mo ago
How is this different from taskwarrior? I feel the use cases are overlapping (which is a good thing as taskwarrior rewrite in rust is a mess)
mrlesk•7mo ago
Never heard of taskwarrior. I will check it out. Thanks for sharing
nzach•7mo ago
Is there a proper way to use this project without commiting files to git ? I just want to try it out in a project I'm working on, but don't want to put it in the history.

What I did is to add the backlog folder into the .gitignore file, but after every command I get a lengthy error about a git command error.

And even if I were to add these files to my repository, I would want to add them manually.

totallykvothe•7mo ago
Subrepo?
slig•7mo ago
There's a GH issue tracking the auto committing, and the author says they'll solve this today.
jwpapi•7mo ago
see my comment regarding using it with aider
mrlesk•7mo ago
I made it configurable and autoCommit is false by default
slig•7mo ago
Thank you so much!
bityard•7mo ago
Whatever is going on in the that GIF must be very impressive, but it goes by so fast it's impossible to tell for sure.
JimDabell•7mo ago
> Markdown-native tasks -- manage every issue as a plain .md file

> Rich query commands -- view, list, filter, or archive tasks with ease

If these things appeal to you and you haven’t already looked at it, the GitHub CLI tool gh is very useful. For instance:

    gh repo clone MrLesk/Backlog.md
    cd Backlog.md
    gh issue view 140
    gh issue view 140 --json body --template "{{.body}}"
— https://cli.github.com

You can do things like fork repos, open pull requests from your current branch, etc.

deafpolygon•7mo ago
What exactly is Markdown-native?
theshrike79•7mo ago
Markdown is used for data storage and transport?
jprokay13•7mo ago
Neat! I am going to check this out. I recently built an MCP system similar to this called Nonlinear (so clever) that uses SQLite for storage that lives outside the repo. Honestly though, in repo is the better option.
eadmund•7mo ago
This seems tailor-made for Org mode! Seeing all those ‘To do’ and ‘Done’ reminds me of it. Did you consider building atop Org mode?
theshrike79•7mo ago
Ooh, the idea of having a CLI for the LLM to mark off tasks without having to pollute context by reading the file over and over again is cool!
gekpp•6mo ago
Thanks for backlog.md. We love it for our project. The problem we face now is that we have two separate repo for BE (Golang) and FE (next.js). And we wanted to have one backlog for both projects. But backlog resides inside git repo.

Can you propose a neat solution of this problem?