frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I need AI that scans every PR and issue and de-dupes

https://twitter.com/steipete/status/2023057089346580828
11•vibeprofessor•3h ago

Comments

ranger_danger•1h ago
> Worked all day yesterday and got like 600 commits in. It was 2700; now it's over 3100.

Why? There's no reason you need to actually handle that many in a day, right? Pace yourself.

slowcache•1h ago
The reviews must be heavily AI assisted in order to get that sort of volume in.

Either way, it doesn't surprise me that this number is so high. Productivity chasing is the name of the game for AI, regardless of how sustainable or helpful this extra work done actually is

CodingJeebus•1h ago
> How's no startup working on this?

Because there's no money in trying to filter out noise that costs next to nothing to generate. It's like asking why no startup is trying to bring forum moderation to the masses.

ranger_danger•42m ago
I think anti-spam providers might disagree with that take.
pavel_lishin•41m ago
> Because there's no money in trying to filter out noise that costs next to nothing to generate.

Not yet, but when there's so much more noise than signal, it'll become valuable.

ltbarcly3•1h ago
I mean you can just do this with claude code or opencode. I suggest opencode and gemini pro since it has a nice big context window. If you are trying to do something like this on the website version of the models just forget it, stop using those, they are like toys compared to the CLI tools.

Step 1: have it sum up every issue and pr in like 100 words. You can have it do it using subagents working on subsets of the tickets so it doesn't take forever.

Step 1a: concatenate all the summary files to one big file.

Step 2: have it check pairs that seem duplicate from the summary. You may have to force it to read the entire file, for whatever reason models are trained to try to avoid just reading stuff into their context and will try grep and writing scripts and whatever else.

Step 3: repeat the above until it stops finding dupes.

I think this will probably take about 4 hours? 2 hours to get the process working and 2 hours of looping it.

If you don't think the above will work well please just move along, don't bother arguing with me because I've done tasks like this over and over and it works great.

Ways to get better results in general:

- Start by having it write a script to dump all the relevant information you will need up front. It's much faster at reading files than trying to do mcp calls. It's also less likely to pretend to read files and just assume it didn't find anything. (happens more than you think)

- Break the problem down into clear steps for the model, don't just give it a vague project. Just paste the steps above and it should work fine.

- Check what it is doing. Don't assume that because it says it read a file it actually read it, it will very often read the first 1000 bytes, then not read any of the rest of it, then just assume it read everything. In fact ChatGPT will complain that the input is truncated when it is the one that chose to only read the first part.

ferngodfather•1h ago
I asked Copilot (work) to do this with a sheet and the summary it gave each time was so generic I couldn't tell one ticket from another. Feeding it tickets individually was fine, but in a spreadsheet it just seemed to forget.

Would be interested to learn how we can get true foreach loops.

tayo42•1h ago
For 3k issues it's 3000x3000 checks to find duplicates? Can you cache similarity?
7e•1h ago
Nearest neighbor embedding search.
cadamsdotcom•58m ago
No one else has done it and code is easier than ever to create. This tool needs to be built by the person closest to the problem.

Ask your agent for ways to do this using code, not more AI.

It might propose - and build! - an embeddings based system and scraper for your issues & PRs. Using that will burn zero tokens and you can iterate on it as you think of improvements.

tantalor•55m ago
Hire a staff
akmarinov•25m ago
He already has dozens of Codex and Claude code accounts
thatjoeoverthr•44m ago
It's surprisingly difficult, and the "obvious" techniques (just do embeddings) don't really work. I wrote about it and did benchmarks here: https://joecooper.me/blog/redundancy/
forty•43m ago
It's not clear to me: is he asking us to bluid this or is he using twitter to ask it to its clawd bot?
forty•41m ago
Or more meta: is this message from the bot itself, controlling his twitter, who got fed up because it's also merging the MRs?
DangitBobby•28m ago
People aren't even good at this task if Stack Overflow is any indication.

Magnus Carlsen Wins the Freestyle (Chess960) World Championship

https://www.fide.com/magnus-carlsen-wins-2026-fide-freestyle-world-championship/
30•prophylaxis•1h ago•6 comments

I’m joining OpenAI

https://steipete.me/posts/2026/openclaw
277•mfiguiere•1h ago•191 comments

LT6502: A 6502-based homebrew laptop

https://github.com/TechPaula/LT6502
272•classichasclass•6h ago•102 comments

GNU Pies – Program Invocation and Execution Supervisor

https://www.gnu.org.ua/software/pies/
42•smartmic•2h ago•33 comments

Audio is the one area small labs are winning

https://www.amplifypartners.com/blog-posts/arming-the-rebels-with-gpus-gradium-kyutai-and-audio-ai
47•rocauc•2d ago•5 comments

Radio host David Greene says Google's NotebookLM tool stole his voice

https://www.washingtonpost.com/technology/2026/02/15/david-greene-google-ai-podcast/
49•mikhael•5h ago•38 comments

Modern CSS Code Snippets: Stop writing CSS like it's 2015

https://modern-css.com
145•eustoria•5h ago•52 comments

I fixed Windows native development

https://marler8997.github.io/blog/fixed-windows/
621•deevus•12h ago•304 comments

EU bans the destruction of unsold apparel, clothing, accessories and footwear

https://environment.ec.europa.eu/news/new-eu-rules-stop-destruction-unsold-clothes-and-shoes-2026...
694•giuliomagnifico•6h ago•461 comments

Pocketblue – Fedora Atomic for mobile devices

https://github.com/pocketblue/pocketblue
31•nikodunk•6h ago•4 comments

I Gave Claude Access to My Pen Plotter

https://harmonique.one/posts/i-gave-claude-access-to-my-pen-plotter
35•futurecat•2d ago•12 comments

Show HN: VOOG – Moog-style polyphonic synthesizer in Python with tkinter GUI

https://github.com/gpasquero/voog
51•gpasquero•3h ago•4 comments

Show HN: Microgpt is a GPT you can visualize in the browser

https://microgpt.boratto.ca
70•b44•4h ago•5 comments

Towards Autonomous Mathematics Research

https://arxiv.org/abs/2602.10177
69•gmays•4h ago•32 comments

Real-time PathTracing with global illumination in WebGL

https://erichlof.github.io/THREE.js-PathTracing-Renderer/
106•tobr•3d ago•10 comments

Gwtar: A static efficient single-file HTML format

https://gwern.net/gwtar
157•theblazehen•7h ago•55 comments

Show HN: Klaw.sh – Kubernetes for AI agents

https://github.com/klawsh/klaw.sh
9•eftalyurtseven•6h ago•0 comments

Continuous batching from first principles (2025)

https://huggingface.co/blog/continuous_batching
7•jxmorris12•39m ago•1 comments

Show HN: Pangolin: Open-source identity-based VPN (Twingate/Zscaler alternative)

https://github.com/fosrl/pangolin
15•miloschwartz•12h ago•6 comments

Show HN: Knock-Knock.net – Visualizing the bots knocking on my server's door

https://knock-knock.net
75•djkurlander•6h ago•24 comments

Two different tricks for fast LLM inference

https://www.seangoedecke.com/fast-llm-inference/
153•swah•13h ago•63 comments

Show HN: Deadlog – almost drop-in mutex for debugging Go deadlocks

https://github.com/stevenctl/deadlog
22•dirteater_•5d ago•1 comments

Show HN: DSCI – Dead Simple CI

https://github.com/melezhik/DSCI
13•melezhik•6h ago•4 comments

Hideki Sato, designer of all Sega's consoles, has died

https://www.videogameschronicle.com/news/hideki-sato-designer-of-segas-consoles-dies-age-75/
298•magoghm•7h ago•30 comments

Oat – Ultra-lightweight, zero dependency, semantic HTML, CSS, JS UI library

https://oat.ink/
438•twapi•15h ago•118 comments

Amazon's Ring and Google's Nest reveal the severity of U.S. surveillance state

https://greenwald.substack.com/p/amazons-ring-and-googles-nest-unwittingly
639•mikece•10h ago•455 comments

Editor's Note: Retraction of article containing fabricated quotations

https://arstechnica.com/staff/2026/02/editors-note-retraction-of-article-containing-fabricated-qu...
107•bikenaga•4h ago•90 comments

Sony Jumbotron Image Control System (1998) [pdf]

https://pro.sony/s3/cms-static-content/operation-manual/3864848111.pdf
24•xattt•3d ago•10 comments

How Is Data Stored?

https://www.makingsoftware.com/chapters/how-is-data-stored
143•tzury•5d ago•14 comments

Show HN: Lightwave – Real-time notes app, 3.5 years of hand-rolled JavaScript

22•jv22222•2h ago•23 comments