frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

How LLM agents solve the table merging problem

https://futuresearch.ai/deep-merge-tutorial/
17•ddp26•2h ago

Comments

mckennameyer•2h ago
Interesting approach with the cascade. How do you decide when to escalate from fuzzy matching to LLM?
parad0x0n•2h ago
So fuzzy matching only makes sense if you expect two columns having the same data more or less, otherwise you can skip that step.

And then you have to pick a threshold -> if similarity of strings is above that threshold, it's a match, otherwise, not. Threshold should be high to prevent false positives. LLM will take care of the non-matches

jackfranklyn•1h ago
Been working on this exact problem in the financial/accounting space - matching bank statement rows to accounting records. Real-world messiness makes it interesting:

The fuzzy threshold question is tricky because false positives are worse than false negatives. A user seeing a wrong match erodes trust fast. We ended up with a tiered approach: high-confidence matches go through automatically, medium-confidence gets surfaced for human review, low-confidence stays unmatched rather than guessing.

One thing we found: the hardest cases aren't the ones where strings are slightly different - they're the ones where the same transaction appears with completely different descriptions on each side. "PAYPAL *ACME" vs "Invoice 1234 - Acme Ltd". No amount of fuzzy matching helps there. That's where learning from historical patterns (how did the user match these before?) beats trying to infer semantic similarity from scratch every time.

ddp26•1h ago
Yep! We have lots of examples like that where two vendors, or two customers, are completely non-matching. With LLMs and LLM web agents, you also can associate things that are not the same entity.

One example we have is merging a table of companies to a table of company websites. You get things like "Acme Corp" matching "my-logicistics.com" that no LLM has memorized, so you have to look them up using the web. ReAct web agents work really well here, but it can be very expensive, so it's all about doing this cost efficiently.

GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers

https://gptzero.me/news/neurips/
584•segmenta•6h ago•296 comments

Show HN: isometric.nyc – giant isometric pixel art map of NYC

https://cannoneyed.com/isometric-nyc/
397•cannoneyed•5h ago•118 comments

Qwen3-TTS family is now open sourced: Voice design, clone, and generation

https://qwen.ai/blog?id=qwen3tts-0115
373•Palmik•8h ago•108 comments

CSS Optical Illusions

https://alvaromontoro.com/blog/68091/css-optical-illusions
99•ulrischa•4h ago•10 comments

Compiling Scheme to WebAssembly

https://eli.thegreenplace.net/2026/compiling-scheme-to-webassembly/
30•chmaynard•4d ago•6 comments

Recent discoveries on the acquisition of the highest levels of human performance

https://www.science.org/doi/abs/10.1126/science.adt7790
60•colincooke•3h ago•24 comments

'Active' sitting is better for brain health: review of studies

https://www.sciencealert.com/not-all-sitting-is-equal-one-type-was-just-linked-to-better-brain-he...
30•mikhael•2h ago•13 comments

Tree-sitter vs. Language Servers

https://lambdaland.org/posts/2026-01-21_tree-sitter_vs_lsp/
180•ashton314•7h ago•49 comments

Why does SSH send 100 packets per keystroke?

https://eieio.games/blog/ssh-sends-100-packets-per-keystroke/
152•eieio•2h ago•110 comments

Launch HN: Constellation Space (YC W26) – AI for satellite mission assurance

27•kmajid•4h ago•5 comments

Show HN: First Claude Code client for Ollama local models

https://github.com/21st-dev/1code
17•SerafimKorablev•4h ago•8 comments

Reverse engineering Lyft Bikes for fun (and profit?)

https://ilanbigio.com/blog/lyft-bikes.html
30•ibigio•5h ago•7 comments

Keeping 20k GPUs healthy

https://modal.com/blog/gpu-health
53•jxmorris12•4d ago•16 comments

Mote: An Interactive Ecosystem Simulation [video]

https://www.youtube.com/watch?v=Hju0H3NHxVI
44•evakhoury•23h ago•3 comments

Your app subscription is now my weekend project

https://rselbach.com/your-sub-is-now-my-weekend-project
94•robteix•3d ago•100 comments

AnswerThis (YC F25) Is Hiring

https://www.ycombinator.com/companies/answerthis/jobs/r5VHmSC-ai-agent-orchestration
1•ayush4921•4h ago

I was banned from Claude for scaffolding a Claude.md file?

https://hugodaniel.com/posts/claude-code-banned-me/
232•hugodan•3h ago•178 comments

My first year in sales as technical founder

https://www.fabiandietrich.com/blog/first-year-in-sales.html
11•f3b5•5d ago•2 comments

Design Thinking Books (2024)

https://www.designorate.com/design-thinking-books/
255•rrm1977•10h ago•118 comments

It looks like the status/need-triage label was removed

https://github.com/google-gemini/gemini-cli/issues/16728
253•nickswalker•5h ago•62 comments

A Year of 3D Printing

https://brookehatton.com/blog/making/a-year-of-3d-printing/
58•nindalf•5d ago•62 comments

Extracting a UART Password via SPI Flash Instruction Tracing

https://zuernerd.github.io/blog/2026/01/07/switch-password.html
3•Eduard•22m ago•0 comments

Vulnerable WhisperPair Devices – Hijack Bluetooth Accessories Using Fast Pair

https://whisperpair.eu/vulnerable-devices
13•gnabgib•4d ago•4 comments

Show HN: Text-to-video model from scratch (2 brothers, 2 years, 2B params)

https://huggingface.co/collections/Linum-AI/linum-v2-2b-text-to-video
21•schopra909•5h ago•7 comments

Show HN: BrowserOS – "Claude Cowork" in the browser

https://github.com/browseros-ai/BrowserOS
32•felarof•5h ago•13 comments

Show HN: CLI for working with Apple Core ML models

https://github.com/schappim/coreml-cli
15•schappim•1h ago•0 comments

TTY and Buffering

https://mattrighetti.com/2026/01/12/tty-and-buffering
31•mattrighetti•5d ago•5 comments

Show HN: Synesthesia, make noise music with a colorpicker

https://visualnoise.ca
20•tevans3•16h ago•8 comments

ISO PDF spec is getting Brotli – ~20 % smaller documents with no quality loss

https://pdfa.org/want-to-make-your-pdfs-20-smaller-for-free/
137•whizzx•11h ago•83 comments

Skill.md: An open standard for agent skills

https://www.mintlify.com/blog/skill-md
30•skeptrune•3h ago•6 comments