Ask LLM Agents to Classify Problems Before Starting

https://futuresearch.ai/merge-cardinality/

6•ddp26•2h ago

Comments

guerython•2h ago

Love the idea. In our merge worker we run a quick cardinality scan before anything else: left-unique ratio, duplicates on both sides, and even a crude 'if every right row has a unique ID but left rows repeat, it's many-to-one' heuristic. That feed becomes a hard constraint in the prompt and we bail to web search if the stats clash with the agent decision. The clash queue plus a quick search run drops our false positives from ~10% into the low single digits while the pipeline stays cheap. Do you ever reuse a stored classification so the second merge between the same sources skips the extra gate?

parad0x0n•1h ago

storing the classification definitely makes sense! We re-use the classification for the different merge attempts in the same run but do not store it because mostly we work with different data every time