A large test run would fail, the agent would get a huge wall of output, and then a lot of the work would go into reconstructing whether the failures were mostly repeats of the same blocker or several different problems.
Sift is a CLI that sits between a command and the agent. Instead of forwarding raw output directly, it tries to group repeated failures into root-cause buckets and return a short diagnosis with an anchor and next step.
The main idea is simple. If 125 tests fail for one reason, the agent should pay for that reason once.
It tries local heuristics first and only escalates if it cannot explain the output confidently. Raw output is still available as fallback.
It is most useful on noisy test runs so far, especially pytest, vitest, and jest, but I have also been using the same idea on typecheck, lint, build failures, audits, and diffs.