Ask HN: How do we benchmark data extraction, analysis and subsequent enrichment
1•srijansriv•2h ago
I have been working on this system that essentially has to extract data from a bunch or structured and unstructured sources, analyse relevance and correctness of the data (im sorry this part is vague) and using the data obtained to fill forms and create write-ups. i do not want to mindlessly use any llm. i want to improve with proof for each of those (broadly, 3) steps. how do we do that? if there are any benchmarks for all these, what systems are leading these benchmarks?