I’ve been working on a tool called DatumInt that inspects data files or data snippets for subtle issues that often slip past basic parsing and only cause problems downstream.
This came from repeatedly seeing files that:
- parsed successfully
- passed schema checks
- but still caused issues later (empty required fields, weird placeholder values, encoding/whitespace problems, type inconsistencies, etc.)
Right now it’s a simple web app:
- upload a data file or paste data
- run an inspection
- get a structured report of potential data quality issues
This is very early and focused on small-to-medium files. It’s not meant to replace full data quality frameworks or observability tools, more of a fast “sanity check” at the file boundary.
I’m mainly trying to learn:
- what kinds of file-level issues people actually run into
- whether this kind of inspection is useful in real workflows
If you try it, I’d really appreciate any honest feedback or cases it misses.