Hey HN, as a former data analyst, I’ve been tooling around trying to get agents to do my old job. The result is this system that gets you maybe 80% of the way there. I think this is a good data point for what the current frontier models are capable of and where they are still lacking (in this case — hypothesis generation and general data intuition).
Some initial learnings:
- Generating web app-based reports goes much better if there are explicit templates/pre-defined components for the model to use.
- Claude can “heal” broken charts if you give it access to chart images and run a separate QA loop.
Would either feedback from the community or to hear from others that have tried similar things!