I’m open sourcing an autonomous financial research and analysis agent. The agent can search SEC filings, extract key financials, and build financial models.
It’s scored 80% on the public Finance Agent validation set with GPT-5, compared to the top result of 55% listed on their website for their private validation set.
But, their public validation set has mistakes. There are quite a few cases where the “ground truth” answers in the benchmark are wrong. I’ve documented each case with citations directly to SEC Edgar here (https://github.com/lucasastorian/intellifin-agent/blob/main/...)
Accuracy with GPT-5 jumps to 92% once we fix those mistakes in the eval.
You can clone the repo and rerun the benchmark yourself.
Next step should be turning this into an open-source “Cursor for Finance,” with a proper UI for equities research and financial modeling.
Feedback & questions are welcome.