Can LLMs Do Accounting?

https://accounting.penrose.com/

6•yunyu•6mo ago

Comments

yunyu•6mo ago

LLMs are on the verge of replacing data scientists and investment bankers. But can they perform simple accounting tasks for a real business?

We built AccountingBench, a test where LLMs must "close the books" for a real SaaS business using 1 year of Stripe/Ramp/Rippling/Mercury data.

Claude 4 and Grok 4 start strong - within 1% of human CPA baselines in month 1.

But as time progresses, all models inevitably accumulate compounding errors and exhibit erratic behavior, causing significant deviations.

That said, the early accuracy here is promising. With targeted post-training, models may be able to replace humans for this kind of work.

simmerup•6mo ago

Accounting isn't really the type of thing that can accept errors though is it?

Like it needs to be 0% error rate

yunyu•6mo ago

A certain level of errors is tolerable/inevitable. But the accountants need to be able to correct for them once they build up

bell-cot•6mo ago

Given their inclination to fabricate user-pleasing answers...could I let an LLM do my tax returns?

yunyu•6mo ago

No comment, the good news is that accounting and taxes are verifiable - so in principle it is possible to RL models to do them correctly

mmarian•6mo ago

I was just thinking of that earlier today, really cool!

AlSweigart•6mo ago

LLMs are really not good at following specific processes like math. They operate off vibes.

Ask Claude to multiply two ten-digit numbers. It gets the first one or two digits correct, and then makes up the rest.

ChatGPT used to have the same problem, but now it writes a program to perform the math for it.

yunyu•6mo ago

This was true up until they started training them using Reinforcement Learning from Verifier Feedback (started with O1). By sticking a calculator in the training loop, they seem to have gotten out of the arithmetic error regime. That said, the ChatGPT default is 4o which is still susceptible to these issues.