KPMG wrote 100-page prompt to build agentic TaxBot

https://www.theregister.com/2025/08/20/kpmg_giant_prompt_tax_agent/

11•ofrzeta•6h ago

Comments

ofrzeta•6h ago

"It is very efficient," Munnelly told the Forrester conference. "It does what our team used to do in about two weeks, in a day. It will strip through our documents and the legislation and produce a 25-page document for a client as a first draft.

"That speed is important," he added. "If we have a client who is about to do a merger, and they want to understand the tax implications, getting that knowledge in a day is much more important than getting it in two weeks' time."

---

I really wonder what is the foundation for their confidence in LLMs. If you have ever used ChatGPT you will be highly skeptic that the output is correct. If it's code, you can at least compile, typecheck, run it, to verify it to some extent. How do you do that with a 25 page report?

defrost•6h ago

> How do you do that with a 25 page report?

Like any technical 25 page report it'll be ballpark with reality, shorter to read and grasp than crawling through a wall of document filled boxes, and passed to other people to 'verify' / offer their opinions on.

Once contracts are in place with millions of dollars in play (or tens of millions, or billions) there will be clauses addressing responsibility and recompense should key parts of the reports upon which an agreement is based prove to be false.

The world runs on technical reports that aren't perfect, but "near enough"; errors are assumed and a frequency of deliberate malfeasance (knowingly lying, misleading, faking results) can be estimated.

Part of my career consisted of producing summaries of two to three thousand documents a day from stock markets about the globe, documents that ranged from three lines announcing a change on a board, a table disclosing a change in holdings by largest investors, etc. to large (hundred+ page) quarterly and annual reports, to small book economic feasibility reports with wads of raw data, interpretation, proposed plans, costings, timelines, etc.

> It will strip through our documents and the legislation and produce a 25-page document for a client as a first draft.

is the key point here, it's a rapid first draft of the major dot points seen to be most important for <whatever>. It is intended to be crawled through with a finer comb and a keen eye before contracts are signed based on a separate framing of <deal>.

The big change here is that an AI churns out a draft faster, the quality of the document will be as suspect as a non AI created human first draft .. untrusted.

ofrzeta•5h ago

Untrusted ... but does it have any value at all when you can't be sure that a lot of it is hallucinated? After all, LLMs are not very good with numbers.

defrost•5h ago

You're correct that I can't be sure as I don't work at KPMG and haven't had any contact with their piles of documents, existing practices, or TaxBot summaries.

What I do know as a fact is that KPMG are self reporting satisfaction with their in house work on putting such a thing together.

The 'proof' will be the next five years of application to corporate clients.

> After all, LLMs are not very good with numbers.

The assumption, always, should be that neither are interns.

Hence why draft summaries should be reviewed and sanity checked by senior experienced people.

I would assume (based on my prior work summarizing large volumes of data for mineral and energy resources domain) that any report produced would have references back to source documents and pages making the task of cross checking the product simple and relatively straightforward.

Neywiny•55m ago

I think the concern is more than what it gathered, I think there's a lot of skepticism over it missing something. The same way so many AI tools just ignore commands, imagine it just ignoring a few sentences. Maybe like:

> We'll sell you our company for $100. But, you have to do a hand-stand and spin around 5 times.

If the AI only puts the first sentence in the summary, you could see how it'd be a bad day for the client. Any human would go "huh that's weird, I'll make sure that's noted in the summary" but in my experience, AIs just don't have that feeling.

defrost•22m ago

What's being ignored, it seems, is this is explicitly an in-house tool for a first draft summary to be reviewed by an in-house accountant prior to a final presentation to a client.

> imagine it just ignoring a few sentences.

Sure. Just like the risk every such human intern | associate | junior prepared similar draft report already carries today and in the past.

One would hope that as a company at risk of litigation and carrying the can for bad advice that an AI reduced draft such as this would be proof read by a senior expert in house who would trace back every "We'll sell you our company for $100." to the _original_ context via an embedded hyperlink in the draft.

It's certainly the way in which things were done when generating summaries of tens of thousands of documents for mineral and energy clients looking to invest at least $50 million in advancing projects for return.

Neywiny•19m ago

You've missed my point. I don't think any human who has a job at a law firm would ignore a sentence like that. I think any AI I've used has ignored explicit instructions of moderate severity. I'm not worried it'll hallucinate things into existence, I'm worried it'll ignore them out. Can't summarize without throwing away words. I don't trust it to choose the right ones.

SvenL•5h ago

I wonder the same. I mean, if it is produced in 1 day but I need 2 weeks to verify it, I don’t gain much. Sure I can ask it to quote and link the sources, but still. I remember this case of the Machine Learning book from Springer press where the author used a LLM and it was only revealed when someone tried to look up the quoted sources - they didn’t exist, they were made up.

yobbo•4h ago

It might also be their relative confidence in peope vs LLMs for this sort of task. People could be worse when the task itself is trivial but the volume is intangible for a single human.

immibis•1h ago

The secret is that nobody both reads the report and wants it to be factual.

bashtoni•3h ago

If it really is a single 100 page prompt then it will be even less reliable than a KPMG audit.

(See https://www.theguardian.com/business/2023/oct/12/kpmg-fined-... or https://pcaobus.org/news-events/news-releases/news-release-d... or https://www.sec.gov/newsroom/press-releases/2017-142 or any of a myriad of other cases)

roxolotl•32m ago

> Munnelly said KPMG built the agent by writing a 100-page prompt it fed into Workbench. The Register asked for details of the prompt and Munnelly said a substantial team worked on it for months, and the resulting agent asks for four or five inputs before it starts working on tax advice, then asks a human for direction before generating a document.

> Only tax agents can use the tool, because its output is not suitable for people without deep tax expertise.

Ok cool so they write a giant piece of software to assist in highly specialized tasks. Would love to know what the LLM adds. Maybe just parsing?

2D geometric library for circular arcs and line segments

Building a community focused platform, is hard

Fastmail have fixed a privacy fail

People are having fewer babies: Is it the end of the world?

"Contain and verify" is the real endgame of US-China AI competition

H100 vs. GB200 NVL72 Training Benchmarks – Power, TCO, and Reliability Analysis

Why do people keep writing about the imaginary compound Cr2Gr2Te6?

The Arborealists: Tree Painters from the United Kingdom

We're All Connected

Project Graveyard on Apple Vision Pro Is a Free Place for Dead Ideas

Skechers is making kids' shoes with a hidden AirTag compartment

Herb Sutter "Three Cool Things in C++26"

Tesla Model 3: Indicator stalk returns in China, available as retrofit option

Building a Regex Engine

Sequoia Backs Zed's Vision for Collaborative Coding

Crypto Settlement Times

Making Roman concrete produces as much CO2 as modern concrete

Render any Git repo into a single static HTML page for humans or LLMs

AI tooling must be disclosed for contributions

Physics of badminton's new killer spin serve

Show HN: macOS voice translation for real-time conversations

Tag-Based Logging

Resist AI – a handbook for concerned citizens

Altman: Expect OpenAI to spend trillions of dollars on datacenter construction

SpaceX says states should dump fiber plans, give all grant money to Starlink

A Survey on Diffusion Language Models

The John McPhee Method

Privacy and Security Risks in the ESIM Ecosystem

The OFL-1.1 license violates 17 USC § 105 (2019)

Black-Scholes: The maths formula linked to the financial crash (2012)