Beyond the Black Box: Interpretability of LLMs in Finance

67•ashater•1d ago

Comments

ashater•1d ago

Paper introduces AI explainability methods, mechanistic interpretation, and novel Finance-specific use cases. Using Sparse Autoencoders, we zoom into LLM internals and highlight Finance-related features. We provide examples of using interpretability methods to enhance sentiment scoring, detect model bias, and improve trading applications.

juancroldan•1d ago

Cool stuff. I'm the CTO of Stargazr (stargazr.ai), a financial & operational AI for manufacturing companies; we started using transformers to process financial data in 2020, a bit before the GPT boom.

In our experience, things beyond very constrained function calling opens the door to explainability problems. We moved away from "based on the embeddings of this P&L, you should do X" towards "I called a function to generate your P&L, which is in this table; based on this you could think of applying these actions".

It's a loss in terms of semantics (the embeddings could pack more granular P&L observations over time) but much better in terms of explainability. I see other finance AIs such as SAP Joule also going in the same direction.

ashater•1d ago

Thank you. Agreed, we are exploring different ways to apply these interpretability methods to a wide range of transformer based methods, not just decoder based generative applications.

hamburga•1d ago

I’m still waiting for somebody to explain to me how a model with a million+ parameters can ever be interpretable in a useful way. You can’t actually understand the model state, so you’re just making very coarse statistical associations between some parameters and some kinds of responses. Or relying on another AI (itself not interpretable) to do your interpretation for you. What am I missing?

esafak•1d ago

Even a large model has to behave fairly predictably to be useful; it's not totally random, is it? The same thing applies to humans.

Interpretability can mean several things. Are you familiar with things like this? https://distill.pub/2018/building-blocks/

ashater•1d ago

Our paper provides evidence of features in Finance but I would suggest reading seminal papers from Anthropic https://www.anthropic.com/news/golden-gate-claude and https://transformer-circuits.pub/2024/scaling-monosemanticit...

Monosemantic behavior is key in our research.

CGMthrowaway•1d ago

There is a power law curve to the importance of any particular feature. I work with models with 1000's of features and usually it's only the top 5-10 that really matter. But you don't know until you do it

dboreham•1d ago

My take is the model is a matrix (or a thing like a matrix). You can "interpret" it in the context of another matrix that you know (presumably by generating that matrix from known training data, or by looking at the delta between different matrices with different measurable output behavior), you can say how much of your test matrix is present in the target model.

laylower•1d ago

Thanks Ariye. What does group risk think about this paper?

I imagine these metrics would be good to include in the MI but are you confident that the methods being proposed are adequate to convince regulators on both sides of the Atlantic?

ashater•1d ago

Thank you for reading. One of the main reasons we've written the paper is to help with model validation of LLM usage in our highly regulated industry. We are also engaging with regulators.

The industry at the moment is mostly using closed sourced vendor models that are very hard to validate or interpret. We are pushing to move onto models, with open source weights and where we can apply our interpretability methods.

Current validation approaches are still very behavioral in nature and we want move it into mechanistic interpretation world.

vessenes•1d ago

Ooh you had me at mechinterp + finance. Thanks for publishing: I’m excited to read it. Long term do you guys hope to uncover novel frameworks? Or are you most interested in having a handle on what’s going on inside the model?

ashater•1d ago

We want to do both. In finance, highly regulated industry, understanding how models work is critical. In addition, mech interp will allow us to understand which current or new architectures could work better for financial applications.

Show HN: I turned my infrastructure into a tab

ApiFlux – A Visual Playground to Build and Debug API Workflows

I created a curated list of AI agents for consumers and developers

Don't know if your business idea will have traction? stop waiting and find out

Big Tech's AI Endgame Is Coming into Focus (an everything app)

FFmpeg Merges WebRTC Support

Friendship rather than romance protects better from depression

Malicious RubyGems pose as Fastlane to steal Telegram API data

Lodestar Multipliers in Delaware and Federal Attorney Fee Awards

Installing *BSD in 2025 part 3 – A critical look at NetBSD's installer

Show HN: Hacker News historic upvote and score data

AI can't solve novel problems yet

Designing Algorithmic Delegates

Our Startup Was Hacked, Need GitHub's Assistance to Trace Attacker

Merlin Bird ID

Binary Wordle

The symbolism of the magnifying glass is not universal

Google Scholar is Manipulatable (2024)

'Spiderweb' drone attack marks a new threat for top militaries

Open Sesame! on the Security and Memorability of Verbal Passwords [pdf]

Chinese couple charged with smuggling a biological pathogen into the U.S.

How NATO is turning to startups to outpace its rivals

DiffX – Next-Generation Extensible Diff Format

Flesh-eating New World Screwworm could pose health risks to cattle, humans

Why is PS3 emulation so fast: RPCS3 optimizations explained [video]

Musk calls Trump's tax bill a 'disgusting abomination'

Ask HN: Stripe and Chargebacks

Meta and Yandex exfiltrating tracking data on Android via WebRTC

Indeed CEO resigns after major growth to focus on AI ethics

Why Is the US Dropping Billions of Mutant Flies from the Sky? [video]

Beyond the Black Box: Interpretability of LLMs in Finance

Comments

Show HN: I turned my infrastructure into a tab

ApiFlux – A Visual Playground to Build and Debug API Workflows

I created a curated list of AI agents for consumers and developers

Don't know if your business idea will have traction? stop waiting and find out

Big Tech's AI Endgame Is Coming into Focus (an everything app)

FFmpeg Merges WebRTC Support

Friendship rather than romance protects better from depression

Malicious RubyGems pose as Fastlane to steal Telegram API data

Lodestar Multipliers in Delaware and Federal Attorney Fee Awards

Installing *BSD in 2025 part 3 – A critical look at NetBSD's installer

Show HN: Hacker News historic upvote and score data

AI can't solve novel problems yet

Designing Algorithmic Delegates

Our Startup Was Hacked, Need GitHub's Assistance to Trace Attacker

Merlin Bird ID

Binary Wordle

The symbolism of the magnifying glass is not universal

Google Scholar is Manipulatable (2024)

'Spiderweb' drone attack marks a new threat for top militaries

Open Sesame! on the Security and Memorability of Verbal Passwords [pdf]

Chinese couple charged with smuggling a biological pathogen into the U.S.

How NATO is turning to startups to outpace its rivals

DiffX – Next-Generation Extensible Diff Format

Flesh-eating New World Screwworm could pose health risks to cattle, humans

Why is PS3 emulation so fast: RPCS3 optimizations explained [video]

Musk calls Trump's tax bill a 'disgusting abomination'

Ask HN: Stripe and Chargebacks

Meta and Yandex exfiltrating tracking data on Android via WebRTC

Indeed CEO resigns after major growth to focus on AI ethics

Why Is the US Dropping Billions of Mutant Flies from the Sky? [video]