frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Exploring the Limits of Large Language Models as Quant Traders

https://nof1.ai/blog/TechPost1
38•rzk•1h ago

Comments

kqr•1h ago
Super interesting! You can click the "live" link in the header to see how they performed over time. The (geometric) average result at the end seems to be that the LLMs are down 35 % from their initial capital – and they got there in just 96 model-days. That's a daily return of -0.6 %, or a yearly return of -81 %, i.e. practically wiping out the starting capital.

Although I lack the maths to determine it numerically (depends on volatility etc.), it looks to me as though all six are overbetting and would be ruined in the long run. It would have been interesting to compare against a constant fraction portfolio that maintains 1/6 in each asset, as closely as possible while optimising for fees.

> difficulty executing against self-authored plans as state evolves

This is indeed also what I've found trying to make LLMs play text adventures. Even when given a fair bit of help in the prompt, they lose track of the overall goal and find some niche corner to explore very patiently, but ultimately fruitlessly.

XenophileJKO•59m ago
I don't think betting on crypto is really playing to the strengths of the models. I think giving news feeds and setting it on some section of the S&P 500 would be a better evaluation.
jwpapi•57m ago
Isn’t that what Renaissance Technology does?
ezekiel68•55m ago
You don't actually need nanosecond latency to trade effectively in futures markets but it does help to be able to evaluate and make decisions in the single-digit milliseconds range. Almost no generative model is able to perform inference at this latency threshold.

A threshold in the single-digit milliseconds range allows the rapid detection of price reversals (signaling the need to exit a position with least loss) in even the most liquid of real futures contracts (not counting rare "flash crash" events).

vita7777777•20m ago
This is true for some classes of strategies. At the same time there are strategies that can be profitable on longer timeframes. The two worlds are not mutually exclusive.
rob_c•10m ago
Yes, but LLM can barely cope with following the ordering of complex software tutorials linearly. Why would you reasonably expect them unprompted to understand time any better enough to trade and turn a profit?
graemep•13m ago
From the article:

> The models engage in mid-to-low frequency trading (MLFT) trading, where decisions are spaced by minutes to a few hours, not microseconds. In stark contrast to high-frequency trading, MLFT gets us closer to the question we care about: can a model make good choices with a reasonable amount of time and information?

bluecalm•51m ago
>>LLMs are achieving technical mastery in problem-solving domains on the order of Chess and Go, solving algorithmic puzzles and math proofs competitively in contests such as the ICPC and IMO.

I don't think LLMs are anywhere close to "mastery" in chess or go. Maybe a nitpick but the point is that a NN created to be good at trading is likely to outperform LLMs at this task the same way way NNs created specifically to be good at board games vastly outperform LLMs at those games.

lukan•14m ago
"Maybe a nitpick but the point is that a NN created to be good at trading is likely to outperform LLMs at this task the same way way NNs created specifically to be good at board games vastly outperform LLMs at those games."

Disagree. Go and chess are games with very limited rules. Succesful trading on the other hand is not so much a arbitary numbers game, but involves analyzing events in the news happening right now. Agentic LLMs that do this and accordingly buy and sell might succeed here.

(Not what they did here, though

"For the first season, they are not given news or access to the leading “narratives” of the market.")

Havoc•34m ago
Are language models really the best choice for this?

Seems to me that the outcome would be near random because they are so poorly suited. Which might manifest as

> We also found that the models were highly sensitive to seemingly trivial prompt changes

baq•31m ago
they're tools. treat them as tools.

since they're so general, you need to explore if and how you can use them in your domain. guessing 'they're poorly suited' is just that, guessing. in particular:

> We also found that the models were highly sensitive to seemingly trivial prompt changes

this is as much as obvious for anyone who seriously looked at deploying these, that's why there are some very successful startups in the evals space.

rob_c•12m ago
> guessing 'they're poorly suited' is just that, guessing

I have a really nice bridge to sell you...

This "failure" is just a grab at trying to look "cool" and "innovative" I'd bet. Anyone with a modicum of understanding of the tooling (or hell experience they've been around for a few years now, enough for people to build a feeling for this), knows that this it's not a task for a pre-trained general LLM.

reedf1•29m ago
you simply will lose trading directly with an llm. mapping the dislocation by estimating the percentage of llm trading bots is useful though.
vita7777777•21m ago
This is very thoughtful and interesting. It's worth noting that this is just a start and in future iterations they're planning to give the LLMs much more to work with (e.g. news feeds). It's somewhat predictable that LLMs did poorly with quantitative data only (prices) but I'm very curious to see how they perform once they can read the news and Twitter sentiment.
rob_c•15m ago
Not just can i guarantee the models are bad with numbers, unless it's a highly tuned and modified version they're too slow for this arena. Stick to using attention transformers in better model designs which have much lower latencies than pre-trained llms...
Lapsa•9m ago
I would argue that sentiment classification is where LLMs perform best. folks are already using it for precisely such purpose - have even built a public index out of it
callamdelaney•20m ago
The limits of LLM's for systematic trading were and are extremely obvious to anybody with a basic understanding of either field. You may as well be flipping a coin.
rob_c•14m ago
At least a coin is faster and more reliable.
aswegs8•17m ago
Given that LLMs can't even finish Pokemon Red, how would you expect they are able to trade futures?
wild_pointer•2m ago
Hey! That wasn't easy!

Cloudflare outage on November 18, 2025 post mortem

https://blog.cloudflare.com/18-november-2025-outage/
990•eastdakota•9h ago•526 comments

Exploring the Limits of Large Language Models as Quant Traders

https://nof1.ai/blog/TechPost1
38•rzk•1h ago•21 comments

Gemini 3

https://blog.google/products/gemini/gemini-3/
1411•preek•18h ago•870 comments

What nicotine does to your brain

https://economist.com/science-and-technology/2025/09/12/what-nicotine-does-to-your-brain
23•runeks•2h ago•22 comments

Google Antigravity

https://antigravity.google/
876•Fysi•17h ago•853 comments

Show HN: Browser-based interactive 3D Three-Body problem simulator

https://trisolarchaos.com/?pr=O_8(0.6)&n=3&s=5.0&so=0.00&im=rk4&dt=1.00e-4&rt=1.0e-6&at=1.0e-8&bs...
119•jgchaos•18h ago•41 comments

Even Realities Smart Glasses: G2

https://www.evenrealities.com/smart-glasses
5•gessha•5d ago•1 comments

Pebble, Rebble, and a path forward

https://ericmigi.com/blog/pebble-rebble-and-a-path-forward/
392•phoronixrly•16h ago•192 comments

I made a down detector for down detector

https://downdetectorsdowndetector.com
108•gusowen•9h ago•27 comments

Blender 5.0

https://www.blender.org/download/releases/5-0/
749•FrostKiwi•11h ago•229 comments

I wrote a Pong game in a 512-byte boot sector

https://akshatjoshi.com/i-wrote-a-pong-game-in-a-512-byte-boot-sector/
36•akshat666•4d ago•3 comments

Ultima VII Revisited

https://github.com/ViridianGames/U7Revisited
77•erickhill•1w ago•9 comments

Mojo-V: Secret Computation for RISC-V

https://github.com/toddmaustin/mojo-v
24•fork-bomber•6d ago•6 comments

Bluetooth Channel Sounding: The Next Leap in Bluetooth Innovation

https://www.embedded.com/bluetooth-channel-sounding-the-next-leap-in-bluetooth-innovation?_gl=1*8...
41•JoachimS•5d ago•15 comments

Gemini 3 Pro Model Card [pdf]

https://storage.googleapis.com/deepmind-media/Model-Cards/Gemini-3-Pro-Model-Card.pdf
224•virgildotcodes•22h ago•318 comments

The code and open-source tools I used to produce a science fiction anthology

https://compellingsciencefiction.com/posts/the-code-and-open-source-tools-i-used-to-produce-a-sci...
146•mojoe•17h ago•19 comments

Cloudflare Global Network experiencing issues

https://www.cloudflarestatus.com/incidents/8gmgl950y3h7
2367•imdsm•21h ago•1608 comments

Strace-macOS: A clone of the strace command for macOS

https://github.com/Mic92/strace-macos
39•signa11•8h ago•4 comments

OrthoRoute – GPU-accelerated autorouting for KiCad

https://bbenchoff.github.io/pages/OrthoRoute.html
166•wanderingjew•14h ago•18 comments

I am stepping down as the CEO of Mastodon

https://blog.joinmastodon.org/2025/11/my-next-chapter-with-mastodon/
469•Tomte•15h ago•314 comments

A down detector for down detector's down detector

https://downdetectorsdowndetectorsdowndetector.com/
114•SeanAnderson•2h ago•33 comments

Google boss says AI investment boom has 'elements of irrationality'

https://www.bbc.com/news/articles/cwy7vrd8k4eo
241•jillesvangurp•1d ago•450 comments

I just want working RCS messaging

https://wt.gd/i-just-want-my-rcs-messaging-to-work
101•joecool1029•7h ago•88 comments

Show HN: RowboatX – open-source Claude Code for everyday automations

https://github.com/rowboatlabs/rowboat
85•segmenta•14h ago•18 comments

Solving a million-step LLM task with zero errors

https://arxiv.org/abs/2511.09030
178•Anon84•17h ago•54 comments

GitHub: Git operation failures

https://www.githubstatus.com/incidents/5q7nmlxz30sk
363•wilhelmklopp•12h ago•296 comments

What I learned about creativity from a man painting on a treadmill (2024)

https://quinnmaclay.com/texts/lets-paint
49•8organicbits•4d ago•14 comments

Rebecca Heineman – from homelessness to porting Doom (2022)

https://corecursive.com/doomed-to-fail-with-burger-becky/
202•birdculture•10h ago•30 comments

Bild AI (YC W25) is hiring – Make housing affordable

https://www.ycombinator.com/companies/bild-ai/jobs/m2ilR5L-founding-engineer-applied-ai
1•rooppal•11h ago

Short Little Difficult Books

https://countercraft.substack.com/p/short-little-difficult-books
178•crescit_eundo•19h ago•95 comments