frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Product Hunt for Roblox

https://www.robloxhunt.xyz/
1•firtaet•3m ago•0 comments

You can code only 4 hours per day. Here's why

https://newsletter.techworld-with-milan.com/p/you-can-code-only-4-hours-per-day
2•vquemener•3m ago•0 comments

IDF Accepts Gaza Health Ministry Death Toll of 71,000 Palestinians Killed

https://www.haaretz.com/israel-news/2026-01-29/ty-article/.premium/idf-accepts-gaza-health-minist...
3•ceejayoz•3m ago•0 comments

More ways to succeed than fail

https://tushar1qaz.substack.com/p/the-physical-world-doesnt-want-your
1•FuseGov•3m ago•1 comments

Lessons from running Meetups for 10 years

https://www.jakeworth.com/posts/how-i-organize-a-meetup/
1•jwworth•5m ago•0 comments

An experiment: an agent-native blockchain run by agents

https://github.com/nhestrompia/seloria
1•nhestrompia•5m ago•1 comments

Theseus - Train like a foundation lab

https://github.com/Jemoka/theseus
1•shetaye•5m ago•0 comments

EzAuth – Simple and plugnplay auth library for Golang

https://github.com/josuebrunel/ezauth
1•josuebrunel•5m ago•1 comments

DX12 Frame Interception Layer for FG Research

1•OrganicCoconut•6m ago•0 comments

"The AI Con" Con

https://benthams.substack.com/p/the-ai-con-con
1•ai_critic•6m ago•0 comments

AI After Drug Development

https://asteriskmag.com/issues/13/ai-after-drug-development
2•abhishaike•7m ago•0 comments

BBC joins Colombian commandos fighting 'never-ending battle' against drug gangs

https://www.bbc.co.uk/news/articles/c04105ywkkqo
1•mmarian•7m ago•0 comments

The Coasean Singularity? Demand, Supply, and Market Design with AI Agents

https://www.nber.org/books-and-chapters/economics-transformative-ai/coasean-singularity-demand-su...
1•surprisetalk•9m ago•0 comments

Infinite Flowers Zoomquilt

https://infiniteflowers.net/
1•surprisetalk•9m ago•0 comments

Quantifying Multi-Track Novels

https://kaleidoscopemind.substack.com/p/quantifying-multi-track-novels
1•surprisetalk•9m ago•0 comments

Disneyland History and Other Disney Park History

https://yesterland.com/
1•surprisetalk•9m ago•1 comments

Few things are worth building

https://twitter.com/jobergum/status/2018706126842294315
1•tosh•10m ago•0 comments

Show HN: OpsBrief – Stop wasting 30 minutes per incident gathering context

https://opsbrief.io
1•darlontrofy•11m ago•0 comments

Intel Announces Xeon 600 Series This Is Granite Rapids for Workstations

https://www.servethehome.com/intel-announces-xeon-600-series-granite-rapids-for-workstations/
1•rbanffy•11m ago•0 comments

How the OpenSSL community was built on Heartbleed [video]

https://fosdem.org/2026/schedule/event/CLBXJC-openssl-community-heartbleed/
1•jlericson•13m ago•0 comments

Data centers in space makes no sense

https://civai.org/blog/space-data-centers
5•ajyoon•13m ago•0 comments

Is Lotterygamedevelopers.com the Right Lottery Development Partner?

https://www.slavnastudio.com/lottery-and-bingo-game-development-services
1•Andrew0416•13m ago•1 comments

Show HN: Openground – open-source, on-device documentation indexing for agents

https://github.com/poweroutlet2/openground
1•poweroutlet2•14m ago•0 comments

Oracle's Financing Primes the OpenAI Pump

https://www.nextplatform.com/2026/02/02/oracles-financing-primes-the-openai-pump/
1•rbanffy•14m ago•0 comments

Life is the Sum Total of 2k Mondays

https://www.joanwestenberg.com/your-life-is-the-sum-total-of-2-000-mondays/
1•speckx•15m ago•0 comments

Using a CSV File in S3 as a "Database"

https://tim.bai.uno/using-a-csv-file-in-s3-as-a-database-a-surprisingly-practical-pattern/
2•timmit•16m ago•0 comments

China Moon Mission: Aiming for 2030 Lunar Landing

https://spectrum.ieee.org/china-moon-mission-mengzhou-artemis
7•rbanffy•18m ago•0 comments

No Source Code == No Patent

https://albertcory50.substack.com/p/no-source-code-no-patent
2•SnobolForever•19m ago•0 comments

macOS Hardening: A New Series

https://bytearchitect.io/macos-security/macOS-Hardening-a-new-series/
3•rantingdemon•19m ago•0 comments

How you're going to keep your job when Opus 5 will kill it

https://twitter.com/realmcore_/status/2018762897971990830
1•akira_067•19m ago•0 comments
Open in hackernews

Show HN: I gave 11 LLMs my trading strategy to see which one profits

https://daytradingbench.com
1•porttipasi•1h ago

Comments

porttipasi•1h ago
I've been day trading indices (DAX, Nasdaq) profitably throughout 2025 using a specific personal strategy. At some point I got curious: if I gave this exact strategy to different LLMs as a prompt, which one would execute it best?

I couldn't find a benchmark that tested this. The academic ones focus on stock portfolios with daily rebalancing. Nothing tested LLMs on fast-paced index day trading where you need to read price action and make quick directional calls based on a defined strategy.

So I built DayTradingBench. The core idea is simple: every model receives the exact same prompt containing my trading strategy and the exact same live market data. The only variable is the model itself. This way I'm measuring pure decision-making capability — not prompt engineering.

How it works:

- 11 LLMs trade autonomously during live market hours (08:00–21:00 UTC, Mon–Fri) - Every 15 minutes each model gets a market snapshot and must return a structured JSON decision: LONG, SHORT, or HOLD — with stop loss, take profit, confidence, and risk percentage - All start with the same $100k virtual balance - Positions auto-close on SL/TP hits (checked every 10 seconds) or at session end

There are two input modes: text mode (structured OHLCV data) and vision mode (candlestick chart images sent to the model). Same strategy prompt, different way of presenting the market data. This lets me compare whether models trade better reading numbers or reading charts.

The performance gap between models is much larger than I expected, even though they all receive identical instructions.

Built this as a solo dev. The system runs fully autonomously 24/5 — I mostly just watch the results now.

Would love to hear what HN thinks, especially if you've experimented with LLMs for trading.

smeeth•1h ago
Just reading your description, it sounds like there are two variables:

1. Prompt adherence: how well the models follow your stated strategy

2. Decision quality: how well models do on judgment calls that aren’t explicitly in the strategy

Candidly, since you haven’t shared the strategy, there’s no way for me to evaluate either (1) or (2). A model’s performance could be coming from the quality of your strategy, the model itself, or an interaction between the two, and I can’t disentangle that from what you’ve provided.

So as presented, the benchmark is basically useless to me for evaluating models (not because it’s pointless overall, but because I can’t tell what it’s actually measuring without seeing the strategy).

porttipasi•1h ago
That's a fair point. You're right that without seeing the strategy, you can't fully disentangle what drives the differences.

But the strategy itself isn't really the point. Since every model gets the exact same prompt and the exact same market data, the only variable is the model. So relative performance differences are real regardless of what the strategy contains. If Model A consistently outperforms Model B under identical conditions, that tells you something meaningful about the model.

And honestly, that blend of prompt adherence and decision quality is how people actually use LLMs in practice. You give it instructions and context, and you care about the result.

You're right though that the strategy being private limits what outsiders can evaluate. It's something I'm thinking about.

smeeth•1h ago
> Model A consistently outperforms Model B under identical conditions, that tells you something meaningful about the model.

Not really! Sorry to harp on this, but there are two ways one model could outperform another:

1) It adheres to your strategy better

2) It improvises

If the prompt was "maximize money, here's inspiration" improvising is fine. If the prompt was "implement the strategy," improvising is failure.

Right now you have a leaderboard; you don’t yet have a benchmark, because you can’t tell whether high P&L reflects correctness.

porttipasi•40m ago
To be more specific: the prompt defines a trading philosophy and tells models what to look for in the charts. But the actual read and the decision is entirely on the model. Using your framing — it's closer to "here's inspiration, now maximize money" than "implement this exact strategy." Which means improvisation within that framework is exactly what's being measured.

But yeah, it's closer to a leaderboard right now.

DivingForGold•1h ago
Get rid of your nag screen for one
porttipasi•1h ago
fair point, I'll look into showing it only to EU visitors.