frontpage.

SQL access to crypto market data, not just JSON

4•knazim•1h ago

Hi HN,

I’m Nazim, founders of Koinju.io and I wanted to share here an exploratory option we opened very recently: providing access to our database, which contains all cryptocurrency market data, via SQL. REST give access for direct retrieval but we're thinking more and more that SQL access for analytical work over a unified crypto market data layer could be of something because of llms.

This was partly triggered by Didier Lopes, ceo of OpenBB recent essay on financial firms owning the infrastructure where financial work happens (https://www.linkedin.com/pulse/how-did-we-end-up-here-didier-rodrigues-lopes-hgeqe/ ), especially the runtime where workflows execute and AI inference happens.

Most data APIs were designed for software that already knows what it wants. Call an endpoint, get JSON, parse it, compute somewhere else. That model worked great and still works great. But I’m not sure it maps well to llm-driven workflows, especially with big data.

A language model can call APIs /read JSON or write python to do so (claude code can force json output). But that does not mean the model is efficient in ingesting, reshaping, joining, aggregating, validating, or reasoning over large structured datasets through tokenized rows. At small scale, it fit within context limit. At large scale, it becomes complexe and small details may disappear silently, as if they were outliers...

So the thesis we are testing is: For big datasets, the AI-facing primitive should be switched from “return json” to execute a bounded, inspectable operation over the dataset”, something that you could plan, replay and even trace precisely. In that case, the llm endorse the role of a planner/controller. It should be able to inspect schemas, understand constraints, express an operation, check limits or even ASTs, run the computation through an execution layer, and then reason over a compact typed result.

So SQL is our current attempt at that layer.

This is really not new :-) not even magically “AI-native”. But it is explicit, inspectable, composable, and executable close to the data. REST still makes sense for simple retrieval. But for analytical questions over large market datasets, JSON pagination feels like the wrong unit of work.

And there is also a governance question here: In financial sector, many firms do not want their entire workflow to move into a vendor’s black-box interface. That seems right. Internal context, permissions, model policy, audit logs, and decision workflows should probably live in the firm env, of course. But that does not necessarily mean every external dataset should be copied locally before any question can be asked.

Maybe the better boundary is: -the firm owns the workflow and inference runtime -the data provider exposes a controlled execution surface, -the llm issues bounded operations, -the query engine performs the actual computation -result comes back

I’m interested in any feedback from people working on stuff like that, market data, quant research, analytics... The questions I’m trying to answer: -What is the right interface today for an llm working with bigdata? -Should the model operate on raw, JSON, schemas, SQL, typed tools, semantic layers, or something else? -Where should the boundary be between customer-owned runtime and provider-side data execution?

How should query limits, cost previews, dry runs, permissions, and audit logs work when the caller might be an agent?

I’m not looking only for validation. If the answer is “don’t invent a new AI category; just provide clean data, stable schemas, SQL, docs, and predictable limits”, that would also be useful.

MorphKatz – polymorphic x64 machine-code rewriter for Windows PEs

Exploring the behavior of a strung computational Stradivarius violin

Can agents replace the search stack?

Canadian election databases use "canary traps"–and they work

Show HN: A customer feedback and public roadmap for digital products

'Point of no return': New Orleans relocation must start now due to sea level

What to Know Before Owning a Cleanroom

Evolving Verifiable Trust: Bringing Binary Transparency to the Android Ecosystem

Welcome to Gas City

New kew v4.0 "Love is gonna save us edition" [video]

Pulitzer Prize Winners 2026

Individual efficiency vs. administrative efficiency (2024)

SAP Acquiring Dremio

Blepping in Cats

Bring Back the Jedi Knights

Operation Midway Blitz: How immigration raids changed Chicago

Building AGI Using Language Models (2020)

Our Story – Meet the Founders of Thaura – Thaura

Omarchy 3.7 Linux Distribution Overhauls Gaming Support, Adds Unified CLI

Claude Is Dead

We Can Do Hard Things

Wine 11.8 Improves VBScript Compatibility Fixes Microsoft Golf 1999

Pentagon Delays Put 150 Wind Projects on Ice as Trump Targets Wind Power

Ask HN: Hypothetical to robotics/autonomy engineers working on FSD

I can't follow "don't use HN for promotion" due to my lifestyle – OK or not?

Show HN: ByAllo – the online bookstore that runs itself

Trump's threats are giving Iran a reason for wanting a nuclear weapon

Links to CSS Colour Palettes

You've heard about the vulnpocalypse. let's talk about the slopdemic

Anthropic's Boris Cherny: Coding is solved what's next