JSON River – Parse JSON incrementally as it streams in

52•rickcarlino•5d ago

Comments

magicalhippo•5d ago

I wrote a more traditional JSON parser for my microcontooller project. You could iterate over elements and it would return "needs more data" if it was unable to proceed. You could then call it again after fetching more. Then just simple state machines to consume the objects.

The benefit with that was that you didn't need the memory to store the deserialized JSON object in memory.

This seems to be more oriented towards interactivity, which is an interesting use-case I hadn't thought about.

rickcarlino•5d ago

I found this because I am interested in streaming responses that populate a user interface quickly, or use spinners if it is loading still

alganet•1h ago

Interesting approach.

I would expect an object JSON stream to be more like a SAX parser though. It's familiar, fast and simple.

Any thougts on not chosing the SAX approach?

benatkin•1h ago

I think this is a lot like etree in python's streaming approach for XML, but with a simpler API, and incremental text parsing. With etree in python, you can access the incomplete tree data and not have to worry about events. So it's missing the SAX API part of a SAX approach, but is built like some real world libraries that use the SAX approach, which end up having a hybrid of events and trees.

alganet•1h ago

It seems to be convenient for some cases. A large object with many keys, for example.

I don't see it as particularly convenient if I want to stream a large array of small independent objects and read each one of them once, then discard it. The incremental parsed array would get bigger and bigger, eventually containing all the objects I wanted to discard. I would also need to move my array pointer to the last element at each increment.

jq and JSON.sh have similar incremental "mini-object-before-complete" approaches to parsing JSON. However, they do include some tools to shape those mini-objects (pruning, selecting, and so on). Also, they're tuned for pipes (new line is the event), which caters to shell and text-processing tools. I wonder what would be the analogue for that in a higher language.

benatkin•14m ago

This is more versatile than it seems at first glance. Under invariants, it shows that you have arrays/objects only being mutated, so you have stable references. You could use a WeakSet to observe new children of an item coming in. You also may not even need manage this directly - you could debounce and just re-render a UI component by returning a modified virtual DOM. Or if you had a visualization in d3, it would automatically notice which ones are new.

rictic•51m ago

SAX is often better if you don't need the full final result, especially if you can throw away most of the data after it's been processed. The nice part about this API is that you just get a DeepPartial<FinalResult> so the code to handle a partial result is basically the same as the code to handle the final result.

codesnik•1h ago

I can't imagine a usecase. Ok, you receive incremental updates, which could be useful, but how to find out that json object is actually received in full already?

philipallstar•1h ago

When its closing brace or square bracket appears.

EDIT: this is totally wrong and the question is right.

rising-sky•1h ago

Actually, not quite how this works. You always get valid JSON, as in this sequence from the readme:

```json {"name": "Al"} {"name": "Ale"} ```

So the braces are always closed

Supermancho•1h ago

When you want to pull multi-gig JSON files and not wait for the full file before processing is where I first used this.

rictic•53m ago

Funnily enough, this was one of the first users of jsonriver at google. A team needed to parse more JSON than most JS VMs will allow you to fit into a single string, so they had no choice but to use a streaming parser.

florians•1h ago

Noteworthy: Contributions by Claude

rictic•42m ago

Is true. I wrote a ton of tests, testing just about everything I can think of, including using a reverse parser I wrote to exhaustively generate the simplest 65k json values, ensuring that it succeeds with the same values and fails on the same cases as JSON.parse.

Then added benchmarks and started doing optimization, getting it ~10x faster than my initial naive implementation. Then I threw agents at it, and between Claude, Gemini, and Codex we were able to make it an additional 2x faster.

seanalltogether•1h ago

Maybe I'm wrong but it seems like you would only want to parse partial values for objects and arrays, but not strings or numbers. Objects and arrays can be unbounded so it makes sense to process what you can, when you can, whereas a string or number usually is not.

everforward•1h ago

It could be useful if you're doing something with the string that operates sequentially anyways (i.e. block-by-block AES, or SHA sums).

I _think_ the intended use of this is for people with bad internet connections so your UI can show data that's already been received without waiting for a full response. I.e. if their connection is 1KB/s and you send an 8KB JSON blob that's mostly a single text field, you can show them the first kilobyte after a second rather than waiting 8 seconds to get the whole blob.

At first I thought maybe it was for handling gigantic JSON blobs that you don't want to entirely load into memory, but the API looks like it still loads the whole thing into memory.

AaronFriel•1h ago

If you're generating long reports, code, etc. with an LLM, partial strings matter quite a lot for user experience.

rictic•1h ago

Numbers, booleans, and nulls are atomic with jsonriver, you get them all at once only when they're complete.

For my use case I wanted streaming parse of strings, I was rendering JSON produced by an LLM, for incrementally rendering a UI, and some of the strings were long enough (descriptions) that it was nice to see them render incrementally.

holdenc137•1h ago

I don't get it (and I'd call this cumulative not incremental)

Why not at least wait until the key is complete - what's the use in a partial key?

simonw•1h ago

If you're building a UI that renders output from a streaming LLM you might get back something which looks like this:

  {"role": "assistant", "text": "Here's that Python code you aske

Incomplete parsing with incomplete strings is still useful in order to render that to your end user while it's still streaming in.

cozzyd•1h ago

incomplete strings could be fun in certain cases

{"cleanup_cmd":"rm -rf /home/foo/.tmp" }

rictic•1h ago

Yeah, another fun one is string enums. Could tread "DeleteIfEmpty" as "Delete".

Waterluvian•38m ago

I imagine if you reason about incomplete strings as a sort of “unparsed data” where you might store or transport or render it raw (like a string version of printing response.data instead of response.json()), but not act on it (compare, concat, etc), it’s a reasonably safe model?

I’m imagining it in my mental model as being typed “unknown”. Anything that prevents accidental use as if it were a whole string… I imagine a more complex type with an “isComplete” flag of sorts would be more powerful but a bit of a blunderbuss.

trevor-e•4m ago

In this example the value is incomplete, not the key.

rictic•55m ago

Cumulative is a good term too. I come from the browser world where it's typically called incremental parsing, e.g. when web browsers parse and render HTML as it streams in over the wire. I was doing the same thing with JSON from LLMs.

simonw•1h ago

If anyone needs to do this in Python I've had success with both ijson and jiter - notes here: https://til.simonwillison.net/json/ijson-stream and https://simonwillison.net/2024/Sep/22/jiter/

syx•1h ago

For those wondering about the use case, this is very useful when enabling streaming for structured output in LLM responses, such as JSON responses. For my local Raspberry Pi agent I needed something performant, I've been using streaming-json-js [1], but development appears to have been a bit dormant over the past year. I'll definitely take a look at your jsonriver and see how it compares!

[1] https://github.com/karminski/streaming-json-js

cjonas•1h ago

Particularly for REACT style agents that use a "final" tool call to end the run.

rokkamokka•18m ago

For LLMs I recommend just doing NDJSON, that is, newline delimited json. It's much simpler to implement

rictic•7m ago

Do any LLMs support constrained generation of newline delimited json? Or have you found that they're generally reliable enough that you don't need to do constrained sampling?

AaronFriel•1h ago

Oh, this is quite similar to an online parser I'd written a few years ago[1]. I have some worked examples on how to use it with the now-standard Chat Completions API for LLMs to stream and filter structured outputs (aka JSON). This is the underlying technology for a "Copilot" or "AI" application I worked on in my last role.

Like yours, I'm sure, these incremental or online parser libraries are orders of magnitude faster[2] than alternatives for parsing LLM tool calls for the very simple reason that alternative approaches repeatedly parse the entire concatenated response, which requires buffering the entire payload, repeatedly allocating new objects, and for an N token response, you parse the first token N times! All of the "industry standard" approaches here are quadratic, which is going to scale quite poorly as LLMs generate larger and larger responses to meet application needs, and users want low latency outputs.

One of the most useful features of this approach is filtering LLM tool calls on the server and passing through a subset of the parse events to the client. This makes it relatively easy to put moderation, metadata capture, and other requirements in a single tool call, while still providing low latency streaming UI. It also avoids the problem with many moderation APIs where for cost or speed reasons, one might delegate to a smaller, cheaper model to generate output in a side-channel of the normal output stream. This not only doesn't scale, but it also means the more powerful model is unaware of these requirements, or you end up with a "flash of unapproved content" due to moderation delays, etc.

I found that it was extremely helpful to work at the level of parse events, but recognize that building partial values is also important, so I'm working on something similar in Rust[3], but taking a more holistic view and building more of an "AI SDK" akin to Vercel's, but written in Rust.

[1] https://github.com/aaronfriel/fn-stream

[2] https://github.com/vercel/ai/pull/1883

[3] https://github.com/aaronfriel/jsonmodem

(These are my own opinions, not those of my employer, etc. etc.)

zahlman•1h ago

> If you gave this to jsonriver one byte at a time it would yield this sequence of values:

Does it create a new value each time, or just mutate the existing one and keep yielding it?

rictic•58m ago

It mutates the existing value and yields it again (unless the toplevel value is a string, because strings are immutable in JS).

mattvr•1h ago

You could also use JSON Merge Patch (RFC 7396) for a similar use case.

(The downside of JSON Merge Patch is it doesn't support concatenating string values, so you must send a value like `{"msg": "Hello World"}` as one message, you can't join `{"msg": "Hello"}` with `{"msg": " World")`.)

[1] https://github.com/pierreinglebert/json-merge-patch

jauntywundrkind•33m ago

It's no longer active, but Oboe.js did great stuff for a decade+ in this field! It has some very nice APIs for consuming. https://github.com/jimhigson/oboe.js/

It's less about incrementally parsing objects, and more about picking paths and shapes out from a feed. If you're doing something like array/newline delimited json, it's a great tool for reading things out as they arrive. Also great for example for feed parsing.

rictic•29m ago

Hi HN! Didn't expect this to be on the front page today! I should really release all the optimizations that've been landing lately, the version on github is about twice as fast as what's released on npm.

I wrote it when I was doing prototyping on doing streaming rendering of UIs defined by JSON generated by LLMs. Using constrained generation you can essentially hand the model a JSON serializable type, and it will always give you back a value that obeys that type, but the big models are slow enough that incremental rendering makes a big difference in the UX.

I'm pretty proud of the testing that's gone into this project. It's fairly exhaustively tested. If you can find a value that it parses differently than JSON.parse, or a place where it disobeys the 5+1 invariants documented in the README I'd be impressed (and thankful!).

This API, where you get a series of partial values, is designed to be easy to render with any of the `UI = f(state)` libraries like React or Lit, though you may need to short circuit some memoization or early exiting since whenever possible jsonriver will mutate existing values rather than creating new ones.

carterschonwald•27m ago

Oh fun, I wrote a similar library in 2015 for Haskell. There is an annoying gotcha to deal with: there are sequences of valid characters that can be parsed incorrectly if you’re doing incremental chunks, namely if “0.0” is split across two input chunks you can get a token stream with two valid float literals rather than 1! Namely “0” and “.0”, which is just a really annoying wart of json float syntax.

rictic•21m ago

Yeah, getting numbers correct was one of the trickier wrinkles in the project. https://github.com/rictic/jsonriver/blob/5515be978bb564e9bdc...

tracnar•19m ago

Don't you need to wait for some kind of delimiter (like ",", "]", "}", newline, EOF) before parsing something else than a string?

yonatan8070•15m ago

An "off the top of my head" solution to this would be not to yield tokens until a terminating character (comma, \n, }).

NanoChat – The best ChatGPT that $100 can buy

Show HN: SQLite Online – 11 years of solo development, 11K daily users

Environment variables are a legacy mess: Let's dive deep into them

Spotlight on pdfly, the Swiss Army knife for PDF files

From Millions to Billions

More random home lab things I've recently learned

JSON River – Parse JSON incrementally as it streams in

Optery (YC W22) – Hiring Tech Lead with Node.js Experience (U.S. & Latin America)

American solar farms

The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2025

Smartphones and being present

MPTCP for Linux

AI and the Future of American Politics

Control your Canon Camera wirelessly

CRDT and SQLite: Local-First Value Synchronization

A16Z-backed data firms Fivetran, dbt Labs to merge in all-stock deal

Ofcom fines 4chan £20K and counting for violating UK's Online Safety Act

Matrices can be your Friends

Android's sideloading limits are its most anti-consumer move yet

Putting a dumb weather station on the internet

Two Paths to Memory Safety: CHERI and OMA

LaTeXpOsEd: A Systematic Analysis of Information Leakage in Preprint Archives

Clockss: Digital preservation services run by academic publishers and libraries

Jeep software update bricks vehicles, leaves owners stranded

Ask HN: What are you working on? (October 2025)

Roger Dean – His legendary artwork in gaming history (Psygnosis)

Tauri binding for Python through Pyo3

Some graphene firms have reaped its potential but others are struggling

Making regular GPS ultra-precise

MicroPythonOS – An Android-like OS for microcontrollers

NanoChat – The best ChatGPT that $100 can buy

Show HN: SQLite Online – 11 years of solo development, 11K daily users

Environment variables are a legacy mess: Let's dive deep into them

Spotlight on pdfly, the Swiss Army knife for PDF files

From Millions to Billions

More random home lab things I've recently learned

JSON River – Parse JSON incrementally as it streams in

Optery (YC W22) – Hiring Tech Lead with Node.js Experience (U.S. & Latin America)

American solar farms

The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2025

Smartphones and being present

MPTCP for Linux

AI and the Future of American Politics

Control your Canon Camera wirelessly

CRDT and SQLite: Local-First Value Synchronization

A16Z-backed data firms Fivetran, dbt Labs to merge in all-stock deal

Ofcom fines 4chan £20K and counting for violating UK's Online Safety Act

Matrices can be your Friends

Android's sideloading limits are its most anti-consumer move yet

Putting a dumb weather station on the internet

Two Paths to Memory Safety: CHERI and OMA

LaTeXpOsEd: A Systematic Analysis of Information Leakage in Preprint Archives

Clockss: Digital preservation services run by academic publishers and libraries

Jeep software update bricks vehicles, leaves owners stranded

Ask HN: What are you working on? (October 2025)

Roger Dean – His legendary artwork in gaming history (Psygnosis)

Tauri binding for Python through Pyo3

Some graphene firms have reaped its potential but others are struggling

Making regular GPS ultra-precise

MicroPythonOS – An Android-like OS for microcontrollers

JSON River – Parse JSON incrementally as it streams in

Comments