The problem: JSON wastes tokens. Curly braces, quotes, colons, commas - all eat into your context window.
ISON uses tabular patterns that LLMs already understand from training data:
JSON (87 tokens): { "users": [ {"id": 1, "name": "Alice", "email": "alice@example.com"}, {"id": 2, "name": "Bob", "email": "bob@example.com"} ] }
ISON (34 tokens): table.users id:int name:string email 1 Alice alice@example.com 2 Bob bob@example.com
Features: - 30-70% token reduction - Type annotations - References between tables - Schema validation (ISONantic) - Streaming format (ISONL)
Implementations: Python, JavaScript, TypeScript, Rust, C++ 9 packages, 171+ tests passing
pip install ison-py # Parser pip install isonantic # Validation & schemas
npm install ison-parser # JavaScript npm install ison-ts # TypeScript with full types npm install isonantic-ts # Validation & schemas
[dependencies] ison-rs = "1.0" isonantic-rs = "1.0" # Validation & schemas
Looking for feedback on the format design.
dtagames•1mo ago
Any tokens you saved will be lost 3x over in that process, as well as introducing confusing new context information that's unrelated to your app.
maheshvaikri99•1mo ago
ISON isn't inventing new syntax. It's CSV/TSV with a header - which LLMs have seen billions of times. The table format:
table.users id name email 1 Alice alice@example.com
...is structurally identical to markdown tables and CSVs that dominate training corpora.
On the "3x translation overhead" - ISON isn't meant for LLM-to-code interfaces where you need JSON for an API call. It's for context stuffing: RAG results, memory retrieval, multi-agent state passing.
If I'm injecting 50 user records into context for an LLM to reason over, I never convert back to JSON. The LLM reads ISON directly, reasons over it, and responds.
The benchmark: same data, same prompt, same task. ISON uses fewer tokens and gets equivalent accuracy. Happy to share the test cases if you want to verify.
dtagames•1mo ago
If your real data is in JSON (and in JS/TS apps, it always is at runtime as only JSON objects exist in that language) it makes no sense to ever convert it, period.
Besides, corporate report type CSVs that are in training materials don't have data shapes anything like JSON or even most businesses software. You're crippling an established and useful data carrier in order to save pennies on tokens. Tokens are getting cheaper, so it's the wrong optimization.
maheshvaikri99•1mo ago
ISON isn't meant to replace JSON in your application. Your JS/TS code still uses JSON objects internally. ISON is specifically for the LLM context window.
The flow: App (JSON) → serialize to ISON → inject into prompt → LLM reasons → response → your app
You're right that nesting is lost. But for LLM reasoning, flat structures often work better. LLMs struggle with deeply nested JSON - they lose track of parent-child relationships 4+ levels deep.
On "tokens are getting cheaper": True for API costs. But context windows are still limited. When you're stuffing RAG results, memory, agent state, and user history into 128K tokens, every byte matters. It's not about saving money - it's about fitting more context.
On "wrong optimization": I ran the benchmark. Same data, same task. ISON: 88.3% accuracy. JSON: 84.7%. The LLM actually performed better with the tabular format, not just "equivalent for fewer tokens."
## BENCHMARK STATS:
TOKEN EFFICIENCY: ISON: 3,550 tokens JSON: 12,668 tokens
LLM ACCURACY (300 Questions): ISON: 265/300 ( 88.3%) JSON: 254/300 ( 84.7%)EFFICIENCY (Acc/1K): ISON: 24.88 JSON: 6.68 ISON is 272.3% MORE EFFICIENT than JSON!
But I hear you - if your data is deeply nested and that nesting carries semantic meaning the LLM needs, JSON might be the right choice. ISON works best for relational/tabular data going into context.