Are there reasons to go with Zod over ArkType?
I recently put together a deck for this after some investigation: https://docs.google.com/presentation/d/1fToIKvR7dyvQS1AAtp4Y...
The zod schema becomes the source of truth.
In addition to using OpenAPI, I generate TS interfaces from my data classes in a Gradle task.
Most Go, Java, and Python APIs are practically all Swagger based.
There is the old TypeScript type emitter, reflect-metadata, which would provide some type information at runtime, but it targets very old decorator and metadata specifications, not the current version. I don't super know how accurate or complete it's model really is, how closely it writes down what typescript knows versus how much it defines its own model to export to. https://www.npmjs.com/package/reflect-metadata
We are maybe at the cusp of some unifying, some coming together, albeit not via typescript at this time, still as a separate layer. The Standard Schema project has support from I dare say most of the top validation libraries. But extending this to API definitions, ORM tools is in extremely early stages. https://github.com/standard-schema/standard-schema?tab=readm...
The only time status quo is fine is when it's for clients and employers who just want stuff done and don't care how.
In that context, all these unfortunate layers of complexity at least mean more billable hours.
for better or worse the web only adds, don't change or remove APIs
Many of the same people who complain about how complicated modern web dev is would also shudder at the suggestion to just replace all those things with TypeScript (and Zod, if that's your TS schema definition and validation library of choice).
const cached = new Map<string, Function>()
export function as<T>(o: any, path: string, as: (o: any) => T | undefined) {
try {
let fn = cached.get(path)
if (!fn) cached.set(path, fn = new Function('o', `return o.${path}`))
const v = fn(o)
return as(v) as T | undefined
}
catch (e) {
return undefined
}
}
as.number = (o: any) => (typeof o === 'number' ? o : undefined)
as.string = (o: any) => (typeof o === 'string' ? o : undefined)
as.boolean = (o: any) => (typeof o === 'boolean' ? o : undefined)
as.numbers = (len = 0) => (o: any) => (o instanceof Array && o.length >= len && o.every(c => typeof c === 'number') ? o : undefined)
as.strings = (len = 0) => (o: any) => (o instanceof Array && o.length >= len && o.every(c => typeof c === 'string') ? o : undefined)
const size = as(usrConfig, 'sys.size', as.numbers(2))
const fontpath = as(usrConfig, 'sys.font', as.string)
And yes I'm aware this doesn't have a huge API surface. That's the whole point. If I already have a JSON object, I can reach into it and get either what I ask for or nothing. In many real world cases, this is enough.
let someArray = [1, 2, , 4];
console.log(as.numbers(someArray) === someArray); // => true
for (let number of numbers) {
// This should be safe because I know everything in the array is a number, right?
console.log(number.toFixed(2)); // => TypeError: number is undefined
}
I mean, it's probably fine if you're only ever getting your data from JSON.parse(). But I would hesitate to use this in production.1. Validate your Buffer/UInt8Array for valid size/encoding/etc first
2. Parse it to an object via JSON.parse or whatever your use-case needs
3. Reach into it with this function to get data if it matches the type you need
This code only deals with #3 and makes a few assumptions about #2 (e.g. no undefined, typical JSON-like object, etc).
So 90% of use cases?
https://ajv.js.org is one such JSON Schema library. How does zod compare to this?
Thanks, this was the missing piece for me. I'd been thinking about using Zod for an old client's Node codebase, but I'd only need it to validate the shape of json objects received by http endpoints. Zod was looking like overkill, but I know it's popular so I wasn't sure what I was missing.
One key difference is preprocessing/refine. With Zod, you can provide a callback before running validation, which is super useful and can't be represented in JSON. This comes in handy more often than you'd think - e.g converting MM/DD/YYYY to DD/MM/YYYY before validating as date.
And you can use AVJ or its own schema validation system (which is much faster, but only sync where avj has asynchronous validation options in case you want to check against a database or something).
Say I have a type returned by the server that might have more sophisticated types than the server API can represent. For instance, api/:postid/author returns a User, but it could either be a normal User or an anonymous User, in which case fields like `username`, `location`, etc come back null. So in this case I might want to use a discriminated union to represent my User object. And other objects coming back from other endpoints might also need some type alterations done to them as well. For instance, a User might sometimes have Post[] on them, and if the Post is from a moderator, it might have special attributes, etc - another discriminated union.
In the past, I've written functions like normalizeUser() and normalizePost() to solve this, but this quickly becomes really messy. Since different endpoints return different subsets of the User/Post model, I would end up writing like 5 different versions of normalizePost for each endpoint, which seems like a mess.
How do people solve this problem?
E.g.
if (user.user_type === 'authenticated') {
// do something with user.name because the type system knows we have that now
}
Without fullstack TS this could look something like: (for a Python backend) Pydantic models+union for the various shapes of `User`, and then OpenAPI/GraphQL schema generation+codegen for the TS client.
Let's say I have api/profile (which has `.posts`) and api/user-directory (which does not). I define User as a discriminated union - one has type `user-with-posts` and one has type `user-no-posts`. OK, good so far. But now say I have a photo gallery, which returns photos on user. Now I have to add that into my discriminated union type, e.g. `user-with-posts-and-photos`, `user-with-posts-but-not-photos`, `user-with-no-posts-but-yes-photos`... and it gets worse if I also have another page with DMs...
Typescript playground: https://www.typescriptlang.org/play/?#code/C4TwDgpgBACg9gZ2F...
So {user: {...}, photos: [...]}, not {user: {..., photos: [...]}}.
Alternatively define separate schemas for each endpoint that extend the base User schema. But I'd prefer having the same structure everywhere as much as possible.
So on the /posts page the client asks for `{ user: { id, posts: { id, content }[] } }`, and gets a generated properly-typed function for making the query.
const MyResult = z.discriminatedUnion("status", [ z.object({ status: z.literal("success"), data: z.string() }), z.object({ status: z.literal("failed"), error: z.string() }), ]);
You can define passthrough behavior if there are a bunch of special attributes for a moderator but you don't want to list/check them all.
With different methods that have different schema- If they share part of the same schema with alterations, you can define an object for the shared part and create objects that contain the shared object schema with additional fields.
If you have a lot of different possibilities, it will be messy, but it sounds like it already is for you, so validators could still at least validate the messines.
Here is some pseudocode.
Person = { name: string, height: number }
Animal = {name: string, capability: string}
A = { post: object, methodType: string, person: Person }
ModeratorA = { post: object, moderatorField1: string, moderatorField2: string, person: Person }
UnionA = A && ModeratorA (There's probably a better way of defining A and ModeratorA to share the shared fields)
B = { post: object, animal: Animal }
endpoint person parses UnionA
endpoint animal parses B
You don't put all of your types in one big Union.
That's not surprising, "this endpoint will only return moderator user objects" is a bit of knowledge that has to be represented in code somehow.
TS libraries for GQL queries can dynamically infer the response shape from the shape of the selected fields.
Such that the query
query {
user {
username
posts {
text
}
}
}
Would be type Response = {
user: {
username: string
posts: Array<{
text: string
}>
}
}
And a query with just { user { username } } would have the posts property omitted entirely.We only use zod to validate forms, so I keep thinking "how does this matter?" Are people maybe using it to validate high throughput API input messages or something like that, where performance may matter more?
[1] https://www.reddit.com/r/typescript/comments/1i3ogwi/announc...
I've always been worried about how overly clever the approach is, does it have problems?
These announcements could be a valuable touchpoint for you to reach a whole new audience, but I can't remember a single one that starts with something like "exciting new release of NAME, the X that does Y for users of Z. Check out the project home page at https:// for more."
Quite often, the release announcement is a dead end that can't even take me to the project! In this case, the only link is a tiny octocat in the lower left-hand corner, AFAICS.
In my experience, large React projects often depend on a multitude of libraries, and when each one rolls out substantial changes—sometimes with barely any documentation—it can quickly become overwhelming. This is honestly one of my least favorite aspects of working with JavaScript. It just feels like a constant uphill battle to keep everything in sync and functioning smoothly.
Or just use an LLM.
I'm confident about this assessment because I maintain a large-ish piece of software and perenially have to decipher user reports of hallucinated LLM syntax for new features.
"It didn't solve my problems"
"You're the problem!"
He said users sent reports with hallucinated syntax, he wasn't even the one who used LLMs.
consumers uninterested in the 'mini' edition don't have to bother with that part.
but, the benefits of the 'mini' edition are so drastic for tree-shaking that it was driving development of alternatives - zod had to either do it (and benefit), or deprecate.
It's honestly been a nightmare, and I wish I had just built in Django instead. The Tailwind 3 -> 4 migration was probably among the most painful, which I was not expecting.
When you have something like SDL which is at it's third version at 27 years old, I'm very doubtful about the culture of NPM/JS world. A closer example is jQuery which is also in its third version at 18 years old.
It's a constantly improving and experimental domain - not just on the web but also the desktop and mobile environment.
Next.js is a good example since it (and its competitors) are a natural iteration on top of SPAs where you want the first request to the server to also inline the initial state of the SPA to remove the double-request problem.
But you pay a huge price trying to live on the edge like that, and when you run into issues, it doesn't make much sense to call that the state of web development since you could have used boring server tech or even a boring React SPA which also hasn't changed much in a decade.
npm is an absolute disaster of a dependency management system. Peer dependencies are so broken that they had to make v4 pretend it's v3.
Edit: Thinking about it, that's the origin story of JavaScript as well, so rather fitting.
They could at least also publish it as a major version without the legacy layer
EDIT: I've just seen the reason described here: https://github.com/colinhacks/zod/issues/4371 TLDR: He doesn't want to trigger a "version bump avalanche" across the ecosystem. (Which I believe, wouldn't happen as they could still backport fixes and support the v3 for a time, as they do it right now)
I'm not sure this is the right conclusion here. I think zod v4 is being included within v3 so consumers can migrate over incrementally. I.e refactor all usages, one by one to `import ... from 'zod/v4'`, and once that's done, upgrade to v4 entirely.
It seems like Zod is a library that provides schema to your (domain) objects in JS/TS projects, so if you're all-in with this library, it's probably a base-layer for a ton of stuff in the entire codebase.
Now imagine that the codebase is worked on by multiple parties, in different areas, and you're dealing with 100K-1M lines of code. Realistically, you can't coordinate such a huge change with just one "Migrated to Zod 4" commit and call it a day, and you most likely want to do it in pieces.
I'm not convinced publishing V4 as version 3 on npm is the best way of achieving this, but to make that process easier seems to be the goal of that as far as I understand.
One of the strange things from the NPM/JS culture is the focus to write everything in one language leading to everyone forgetting about modularization and principles like SOLID. We link everything together with strange mechanisms for "DX", which really means "make it easy to write the first version, maintenance be damned".
You don't get anything remotely like this in any other backend frontend combo.
In simple applications, you can get away with defining a single schema for all, but it helps with keeping in mind that you may have to design a better data shape for some logic.
If the teams working on the two are different, or even if the expertise level is uneven, something like a typed serialization library is a great boon.
At work, I maintain a Haskell-like programming language which spits out JSON representations of charts over OLAP queries. I’m the only one who knows the language extensively, but everyone is expected to do work with the JSON I push. If I serialize something incorrectly, or if someone else mistypes the frontend JSON response definition, we’re in for a world of pain.
Needless to say, I’ll be adding something like Zod to make that situation safer for everyone.
You don't even need that, I use typed serialization on both sides when talking to myself. How else do I guarantee the shape of what I send and receive? I want my codebase to scream at me if I ever mess it up.
You can import aliases.
If you have a code that needs to use two versions of axios, or zod, or whatever...
"zod4": "npm:zod@4.0.0"
Edit: reading the rationale, it's about peer dependencies rather than direct dependencies. I am still a little confused.
That said, you do see a cultural difference in node-land vs. many other ecosystems where library maintainers are much quicker to go for the new major version vs. iterating on the existing version and maintaining backward compatibility. I think that's what people are mostly really referring to when they complain about node/npm.
Webpack is a good example—what's it on now, version 5? If it was a Go project, it would probably still be on v1 or maybe v2. While the API arguably does get 'better' with each major version, in practice a lot of the changes are cosmetic and unnecessary, and that's what frustrates people I think. Very few people really care how nice the API of their bundler is. They just want it to work, and to receive security updates etc. Any time you spend on upgrading and converting APIs is basically a waste—it's not adding value for users of your software.
When it's time to go for a refactoring, the trade-off between costs and returns are worth it as you can go for years between those huge refactors.
Ultimately you're right that npm doesn't work well to manage the situation Zod finds itself in. But Zod is subject to a bunch of constraints that virtually no other libraries are subject to. There are dozens or hundreds of libraries that directly import interfaces/classes from "zod" and use them in their own public-facing API.
Since these libraries are directly coupled to Zod, they would need to publish a new major version whenever Zod does. That's ultimately reasonable in isolation, but in Zod's case it would trigger a "version avalanche" would just be painful for everyone involved. Selfishly, I suspect it would result in a huge swath of the ecosystem pinning on v3 forever.
The approach I ended up using is analogous to what Golang does. In essence a given package never publishes new breaking versions: they just add a new subpath when a new breaking release is made. In the TypeScript ecosystem, this means libraries can configure a single peer dependency on zod@^3.25.0 and support both versions simultaneously by importing what they need from "zod/v3" and "zod/v4". It provides a nice opt-in incremental upgrade path for end-users of Zod too.
Let's say a library is trying to implement an `acceptSchema` function that can accepts `Zod3Type | Zod4Type`. For starters: those two interfaces won't both be available if Zod 3 and Zod 4 are in separate packages. So that's already a non-starter. And differentiating them at runtime requires knowing which package is installed, which is impossible in the general case (mostly because frontend bundlers generally have no affordance for optional peer dependencies).
I describe this in more detail here: https://x.com/colinhacks/status/1922101292849410256
- Zero external dependencies
- Works in Node.js and all modern browsers
- Tiny: 2kb core bundle (gzipped)
Also, the 2kb bundle just for the zod/v4-mini package, the full zod/v4 package is quite large.
Last I looked, the nice thing about TypeBox was that is _was_ JsonSchema just typed which was nice for interoperability.
I tested early versions of zod v4 and liked the new API, but was very concerned about what will be the migration path. I was even going to suggest to publish under a new package name.
But the author's approach is ingenious. It allows for someone like me to start adopting v4 immediately without waiting for every dependency to update.
Well done!
Regarding the versioning: I wrote a fairly detailed writeup here[0] for those who are interested in the reasons for this approach.
Ultimately npm is not designed to handle the situation Zod finds itself in. Zod is subject to a bunch of constraints that virtually no other libraries are subject to. Namely, the are dozens or hundreds of libraries that directly import interfaces/classes from Zod and use them in their own public-facing API.
Since these libraries are directly coupled to Zod, they would need to publish a new major version whenever Zod does. That's ultimately reasonable in isolation, but in Zod's case it would trigger a "version avalanche" would just be painful for everyone involved. Selfishly, I suspect it would result in a huge swath of the ecosystem pinning on v3 forever.
The approach I ended up using is analogous to what Golang does. In essence a given package never publishes new breaking versions: they just add a new subpath when a new breaking release is made. In the TypeScript ecosystem, this means libraries can configure a single peer dependency on zod@^3.25.0 and support both versions simultaneously by importing what they need from "zod/v3" and "zod/v4". It provides a nice opt-in incremental upgrade path for end-users of Zod too.
Everyone old in Python ecosystem remembers the Python 2/3 migration madness.
As a convenience and mostly avoid typos in form names I use my own version of https://github.com/raflymln/zod-key-parser. I've been surprised something like this hasn't been implemented directly in the library.
Curious if you think this is out of scope for Zod or just something you haven't gotten around to implement?
(Here are discussions around it: https://github.com/colinhacks/zod/discussions/2134)
I like your library! Put in a PR to add it to the ecosystem page :)
To be clear: this isn't my library. This is just something I found while trying to solve the FormData issue. Props go to https://github.com/raflymln who created it.
Does this not sound insane?
---
I've been using the alpha versions of Zod for months, I just want to edit package.json and upgrade. But now I need to shotgun across git history instead.
Colin, I appreciate your project immensely. As a point of feedback, you made this ^ much harder than you had to. (Perhaps publish a 4.x along with the 3.x stuff?)
That being said, I'm fully understanding of the reasons for the somewhat odd versioning given your special situation, but still, I'd wish there would be a 4.0.0-package for folks like us who simply don't need to worry or bother about zod-version-clashes in transitive dependencies bacause those don't exist (or at least I think so; npm ls zod only returns our own dependency of zod). If I understood correctly, we'll need to adapt the import to "zod/v4", which will be an incredibly noisy change and will probably cause quite a few headaches when IDEs auto-import from 'zod' and such, which we then need to catch with linting-rules.
But that's probably a small gripe for a what sounds overall like a very promising upgrade - many thanks for your work once again!
Is fixing .optional() in TS[0] part of the 9/10 top-issues fixed? This has been my biggest pain point with Zod... but still Zod is so good I still choose to just deal with it :) Thanks for an amazing part of the ecosystem.
Now, that's effectively what 3.25 is. But there are some problematic extras... the semantics of the semantic versioning is muddled. That means confusion, which means some people will spin their wheels after not initially grokking the situation correctly. Also, there seems to be an implication there could be a strong deprecation of v3 APIs coming. That is, you have to wonder how long the window for incremental migration will remain open.
To be frank, I wouldn't touch zod on any project I hoped will be around for a while. Predicting the future is an uncertain business, but we can look at the past. In a few years, I think it's reasonable to guess zod v4 will be getting the same treatment zod v3 is getting now.
Not that I think you ought to do anything different. I probably wouldn't want to maintain some old API I came up with years ago indefinitely either (not without a decent support contract, that is).
90s_dev•6h ago
Is there a comparison guide? Never heard of this before, but I used io-ts and ajv.