Show HN: I built a zero-browser, pure-JS typesetting engine for bit-perfect PDFs

45•cosmiciron•18h ago

Hi HN, I'm a film director by trade, and I prefer writing my stories in plain text rather than using clunky screenplay software. Standard markup like Fountain doesn't work for me because I write in mixed languages, so I use Markdown with a custom syntax I invented to resemble standard screenplay structures.

This workflow is great until I need to actually generate an industry-standard screenplay PDF. I got tired of manually copying and pasting my text back into the clunky software just to export it, so I decided to write a script to automate the process. That's when I hit a wall.

I tried using React-pdf and other high-level libraries, but they failed me on two fronts: true multilingual text shaping, and complex contextual pagination. Specifically, the strict screenplay requirement to automatically inject (MORE) at the bottom of a page and (CONT'D) at the top of the next page when a character's dialogue is split across a page break.

You can't really do that elegantly when the layout engine is a black box. So, I bypassed them and built my own typesetting engine from scratch.

VMPrint is a deterministic, zero-browser layout VM written in pure TypeScript. It abandons the DOM entirely. It loads OpenType fonts, runs grapheme-accurate text segmentation (Intl.Segmenter), calculates interval-arithmetic spatial boundaries for text wrapping, and outputs a flat array of absolute coordinates.

Some stats:

Zero dependencies on Node.js APIs or the DOM (runs in Cloudflare Workers, Lambda, browser).

88 KiB core packed.

Performance: On a Snapdragon Elite ARM chip, the engine's "God Fixture" (8 pages of mixed CJK, Arabic RTL, drop caps, and multi-page spanning tables) completes layout and rendering in ~28ms.

The repo also includes draft2final, the CLI tool I built to convert Markdown into publication-grade PDFs (including the screenplay flavor) using this engine.

This is my first open-source launch. The manuscript is still waiting, but the engine shipped instead. I’d love to hear your thoughts, answer any questions about the math or the architecture, and see if anyone else finds this useful!

--- A note on AI usage: To be fully transparent about how this was built, I engineered the core concept (an all-flat, morphable box-based system inspired by game engines, applied to page layouts), the interval-arithmetic math, the grapheme segmentation, and the layout logic entirely by hand. I did use AI as a coding assistant at the functional level, but the overall software architecture, component structures, and APIs were meticulously designed by me.

For a little background: I’ve been a professional systems engineer since 1992. I’ve worked as a senior system architect for several Fortune 500 companies and currently serve as Chief Scientist at a major telecom infrastructure provider. I also created one of the world's first real-time video encoding technologies for low-power mobile phones (in the pre-smartphone era). I'm no stranger to deep tech, and a deterministic layout VM is exactly the kind of strict, math-heavy system that simply cannot be effectively constructed with a few lines of AI prompts.

Comments

flexagoon•2h ago

Looks interesting, but the "Why Not Just Use" section in the readme is definitely missing Typst. Would be interesting to know how they compare, since Typst is the obvious choice for typesetting nowadays, rather than LaTeX.

cosmiciron•1h ago

That is a very fair point, and I will absolutely add Typst to the "Why not just use..." section of the README! Typst is a phenomenal piece of software, but it operates in a very different architectural space than VMPrint.

The core differences come down to the runtime environment and the integration paradigm:

1. Edge-Native vs. WASM: Typst is written in Rust. To run it in a serverless environment, you have to ship and instantiate a WebAssembly binary. In strict Edge environments (like Cloudflare Workers or Vercel Edge) where bundle sizes and cold starts are heavily penalized, WASM can be a bottleneck. VMPrint is an 88 KiB pure-JS engine. It drops natively into any V8/Edge runtime with zero WASM bridge, allowing you to synchronously generate and stream deterministic PDFs directly from the edge in milliseconds.

2. Programmatic AST vs. Custom Markup: Typst is a markup compiler—you write in its .typ language. VMPrint is a lower-level layout VM. It doesn't parse markup; it consumes a flat JSON instruction stream (an AST) and spits out absolute X/Y coordinates. It's designed for developers who want to build their own custom document generators programmatically, rather than writing a document by hand.

3. Mid-Flight Pagination Control: My specific pain point was screenplays. I needed strict contextual rules, like automatically injecting (MORE) at the bottom of a page and (CONT'D) at the top of the next if a dialogue block splits across a page break. Achieving that kind of hyper-specific programmatic intervention is tough in a closed compiler. With VMPrint's two-stage pipeline, you have absolute access to the layout tree in native JS to manipulate it before it ever hits the renderer.

In short: if you want a beautiful, incredibly fast LaTeX replacement to write documents, use Typst. If you are a JS developer who needs to build a custom document pipeline and wants a lightweight, native layout VM to run the heavy math at the edge, that's VMPrint.

koterpillar•2h ago

Are Unicode combining characters (dotted circles) visible on the screenshot by design?

cosmiciron•1h ago

Incredible eye. You are absolutely right, and that is actually an artifact of the engine's current architecture!

That screenshot includes the Hindi word 'देवनागरी' (Devanagari) and some Arabic text with diacritics. Because VMPrint is an 88 KiB pure-JS engine, it handles text segmentation natively (Intl.Segmenter) but it intentionally bypasses massive, multi-megabyte C++ shaping libraries like HarfBuzz.

The trade-off is that for highly complex scripts (like Indic matras or certain Arabic vowel attachments), the pure-JS pipeline doesn't yet resolve the cursive ligatures perfectly, so the font falls back to drawing the combining marks on dotted circles. It mathematically calculates the bounding boxes correctly, but the visual glyph substitution isn't fused. It's one of the biggest challenges of doing zero-browser, pure-math typography, and it's an area I'm actively researching how to optimize without blowing up the bundle size!

speajus•1h ago

Oh man -- I just wrote of these browserless markdown to pdf a few days ago.... Thanks for publishing [https://github.com/speajus/markdown-to-pdf.git](https://speajus.github.io/markdown-to-pdf). I didn't need anything this exacting. Anyways nice work; excited to look deeper.

luaybs•1h ago

Every single screenshot of Arabic in the README is malformed, the letters are squished together and not connected.

cosmiciron•36m ago

Good eye, you are right. A few others noticed this as well. It is a known trade off given the engine being pure JS and only 80K. The project is still at very early stage, and I'm definitely keeping my eyes open for solutions.

codegladiator•1h ago

devnagri in the screenshot is wrongly rendered.

Also can you share some names of films you have been part of as film director.

cosmiciron•40m ago

You are absolutely right about the Devanagari. That is a known trade-off at the moment. Because the core engine is strictly constrained to 88 KiB of pure JavaScript, it intentionally bypasses massive C++ shaping libraries like HarfBuzz. I haven't yet found a way to process complex text layout (CTL) and fuse those ligatures purely in JS without completely blowing up the bundle size. It's a very early implementation, but finding a micro-footprint solution for that is on the ROADMAP!

Though, to be fair, for my original need—generating industry-standard screenplays from Markdown—the engine is already total overkill. LOL.

As for the film: my feature premiered in January 2024 in China under the title 《天降大任》. It was originally developed in Los Angeles as an English-language project called Chosen. I actually put down my programmer's hat and worked on that film for over ten years!

raphlinus•1h ago

Unfortunately, your complex script shaping for Arabic and Devanagari is wrong. The Arabic is missing the joining (all forms are isolated), and the Devanagari doesn't have the vowels combining (so you see those dotted circles).

To fix this you'll need Harfbuzz or something similar. Taking a quick look at the code, it seems like you're just doing a glyph at a time through the cmap. That, uh, won't do.

cosmiciron•51m ago

You are completely right on all fronts. Thank you for taking a look at the code!

You hit the exact architectural bottleneck. Right now, the engine uses Intl.Segmenter to find the grapheme boundaries, but then it just does a direct cmap lookup to get the advance widths. It currently lacks a parser for the OpenType GSUB (Glyph Substitution) and GPOS (Glyph Positioning) tables, which is why Arabic defaults to isolated forms and Indic matras don't fuse.

The standard advice is exactly what you suggested: "just drop in HarfBuzz." But that creates an existential problem for this specific project. HarfBuzz is a massive C++ library. To run it in an Edge worker or pure V8 environment, I'd have to ship a WebAssembly binary that is often upwards of 1MB. That entirely defeats the purpose of building an 88 KiB, pure-JS, zero-dependency layout VM.

Doing complex text layout (CTL) and shaping purely in JavaScript without exploding the bundle size is essentially the final boss of this project. The roadmap is to either implement a highly tree-shakeable, pure-JS parser for the most critical GSUB/GPOS rules, or find a way to pre-compile shaping instructions.

For right now, it's a known trade-off: lightning-fast, edge-native pure JS layout, at the cost of failing on complex cursive ligatures. If you know of any micro-footprint pure-JS shaping libraries that don't rely on WASM, I am all ears!

Yiin•33m ago

Not sure what's the point of it being so fast and so small if it's also wrong.

cosmiciron•28m ago

And what's the point of being right when it's slow and bloated? Come on, it works for a lot of use cases, and it doesn't work for some. And it's still evolving.

LastTrain•1h ago

Define "I"

cosmiciron•31m ago

As someone who has spent decades in the lucid dreaming community researching consciousness (and created the SSILD technique), I could give you a very long, highly philosophical answer about the nature of the self, LOL.

But in the context of this repository: 'I' is the carbon-based entity that designed the layout architecture and takes full responsibility for the bugs.

TimTheTinker•35m ago

> If you generate PDFs with headless browsers or HTML-to-PDF tools, you've accepted a compromise: heavy dependencies, memory leaks, and "approximate" layout that shifts across environments

Absolutely not true with Prince[0]. It's an HTML/CSS-based typesetter built by the creator of CSS (Håkon Wium Lie [1]) that is lightweight, cross-platform, requires no dependencies, has no memory leaks, is 100% consistent in its output, is fully compliant with the relevant standards, and has a lot of really great print-oriented features (like using CSS to control things like page headers/footers, numbering, etc.). Prince has been used to typeset a lot of different print output types, from posters to books to scientific papers. It's even a viable alternative to LaTex. I've used it in the past, and can attest that it is outstanding.

[0] https://www.princexml.com/

[1] https://en.wikipedia.org/wiki/H%C3%A5kon_Wium_Lie

cosmiciron•4m ago

Thanks for the correction. I'm actually not familiar with Prince, so I really can't tell.

To be clear, VMPrint isn't meant to compete with established engines like that. It’s just a genuinely helpful tool I built from scratch for the specific tasks I needed to accomplish because I couldn't find an alternative.

Prince looks powerful, but I have a feeling it probably wouldn't have been the right fit for my use case anyway.

LastTrain•32m ago

So this is what it has come to? AI bots writing code and fake origin stories of said code and AI bots commenting on it any other bots responding? This is front page content now? HN: please require all AI generated content to be flagged as such. Ban offenders. This just blows.

wmf•11m ago

I think I'm more tired of AInvestigations than of AI now.

sriram_malhar•30m ago

Love it, love it! Thanks for sharing.

cosmiciron•25m ago

Thank you for the kind words. I hope it becomes a useful part of your toolkit. It certainly is for my need to generate screenplays.

samlinnfer•29m ago

>ai description

>ai code

>ai comments

tills13•3m ago

Low key the comments and replies here look like ai too.

nodoodles•26m ago

Curious but offtopic - are others also immediately suspicious of the content and quality because the readme is so obviously AI-written? What are ways you distinguish genuinely useful contributions on the sea of slop?

armanidev•5m ago

Cool project! I built something similar — an AI-powered "Roast My Website" tool that analyzes any website's design and gives a brutally honest review. It's free and runs on n8n webhooks: https://crazeemedia.app.n8n.cloud/webhook/roast-my-website

Always fun to see what people are building with AI.

Computer-generated dream world: Virtual reality for a 286 processor

If AI writes code, should the session be part of the commit?

Evolving descriptive text of mental content from human brain activity

WebMCP is available for early preview

Everett shuts down Flock camera network after judge rules footage public record

Frankensqlite a Rust reimplementation of SQLite with concurrent writers

Show HN: Timber – Ollama for classical ML models, 336x faster than Python

Right-sizes LLM models to your system's RAM, CPU, and GPU

Show HN: I built a zero-browser, pure-JS typesetting engine for bit-perfect PDFs

How to record and retrieve anything you've ever had to look up twice

Ghostty – Terminal Emulator

Tove Jansson's criticized illustrations of The Hobbit (2023)

Why does C have the best file API

Little Free Library

When does MCP make sense vs CLI?

Have your cake and decompress it too

Decision trees – the unreasonable power of nested decision rules

Next-gen spacecraft are overwhelming communication networks

Show HN: Vibe Code your 3D Models

C64 Copy Protection

You don't have to

Microgpt explained interactively

Long Range E-Bike (2021)

Ape Coding [fiction]

What our DNA reveals about the sex life of Neanderthals

Running Neural Amp Modeler on embedded hardware

Setting up phones is a nightmare

Why XML tags are so fundamental to Claude

Flightradar24 for Ships

Microgpt

Show HN: I built a zero-browser, pure-JS typesetting engine for bit-perfect PDFs

Comments

Computer-generated dream world: Virtual reality for a 286 processor

If AI writes code, should the session be part of the commit?

Evolving descriptive text of mental content from human brain activity

WebMCP is available for early preview

Everett shuts down Flock camera network after judge rules footage public record

Frankensqlite a Rust reimplementation of SQLite with concurrent writers

Show HN: Timber – Ollama for classical ML models, 336x faster than Python

Right-sizes LLM models to your system's RAM, CPU, and GPU

Show HN: I built a zero-browser, pure-JS typesetting engine for bit-perfect PDFs

How to record and retrieve anything you've ever had to look up twice

Ghostty – Terminal Emulator

Tove Jansson's criticized illustrations of The Hobbit (2023)

Why does C have the best file API

Little Free Library

When does MCP make sense vs CLI?

Have your cake and decompress it too

Decision trees – the unreasonable power of nested decision rules

Next-gen spacecraft are overwhelming communication networks

Show HN: Vibe Code your 3D Models

C64 Copy Protection

You don't have to

Microgpt explained interactively

Long Range E-Bike (2021)

Ape Coding [fiction]

What our DNA reveals about the sex life of Neanderthals

Running Neural Amp Modeler on embedded hardware

Setting up phones is a nightmare

Why XML tags are so fundamental to Claude

Flightradar24 for Ships

Microgpt