Show HN: I improved my handwritten math OCR (now preserves derivations)

https://www.useaxiomnotes.com/app

1•mrajatnath•2h ago

I built this after almost losing a semester’s worth of handwritten math derivations.

I was taking a Signals and Systems course and filling notebooks with Laplace transforms and long derivations. Before finals I tried digitizing them so I could search my notes.

Everything failed.

Most OCR tools can recognize the characters, but they destroy the structure that makes math readable:

- aligned equations lose alignment - multi-step derivations collapse into paragraphs - numbered problems merge together - tables flatten into plain text

So I built *Axiom*.

Instead of focusing only on transcription accuracy, it focuses on *preserving mathematical structure*.

Upload a photo of handwritten STEM notes and it returns structured Markdown with real LaTeX — keeping aligned equations, derivation steps, and problem blocks intact.

Under the hood it’s basically:

image → vision model → structured Markdown + LaTeX → KaTeX render

Most of the work ended up being in *layout preservation*, not OCR.

https://www.useaxiomnotes.com/app

Happy to answer questions.

Comments

mrajatnath•2h ago

Rajat here.

A few technical details about how this works.

Stack: - Next.js - Tailwind - KaTeX for rendering - Supabase storage - deployed on Vercel

The pipeline is roughly:

image → vision model → Markdown + LaTeX → custom renderer

The tricky part isn’t OCR itself — it's preserving structure.

Examples:

• consecutive equations with aligned `=` signs need to become a single `align` block • handwritten tables must be reconstructed from vertical alignment patterns • numbered problems must stay separate instead of merging

The system prompt ended up being ~300 lines mostly consisting of *negative constraints* like:

- don't simplify math - don't merge derivation steps - don't reorder columns

Without those rules the model constantly tries to "improve" the notes.

One surprising lesson: prompt engineering for OCR is very different from chat prompts — you want the model to be extremely literal.

Still working on better handling for diagrams and messy annotations.

Curious if anyone here has worked on *math layout detection or document AI*.

Pentagon Eyes New 'Robot Ship' Concept for Low-Profile, All-Domain Logistics

ChatRoutes is open source now

Agent's context is a junk drawer

Show HN: OpenTimelineEngine – Shared local memory for Claude Code and codex

I'm building a $15/mo status page would you pay for it?

The Purpose of Keyboard Bumps – Its Not What You Think

Enterprise UI Module Federation

Show HN: We want to kill SaaS glue code with one shared infrastructure model

Show HN: Tyop: A macOS menu bar app that fixes typos on demand

Show HN: safe-docx lets coding agents edit Word docs without breaking formatting

Show HN: I built a language app that generates songs from your vocab list

A zero-dependency multi-agent AI engine that negotiates instead of agreeing

Father claims Google's AI product fuelled son's delusional spiral

The origin of our fascination with crystals

Treetops Emit Ultraviolet Sparkles During Thunderstorms

Show HN: MomentSurfer – AI Scrolling Agent for Social Media

Don't Let Crypto Kill the Economy

Show HN: SmartAgentKit – policy-governed smart wallets for AI agents

Show HN: Karellen-rr-MCP – MCP server that gives LLMs rr reverse debugging

Israel Spent Years Hacking Tehran Traffic Cameras to Track Khamenei

Genome modelling and design across all domains of life with Evo 2

Google ends its 30 percent app store fee and welcomes third-party app stores

Google Chrome moving to a two-week Release Cycle, to begin on 8 September

Ask HN: How will agents change our theories of labor?

Blogosphere – a directory of independent blogs and personal websites

Show HN: A browser based sequencer for rapid music prototyping

Helpme: Minimal tmux wrapper for context-aware agent debugging

A Dual-LLM Policy for Reducing Noise in Agentic Program Repair

Show HN: WooTTY - browser terminal in a single Go binary

Show HN: I built a CLI to sync AI agent skills and MCPs across coding agents