The writing is on the wall for handwriting recognition

https://newsletter.dancohen.org/archive/the-writing-is-on-the-wall-for-handwriting-recognition/

189•speckx•2mo ago

Comments

coolness•2mo ago

Great post and amazing progress in this field! However, I have to wonder if some of these letters were part of the training data for Gemini, since they are well-known and someone has probably already done the painstaking work of transcribing them...

suddenlybananas•2mo ago

Shhhhh no one cares about data contamination anymore.

spwa4•2mo ago

Then write something down yourself and upload a picture to gemini.google.com or chatgpt. Hell, combine it. Make yourself a quick math test, print it, solve with pen and ask these models to correct it.

They're very good at it.

timdiggerm•2mo ago

For that to be relevant to this post, they would need to write with secretary hand.

suddenlybananas•2mo ago

I don't know how to write like a 19th century mathematician, nor anyone earlier. I'm not sure OCR on Carolingian Miniscule has been solved, let alone more ancient styles like Roman cursive or, god forbid, things like cuneiform. Especially since the corpora on these styles is so small, dataset contamination /is/ a major issue!

MrSkelter•2mo ago

I have a personal corpus of letters between my grandparents in WW2. My grandfather fighting in Europe and my grandmother in England. The ability of Claude and ChatGPT to transcribe them is extremely impressive. Though I haven’t worked on them in months and this uses older models. At that time neither system could properly organize pages though and chatGPT would sometimes skip a paragraph.

vertnerd•2mo ago

I've also been working on half a dozen crates of old family letters. ChatGPT does well with them and is especially good at summarizing the letters. Unfortunately, all the output still has to be verified because it hallucinates words and phrases and drops lines here and there. So at this point, I still transcribe them by hand, because the verification process is actually more tiresome than just typing them up in the first place. Maybe I should just have ChatGPT verify MY transcriptions instead.

embedding-shape•2mo ago

It helps when you can see the confidence of each token, which downloadable weights usually gives you. Then whenever you (your software) detects a low confidence token, run over that section multiple times to generate alternatives, and either go with the highest confidence one, or manually review the suggestions. Easier than having to manually transcribe those parts at least.

seidleroni•2mo ago

Is there any way to do this with the frontier LLM's?

red75prime•2mo ago

Ask them to mark low confidence words.

seidleroni•2mo ago

interesting... I'll give that a shot

akoboldfrying•2mo ago

Do they actually have access to that info "in-band"? I would guess not. OTOH it should be straightforward for the LLM program to report this -- someone else commented that you can do this when running your own LLM locally, but I guess commercial providers have incentives not to make this info available.

red75prime•2mo ago

Naturally, their "confidence" is represented as activations in layers close to output, so they might be able to use it. Research ([0], [1], [2], [3]) shows that results of prompting LLMs to express their confidence correlate with their accuracy. The models tend to be overconfident, but in my anecdotal experience the latest models are passably good at judging their own confidence.

[0] https://ieeexplore.ieee.org/abstract/document/10832237

[1] https://arxiv.org/abs/2412.14737

[2] https://arxiv.org/abs/2509.25532

[3] https://arxiv.org/abs/2510.10913

criemen•2mo ago

It used to be that the answer was logprobs, but it seems that is no longer available.

SoftTalker•2mo ago

Always seemed strange to me that personal correspondence between two now-dead people is interesting. But I guess that is just my point of view. You could say the same thing about reading fiction, I guess.

suddenlybananas•2mo ago

Why on earth wouldn't it be interesting? Do you only care about your own life?

dmd•2mo ago

Possibly, but given it can also read my handwriting- which is much, MUCH worse than Boole’s - with better accuracy than any human I’ve given it to- that’s probably not the explanation.

lccerina•2mo ago

Most likely, and probably inferring the structure on texts with "similar" writing forms. Tried with my handwriting (in italian) and the performance wasn't that stellar. More annoyingly, it is still a LLM and not a "pure" OCR, so some sentences were partially rephrased with different words than the one in the text. This is crucially problematic if they would be used to transcribe historical documents

embedding-shape•2mo ago

> Tried with my handwriting (in italian) and the performance wasn't that stellar.

Same here, for diaries/journals written in mixed Swedish/English/Spanish and with absolutely terrible hand-writing.

I'd love for the day where the writing is on the wall for handwriting recognition, which is something I bet on when I started with my journals, but seems that day has yet to come. I'm eager to get there though so I can archive all of it!

GaggiX•2mo ago

Are you sure to have used the Gemini 3.0 pro model? Maybe try increasing the media resolution on the AI studio if the text is small

butlike•2mo ago

So it doesn't work is what you're saying, right?

pbronez•2mo ago

"it is still a LLM and not a "pure" OCR"

When does a character model become a language model?

If you're looking at block text with no connections between letter forms, each character mostly stands on its own. Except capital letters are much more likely at the beginning of a word or sentence than elsewhere, so you probably get a performance boost if you incorporate that.

Now we're considering two-character chunks. Cursive script connects the letterforms, and the connection changes based on both the source and target. We can definitely get a performance boost from looking at those.

Hmm you know these two-letter groupings aren't random. "ng" is much more likely if we just saw an "i". Maybe we need to take that into account.

Hmm actually whole words are related to each other! I can make a pretty good guess at what word that four-letter-wide smudge is if I can figure out the word before and after...

and now it's an LLM.

iamflimflam1•2mo ago

If I went back in time to the 90s when I was doing my PhD I would absolutely blow my mind with how well handwriting OCR works now.

th0ma5•2mo ago

My question for OCR automation is always which digits within the numbers being read are allowed to be incorrect?

pjmlp•2mo ago

Maybe for English, for the other human languages I use, it is still kind of hit and miss, just like speaking recognition, even with English it suffices to have an accent that is off the standard TV one.

NitpickLawyer•2mo ago

ee lay vhen!

TonyTrapp•2mo ago

They don't do Scottish accents!

pjmlp•2mo ago

Indeed, has it improved anything in 14 years?

https://www.youtube.com/watch?v=BOUTfUmI8vs

macleginn•2mo ago

As always, this depends on the amount of training data available. Japanese is another success story: https://digitalorientalist.com/2020/02/18/cursive-japanese-a...

pjmlp•2mo ago

Interesting, thanks for sharing.

williamjsdavis•2mo ago

Agree here. I've had successes with 18th century Dutch, but again quite a few failures and mistakes

cubefox•2mo ago

Is "it" Gemini 3 Pro?

__alexs•2mo ago

Call me when it can do Russian Cursive.

decimalenough•2mo ago

Seems to do an OK job:

https://g.co/gemini/share/e173d18d1d80

This is a random image from Twitter with no transcript or English translation provided, so it's not going to be in the training data.

shatsky•2mo ago

No, transcription has nothing to do with written text, it guessed few words here and there but not even general topic. That's doctors note about patient visit, beginning with "Прием: состояние удовл., t*, но кашель / patient visit: condition is OK, t(temperature normal?) but coughing". But unreadable doctors handwriting is a meme...

GaggiX•2mo ago

That's Gemini 2.5 Flash btw

The result from Gemini 3 Pro using the default media resolution (the medium one): "(Заголовок / Header): Арсеньев (Фамилия / Surname - likely "Arsenyev")

    Состояние удовл-

    t N, кожные

    покровы чистые,

    [л/у не увел.]

    В зеве умерен. [умеренная]

    гипер. [гиперемия]

    В легких дыха-

    ние жесткое, хрипов

    нет. Тоны серд-

    [ца] [ритм]ичные.

    Живот мяг-

    кий, б/б [безболезненный].

    мочеисп. [мочеиспускание] своб. [свободное]

    Ds: ОРЗ [или ОРВИ]" and with the translation: "Arsenyev

Condition satisfactory. Temp normal, skin coverings [skin] are clean, lymph nodes not enlarged. In the throat [pharynx], moderate hyperemia [redness]. In the lungs, breathing is rigid [hard], no rales [crackles/wheezing]. Heart tones are rhythmic. Abdomen is soft, painless. Urination is free [unhindered]. Diagnosis: ARD (Acute Respiratory Disease)."

__alexs•2mo ago

Ok fine I'm impressed

red75prime•2mo ago

My first language is Russian. I can't fully understand this dreaded "doctor's cursive", but I can see that some parts of Gemini's text is probably wrong.

It's most likely "но кашель сохр-ся лающий" ("but barking cough is still present"), not "кожные покровы чистые" ("the skin is clean"). Diagnose is probably wrong too. Judging by symptoms it should be "ОРЗ", but I have no idea what's actually written there.

Still, it's very, very impressive.

myth_drannon•2mo ago

This is a historical church document from 19th century and Gemini got it right with common words but completely hallucinated the names of village and people.

https://gemini.google.com/share/f98de1d5ac55

myth_drannon•2mo ago

Right, it can do modern writing but anything older than a century ( church records and census)and it produces garbage. Yandex Archives figured that out and have CER in a single digit but they have the resources to collect immense data for training. I'm slowly building a dataset for finetuning TROCR model and the best it can do is CER 18% ... which is sort of readable.

coredog64•2mo ago

How do you do, fellow TrOCR fine-tuner?

I'm using TrOCR because it's a smaller model that I can fine tune on a consumer card, but the age of the model and resources certainly make it a challenge. The official notebook for fine tuning hasn't been updated in years and has several errors due to the march of progress in the primary packages.

myth_drannon•2mo ago

I think I based my notebook on the official example but yes at some point new versions of the libraries completely broke it. I had to pin the versions for it to work again.

This one works, you can check the versions https://pastebin.com/QPjGHN8j

tigerlily•2mo ago

Surely the true prize is to be able to ditch computers altogether and just write with pencil on paper.

sph•2mo ago

I am writing on paper with the hope that one day I can digitize everything painlessly with 99.99% accuracy.

layer8•2mo ago

Keyboards are faster.

DarkNova6•2mo ago

> Here’s Transkribus’s best guess at George’s letter to Maryann, above:

Transkribus got a new model architecture around the corner and the results look impressive. Not only for trivial cases like text, but also for table structures and layouting.

Best of all, you can train it on your own corpus of text to support obscure languages and handwriting systems.

Really looking forward to it.

macleginn•2mo ago

I became convinced of this after the release of KuroNet: https://arxiv.org/pdf/1910.09433 (High-quality OCR of Japanese manuscripts, which look almost impossible to read.)

nikanj•2mo ago

The writing is on the wall for handwriting. Zoomers use speech recognition or touchscreen keyboards, millennials use keyboards. Boomers use pens

lccerina•2mo ago

I call out the Lindy effect. Handwriting survived printed characters, typewriters, and the last 50-70 years of computers and keyboards, it will survive this too.

djmips•2mo ago

I love how you fit right into the current meme that Gen-X never gets mentioned.

shaftway•2mo ago

While I agree (and strongly identify with, and like this position), one could amend the original to be "Gen-X uses pens and print, Boomers use pens and cursive"

sph•2mo ago

Silly comment. Handwriting is proven to be correlated with much better memory retention, which ultimately means much greater degree of association with existing memories and the creation of novel ideas.

"The comparison between handwriting and typing reveals important differences in their neural and cognitive impacts. Handwriting activates a broader network of brain regions involved in motor, sensory, and cognitive processing, contributing to deeper learning, enhanced memory retention, and more effective engagement with written material. Typing, while more efficient and automated, engages fewer neural circuits, resulting in more passive cognitive engagement. These findings suggest that despite the advantages of typing in terms of speed and convenience, handwriting remains an important tool for learning and memory retention, particularly in educational contexts."

https://pmc.ncbi.nlm.nih.gov/articles/PMC11943480/

You are literally handicapping yourself by not thinking with pen and paper, or keeping paper notes.

The future is handwriting with painless digitization for searchability, until we invent a better input device for text that leverages our motor-memory facilities in the brain.

debazel•2mo ago

This paper just says that handwriting requires more cognitive load?

Which is exactly my experience with handwriting through my school years. When handwriting notes during lectures all focus goes to plotting down words, and it becomes impossible to actually focus on the meaning behind them.

SoftTalker•2mo ago

The actual research doesn't back up your personal experience.

nikanj•2mo ago

Handwriting might be very beneficial, but so are frequent social visits and those are going away too - a 2025 human spends way less time with friends than a 1975 human did. We are not rational actors, and good habits die easily

lifestyleguru•2mo ago

It feels unbelievable that in Europe literacy rate could be 10% of lower. Then I look at documents even as young as 150 years... fraktur, blackletter, elaborate handwritting. I guess I'm illiterate now.

Hopefully next generations will feel the same about legal contracts, law in general, and Java code bases. They're incomprehensible not because of fonts but because of unfathomable complexity.

sph•2mo ago

Which Europe and which century do you live in where literacy rate is below 10%?

lifestyleguru•2mo ago

Speaking about the past centuries.

tpm•2mo ago

You can learn fraktur or blackletter in a day and cyrillic in a few days, if you already know the latin alphabet.

lifestyleguru•2mo ago

> learn fraktur or blackletter in a day and cyrillic in a few days

Not a chance, sorry.

tpm•2mo ago

Why? The former are just different typefaces (I learned to read them by myself when I was 10 while looking at our old books) and the latter I sort of picked up while travelling through Serbia and Bulgaria (I don't speak the languages).

sph•2mo ago

Any self-hosted open source solution? I would like to digitize my paper notebooks but I do not want to use anything proprietary or that uses external services. What is the state of the art on the FOSS side?

Ideally something that I can train with my own handwriting. I had a look at Tesseract, wondering if there’s anything better out there.

vintermann•2mo ago

Regular handwriting there are many.

Historical handwriting, Gemini 3 is the only one which gave a decent result on a 19th century minutes from a town court in Northern Norway (Danish gothic handwriting with bleed through). I'm not 100% sure it's correct, but that's because it's so dang hard to read it to verify it. At least I see it gets many names, dates and locations right.

I've been waiting a long time for this.

sph•2mo ago

> Regular handwriting there are many.

Please share. I am out of the loop and my searches have not pointed me to the state of the art, which has seen major steps forward in the past 3 or 4 years but most of it seems to be closed or attached to larger AI products.

Is it even still called OCR?

fragmede•2mo ago

Totally not what you asked, but making an OCR model is a learning exercise for AI research students. Using the Kaggle-hosted dataset https://www.kaggle.com/datasets/landlord/handwriting-recogni... and a tutorial, eg https://pyimagesearch.com/2020/08/17/ocr-with-keras-tensorfl... you can follow along and train your own OCR model!

driscoll42•2mo ago

The best open source OCR model for handwriting in my experience is surya-v2 or nougat, really depends on the docs which is better, each got about 90% accuracy (cosine similarity) in my tests. I have not tried Deepseek-OCR, but mean to at some point.

embedding-shape•2mo ago

Try various downloadable weights that has Vision, they're all good at different examples, running multiple ones and then finally something to aggregate/figure out the right one usually does the trick. Some recent ones to keep in the list: ministral-3-14b-reasoning, qwen3-vl-30b, magistral-small-2509, gemma-3-27b

Personally I found magistral-small-2509 to be overall most accurate, but it completely fails on some samples, while qwen3-vl-30b doesn't struggle at all with those same samples. So seems training data is really uneven depending on what exactly you're trying to OCR.

And the trade-off of course is that these are LLMs so not exactly lightweight nor fast on consumer hardware, but at least with the approach of using multiple you greatly increase the accuracy.

girvo•2mo ago

> "transmitted": In the second line of the body, the word "transmitted" is crossed out in the original text

Am I nuts or is this wrong, not “perfect”?

It doesn’t look crossed out at all to me in the image, just some bleeding?

Still very impressive, of course

williamscales•2mo ago

I agree, I noticed the same thing. To my eye it appears smudged.

benterix•2mo ago

> If AI can diminish some of the monotony of research, perhaps we can spend more time thinking, writing, playing piano, and taking walks — with other people.

Whenever any progress is made, this is the logical conclusion. And yet, those who decide about how your time is being used, have an opposing view.

ulbu•2mo ago

maybe akin to how faster conputers bred programs that are slower than before.

volemo•2mo ago

Better “thinking” computers will breed worse thinking people, huh?

palmotea•2mo ago

> Better “thinking” computers will breed worse thinking people, huh?

I actually think that will be the case. We're designing society for the technology, not the technology for the people in it. The human brain wasn't built to fit whatever gap is left by AI, regardless of how many words the technologists spew to claim otherwise.

For instance: AI already is undermining education by enabling mental laziness students (why learn the material when ChatGPT can do your homework for you). It seems the current argument is that AI will replace entry-level roles but leave space for experienced and skilled people (but block the path to get there). Some of the things LLMs do a mediocre but often acceptable job at are the things one needs to do to build and hone higher-level skills.

spiritplumber•2mo ago

Dr. Walter Gibbs: Won't that be grand? Computers and the programs will start thinking and the people will stop.

themaninthedark•2mo ago

Look at GPS and then "self-driving" cars.

With GPS we have seen people confidently drive past road closed signs and around barriers off bridges.

With self-driving technology, we have seen them defeat safe guards so they can sit in the back while the car accelerates up to 70 in a subdivision.

f3b5•2mo ago

Socrates allegedly was opposed to writing since he felt that it would make people lazy, reducing their ability to memorize things. If it wouldn't be for his disciple Plato who wrote down his words, none of his philosophy would have survived.

So I'm not completely disagreeing with you, but I also am not too pessimistic, either. We will adapt, and benefit through the adoption of AI, even though some things will probably be lost, too.

volemo•2mo ago

> We will adapt, and benefit through the adoption of AI, even though some things will probably be lost, too.

“What doesn’t kill you, makes you stronger”. We will adapt and benefit, or we will not — time will tell.

ulbu•2mo ago

that, and instead of increases of productivity reducing people's need to work, what might (I think, will) happen is that we will actually have to work more for worse results and lower incomes, for the whims of the executive class and increased energy requirements for LLMs. compound this control over channels of communication (google, facebook, xitter), means of production (microsoft, amazon), with force of social-emotional manipulation of LLMs and we have a really "winner" technology.

I do not think the executive class is actually in on the power of AI to increase productivity, but rather to increase reliance.

notimetorelax•2mo ago

I feel that we’re reaching a limit to our context switching. Any further process improvements or optimizations will be bottlenecked on humans. And I don’t think AI will help here as jobs will account for that and we’ll have to do context switching on even broader and more complex scopes.

seethishat•2mo ago

I think the limit has been exceeded. That's the primary reason everything sort of sucks now. There is no time to slow down and do things right (or better).

IMO, cyber security, for example, will have to become a government mandate with real penalties for non-compliance (like seat belts in cars were mandated) in order to force organizations to slow down, and make sure systems are built carefully and as correctly as possible to protect data.

This is in conflict with the hurtling pace of garbage in/garbage out AI generated stuff we see today.

hiAndrewQuinn•2mo ago

Here in the EU cybersecurity is actually being regulated, with heavy fines to come (15 million euros or 2.5% of global turnover!), if it wasn't already. Look up the CRA and the NIS2.

Things may well reach a point elsewhere in the world finding out that some software is for sale in the European Union is itself a marker of quality, and therefore justifies some premium.

stockresearcher•2mo ago

These are good developments, but it remains to be seen how much of impact they will have. Software developers will have to follow a bunch of “best practices”, but there isn’t a requirement that they are good at them. There are no fines for producing insecure software, only fines for not following the rules.

Software providers are also likely to be specifying narrow “fit for purpose” statements and short (ish) support window. If costs go up too much, people will be using “inappropriate” and/or EOL stuff because the “right thing” is too expensive.

To be clear, this is a step in the right direction but is not the panacea.

ecshafer•2mo ago

If there is 1 job at a university. And there are 10 researchers applying. And 1 took this improvement in research speed to do more research, and 9 took the change to play more piano and take more walks, then most likely that one will get the job. This competitive nature is what has driven society forward and not kept us at just above subsistence agriculture.

swatcoder•2mo ago

Not really. It's a very recent fad to treat "research" as some kind of mechanical factory process that need simply optimize units research per unit time.

When you sit down to think about it, what does it really even mean to do "more research"? What concrete phenomenon are you observing to decide what that is?

Across the journey from "subsistence agriculture", there have been countless approaches to nurturing innovation and discovery, but abstracting it into an abstract game measured by papers published and citations received is extremely novel and so far seems to correlate more with a waste and noise than it does discovery. Science and research is not in a healthy period these days, and the model that you describe, and seem to take for granted or may even be celebrating, plays a big role in why.

pixl97•2mo ago

Eh the more problem space you explore the more energy is required to explore it. Looking at the last 300 years and saying 'look at all the low hanging fruit we picked' doesnt describe where we are now.

throwup238•2mo ago

> This competitive nature is what has driven society forward and not kept us at just above subsistence agriculture.

The UN estimates that around 500 million households or 2 billion people are still subsistence farmers. In 2025.

Fat lot of good competition has done them, especially when they don’t have enough surplus to participate in a market economy to begin with.

pixl97•2mo ago

I mean with our population increase in the last 100 years these numbers are showing a massive decrease in poverty with sub Saharan Africa holding the highest remaining areas of poverty.

robotresearcher•2mo ago

Their children are far more likely to survive childhood than at any time in history.

palmotea•2mo ago

> Whenever any progress is made, this is the logical conclusion. And yet, those who decide about how your time is being used, have an opposing view.

Exactly. Some people forget we live in a capitalist society, which does not prioritize or support the contentment of the masses. We exist to work for the owners or starve, they're not going to pay us to enjoy ourselves.

BJones12•2mo ago

Between 1965 and 1995 the average American gained about 6 hours per week of leisure time. They then used most of the additional free time to watch TV.

financetechbro•2mo ago

What’s the problem with that? People are free to use their leisure time how they see fit.

dugidugout•2mo ago

They were likely pushing back on the original comment, such that it isn't solely:

> ...those who decide about how your time is being used...

which stops individuals from:

> [spending] more time thinking, writing, playing piano, and taking walks — with other people.

Which it seems you would agree with. I don't see where they asserted whether this was a problem to address.

robocat•2mo ago

Source? Surely depends on the population chosen e.g. does average American include retirees?

missedthecue•2mo ago

Pretty sure the figure he's quoting is average hours working. bls.gov tracks this.

So no, no retirees or students or unemployed or disabled in that figure.

nitwit005•2mo ago

It does include people who would like to work more hours though. One of the trends has been people increasingly struggling to get enough hours.

jgeada•2mo ago

And what's happened since 1995 (30 years ago!) ?

Because all the trends seem to indicate that to make a living people are working longer hours, holding multiple concurrent jobs (eg https://gameofjobs.org/are-americans-now-more-likely-than-ev...), and holding off retirement.

themaninthedark•2mo ago

We started offshoring manufacturing and growing the service economy?

Now the service economy is turning into the sharing economy, I think the only thing we are sharing is the greater profits and they are taking the lions share.

zkmon•2mo ago

It's painful to see that beautiful hand-writing of the past is now pretty much extinct. For me, handwriting of a person speaks a lot about them, not just their mind, but physical state as well.

SoftTalker•2mo ago

Penmanship used to be a topic of instruction in school. It hasn't been for quite a long time. Even in the 1970s when we still had to write in cursive, we didn't spend time learning to make it elegant, with a lot of flourishes. We just learned a very plain standard form of writing. By high school everyone had diverged a bit with their own personal style but nothing like the writing of the 19th century.

I once visited a high school where they had a wall of signatures from every graduating senior going back to the 1920s or so. The "personality" evident in the signatures showed a steady decline, from very stylish in the oldest ones to mostly just poorly printed names in the 2020s.

ferguess_k•2mo ago

Don't worry, handwriting itself has diminished throughout the decades since the introduction of computers an especially smart phones.

Ah, maybe I'll pick up Qin seal when I retire, if I retire.

tokai•2mo ago

Anyone knows how the models do on Russian cursive?

euroderf•2mo ago

Ultimate Test

zelphirkalt•2mo ago

Quite certain my doctor can still produce writing, that the models don't stand a chance to be able to recognize.

stronglikedan•2mo ago

I'm just excited that I may finally be able to decipher my meeting notes from yesterday!

RationPhantoms•2mo ago

Anecdata inbound but my PCP, thankfully, used Nuance's speech-to-text platform remarkably well for adding his own commentary on things. It was a refreshing thing to see and I hope my clinicians use it.

nottorp•2mo ago

I thought handwriting recognition is on the wall because no one knows how to write cursive any more

taeric•2mo ago

I confess this largely surprises me for reasons that I think should not surprise me. I would expect current AI is largely best at guessing at what some writing was based on expectations of other things it has managed to "read." As such, I would think it is largely not going to be much better at hand writing than any other tool.

Yet, it occurs to me that that "guess and check" is exactly what I'm doing when trying to read my 6yo's writing. Often I will do a pass to detect the main sounds, but then I start thinking of what was current on his thoughts and see if I can make a match. Not surprisingly, often I do.

shevy-java•2mo ago

I can't recognize my handwriting anymore. :(

RationPhantoms•2mo ago

Wouldn't it be easier to train a vLLM on the handwriting style of the historical person in question? An agent graphologist if you will. Surely there is a lot of pattern matching in the way things are written.

Then again, getting this result from a heavily-generalized SOTA model is pretty incredible too.

supersrdjan•2mo ago

Now, onto the next frontier: handwriting recognition for shorthand. Let's start from Orthic :)

canopi•2mo ago

If that's not a proof of the 10/90 rule in machine learning. The last 10% of accuracy are harder than the first 90 (and that goes recursively).

We almost solved OCR 20 years ago. Then we spent 20 years on the last percentage. We see the same in self-driving cars.

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

Statin drugs safer than previously thought

Handy when you just want to distract yourself for a moment

More States Are Taking Aim at a Controversial Early Reading Method

AI will not save developer productivity

How I do and don't use agents

BTDUex Safe? The Back End Withdrawal Anomalies

Show HN: Compile-Time Vibe Coding

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

PR to support XMPP channels in OpenClaw

Twenty: A Modern Alternative to Salesforce

Raspberry Pi: More memory-driven price rises

Level Up Your Gaming

Di.day is a movement to encourage people to ditch Big Tech

Show HN: AI generated personal affirmations playing when your phone is locked

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

Launch of X (Twitter) API Pay-per-Use Pricing

Facebook seemingly randomly bans tons of users

Global Bird Count Event

What Is Ruliology?

Jon Stewart – One of My Favorite People – What Now? with Trevor Noah Podcast [video]

P2P crypto exchange development company

Vocal Guide – belt sing without killing yourself

Write for Your Readers Even If They Are Agents

Knowledge-Creating LLMs

Maple Mono: Smooth your coding flow

Sid Meier's System for Real-Time Music Composition and Synthesis

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

Statin drugs safer than previously thought

Handy when you just want to distract yourself for a moment

More States Are Taking Aim at a Controversial Early Reading Method

AI will not save developer productivity

How I do and don't use agents

BTDUex Safe? The Back End Withdrawal Anomalies

Show HN: Compile-Time Vibe Coding

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

PR to support XMPP channels in OpenClaw

Twenty: A Modern Alternative to Salesforce

Raspberry Pi: More memory-driven price rises

Level Up Your Gaming

Di.day is a movement to encourage people to ditch Big Tech

Show HN: AI generated personal affirmations playing when your phone is locked

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

Launch of X (Twitter) API Pay-per-Use Pricing

Facebook seemingly randomly bans tons of users

Global Bird Count Event

What Is Ruliology?

Jon Stewart – One of My Favorite People – What Now? with Trevor Noah Podcast [video]

P2P crypto exchange development company

Vocal Guide – belt sing without killing yourself

Write for Your Readers Even If They Are Agents

Knowledge-Creating LLMs

Maple Mono: Smooth your coding flow

Sid Meier's System for Real-Time Music Composition and Synthesis

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

The writing is on the wall for handwriting recognition

Comments