frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Mac Mini Service Program for No Power Issue

https://support.apple.com/mac-mini-2023-service-program-for-no-power-issue
1•doener•31s ago•0 comments

Revealing Political Bias in LLMs Through Structured Multi-Agent Debate

https://arxiv.org/abs/2506.11825
1•rntn•1m ago•0 comments

WhatsApp is getting ads using personal data from Instagram and Facebook

https://noyb.eu/en/whatsapp-getting-ads-using-personal-data-instagram-and-facebook
1•ThePhysicist•1m ago•0 comments

Merrypopins a Library for Nanoindentation

https://mnky9800n.substack.com/p/merrypopins-a-library-for-nanoindentation
1•mnky9800n•3m ago•0 comments

Ask HN: How do I market to consumers as a solo dev about to go to uni?

2•Alex-Programs•3m ago•0 comments

Crypto group Tron to go public after U.S. pauses probe into billionaire founder

https://www.ft.com/content/13a6cead-af71-4811-9b90-553f233ac45f
1•cempaka•6m ago•0 comments

Trump Organization enters phone market with $499 Trump Mobile device

https://www.reuters.com/world/us/trump-organization-unveils-self-branded-mobile-phone-network-2025-06-16/
2•jayknight•6m ago•0 comments

Apollo 11 Technical Crew Debriefing – Tape 3 [video]

https://www.youtube.com/watch?v=yTzEIIJm1-Y
1•pgreenwood•7m ago•0 comments

Engineers at our startup don't build features anymore

1•s4293918•8m ago•0 comments

Replace Your Gmail Password Now, Google Tells 2B Users

https://www.forbes.com/sites/daveywinder/2025/06/15/change-your-gmail-password-now-google-tells-2-billion-users/
2•thunderbong•8m ago•0 comments

Breaking Murphy's Law

http://www.breakingmurphyslaw.com/
1•bookofjoe•10m ago•0 comments

Ask HN: How do you handle an employee who complies but never delivers?

3•tropicalfruit•12m ago•0 comments

Gbadev.org

https://gbadev.org/
1•ibobev•14m ago•0 comments

How Storytelling Fixed My Broken User Experience

https://eonurk.com/2025/06/16/enhancing-user-experience-via-storytelling/
1•celltalk•14m ago•0 comments

My grandparents chose to die together, the end chapter of love spanning 70 years

https://www.theguardian.com/australia-news/2025/jun/08/when-they-chose-to-die-together-my-grandparents-wrote-the-final-chapter-of-a-love-story-spanning-70-years
2•NaOH•14m ago•1 comments

Tyme+ – The Everyday App

https://tyme.today
1•tymelabs•16m ago•2 comments

Summary of Heroku June 10 Outage

https://www.heroku.com/blog/summary-of-june-10-outage/
1•dakull•17m ago•0 comments

Use AI to Get Your Time Back

https://algarch.com/blog/use-ai-to-get-your-time-back
2•jdalton•21m ago•1 comments

Believing you only have one option is dangerous

https://www.clearerthinking.org/post/believing-you-only-have-one-option-is-dangerous
5•gmays•28m ago•0 comments

GitHub metrics are lying to you

https://www.threadsafe.dev/blog/github-metrics-are-lying-to-you
1•ruairidhwm•29m ago•0 comments

Video Game Controllers

https://plover.wiki/index.php/Video_game_controllers
1•tosh•31m ago•0 comments

How NOT to become a VP – 24 wrong steps on a journey

https://markgreville.ie/2025/04/13/how-to-become-a-vp-at-a-billion-dollar-company-guaranteed-success-in-24-simple-steps/
2•gHeadphone•32m ago•0 comments

ChatMultiAI: Browser extension, send prompts to multiple providers at once

https://github.com/caiyongji/ChatMultiAI
1•gavinray•34m ago•1 comments

Show HN: AI Calculator builder to build any type of calculator

https://minform.io/ai-calculator-builder
1•eashish93•38m ago•1 comments

Show HN: VS Code extension to share code snippets instantly

https://snippetshare.dev
2•petermukha•39m ago•0 comments

Your Clever Password Algorithm Sucks

https://shkspr.mobi/blog/2025/06/your-password-algorithm-sucks/
3•edent•39m ago•0 comments

Working on databases from prison: How I got here pt. 2

https://turso.tech/blog/working-on-databases-from-prison
3•dvektor•39m ago•0 comments

Show HN: I build an Astrology AI, Which can expose people personality in detail

https://horochan.com/ai
1•viknesh_x•40m ago•1 comments

The PowerPad tablet was pretty cool! Its ad in PC World 1984 was...not

https://buttondown.com/suchbadtechads/archive/powerpad-1984/
2•rfarley04•40m ago•0 comments

Why AI Challenges Us to Become More Human

https://www.forbes.com/sites/bernardmarr/2024/05/06/why-ai-challenges-us-to-become-more-human/
1•squircle•40m ago•0 comments
Open in hackernews

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

https://huggingface.co/nanonets/Nanonets-OCR-s
74•PixelPanda•6h ago

Comments

PixelPanda•6h ago
Full disclaimer: I work at Nanonets

Excited to share Nanonets-OCR-s, a powerful and lightweight (3B) VLM model that converts documents into clean, structured Markdown. This model is trained to understand document structure and content context (like tables, equations, images, plots, watermarks, checkboxes, etc.). Key Features:

LaTeX Equation Recognition Converts inline and block-level math into properly formatted LaTeX, distinguishing between $...$ and $$...$$.

Image Descriptions for LLMs Describes embedded images using structured <img> tags. Handles logos, charts, plots, and so on.

Signature Detection & Isolation Finds and tags signatures in scanned documents, outputting them in <signature> blocks.

Watermark Extraction Extracts watermark text and stores it within <watermark> tag for traceability.

Smart Checkbox & Radio Button Handling Converts checkboxes to Unicode symbols like , , and for reliable parsing in downstream apps.

Complex Table Extraction Handles multi-row/column tables, preserving structure and outputting both Markdown and HTML formats.

Huggingface / GitHub / Try it out: https://huggingface.co/nanonets/Nanonets-OCR-s

Try it with Docext in Colab: https://github.com/NanoNets/docext/blob/main/PDF2MD_README.m...

mvac•2h ago
Correct link for Docext: https://github.com/NanoNets/docext/blob/main/PDF2MD_README.m...
silversmith•6h ago
I'm curious, how does it do with non-english texts? It's my understanding that LLM-based OCR solutions fall way behind traditional ones once you introduce other languages.
wickedsight•5h ago
Understanding or experience?

Because my experience is not at all like that. If I use both Google Translate and ChatGPT on an image, ChatGPT is pretty much always better. It can even translate Japanese hand written menus quite well. With the added benefit of it being able to add context and explain what the dishes are.

silversmith•4h ago
I'm passively interested in small, local LLM OCR, due to couple ideas kicking around between my ears. Tried some a while ago, but most of my recent knowledge is second-hand. Waiting for someone to exclaim "hey this works now!" before committing more time :)

With the big commercial offerings like chatgpt I'd fully expect them to work fine, due to the absolutely massive horsepower in use.

raus22•6h ago
With models like these, when multilingual is not mentioned it will perform really bad on real life non-english pdfs.
souvik3333•6h ago
The model was primarily trained on English documents, which is why English is listed as the main language. However, the training data did include a smaller proportion of Chinese and various European languages. Additionally, the base model (Qwen-2.5-VL-3B) is multilingual. Someone on Reddit mentioned it worked on Chinese: https://www.reddit.com/r/LocalLLaMA/comments/1l9p54x/comment...
progval•6h ago
It's not open-source (nor open-weight): https://huggingface.co/nanonets/Nanonets-OCR-s/discussions/2
souvik3333•6h ago
Hi, author of the model here. It is an open-weight model, you can download it from here: https://huggingface.co/nanonets/Nanonets-OCR-s
gardnr•5h ago
Interestingly, another OCR model based on Qwen2.5-VL-3B just dropped which also publishes as Apache 2. It's right next to Nanonets-OCR-s on the HF "Trending" list.

https://huggingface.co/echo840/MonkeyOCR/blob/main/Recogniti...

tensor•6h ago
There are no benchmarks or accuracy measures on a hold out set?
souvik3333•6h ago
Hi, author of the model here..

We have a benchmark for evaluating VLM on document understanding tasks: https://idp-leaderboard.org/ . But unfortunately, it does not include image to markdown as a task. The problem with evaluating an image to markdown is that even if the order of two blocks are different, it can still be correct. Eg: if you have both seller info and buyer info side by side in the image one model can extract the seller info first, and another model can extract the buyer info first. Both model will be correct but depending on the ground truth if you do fuzzy matching one model will have higher accuracy than the other one.

Normally, a company will train and test on a dataset that is trained on the same type of annotation (either left block first or right block first), and all other models can get a low score on their benchmark because they are trained on the opposite order of annotations.

Eisenstein•4h ago
How does it do with handwriting?
souvik3333•3h ago
We have not trained explicitly on handwriting datasets (completely handwritten documents). But, there are lots of forms data with handwriting present in training. So, do try on your files, there is a huggingface demo, you can quickly test there: https://huggingface.co/spaces/Souvik3333/Nanonets-ocr-s

We are currently working on creating completely handwritten document datasets for our next model release.

Eisenstein•2h ago
Document:

* https://imgur.com/cAtM8Qn

Result:

* https://imgur.com/ElUlZys

Perhaps it needed more than 1K tokens? But it took about an hour (number 28 in queue) to generate that and I didn't feel like trying again.

How many tokens does it usually take to represent a page of text with 554 characters?

souvik3333•2h ago
Hey, the reason for the long processing time is that lots of people are using it, and with probably larger documents. I tested your file locally seems to be working correctly. https://ibb.co/C36RRjYs

Regarding the token limit, it depends on the text. We are using the qwen-2.5-vl tokenizer in case you are interested in reading about it.

You can run it very easily in a Colab notebook. This should be faster than the demo https://github.com/NanoNets/docext/blob/main/PDF2MD_README.m...

There are incorrect words in the extraction, so I would suggest you to wait for the handwritten text model's release.

mvac•2h ago
How does it compare to Datalab/Marker https://github.com/datalab-to/marker ? We evaluated many PDF->MD converters and this one performed the best, though it is not perfect.
ks2048•15m ago
It’s a shame all these models target markdown and not something with more structure and a specification. There are different flavors of Markdown and limited support for footnotes, references, figures, etc.