frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Bitchat – A decentralized messaging app that works over Bluetooth mesh networks

https://github.com/jackjackbits/bitchat
37•ananddtyagi•45m ago•17 comments

Nobody has a personality anymore: we are products with labels

https://www.freyaindia.co.uk/p/nobody-has-a-personality-anymore
80•drankl•2h ago•47 comments

Building the Rust Compiler with GCC

https://fractalfir.github.io/generated_html/cg_gcc_bootstrap.html
77•todsacerdoti•3h ago•2 comments

Intel's Lion Cove P-Core and Gaming Workloads

https://chipsandcheese.com/p/intels-lion-cove-p-core-and-gaming
49•zdw•2h ago•0 comments

Show HN: I wrote a "web OS" based on the Apple Lisa's UI, with 1-bit graphics

https://alpha.lisagui.com/
239•ayaros•6h ago•78 comments

I extracted the safety filters from Apple Intelligence models

https://github.com/BlueFalconHD/apple_generative_model_safety_decrypted
246•BlueFalconHD•5h ago•148 comments

Jane Street barred from Indian markets as regulator freezes $566 million

https://www.cnbc.com/2025/07/04/indian-regulator-bars-us-trading-firm-jane-street-from-accessing-securities-market.html
220•bwfan123•10h ago•120 comments

Centaur: A Controversial Leap Towards Simulating Human Cognition

https://insidescientific.com/centaur-a-controversial-leap-towards-simulating-human-cognition/
7•CharlesW•1h ago•2 comments

Data on AI-related Show HN posts

https://ryanfarley.co/ai-show-hn-data/
215•rfarley04•2d ago•125 comments

A non-anthropomorphized view of LLMs

http://addxorrol.blogspot.com/2025/07/a-non-anthropomorphized-view-of-llms.html
88•zdw•2h ago•67 comments

Opencode: AI coding agent, built for the terminal

https://github.com/sst/opencode
120•indigodaddy•7h ago•28 comments

Get the location of the ISS using DNS

https://shkspr.mobi/blog/2025/07/get-the-location-of-the-iss-using-dns/
254•8organicbits•12h ago•75 comments

Functions Are Vectors (2023)

https://thenumb.at/Functions-are-Vectors/
148•azeemba•9h ago•79 comments

I don't think AGI is right around the corner

https://www.dwarkesh.com/p/timelines-june-2025
130•mooreds•4h ago•157 comments

There's a COMPUTER inside my DS flashcart [video]

https://www.youtube.com/watch?v=uq0pJmd7GAA
3•surprisetalk•46m ago•0 comments

Backlog.md – Markdown‑native Task Manager and Kanban visualizer for any Git repo

https://github.com/MrLesk/Backlog.md
75•mrlesk•4h ago•15 comments

Lessons from creating my first text adventure

https://entropicthoughts.com/lessons-from-creating-first-text-adventure
24•kqr•2d ago•1 comments

Crypto 101 – Introductory course on cryptography

https://www.crypto101.io/
21•pona-a•3h ago•1 comments

Curzio Malaparte's Shock Tactics

https://www.newyorker.com/books/under-review/curzio-malapartes-shock-tactics
3•mitchbob•3d ago•2 comments

Why English doesn't use accents

https://www.deadlanguagesociety.com/p/why-english-doesnt-use-accents
58•sandbach•3h ago•51 comments

Async Queue – One of my favorite programming interview questions

https://davidgomes.com/async-queue-interview-ai/
87•davidgomes•8h ago•69 comments

Metriport (YC S22) is hiring engineers to improve healthcare data exchange

https://www.ycombinator.com/companies/metriport/jobs/Rn2Je8M-software-engineer
1•dgoncharov•7h ago

Corrected UTF-8 (2022)

https://www.owlfolio.org/development/corrected-utf-8/
38•RGBCube•3d ago•25 comments

The Broken Microsoft Pact: Layoffs and Performance Management

https://danielsada.tech/blog/microsoft-pact/
26•dshacker•1h ago•10 comments

Hannah Cairo: 17-year-old teen refutes a math conjecture proposed 40 years ago

https://english.elpais.com/science-tech/2025-07-01/a-17-year-old-teen-refutes-a-mathematical-conjecture-proposed-40-years-ago.html
336•leephillips•9h ago•74 comments

Toys/Lag: Jerk Monitor

https://nothing.pcarrier.com/posts/lag/
46•ptramo•10h ago•36 comments

Mirage: AI-native UGC game engine powered by real-time world model

https://blog.dynamicslab.ai
17•zhitinghu•1d ago•11 comments

Collatz's Ant and Σ(n)

https://gbragafibra.github.io/2025/07/06/collatz_ant5.html
23•Fibra•7h ago•3 comments

Overclocking LLM Reasoning: Monitoring and Controlling LLM Thinking Path Lengths

https://royeisen.github.io/OverclockingLLMReasoning-paper/
48•limoce•11h ago•0 comments

"Do not highlight any negatives"

https://www.google.com/search?q=%22do+not+highlight+any+negatives%22+site%3Aarxiv.org
23•bgc•1h ago•3 comments
Open in hackernews

How we made our OCR code more accurate

https://pieces.app/blog/how-we-made-our-optical-character-recognition-ocr-code-more-accurate
58•thunderbong•1mo ago

Comments

camtarn•1mo ago
Neat article, but I feel like I have no idea why they're doing this! Is transcribing code from images really such a big use case?
lelag•1mo ago
Maybe they want to compile the Apollo Guidance Computer source code...

https://www.softwareheritage.org/wp-content/uploads/2019/07/...

ivanjermakov•1mo ago
If it's not a joke, I think it was already digitized: https://github.com/chrislgarry/Apollo-11
dewey•1mo ago
> To best support software engineers when they want to transcribe code from images, we fine-tuned our pre-processing pipeline to screenshots of code in IDEs, terminals, and online resources like YouTube videos and blog posts.

Even with these examples that seems like a very narrow use case.

FloatArtifact•1mo ago
From an accessibility standpoint, yes. To be able to pattern match where you are in I.D.E without using an accessibility api
SloopJon•1mo ago
The product appears to be similar to Microsoft's embattled Recall feature. In order to remember your digital life it takes frequent screenshots.
gosub100•1mo ago
I guess it would be excellent to evade security monitors to take unauthorized copies of your employers codebase.
EvanAnderson•1mo ago
It worries me that stuff like that becoming easier will lead to wacky data pipelines being normalized (pulling display output off systems and "scraping" it to get data, of dubious quality, versus just building a proper interface). The kind of crowd that likes "low code" tools like MSFT's "Power Automate" is going to love to make Rube Goldberg nightmares out of tools like this.

It fills me with a deep sadness that we created deterministic machines then, though laziness, exploit every opportunity to "contaminate" them with sloppy practices that make them produce output with the same fuzzy inaccuracy as human brains.

Old man yells a neural networks take: We're entering a "The Machine Stops" era where nobody is going to know how to formulate basic algorithms.

"We need to add some numbers. Let's point a camera at the input, OCR it, then feed it to an LLM that 'knows math'. Then we don't have to figure out an algorithm to add numbers."

I wish compute "cost" more so people would be forced to actually make efficient use of hardware. Sadly, I think it'll take mass societal and infrastructure collapse for that to happen. Until it does, though, let the excess compute flow freely!

jocoda•1mo ago
asimov - The feeling of power.
abc-1•1mo ago
Anything that mentions tesseract is about 10 years out of date at this point.
amelius•1mo ago
Well, at least I can apt-get install tesseract.

That doesn't hold for any of the GPU-based solutions, last time I checked.

booder1•1mo ago
5.5.0 released November last year. Still a very active project as far as I can tell and runs on CPU. Even compared to best open source GPU option it is still pretty good. VLMs work very differently and don't work as well for everything. Why is it out of date?
cbsmith•1mo ago
I don't know that that is true: https://researchify.io/blog/comparing-pytesseract-paddleocr-...

Using Surya gets you significantly better results and makes almost all the work detailed in the article largely unnecessary.

booder1•1mo ago
Surya weights for the models are licensed cc-by-nc-sa-4.0 so not free for commercial usage. Also, as far as I know, the training data is 100% unavailable. Given they use well trained, but standard models, it isn't really open source and barely, maybe, open weight. I kinda hate how their repo says gpl cause that is only true for the inference code. The training code is closed source.
cbsmith•1mo ago
I did not know that the training code is closed source. That is troubling.
krapht•1mo ago
I just built a pipeline with tesseract last year. What's better that is open source and runnable locally?

VLLM hallucination is a blocker for my use case.

stavros•1mo ago
How is a hallucination worse than a Tesseract error?
jgalt212•1mo ago
Hallucinations are hard to detect unless you are a subject-matter expert. I don't have direct experience with Tesseract error detection.
krapht•1mo ago
Because the VLM doesn't know it hallucinated. When you get a Tesseract error you can flag the OCR job for manual review.
amelius•1mo ago
It could hallucinate obscene language, something which is less likely with classic OCR.
gessha•1mo ago
Latter is more likely to get debugged.
criddell•1mo ago
If you are stuck with open source, then your options are limited.

Otherwise I'd say just use your operating system's OCR API. Both Windows and MacOS have excellent APIs for this.

fxtentacle•1mo ago
Quite simply, you’re completely wrong. Modern tesseract versions include a modern LSTM AI. It can very affordably be deployed on CPU, yet its performance is competitive with much more expensive large GPU-based models. Especially if you handle a high volume of scans, chances are that tesseract will have the best bang per buck.
nicman23•1mo ago
i remember that you could not train it your self in a font like you could in older versions, it that still the case?
ianhawes•1mo ago
My company probably spent close to 6 figures overall creating Tesseract 5 custom models for various languages. Surya beats them all and is open source (and quite faster).
booder1•1mo ago
Surya weights for the models are licensed cc-by-nc-sa-4.0. They have an exception for small companies. If you're company is not small you either need to pay them or use them illegally.

Their training code and data is closed source. They are barely open weight and only inference is open source.

bluelightning2k•1mo ago
I can't say I've ever wanted to transcribe code from an image. That seems super niche.

Perhaps the specific idea is to harvest coding textbooks as training data for LLMs?

eurekin•1mo ago
I'm guessing to automatically scrape videos for future training rounds.
blharr•1mo ago
Eh, imagine poor documentation where people take screenshots of steps and don't write them out.

I can also imagine plenty of YouTube tutorials that type the code live... seems fairly useful

cAtte_•1mo ago
Pieces is (correction: used to be, prior to the AI slopification) an app for storing code snippets. so i think you can imagine the general idea of, e.g., "cool API usage example from a YouTube video, let me screenshot it!"
potato-peeler•1mo ago
> can't say I've ever wanted to transcribe code from an image. That seems super niche.

This is nightmare for endpoint protection. Imagine rogue employees snapping pics of your proprietary codebase and then using this to reassemble it.

vaxman•1mo ago
Tesseract OCR was created by digital (DEC) in 19_8_5 (yes, 40 not four YEARs ago). Now go back and read the article and ROFL with me.
Onavo•1mo ago
The original tesseract OCR has no neural nets. It bare little resemblance to the modern version.
vaxman•1mo ago
It's still 40.

Why not use Ollama-OCR?

krapht•1mo ago
Because I benchmarked both on my dataset and found that Tesseract was better for my use-case?
rafram•1mo ago
I’ve tested a bunch of vision models on particularly difficult documents (handwritten in a German script that’s no longer used), and I have yet to be impressed. They’re good at BSing to the point that you almost think they nailed it, until you realize that it’s mostly/all made-up text that doesn’t appear in the document.
yjftsjthsd-h•1mo ago
> It's still 40.

Is it, though? If the important parts of the code are new, does it matter that other parts are older or derived from older code? (Of course, I think this whole line of thought is pointless; what matters is not age, but how well it works, and tesseract generally does seem to work.)

vaxman•1mo ago
Yeah it is, it does (especially with OOP) and "ABBYY" kicked Tesseract's arse a long time ago anyway.

Maybe try OpenAI GPT-4o or Google's Document AI https://cloud.google.com/document-ai

ivanjermakov•1mo ago
What is this argument? Much software we use today was created in the 80s.
vaxman•1mo ago
Not the actual implementations heh ...I heard even Linus has dropped support for the 486. Even the infra is finally giving way...did you see the NVLINK SPINE announcement a few days ago? It's going to be deployed in Stargate UAE that was announced Thursday.
rafram•1mo ago
Unix was created in _1971_ and here we are still running processes and shells like it’s the 70s. Why not just have an LLM dream up the output?
vaxman•1mo ago
No son, Linux is not a version of Unix anymore than MINIX is.

NeXTStep was real UNIX, but macOS is not.

BTW, I was taught to program in C by one of the original core Unix team members and I worked for DEC long before I could have discussed TesseractOCR with people who didn't. Keep those ignorant downvotes commin'

sushid•1mo ago
Making OCR more accurate for regular text (e.g. data extraction from documents) would be useful; not sure how useful code transcription is
bobosha•1mo ago
has anyone tried feeding the admittedly noisy OCR-ed text -at a document level - to an LLM for making sense? Presumably some of the less capable ones should be quite affordable and accurate at scale as well.
lesuorac•1mo ago
OCR is the biggest XY problem.

Stop accepting PDFs and force things to use APIs ...

MoonGhost•1mo ago
Even small upscale model trained on texts should do better than big generic.