frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Other Leverage in Software and AI

https://tomtunguz.com/the-other-leverage-in-software-and-ai/
1•gmays•56s ago•0 comments

AUR malware scanner written in Rust

https://github.com/Sohimaster/traur
2•sohimaster•3m ago•0 comments

Free FFmpeg API [video]

https://www.youtube.com/watch?v=6RAuSVa4MLI
2•harshalone•3m ago•1 comments

Are AI agents ready for the workplace? A new benchmark raises doubts

https://techcrunch.com/2026/01/22/are-ai-agents-ready-for-the-workplace-a-new-benchmark-raises-do...
2•PaulHoule•8m ago•0 comments

Show HN: AI Watermark and Stego Scanner

https://ulrischa.github.io/AIWatermarkDetector/
1•ulrischa•8m ago•0 comments

Clarity vs. complexity: the invisible work of subtraction

https://www.alexscamp.com/p/clarity-vs-complexity-the-invisible
1•dovhyi•9m ago•0 comments

Solid-State Freezer Needs No Refrigerants

https://spectrum.ieee.org/subzero-elastocaloric-cooling
1•Brajeshwar•10m ago•0 comments

Ask HN: Will LLMs/AI Decrease Human Intelligence and Make Expertise a Commodity?

1•mc-0•11m ago•1 comments

From Zero to Hero: A Brief Introduction to Spring Boot

https://jcob-sikorski.github.io/me/writing/from-zero-to-hello-world-spring-boot
1•jcob_sikorski•11m ago•0 comments

NSA detected phone call between foreign intelligence and person close to Trump

https://www.theguardian.com/us-news/2026/feb/07/nsa-foreign-intelligence-trump-whistleblower
5•c420•12m ago•0 comments

How to Fake a Robotics Result

https://itcanthink.substack.com/p/how-to-fake-a-robotics-result
1•ai_critic•12m ago•0 comments

It's time for the world to boycott the US

https://www.aljazeera.com/opinions/2026/2/5/its-time-for-the-world-to-boycott-the-us
1•HotGarbage•12m ago•0 comments

Show HN: Semantic Search for terminal commands in the Browser (No Back end)

https://jslambda.github.io/tldr-vsearch/
1•jslambda•12m ago•1 comments

The AI CEO Experiment

https://yukicapital.com/blog/the-ai-ceo-experiment/
2•romainsimon•14m ago•0 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
3•surprisetalk•18m ago•0 comments

MS-DOS game copy protection and cracks

https://www.dosdays.co.uk/topics/game_cracks.php
3•TheCraiggers•19m ago•0 comments

Updates on GNU/Hurd progress [video]

https://fosdem.org/2026/schedule/event/7FZXHF-updates_on_gnuhurd_progress_rump_drivers_64bit_smp_...
2•birdculture•20m ago•0 comments

Epstein took a photo of his 2015 dinner with Zuckerberg and Musk

https://xcancel.com/search?f=tweets&q=davenewworld_2%2Fstatus%2F2020128223850316274
8•doener•20m ago•2 comments

MyFlames: View MySQL execution plans as interactive FlameGraphs and BarCharts

https://github.com/vgrippa/myflames
1•tanelpoder•21m ago•0 comments

Show HN: LLM of Babel

https://clairefro.github.io/llm-of-babel/
1•marjipan200•21m ago•0 comments

A modern iperf3 alternative with a live TUI, multi-client server, QUIC support

https://github.com/lance0/xfr
3•tanelpoder•22m ago•0 comments

Famfamfam Silk icons – also with CSS spritesheet

https://github.com/legacy-icons/famfamfam-silk
1•thunderbong•23m ago•0 comments

Apple is the only Big Tech company whose capex declined last quarter

https://sherwood.news/tech/apple-is-the-only-big-tech-company-whose-capex-declined-last-quarter/
2•elsewhen•26m ago•0 comments

Reverse-Engineering Raiders of the Lost Ark for the Atari 2600

https://github.com/joshuanwalker/Raiders2600
2•todsacerdoti•28m ago•0 comments

Show HN: Deterministic NDJSON audit logs – v1.2 update (structural gaps)

https://github.com/yupme-bot/kernel-ndjson-proofs
1•Slaine•31m ago•0 comments

The Greater Copenhagen Region could be your friend's next career move

https://www.greatercphregion.com/friend-recruiter-program
2•mooreds•32m ago•0 comments

Do Not Confirm – Fiction by OpenClaw

https://thedailymolt.substack.com/p/do-not-confirm
1•jamesjyu•32m ago•0 comments

The Analytical Profile of Peas

https://www.fossanalytics.com/en/news-articles/more-industries/the-analytical-profile-of-peas
1•mooreds•32m ago•0 comments

Hallucinations in GPT5 – Can models say "I don't know" (June 2025)

https://jobswithgpt.com/blog/llm-eval-hallucinations-t20-cricket/
1•sp1982•32m ago•0 comments

What AI is good for, according to developers

https://github.blog/ai-and-ml/generative-ai/what-ai-is-actually-good-for-according-to-developers/
1•mooreds•32m ago•0 comments
Open in hackernews

How we made our OCR code more accurate

https://pieces.app/blog/how-we-made-our-optical-character-recognition-ocr-code-more-accurate
58•thunderbong•8mo ago

Comments

camtarn•8mo ago
Neat article, but I feel like I have no idea why they're doing this! Is transcribing code from images really such a big use case?
lelag•8mo ago
Maybe they want to compile the Apollo Guidance Computer source code...

https://www.softwareheritage.org/wp-content/uploads/2019/07/...

ivanjermakov•8mo ago
If it's not a joke, I think it was already digitized: https://github.com/chrislgarry/Apollo-11
dewey•8mo ago
> To best support software engineers when they want to transcribe code from images, we fine-tuned our pre-processing pipeline to screenshots of code in IDEs, terminals, and online resources like YouTube videos and blog posts.

Even with these examples that seems like a very narrow use case.

FloatArtifact•8mo ago
From an accessibility standpoint, yes. To be able to pattern match where you are in I.D.E without using an accessibility api
SloopJon•8mo ago
The product appears to be similar to Microsoft's embattled Recall feature. In order to remember your digital life it takes frequent screenshots.
gosub100•8mo ago
I guess it would be excellent to evade security monitors to take unauthorized copies of your employers codebase.
EvanAnderson•8mo ago
It worries me that stuff like that becoming easier will lead to wacky data pipelines being normalized (pulling display output off systems and "scraping" it to get data, of dubious quality, versus just building a proper interface). The kind of crowd that likes "low code" tools like MSFT's "Power Automate" is going to love to make Rube Goldberg nightmares out of tools like this.

It fills me with a deep sadness that we created deterministic machines then, though laziness, exploit every opportunity to "contaminate" them with sloppy practices that make them produce output with the same fuzzy inaccuracy as human brains.

Old man yells a neural networks take: We're entering a "The Machine Stops" era where nobody is going to know how to formulate basic algorithms.

"We need to add some numbers. Let's point a camera at the input, OCR it, then feed it to an LLM that 'knows math'. Then we don't have to figure out an algorithm to add numbers."

I wish compute "cost" more so people would be forced to actually make efficient use of hardware. Sadly, I think it'll take mass societal and infrastructure collapse for that to happen. Until it does, though, let the excess compute flow freely!

jocoda•8mo ago
asimov - The feeling of power.
abc-1•8mo ago
Anything that mentions tesseract is about 10 years out of date at this point.
amelius•8mo ago
Well, at least I can apt-get install tesseract.

That doesn't hold for any of the GPU-based solutions, last time I checked.

booder1•8mo ago
5.5.0 released November last year. Still a very active project as far as I can tell and runs on CPU. Even compared to best open source GPU option it is still pretty good. VLMs work very differently and don't work as well for everything. Why is it out of date?
cbsmith•8mo ago
I don't know that that is true: https://researchify.io/blog/comparing-pytesseract-paddleocr-...

Using Surya gets you significantly better results and makes almost all the work detailed in the article largely unnecessary.

booder1•8mo ago
Surya weights for the models are licensed cc-by-nc-sa-4.0 so not free for commercial usage. Also, as far as I know, the training data is 100% unavailable. Given they use well trained, but standard models, it isn't really open source and barely, maybe, open weight. I kinda hate how their repo says gpl cause that is only true for the inference code. The training code is closed source.
cbsmith•8mo ago
I did not know that the training code is closed source. That is troubling.
krapht•8mo ago
I just built a pipeline with tesseract last year. What's better that is open source and runnable locally?

VLLM hallucination is a blocker for my use case.

stavros•8mo ago
How is a hallucination worse than a Tesseract error?
jgalt212•8mo ago
Hallucinations are hard to detect unless you are a subject-matter expert. I don't have direct experience with Tesseract error detection.
krapht•8mo ago
Because the VLM doesn't know it hallucinated. When you get a Tesseract error you can flag the OCR job for manual review.
amelius•8mo ago
It could hallucinate obscene language, something which is less likely with classic OCR.
gessha•8mo ago
Latter is more likely to get debugged.
criddell•8mo ago
If you are stuck with open source, then your options are limited.

Otherwise I'd say just use your operating system's OCR API. Both Windows and MacOS have excellent APIs for this.

fxtentacle•8mo ago
Quite simply, you’re completely wrong. Modern tesseract versions include a modern LSTM AI. It can very affordably be deployed on CPU, yet its performance is competitive with much more expensive large GPU-based models. Especially if you handle a high volume of scans, chances are that tesseract will have the best bang per buck.
nicman23•8mo ago
i remember that you could not train it your self in a font like you could in older versions, it that still the case?
ianhawes•8mo ago
My company probably spent close to 6 figures overall creating Tesseract 5 custom models for various languages. Surya beats them all and is open source (and quite faster).
booder1•8mo ago
Surya weights for the models are licensed cc-by-nc-sa-4.0. They have an exception for small companies. If you're company is not small you either need to pay them or use them illegally.

Their training code and data is closed source. They are barely open weight and only inference is open source.

bluelightning2k•8mo ago
I can't say I've ever wanted to transcribe code from an image. That seems super niche.

Perhaps the specific idea is to harvest coding textbooks as training data for LLMs?

eurekin•8mo ago
I'm guessing to automatically scrape videos for future training rounds.
blharr•8mo ago
Eh, imagine poor documentation where people take screenshots of steps and don't write them out.

I can also imagine plenty of YouTube tutorials that type the code live... seems fairly useful

cAtte_•8mo ago
Pieces is (correction: used to be, prior to the AI slopification) an app for storing code snippets. so i think you can imagine the general idea of, e.g., "cool API usage example from a YouTube video, let me screenshot it!"
potato-peeler•8mo ago
> can't say I've ever wanted to transcribe code from an image. That seems super niche.

This is nightmare for endpoint protection. Imagine rogue employees snapping pics of your proprietary codebase and then using this to reassemble it.

vaxman•8mo ago
Tesseract OCR was created by digital (DEC) in 19_8_5 (yes, 40 not four YEARs ago). Now go back and read the article and ROFL with me.
Onavo•8mo ago
The original tesseract OCR has no neural nets. It bare little resemblance to the modern version.
vaxman•8mo ago
It's still 40.

Why not use Ollama-OCR?

krapht•8mo ago
Because I benchmarked both on my dataset and found that Tesseract was better for my use-case?
rafram•8mo ago
I’ve tested a bunch of vision models on particularly difficult documents (handwritten in a German script that’s no longer used), and I have yet to be impressed. They’re good at BSing to the point that you almost think they nailed it, until you realize that it’s mostly/all made-up text that doesn’t appear in the document.
yjftsjthsd-h•8mo ago
> It's still 40.

Is it, though? If the important parts of the code are new, does it matter that other parts are older or derived from older code? (Of course, I think this whole line of thought is pointless; what matters is not age, but how well it works, and tesseract generally does seem to work.)

vaxman•8mo ago
Yeah it is, it does (especially with OOP) and "ABBYY" kicked Tesseract's arse a long time ago anyway.

Maybe try OpenAI GPT-4o or Google's Document AI https://cloud.google.com/document-ai

ivanjermakov•8mo ago
What is this argument? Much software we use today was created in the 80s.
vaxman•8mo ago
Not the actual implementations heh ...I heard even Linus has dropped support for the 486. Even the infra is finally giving way...did you see the NVLINK SPINE announcement a few days ago? It's going to be deployed in Stargate UAE that was announced Thursday.
rafram•8mo ago
Unix was created in _1971_ and here we are still running processes and shells like it’s the 70s. Why not just have an LLM dream up the output?
vaxman•8mo ago
No son, Linux is not a version of Unix anymore than MINIX is.

NeXTStep was real UNIX, but macOS is not.

BTW, I was taught to program in C by one of the original core Unix team members and I worked for DEC long before I could have discussed TesseractOCR with people who didn't. Keep those ignorant downvotes commin'

sushid•8mo ago
Making OCR more accurate for regular text (e.g. data extraction from documents) would be useful; not sure how useful code transcription is
bobosha•8mo ago
has anyone tried feeding the admittedly noisy OCR-ed text -at a document level - to an LLM for making sense? Presumably some of the less capable ones should be quite affordable and accurate at scale as well.
lesuorac•8mo ago
OCR is the biggest XY problem.

Stop accepting PDFs and force things to use APIs ...

MoonGhost•8mo ago
Even small upscale model trained on texts should do better than big generic.