frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

ClawPDF – Open-Source Virtual/Network PDF Printer with OCR and Image Support

https://github.com/clawsoftware/clawPDF
192•miles•2mo ago

Comments

criddell•2mo ago
Why use Tesseract for this? Windows' built-in OCR is so much better in my experience.
Oras•2mo ago
Yeah, tesseract has lots of issues especially identifying tables
skeeter2020•2mo ago
I suspect because of the vintage of this project. This is built on .net Framework 4.x, hence windows only.

edit: and goes deep into COM for device interfaces. Wow! blast from the past.

wolfi1•2mo ago
.Net Framework is mostly a wrapper for COM
PeterStuer•2mo ago
That's a bit of a streach. Yes, .Net was MS's next gen of component tech following (D)COM, but it grew way past that from the start.
jeroenhd•2mo ago
Microsoft's OCR engine supports Windows 10.0.10240.0 and up. This project intends to support Windows 7 and up.

In theory you could maintain code paths for both, offering a slimmer package for Windows 10+, but that'd also cost more time and effort to maintain.

Also, not many people know Windows comes with an OCR API. It's extremely underused in my opinion.

atmanactive•2mo ago
Windows OCR is used by PowerToys.

https://github.com/microsoft/PowerToys

hoistbypetard•2mo ago
That looks really useful.

But, also, wow! Windows-only and AGPLv3 is not a combination I think I've ever seen before.

sirjaz•2mo ago
We need more things like this. I know people don't like Windows Server because it is not open source, but it is simple to use and get up and running. Also, user management is easy.
yndoendo•2mo ago
I don't like Microsoft products, such as Windows, because I used them through out the years and find all the edge cases where they don't hold up. Windows OS is too fragile with their kludge of internal designs. Corrupt registry or WMI repository bakes systems with ease. This has nothing to do with Open Source.

OSes that use plain text configuration files are easy to resurrect. Windows is fixed with reinstall the OS. Linux and BSD are fixed with editing a config file or reinstalling a single corrupt application / library.

Example of bad versus good design is DirectX shader compilation. Windows can only perform this while the game is running. Linux with WINE can perform this without the game running. Windows will have bad FPS during the first run / scene with many games because of this.

PS. Windows print system is really bad in the industrial environment because they do not follow label markup language stands. Number of label DSLs have a print quantity setting to save memory. Want 1000 copies printed, one print job with print quantity set to 1000. Windows spools up 1000 copies of the label and sends each to the printer. This eats up the memory on printers in no time. It also brakes the ability to clear the print queue just on the printer. Extra steps require the Windows print job to be canceled and they the printer's queue to be cleared. Otherwise The printer will receive the next 990 of the 1000 print job.

Tika2234•1mo ago
Short answer is you not familiar with Windows but quite good with Linux. Hence the "not like" part. Plenty of Windows developers I know (that is way more than Linux developers statiscally) love Windows. The apps they designed and built simply way better or even non existent on Linux. The same reason too for them, they don't know Linux and near God-level tier with Windows from MFC to assembbly.
yndoendo•1mo ago
Assumptions .... I was an IT/Network Consultant for a number of years before going to product development. Started with DOS on 5 1/2 dual floppy and then Win 3.1.

Example of bad API designs by Microsoft that gets pushed into production is `GetPrivateProfileString`[0]. This function returns a single key value from an INI file. This function will 1) Open the file, 2) Search the file for the Key, 3) Close the file. A better design would be to abstract the file so it is only open and closed once versus how many key values must be read from an INI file. It is like reading one BYTE of an IC at a time instead of batching the process.

NTFS cannot even free master file table space. Creating a lot of small files make it expand and never shrink.

Windows does not properly handle STDIN and STDOUT. Because of DOS being an applications versus a SHELL a person must compile an application as a GUI or CMD flagged, that is also bad design because a command line application must be re-design and re-complied as a GUI to hide the DOS console from showing when it runs and brakes all STDIN and STDOUT logging methods.

Microsoft still does not have proper offline updating. For some reason they falsely believe that everyone connects their computer to the Internet. Lot of air-gap machines in the automation industry. Big reason to move product host OS to BSD or Linux.

It is not fun trying to fix a corrupt registry or WMI repository. Even Microsoft sent out a Windows update to stop auto-backup of the registry because their low-end Surface laptops didn't have the hard-drive space to store them.

[0] https://learn.microsoft.com/en-us/windows/win32/api/winbase/...

sowbug•2mo ago
OT: someone please make a RPi image that "prints" a page to an eink display. I want to duct-tape an RPi Zero and a rechargeable battery to the back of a display, then be able to print recipes to it while cooking. Other people might print board-game rules or speech notes while rehearsing -- anything that you'd typically print and then throw away after brief usage.

I know I could make a PDF, sideload it to a Kindle, etc. Too many steps. I just want the display to appear as a printer on my phone.

IlikeKitties•2mo ago
Sounds pretty vibe codable, why don't you try it yourself?
xrendan•2mo ago
I have some really old code that pretty much does this, I'll see if I can find it.
xrendan•2mo ago
Ugh, I don't have it. It was from before I used git.

Basically to do this you have a cups server that exposes itself as a network printer that prints to a specified PDF directory and then you have a program watching that directory for new files and if there's a new one it opens up whatever pdf viewer you want in full screen.

Setup a shared pdf printer: https://askubuntu.com/questions/1310867/how-to-set-up-shared...

navane•2mo ago
I always wanted to tackle this use case with receipt printers, those thermal narrow paper rol ones. But those things are freaking expensive!
colechristensen•2mo ago
Restaurants are going out of business all the time, there's your source
literalAardvark•2mo ago
Thermal paper has some pretty horrible effects on your health, I'd avoid that.
whartung•2mo ago
Just curious if the folks at CVS chart particularly high on these horrible effects, considering the no doubt thousands of feet of receipts they handle each day there.

For those unaware, at the CVS Pharmacy if you walk in and buy so much as a pack of gum, you're likely to walk out with at least 3 feet of receipt. They use them to tack on ads and coupons.

literalAardvark•1mo ago
Probably, idk if there are such specific studies.

The thermal sensitive layer contains very large amounts of BPA in a dusty form that will easily contaminate your hands.

BPA is a major endocrine disruptor. They might say BPA-free, which would be technically correct, but that just means they'll use a near identical BPA variant that isn't proven to be an endocrine disruptor yet.

Handle with care, wash your hands, don't put them in the kitchen.

turtlebits•2mo ago
You could use the "share" sheet on your phone to send to an RPI over BT via obexpushd, then process it on device -> eink display
kittikitti•2mo ago
This is an incredible idea! I really like it because it sounds so obvious after being exposed to it but I never thought of it before! I wonder what other ways we could integrate GPT's, LLM's, and other AI into the simple "Print" functionality across all our devices.
mathfailure•2mo ago
For Windows only.

Abandonware.

npodbielski•2mo ago
Looks like it is .NET Framework, so there is possibility to port it to .net core and possibly use via dll or .so as library inside other, linux desktop framework (or in something more portable like Flutter).
cryptonector•1mo ago
Could get ported.
johnea•2mo ago
Just another poster child of windoze suk.

Of course, CUPS based printing has had built in print to PDF for years...

[1] Common Unix Printing System

tonyedgecombe•1mo ago
Windows has had a built in PDF driver for a long time as well.

Asynchrony is not concurrency

https://kristoff.it/blog/asynchrony-is-not-concurrency/
151•kristoff_it•4h ago•104 comments

How to write Rust in the Linux kernel: part 3

https://lwn.net/SubscriberLink/1026694/3413f4b43c862629/
29•chmaynard•1h ago•0 comments

Shutting Down Clear Linux OS

https://community.clearlinux.org/t/all-good-things-come-to-an-end-shutting-down-clear-linux-os/10716
22•todsacerdoti•35m ago•3 comments

Ccusage: A CLI tool for analyzing Claude Code usage from local JSONL files

https://github.com/ryoppippi/ccusage
14•kristianp•57m ago•4 comments

Silence Is a Commons by Ivan Illich (1983)

http://www.davidtinapple.com/illich/1983_silence_commons.html
59•entaloneralie•2h ago•10 comments

Broadcom to discontinue free Bitnami Helm charts

https://github.com/bitnami/charts/issues/35164
83•mmoogle•4h ago•47 comments

Wii U SDBoot1 Exploit “paid the beak”

https://consolebytes.com/wii-u-sdboot1-exploit-paid-the-beak/
63•sjuut•3h ago•7 comments

Multiplatform Matrix Multiplication Kernels

https://burn.dev/blog/sota-multiplatform-matmul/
44•homarp•4h ago•16 comments

lsr: ls with io_uring

https://rockorager.dev/log/lsr-ls-but-with-io-uring/
292•mpweiher•11h ago•151 comments

EPA says it will eliminate its scientific reseach arm

https://www.nytimes.com/2025/07/18/climate/epa-firings-scientific-research.html
66•anigbrowl•1h ago•34 comments

Valve confirms credit card companies pressured it to delist certain adult games

https://www.pcgamer.com/software/platforms/valve-confirms-credit-card-companies-pressured-it-to-delist-certain-adult-games-from-steam/
145•freedomben•8h ago•147 comments

Meta says it wont sign Europe AI agreement, calling it growth stunting overreach

https://www.cnbc.com/2025/07/18/meta-europe-ai-code.html
85•rntn•6h ago•119 comments

Trying Guix: A Nixer's impressions

https://tazj.in/blog/trying-guix
132•todsacerdoti•3d ago•38 comments

AI capex is so big that it's affecting economic statistics

https://paulkedrosky.com/honey-ai-capex-ate-the-economy/
182•throw0101c•4h ago•202 comments

Replication of Quantum Factorisation Records with a VIC-20, an Abacus, and a Dog

https://eprint.iacr.org/2025/1237
58•teddyh•5h ago•15 comments

Show HN: Molab, a cloud-hosted Marimo notebook workspace

https://molab.marimo.io/notebooks
63•akshayka•5h ago•8 comments

Mango Health (YC W24) Is Hiring

https://www.ycombinator.com/companies/mango-health/jobs/3bjIHus-founding-engineer
1•zachgitt•5h ago

Sage: An atomic bomb kicked off the biggest computing project in history

https://www.ibm.com/history/sage
13•rawgabbit•3d ago•0 comments

The year of peak might and magic

https://www.filfre.net/2025/07/the-year-of-peak-might-and-magic/
69•cybersoyuz•6h ago•36 comments

Show HN: I built library management app for those who outgrew spreadsheets

https://www.librari.io/
44•hmkoyan•4h ago•27 comments

CP/M creator Gary Kildall's memoirs released as free download

https://spectrum.ieee.org/cpm-creator-gary-kildalls-memoirs-released-as-free-download
226•rbanffy•13h ago•118 comments

Cancer DNA is detectable in blood years before diagnosis

https://www.sciencenews.org/article/cancer-tumor-dna-blood-test-screening
153•bookofjoe•5h ago•95 comments

A New Geometry for Einstein's Theory of Relativity

https://www.quantamagazine.org/a-new-geometry-for-einsteins-theory-of-relativity-20250716/
72•jandrewrogers•9h ago•1 comments

Making a StringBuffer in C, and questioning my sanity

https://briandouglas.ie/string-buffer-c/
27•coneonthefloor•3d ago•15 comments

How I keep up with AI progress

https://blog.nilenso.com/blog/2025/06/23/how-i-keep-up-with-ai-progress/
167•itzlambda•5h ago•85 comments

Show HN: Simulating autonomous drone formations

https://github.com/sushrut141/ketu
12•wanderinglight•3d ago•2 comments

Benben: An audio player for the terminal, written in Common Lisp

https://chiselapp.com/user/MistressRemilia/repository/benben/home
46•trocado•4d ago•4 comments

Hundred Rabbits – Low-tech living while sailing the world

https://100r.co/site/home.html
215•0xCaponte•4d ago•60 comments

How to Get Foreign Keys Horribly Wrong

https://hakibenita.com/django-foreign-keys
50•Bogdanp•3d ago•23 comments

When root meets immutable: OpenBSD chflags vs. log tampering

https://rsadowski.de/posts/2025/openbsd-immutable-system-logs/
126•todsacerdoti•15h ago•41 comments