frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

ClawPDF – Open-Source Virtual/Network PDF Printer with OCR and Image Support

https://github.com/clawsoftware/clawPDF
192•miles•10mo ago

Comments

criddell•10mo ago
Why use Tesseract for this? Windows' built-in OCR is so much better in my experience.
Oras•10mo ago
Yeah, tesseract has lots of issues especially identifying tables
skeeter2020•10mo ago
I suspect because of the vintage of this project. This is built on .net Framework 4.x, hence windows only.

edit: and goes deep into COM for device interfaces. Wow! blast from the past.

wolfi1•10mo ago
.Net Framework is mostly a wrapper for COM
PeterStuer•10mo ago
That's a bit of a streach. Yes, .Net was MS's next gen of component tech following (D)COM, but it grew way past that from the start.
jeroenhd•10mo ago
Microsoft's OCR engine supports Windows 10.0.10240.0 and up. This project intends to support Windows 7 and up.

In theory you could maintain code paths for both, offering a slimmer package for Windows 10+, but that'd also cost more time and effort to maintain.

Also, not many people know Windows comes with an OCR API. It's extremely underused in my opinion.

atmanactive•10mo ago
Windows OCR is used by PowerToys.

https://github.com/microsoft/PowerToys

hoistbypetard•10mo ago
That looks really useful.

But, also, wow! Windows-only and AGPLv3 is not a combination I think I've ever seen before.

sirjaz•10mo ago
We need more things like this. I know people don't like Windows Server because it is not open source, but it is simple to use and get up and running. Also, user management is easy.
yndoendo•10mo ago
I don't like Microsoft products, such as Windows, because I used them through out the years and find all the edge cases where they don't hold up. Windows OS is too fragile with their kludge of internal designs. Corrupt registry or WMI repository bakes systems with ease. This has nothing to do with Open Source.

OSes that use plain text configuration files are easy to resurrect. Windows is fixed with reinstall the OS. Linux and BSD are fixed with editing a config file or reinstalling a single corrupt application / library.

Example of bad versus good design is DirectX shader compilation. Windows can only perform this while the game is running. Linux with WINE can perform this without the game running. Windows will have bad FPS during the first run / scene with many games because of this.

PS. Windows print system is really bad in the industrial environment because they do not follow label markup language stands. Number of label DSLs have a print quantity setting to save memory. Want 1000 copies printed, one print job with print quantity set to 1000. Windows spools up 1000 copies of the label and sends each to the printer. This eats up the memory on printers in no time. It also brakes the ability to clear the print queue just on the printer. Extra steps require the Windows print job to be canceled and they the printer's queue to be cleared. Otherwise The printer will receive the next 990 of the 1000 print job.

Tika2234•10mo ago
Short answer is you not familiar with Windows but quite good with Linux. Hence the "not like" part. Plenty of Windows developers I know (that is way more than Linux developers statiscally) love Windows. The apps they designed and built simply way better or even non existent on Linux. The same reason too for them, they don't know Linux and near God-level tier with Windows from MFC to assembbly.
yndoendo•10mo ago
Assumptions .... I was an IT/Network Consultant for a number of years before going to product development. Started with DOS on 5 1/2 dual floppy and then Win 3.1.

Example of bad API designs by Microsoft that gets pushed into production is `GetPrivateProfileString`[0]. This function returns a single key value from an INI file. This function will 1) Open the file, 2) Search the file for the Key, 3) Close the file. A better design would be to abstract the file so it is only open and closed once versus how many key values must be read from an INI file. It is like reading one BYTE of an IC at a time instead of batching the process.

NTFS cannot even free master file table space. Creating a lot of small files make it expand and never shrink.

Windows does not properly handle STDIN and STDOUT. Because of DOS being an applications versus a SHELL a person must compile an application as a GUI or CMD flagged, that is also bad design because a command line application must be re-design and re-complied as a GUI to hide the DOS console from showing when it runs and brakes all STDIN and STDOUT logging methods.

Microsoft still does not have proper offline updating. For some reason they falsely believe that everyone connects their computer to the Internet. Lot of air-gap machines in the automation industry. Big reason to move product host OS to BSD or Linux.

It is not fun trying to fix a corrupt registry or WMI repository. Even Microsoft sent out a Windows update to stop auto-backup of the registry because their low-end Surface laptops didn't have the hard-drive space to store them.

[0] https://learn.microsoft.com/en-us/windows/win32/api/winbase/...

sowbug•10mo ago
OT: someone please make a RPi image that "prints" a page to an eink display. I want to duct-tape an RPi Zero and a rechargeable battery to the back of a display, then be able to print recipes to it while cooking. Other people might print board-game rules or speech notes while rehearsing -- anything that you'd typically print and then throw away after brief usage.

I know I could make a PDF, sideload it to a Kindle, etc. Too many steps. I just want the display to appear as a printer on my phone.

IlikeKitties•10mo ago
Sounds pretty vibe codable, why don't you try it yourself?
xrendan•10mo ago
I have some really old code that pretty much does this, I'll see if I can find it.
xrendan•10mo ago
Ugh, I don't have it. It was from before I used git.

Basically to do this you have a cups server that exposes itself as a network printer that prints to a specified PDF directory and then you have a program watching that directory for new files and if there's a new one it opens up whatever pdf viewer you want in full screen.

Setup a shared pdf printer: https://askubuntu.com/questions/1310867/how-to-set-up-shared...

navane•10mo ago
I always wanted to tackle this use case with receipt printers, those thermal narrow paper rol ones. But those things are freaking expensive!
colechristensen•10mo ago
Restaurants are going out of business all the time, there's your source
literalAardvark•10mo ago
Thermal paper has some pretty horrible effects on your health, I'd avoid that.
whartung•10mo ago
Just curious if the folks at CVS chart particularly high on these horrible effects, considering the no doubt thousands of feet of receipts they handle each day there.

For those unaware, at the CVS Pharmacy if you walk in and buy so much as a pack of gum, you're likely to walk out with at least 3 feet of receipt. They use them to tack on ads and coupons.

literalAardvark•10mo ago
Probably, idk if there are such specific studies.

The thermal sensitive layer contains very large amounts of BPA in a dusty form that will easily contaminate your hands.

BPA is a major endocrine disruptor. They might say BPA-free, which would be technically correct, but that just means they'll use a near identical BPA variant that isn't proven to be an endocrine disruptor yet.

Handle with care, wash your hands, don't put them in the kitchen.

turtlebits•10mo ago
You could use the "share" sheet on your phone to send to an RPI over BT via obexpushd, then process it on device -> eink display
kittikitti•10mo ago
This is an incredible idea! I really like it because it sounds so obvious after being exposed to it but I never thought of it before! I wonder what other ways we could integrate GPT's, LLM's, and other AI into the simple "Print" functionality across all our devices.
mathfailure•10mo ago
For Windows only.

Abandonware.

npodbielski•10mo ago
Looks like it is .NET Framework, so there is possibility to port it to .net core and possibly use via dll or .so as library inside other, linux desktop framework (or in something more portable like Flutter).
cryptonector•10mo ago
Could get ported.
johnea•10mo ago
Just another poster child of windoze suk.

Of course, CUPS based printing has had built in print to PDF for years...

[1] Common Unix Printing System

tonyedgecombe•10mo ago
Windows has had a built in PDF driver for a long time as well.

Personal Encyclopedias

https://whoami.wiki/blog/personal-encyclopedias
95•jrmyphlmn•13h ago•21 comments

Running Tesla Model 3's computer on my desk using parts from crashed cars

https://bugs.xdavidhu.me/tesla/2026/03/23/running-tesla-model-3s-computer-on-my-desk-using-parts-...
611•driesdep•11h ago•191 comments

ARC-AGI-3

https://arcprize.org/arc-agi/3
383•lairv•14h ago•259 comments

Earthquake scientists reveal how overplowing weakens soil at experimental farm

https://www.washington.edu/news/2026/03/19/earthquake-scientists-reveal-how-overplowing-weakens-s...
158•Brajeshwar•18h ago•66 comments

The truth that haunts the Ramones: 'They sold more T-shirts than records'

https://english.elpais.com/culture/2026-03-17/the-uncomfortable-truth-that-will-always-haunt-the-...
103•c420•4d ago•46 comments

More precise elevation data for GraphHopper routing engine

https://www.graphhopper.com/blog/2026/03/23/more-precise-elevation-data-for-graphhopper/
37•karussell•2d ago•0 comments

Two studies in compiler optimisations

https://www.hmpcabral.com/2026/03/20/two-studies-in-compiler-optimisations/
74•hmpc•3d ago•5 comments

The EU still wants to scan your private messages and photos

https://fightchatcontrol.eu/?foo=bar
1108•MrBruh•12h ago•295 comments

My DIY FPGA board can run Quake II

https://blog.mikhe.ch/quake2-on-fpga/part4.html
143•sznio•3d ago•48 comments

Ashby (YC W19) Is Hiring Engineers Who Make Product Decisions

https://www.ashbyhq.com/careers?ashby_jid=c3c7125d-7883-4dff-a2bf-f5a55de4a364&utm_source=hn
1•abhikp•2h ago

90% of Claude-linked output going to GitHub repos w <2 stars

https://www.claudescode.dev/?window=since_launch
276•louiereederson•14h ago•165 comments

Supreme Court Sides with Cox in Copyright Fight over Pirated Music

https://www.nytimes.com/2026/03/25/us/politics/supreme-court-cox-music-copyright.html
342•oj2828•18h ago•263 comments

Show HN: Robust LLM Extractor for Websites in TypeScript

https://github.com/lightfeed/extractor
37•andrew_zhong•5h ago•22 comments

False claims in a widely-cited paper

https://statmodeling.stat.columbia.edu/2026/03/24/false-claims-in-a-published-no-corrections-no-c...
267•qsi•8h ago•102 comments

Shell Tricks That Make Life Easier (and Save Your Sanity)

https://blog.hofstede.it/shell-tricks-that-actually-make-life-easier-and-save-your-sanity/
58•zdw•8h ago•16 comments

Maxell MXCP-P100 – wireless cassette player

https://maxell-usa.com/product/cassetteplayer/
17•ChrisArchitect•2d ago•7 comments

Quantization from the Ground Up

https://ngrok.com/blog/quantization
248•samwho•17h ago•46 comments

You probably don't want to buy a retro console

https://medium.com/@razorbeamz/you-probably-dont-want-to-buy-a-retro-console-a-guide-for-people-w...
7•razorbeamz•27m ago•1 comments

Apple randomly closes bug reports unless you "verify" the bug remains unfixed

https://lapcatsoftware.com/articles/2026/3/11.html
394•zdw•13h ago•219 comments

What came after the 486?

https://dfarq.homeip.net/what-came-after-486/
19•jnord•2d ago•13 comments

Data is everywhere. The government is buying it without a warrant

https://www.npr.org/2026/03/25/nx-s1-5752369/ice-surveillance-data-brokers-congress-anthropic
38•nuke-web3•2h ago•7 comments

Show HN: A plain-text cognitive architecture for Claude Code

https://lab.puga.com.br/cog/
92•marciopuga•9h ago•27 comments

Do Architects Still Need to Draw? (2020)

https://www.lifeofanarchitect.com/do-architects-still-need-to-draw/
18•hbarka•4d ago•16 comments

My astrophotography in the movie Project Hail Mary

https://rpastro.square.site/s/stories/phm
856•wallflower•3d ago•197 comments

Jury finds Meta liable in case over child sexual exploitation on its platforms

https://www.cnn.com/2026/03/24/tech/meta-new-mexico-trial-jury-deliberation
382•billfor•1d ago•476 comments

Power consumption of Game Boy flash cartridges (2021)

https://gekkio.fi/blog/2021/power-consumption-of-game-boy-flash-cartridges/
32•JNRowe•2d ago•1 comments

Swift 6.3 Released

https://www.swift.org/blog/swift-6.3-released/
13•ingve•1h ago•1 comments

"Disregard That" Attacks

https://calpaterson.com/disregard.html
76•leontrolski•9h ago•54 comments

Thoughts on slowing the fuck down

https://mariozechner.at/posts/2026-03-25-thoughts-on-slowing-the-fuck-down/
838•jdkoeck•19h ago•381 comments

Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

https://github.com/jonwiggins/optio
51•jawiggins•15h ago•28 comments