frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

A PDF that changes based on who is reading

https://sgaud.com/texts/pdf
33•SarthakGaud•1h ago

Comments

jheimark•1h ago
This looks really interesting. Optimizing for humans vs. agents feels like the new wave of Desktop vs. Mobile (where mobile won) - agents are going to win even faster.

Where is the repo? It's mentioned but I can't find it.

jheimark•1h ago
is it this one? https://github.com/iminoaru/adaptivepdf
gpvos•50m ago
Looks like it, the author's name matches.
gpvos•1h ago
I would suggest changing the title to the actual title of the article: Adaptive PDFs.

Assuming the program works, the PDF will not actually look different to me than to anyone else looking at it, so there is nothing that "changes based on who is reading". It is just that text extraction, a wholly different (and much fuzzier) process than viewing the PDF, and something that the same person can do, will now return structured (Markdown) text. (One might say the PDF changes based on how you are reading it.) A great idea, IMHO.

dredmorbius•24m ago
Email the mods: <https://news.ycombinator.com/item?id=40493683>.

hn@ycombinator.com

mc32•17m ago
Having slightly different versions would certainly be a help in identifying leakers of certain kinds of documents to increase the odds of identifying leakers. That would be of interest to some kinds of organizations or departments within organizations.
gnunicorn•53m ago
Just because everything is a potential threat vector now: doesn't this also mean you could easily put AI specific malicious instructions into the PDF that the regular human would never notice?

Like the "white text between the lines that only appears when copy-pasted"-hack that some professors have been doing in their exercises to their students to include pink elephants in the output and stuff. But worse. Just thinking of a electricity bill pdf you provide as proof of address to some company that uses an LLM to extraxt that address and pre-process that doc. But instead we can command it to do something else that a regular human wouldn't even ever notice...

Just a thought

jexp•50m ago
Shouldn’t it be possible since forever to put machine readable source information into PDF metadata. It’s more a problem of the tools and programs generating the PDFs.

We spend millions turning structured information into PDFs and billions to extract the same data from a printer rendering language

neonmagenta•30m ago
Exactly. But we have no real coordination or uniform application in how we're creating PDFs across all these programs so we always end up with a fun mix of what will and wont be static, scalable, searchable
vjvjvjvjghv•12m ago
[delayed]
iLoveOncall•44m ago
I'd be more interested in the contrary. A PDF that ensures it's only readable by humans.

I guess the exact same technique can actually be used.

vjvjvjvjghv•11m ago
[delayed]
al_hag•25m ago
In the US, publicly funded organizations are required to code their PDF with semantic structure to support machine access by screen readers and other assistive technologies [1], [2].

Given the low adherence to accessibility standards e.g. in academic publishing [3], LLM parsing needs creating a commercial incentive for comparable structured access would be marvelous.

[1] https://www.section508.gov/create/pdfs/common-tags-and-usage...

[2] https://pdfa.org/resource/tagged-pdf-best-practice-guide-syn...

[3] https://arxiv.org/html/2410.03022v1

Xotic007•20m ago
Cool but it's relying on every extractor honoring that replacement-text property which you said yourself is hit or miss. So it's clean markdown until someone runs it through a tool that ignores it and quietly gets the messy version and has no idea that happened.
Tomte•16m ago
> LaTeX, Chrome's print-to-PDF, most export tools don't produce tags

LaTeX is actually one of the best ways to create tagged PDF: https://latex3.github.io/tagging-project/tagging-status/ and https://www.overleaf.com/learn/latex/An_introduction_to_tagg...

fsckboy•5m ago
>This didn't matter when humans were the only readers. But now most PDFs end up in an LLM.

but it did matter, a lot. the PDF format was originally proprietary and was designed to be proprietary and to disallow casual text extraction. I just didn't like the way you glossed over that, "it was OK that people for over 30 years were not given any way for the information they were given to be unshackled, but now it matters because our AI overlords were prefer that so we must change things!"

Are Americans Too Old?

https://www.newyorker.com/culture/open-questions/are-americans-too-old
1•littlexsparkee•26s ago•0 comments

New GreatXML Exploit Bypasses Windows BitLocker via Recovery Partition XML Files

https://github.com/MSNightmare/GreatXML
2•uukelele•1m ago•0 comments

SealedKeys – Zero-knowledge team password manager with post-quantum crypto

https://sealedkeys.com
1•michaelgartlan•1m ago•0 comments

Shares in Elon Musk's SpaceX surge after biggest IPO

https://www.reuters.com/world/us/live-elon-musks-spacex-set-stock-market-trading-after-worlds-big...
1•layer8•1m ago•0 comments

Ten Years of Having a Personal Website

https://blog.greenpants.net/ten-years-of-personal-websites-from-student-to-senior/
1•Greenpants•2m ago•0 comments

System Call Stack Alignment

https://www.humprog.org/~stephen/blog/2026/06/12/#system-calling-alignment
2•matt_d•6m ago•0 comments

Run local agentic AI on the Mac using MLX (WWDC 2026) [video]

https://developer.apple.com/videos/play/wwdc2026/232/
2•sebiw•7m ago•0 comments

Joan Didion: Staking Out California (1979)

https://www.nytimes.com/1979/06/10/books/didion-calif.html
1•jxmorris12•8m ago•0 comments

Four ways AI is making your life more expensive

https://www.washingtonpost.com/technology/2026/06/06/inflation-is-being-driven-up-by-huge-investm...
1•1vuio0pswjnm7•9m ago•0 comments

No updates on Cryptome since June 2025?

https://cryptome.org/
3•joering2•11m ago•3 comments

Factorio: Flip, Flow, and Fresh Paint

https://factorio.com/blog/post/fff-442
1•ibobev•12m ago•0 comments

I Am Not a Reverse Centaur

https://blog.miguelgrinberg.com/post/i-am-not-a-reverse-centaur
2•ibobev•13m ago•0 comments

"Don't You Just Upload It to ChatGPT?"

https://correresmidestino.com/dont-you-just-upload-it-to-chatgpt/
3•speckx•13m ago•1 comments

Tesla Full Self Driving uses bicycle lane in official Denmark approval video

https://politiken.dk/danmark/forbrug/biler/art10875514/Allerede-12-sekunder-inde-i-PR-videoen-beg...
3•Veserv•16m ago•0 comments

Judge Punishes 4 Lawyers After Catching Both Sides Using A.I. In Lawsuit

https://www.nytimes.com/2026/06/09/us/ai-lawyers-sanctioned-mississippi.html
4•thm•16m ago•0 comments

Great Reshuffling of the Agentic Era: The 6 Career Archetypes

https://aidoses.substack.com/p/great-reshuffling-of-the-agentic
1•ryanrad•17m ago•0 comments

Treating pancreatic tumours may have revealed cancer's master switch

https://www.economist.com/science-and-technology/2026/06/12/treating-pancreatic-tumours-may-have-...
1•rndsignals•17m ago•0 comments

The Agentic Payments Map

https://www.fintechbrainfood.com/p/the-agentic-payments-map
1•AnhTho_FR•19m ago•0 comments

Hacker News actively blocking SpaceX IPO submissions

https://news.ycombinator.com
2•curldevnull•19m ago•1 comments

The machine can't be held accountable. You still can

https://pragmaticbuilder.substack.com/p/the-machine-cant-be-held-accountable
2•msolujic•20m ago•0 comments

How's Your Attention Span?

https://alessandracodinha.substack.com/p/hows-your-attention-span
2•gmays•20m ago•0 comments

Show HN: For the messy stage of research, built the Cognir Research Ontology

https://cognir-research.netlify.app/docs
1•sailpvp998•22m ago•0 comments

Uruky: Kagi alternative, EU-based private search engine

https://yeechie.nl/uruky-kagi-alternative-eu-based-private-search-engine
2•speckx•22m ago•0 comments

Ask HN: Influence of Legend of Zelda in Backrooms Movie?

1•dieselgate•22m ago•0 comments

Oracle Shares Tumble Amid Pricey Data-Center Build-Out

https://www.wsj.com/business/earnings/oracle-reports-higher-profit-on-surging-cloud-revenue-5f7d25eb
2•1vuio0pswjnm7•23m ago•0 comments

Reckless: Competitive chess engine written in Rust

https://github.com/codedeliveryservice/Reckless
1•dpcx•23m ago•0 comments

Fail loudly: a plea to stop hiding bugs

https://alejo.ch/3he
1•afc•24m ago•1 comments

Chili peppers of the world: cultivars, species, and heat

https://www.notesfromtheroad.com/desertmexico/chili-peppers.html
4•fanf2•24m ago•0 comments

Show HN: Sketchlog – 100M events compressed to 93 KB using streaming sketches

1•BALAVIGNESH321•25m ago•0 comments

Department of War Publishes Third Release of UAP Files on War.gov/UFO

https://www.war.gov/News/Releases/Release/Article/4515408/department-of-war-publishes-third-relea...
1•bookofjoe•29m ago•0 comments