https://old.reddit.com/r/law/comments/1ptlms6/some_epstein_f...
https://krassencast.com/p/breaking-we-just-unredacted-the-ep...
https://old.reddit.com/r/law/comments/1ptlms6/some_epstein_f...
https://krassencast.com/p/breaking-we-just-unredacted-the-ep...
We Just Unredacted the Epstein Files
https://news.ycombinator.com/item?id=46364121
I tried to ascertain, but am not certain, this is the original blog source. Maybe they made some prior X posts.
PDF is an absurdly complex file format. It's part of the reason there is no single "good" PDF reader, just a lot of mediocre PDF readers that are all terrible in their own way. Which is a topic for another day.
There are several ways to remove data in a PDF:
- Remove the data. This is much harder than it sounds. Many PDF tools won't let you change the content of a PDF, not because it isn't possible, but because you'll likely massively screw up the formatting, and the tools don't want to deal with that.
- Replace the data. This what what all the "blackout" tools do, find "A" and replace with "🮋". This is effective and doesn't break formatting since it's a 1-to-1 replacement. The problem with "replacing" is that not every PDF tool works the same way, and some, instead, just change the foreground and background color to black; it looks nearly the same, but the power of copy-and-paste still functions.
- Then you have the computer illiterate, who think changing the foreground and background color to black is good enough anyway.
Not kidding - it's a ~~~billion dollar market haha
Make an MVP/Show HN :-)
Should take... a weekend tops? ;) PDF is crazy and scary
As far as I understand it, at its core, pdf is just a stream of instructions that is continually modifying the document. You can insert a thousand objects before you start the next word in a paragraph. And this is just the most basic stuff. Anything on a page can be anywhere in the stream. I don't know if you can go back and edit previous pages, you might have a shot at least trying to understand one page at a time.
Did you know you can have embedded XML in PDFs? You can have a paper form with all the data filled in and include an XML version of that for any computer systems that would like an easier way to read it.
> - Remove the data. This is much harder than it sounds. Many PDF tools won't let you change the content of a PDF, not because it isn't possible, but because you'll likely massively screw up the formatting, and the tools don't want to deal with that.
Compared to other formats this is actually relatively easy in a PDF since the way the text drawing operators work they don't influence the state for arbitrary other content. A lot of positioning in a PDF is absolute (or relative to an explicitly defined matrix which has hardcoded values). Usually this makes editing a PDF harder (since when changing text the related text does not adapt automatically), but when removing data it makes it much easier since you can mostly just delete it without affecting anything else. (There are exceptions for text immediately after the removed data, but that's limited and relatively easy to control.)
> - Replace the data. This what what all the "blackout" tools do, find "A" and replace with "🮋". This is effective and doesn't break formatting since it's a 1-to-1 replacement.
That's actually rather tricky in PDFs since they usually contain embedded subset fonts and these usually do not have "🮋" as part of the subset. Also doing this would break the layout since "🮋" has a different width than most letters in a typical font, so it would not lead to less formatting issues than the previous option. Unless the "🮋" is stretched for each letter to have the same dimensions, but then the stretched characters allow to recover the text.
> The problem with "replacing" is that not every PDF tool works the same way, and some, instead, just change the foreground and background color to black; it looks nearly the same, but the power of copy-and-paste still functions.
PDF does not have a concept of a background color. If it looks like a background color in PDF, you have a rectangle drawn in one color and something in the foreground color in front of it. What you usually see in badly redacted PDF files is exactly this, but in opposite color: Someone just draws a black box on top of the characters. You could argue that this is smarter since it would still work even if someone would chnage colors, but of course, PDF is a vector format. If you just add a rectangle, someone else can remove it again. (And also copy & paste doesn't care about your rectangle)
Anyway, if you click on a "redaction", you're clicking on the box and can't select the text underneath, but if you just highlight the text around it, you can copy all the original text.
It's a bizarre oversight.
The only safe way for journalists is to paraphrase what the document said and to say "an unnamed source claims that ..." and to guarantee with your reputation, and the reputation of your publisher, that you are being faithful to what the original source said. For even better results, combine multiple sources.
Unfortunately paraphrasing things and taking editorial responsibility have both been deprecated in favour of rereleasing press releases in the house style, so it's difficult to get the actual journalism these days.
You can't possibly know that!
(Sorry, watching Grinch, Jim Carrey spoke through me).
The names of involved powerful people were NOT supposed to be censored. All those names except Bill Clinton name were redacted. To protect Trump and everybody else involved in the scandal except said Bill Clinton. But especially to protect Trump.
- Paul Manafort court filing (U.S., 2019) Manafort’s lawyers filed a PDF where the “redacted” parts were basically black highlighting/boxes over live text. Reporters could recover the hidden text (e.g., via copy/paste).
- TSA “Standard Operating Procedures” manual (U.S., 2009) A publicly posted TSA screening document used black rectangles that did not remove the underlying text; the concealed content could be extracted. This led to extensive discussion and an Inspector General review.
- UK Ministry of Defence submarine security document (UK, 2011) A MoD report had “redacted” sections that could be revealed by copying/pasting the “blacked out” text—because the text was still present, just visually obscured.
- Apple v. Samsung ruling (U.S., 2011) A federal judge’s opinion attempted to redact passages, but the content was still recoverable due to the way the PDF was formatted; copying text out revealed the “redacted” parts.
- Associated Press + Facebook valuation estimate in court transcript (U.S., 2009) The AP reported it could read “redacted” portions of a court transcript by cut-and-paste (classic overlay-style failure). Secondary coverage notes the mechanism explicitly.
A broader “history of failures” compilation (multiple orgs / years) The PDF Association collected multiple incidents (including several above) and describes the common failure mode: black shapes drawn over text without deleting/sanitizing the underlying content. https://pdfa.org/wp-content/uploads/2020/06/High-Security-PD...
What happens in a court case when this occurs? Does the receiving party get to review and use the redacted information (assuming it’s not gagged by other means) or do they have to immediately report the error and clean room it?
Edit: after reading up on this it looks like attorneys have strict ethical standards to not use the information (for what little that may be worth), but the Associated Press was a third party who unredacted public court documents in a separate Facebook case.
I know and am friends with a lot of lawyers. They're pretty ruthless when it comes to this kind of thing.
Legally, I would think both parties get copies of everything. I don't know if that was the case here.
https://www.justice.gov/multimedia/Court Records/Matter of the Estate of Jeffrey E. Epstein, Deceased, No. ST-21-RV-00005 (V.I. Super. Ct. 2021)/2022.03.17-1 Exhibit 1.pdf
And yes, I've heard of Hanlon's Razor haha
The CIA, for example, is entirely above the law.
Its exactly equivalent to a dictatorship by the head of the CIA, unless the CIA is effectively answerable to some other authority despite not being answerable to the law, and then it is equivalent to a dictatorship by that higher authority.
I believe Trump will manufacture a crisis before he's out of office in a bid to maintain control. I believe he will have learned from Bush Jr. that a simple war isn't good enough, and it needs to be a genuine emergency.
I believe he'll do whatever he can to make that happen. Native born terrorist, or war with a close country, or absolutely over the top financial crash. Something awful that lets him invoke some obscure rule that lets him stay in power with congressional approval - he'll just skip the congressional approval part like he already does.
Wasn't too hard to put together a quick graph of the past decade for the U.S. using the World Press Freedom Index (relative ranking and score) - an annual ranking of 180 countries published by Reporters Without Borders that measures the level of press freedom.
The fact is that the US has taken leaps and bounds toward being a dictatorship in the last decade and if we don't stop this trend we will be a dictatorship. The only thing saying "we're not a dictatorship" does to the conversation is minimizes the very real danger we're in. At least saying "we're a dictatorship" communicates the danger and urgency of the situation.
I do agree that a more nuanced conversation would be more honest, but it's quite difficult to foster nuanced conversations, and I don't think your comment is fostering a more nuanced conversation.
It's not like a few more stories of Trump raping $whomever are going to move the needle at all, especially with how the media is on board with burying negative coverage of the regime.
Also if you're wondering how this activity isn't some kind of abuse of government resources, keep in mind that thanks to the Supreme Council's embrace of the Unitary Executive Theory (ie Sparkling Autocracy), covering up evidence about Donald Trump raping under-aged sex trafficking victims is now an official priority of the United States Government.
For context, lawyers deal with this all the time. In discovery, there is an extensive document ("doc") review process to determine if documents are responsive or non-responsive. For example, let's say I subpoenaed all communication between Bob and Alice between 1 Jan 2019 and 1 Jan 2020 in relation to the purchase of ABC Inc as part of litigation. Every email would be reviewed and if it's relevant to the subpoena, it's marked as responsive, given an identifier and handed over to the other side. Non-responsive communication might not be eg attorney-client communications.
It can go further and parts of documents can be viewed as non-responsive and otherwise be blacked out eg the minutes of a meeting that discussed 4 topics and only 1 of them was about the company purchase. That may be commercially sensitive and beyond the scope of the subpoena.
Every such redaction and exclusion has to be logged and a reason given for it being non-responsive where a judge can review that and decide if the reason is good or not, should it ever be an issue. Can lawyers find something damaging and not want to hand it over and just mark it non-responsive? Technically, yes. Kind of. It's a good way to get disbarred or even jailed.
My point with this is that lawyers, which the Department of Justice is full of, are no strangers to this process so should be able to do it adequately. If they reveal something damaging to their client this way, they themselves can get sued for whatever the damages are. So it's something they're careful about, for good reason.
So in my opinion, it's unlikely that this is an act of resistance. Lawyers won't generally commit overt illegal acts, particularly when the only incentive is keeping their job and the downside is losing their career. It could happen.
What I suspect is happening is all the good lawyers simply aren't engaging in this redaction process because they know better so the DoJ had the wheel out some bad and/or unethical ones who would.
What they're doing is in blatant violation to the law passed last month and good lawyers know it.
There's a lot of this going on at the DoJ currently. Take the recent political prosecutions of James Comey, Letitia James, etc. No good prosecutor is putting their name to those indictments so the administration was forced to bring in incompetent stooges who would. This included former Trump personal attorneys who got improerly appointed as US Attorneys. This got the Comey indictment thrown out.
The law that Ro Khanna and Thomas Massey co-sponsored was sweeping and clear about what needs to be released. The DoJ is trying to protect both members of the administration and powerful people, some of whom are likely big donors and/or foreign government officials or even heads of state.
That's also why this process is so slow I imagine. There are only so many ethically compromised lackeys they can find.
Two things come to mind:
* Some things Indyke did fall outside the scope of lawyer-client privilege. It would be bad for certain people to get him on a stand and force him to spill the beans. He was never interviewed re: Epstein [1]
* He's a very talented lawyer, insofar as a competent lawyer with, at least, extreme discretion, is talented.
[1] https://www.finance.senate.gov/imo/media/doc/letter_to_doj-f...
Every slide towards authoritarianism is gradual, there is no announcement.
montroser•16h ago