Relatedly, there's an extremely good online archive of important cases in the past, but because they disallow crawlers in robots.txt: https://www.bailii.org/robots.txt not many people know about it. Personally I would prefer if all reporting on legal cases linked to the official transcript, but seemingly none of the parties involved finds it in their interest to make that work.
https://x.com/CPhilpOfficial/status/2021295301017923762
https://xcancel.com/CPhilpOfficial/status/202129530101792376...
This kind of logic does more disservice than people realize. You can combat bigotry towards immigrants (issue #1), without covering up for criminal immigrants (issue #2) in fear of increase of issue #1 among the natives. It only brings up more resentment and bigotry.
Or it should be sealed for X years and then public record. Where X might be 1 in cases where you don't want to hurt an ongoing investigation, or 100 if it's someone's private affairs.
Nothing that goes through the courts should be sealed forever.
We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
How about rate limited?
If load on the server is a concern, make the whole database available as a torrent. People who run scrapers tend to prefer that anyway.
This isn't someone's hobby project run from a $5 VPS - they can afford to serve 10k qps of readonly data if needed, and it would cost far less than the salary of 1 staff member.
I’d then ask OpenAI to be open too since open is open.
It's like having you search through sand, it's bad enough while you can use a sift, but then they tell you that you can only use your bare hands, and your search efforts are made useless.
This is not a new tactic btw and pretty relevant to recent events...
Free to ingest and make someones crimes a permanent part of AI datasets resulting in forever-convictions? No thanks.
AI firms have shown themselves to be playing fast and loose with copyrighted works, a teenager shouldn't have their permanent AI profile become "shoplifter" because they did a crime at 15 yo that would otherwise have been expunged after a few years.
But why shouldn't a 19 year old shoplifter have that on their public record? Would you prevent newspapers from reporting on it, or stop users posting about it on public forums?
Yes
That is the entire point of having courts, since the time of Hammurabi. Otherwise it's back to the clan system, where justice is made by avenging blood.
Making and using any "profiles" of people is an entirely different thing than having court rulings accessible to the public.
Instead, we should make it illegal to discriminate based on criminal conviction history. Just like it is currently illegal to discriminate based on race or religion. That data should not be illegal to know, but illegal to use to make most decisions relating to that person.
If a conviction is something minor enough that might be expungable, it should be private until that time comes. If the convicted person hasn't met the conditions for expungement, make it part of the public record, otherwise delete all history of it.
Then there's cases like Japan, where not only companies, but also landlords, will make people answer a question like: "have you ever been part of an anti-social organization or committed a crime?" If you don't answer truthfully, that is a legal reason to reject you. If you answer truthfully, then you will never get a job (or housing) again.
Of course, there is a whole world outside of the United States and Japan. But these are the two countries I have experience dealing with.
However, there's also jobs which legally require enhanced vetting checks.
The idea that society is required to forget crime is pretty toxic honestly.
1000x this. It’s one thing to have a felony for manslaughter. It’s another to have a felony for drug possession. In either case, if enough time has passed, and they have shown that they are reformed (long employment, life events, etc) then I think it should be removed from consideration. Not expunged or removed from record, just removed from any decision making. The timeline for this can be based on severity with things like rape and murder never expiring from consideration.
There needs to be a statute of limitations just like there is for reporting the crimes.
What I’m saying is, if you were stupid after your 18th birthday and caught a charge peeing on a cop car while publicly intoxicated, I don’t think that should be a factor when your 45 applying for a job after going to college, having a family, having a 20 year career, etc.
OR it should be allowed for humans to access the public record but charge fees for scrapers
They have ability to seal documents until set dates and deal with digital archival and retrieval.
I suspect some of this is it's a complete shit show and they want to bury it quickly or avoid having to pay up for an expensive vendor migration.
It's not about any post-case information.
England has a genuinely independent judiciary. Judges and court staff do not usually attempt to hide from journalists stuff that journalists ought to be investigating. On the other hand, if it's something like an inquest into the death of a well-known person which would only attract the worst kind of journalist they sometimes do quite a good job of scheduling the "public" hearing in such a way that only family members find out about it in time.
A world government could perhaps make lots of legal records public while making it illegal for journalists to use that material for entertainment purpose but we don't have a world government: if the authorities in one country were to provide easy access to all the details of every rape and murder in that country then so-called "tech" companies in another country would use that data for entertainment purposes. I'm not sure what to do about that, apart, obviously, from establishing a world government (which arguably we need anyway in order to handle pollution and other things that are a "tragedy of the commons" but I don't see it happening any time soon).
The counter claim by the government is that this isn't "the source of truth" being deleted but rather a subset presented more accessibly by a third party (CourtsDesk) which has allegedly breached privacy rules and the service agreement by passing sensitive info to an AI service.
Coverage of the "urgent question" in parliament on the subject here:
House of Commons, Courtsdesk Data Platform Urgent Question
Then they start jailing people for posts.
Then they get rid of juries.
Then they get rid of public records.
What are they trying to hide?
In other countries, interference with the right to a fair trial would have lead to widespread protest. We don't hold our government to account, and we reap the consequences of that.
Obviously the government Ministry of Justice cannot make other parts of government more popular in a way that appeases political opponents, so the logical solution is to clamp down on open justice.
Though I'm not sure stopping this service achieves that.
Also - even in the case that somebody is found guilty - there is a fundamental principle that such convictions have a life time - after which they stop showing up on police searches etc.
If some third party ( not applicable in this case ), holds all court cases forever in a searchable format, it fundamentally breaches this right to be forgotten.
"The government has cited a significant data protection breach as the reason for its decision - an issue it clearly has a duty to take seriously."
https://www.nuj.org.uk/resource/nuj-responds-to-order-for-th...
They don't have a budget for that. And besides, it might be an externalized service, because self hosting is so 90s.
ETA: They didn't ship data off to e.g. ChatGPT. They hired a subcontractor to build them a secure AI service.
Details in this comment:
https://news.ycombinator.com/item?id=47035141
leading to this:
https://endaleahy.substack.com/p/what-the-minister-said
The government is behaving disgracefully.
Traced what? Innuendo is not a substitute for information.
https://www.tremark.co.uk/moj-orders-deletion-of-courtsdesk-...
They raise the interesting point that "publicly available" doesn't necessarily mean its free to store/process etc:
> One important distinction is that “publicly available” does not automatically mean “free to collect, combine, republish and retain indefinitely” in a searchable archive. Court lists and registers can include personal data, and compliance concerns often turn on how that information is processed at scale: who can access it, how long it is kept, whether it is shared onward, and what safeguards exist to reduce the risk of harm, especially in sensitive matters.
> ... the agreement restricts the supply of court data to news agencies and journalists only.
> However, a cursory review of the Courtsdesk website indicates that this same data is also being supplied to other third parties — including members of @InvestigatorsUK — who pay a fee for access.
> Those users could, in turn, deploy the information in live or prospective legal proceedings, something the agreement expressly prohibits.
> HMCTS acted to protect sensitive data after CourtsDesk sent information to a third-party AI company.
(statement from the UK Ministry of Justice on Twitter, CourtsDesk had ran the database)
but it's unclear how much this was an excuse to remove transparency and how much this actually is related to worry how AI could misuse this information
you _really_ shouldn't be allowed to train on information without having a copyright license explicitly allowing it
"publicly available" isn't the same as "anyone can do whatever they want with it", just anyone can read it/use it for research
harel•1h ago
nine_k•47m ago
harel•17m ago