Cases aren't ordered randomly. Obvious cases are scheduled at the end of session before breaks.
From the paper:
“we find that the LLM adheres to the legally correct outcome significantly more often than human judges”
That presupposes that a “legally correct” outcome exists
The Common Law, which is the foundation of federal law and the law of 49/50 states, is a “bottom up” legal system.
Legal principals flow from the specific to the general. That is, judges decided specific cases based on the merits of that individual case. General principles are derived from lots of specific examples.
This is different from the Civil Law used in most of Europe, which is top-down. Rulings in specific cases are derived from statutory principles.
In the US system, there isn’t really a “correct legal outcome”.
Common Law heavily relies on “Juris Prudence”. That is, we have a system that defers to the opinions of “important people”.
So, there isn’t a “correct” legal outcome.
Remember the article that described LLMs as lossy compression and warned that if LLM output dominated the training set, it would lead to accumulated lossiness? Like a jpeg of a jpeg
I am comforted that folks still are trying to separate right from wrong. Maybe it’s that effort and intention that is the thread of legitimacy our courts dangle from.
Until this administration forces OpenAI to comply by secret government LLM training protocols that is...
To be clear, federal judges do have their paychecks signed by the federal government, but they are lifetime appointees and their pay can never be withheld or reduced. You would need to design an equivalent system of independence.
The problem with a AI is similar; what in-built biases does it have? Even if it was simply trained on the entire legal history that would bias it towards historical norms.
I feel like this is really poor take on what justice really is. The law itself can be unjust. Empowering a seemingly “unbiased” machine with biased data or even just assuming that justice can be obtained from a “justice machine” is deeply flawed.
Whether you like it or not, the law is about making a persuasive argument and is inherently subject our biases. It’s a human abstraction to allow for us to have some structure and rules in how we go about things. It’s not something that is inherently fair or just.
Also, I find the entire premise of this study ludicrous. The common law of the US is based on case law. The statement in the abstract that “Consistent with our prior work, we find that the LLM adheres to the legally correct outcome significantly more often than human judges. In fact, the LLM makes no errors at all,” is pretentious applesauce. It is offensive that this argument is being made seriously.
Multiple US legal doctrines now accepted and form the basis of how the Constitution is interpreted were just made up out of thin air which the LLMs are now consuming to form the basis of their decisions.
How do we even begin to establish that? This isn't a simple "more accidents" or "less accidents" question, its about the vague notion of "justice" which varies from person to person much less case to case.
hah. Sure.
> Subjects were told that they were a judge who sat in a certain jurisdiction (either Wyoming or South Dakota), and asked to apply the forum state’s choice of law rule to determine whether Kansas or Nebraska law should apply to a tort case involving an automobile accident that took place in either Kansas or Nebraska.
Oh. So it "made no errors at all" with respect to one very small aspect of a very contrived case.
Hand it conflicting laws. Pit it against federal and state disagreements. Let's bring in some complicated fourth amendment issues.
"no errors."
That's the Chicago school for you. Nothing but low hanging fruit.
Not expressing an opinion when/how AI should contribute to legal proceedings. I certainly believe that judges need to respond both to the law and the specific nuances that the law can never code for.
As mentioned elsewhere in the thread, judges focus their efforts on thorny questions of law that don't have clear yes or no answers (they still have clerks prepare memos on these questions, but that's where they do their own reasoning versus just spot checking the technical analysis). That's where the insight and judgement of the human expert comes into play.
The title of the paper is "Silicon Formalism: Rules, Standards, and Judge AI"
When they say legally correct they are clear that they mean in a surface formal reading of the law. They are using it to characterize the way judges vs. GPT-5 treat legal decisions, and leave it as an open question which is better.
The conclusion of the paper is "Whatever may explain such behavior in judges and some LLMs, however, certainly does not apply to GPT-5 and Gemini 3 Pro. Across all conditions, regardless of doctrinal flexibility, both models followed the law without fail. To the extent that LLMs are evolving over time, the direction is clear: error-free allegiance to formalism rather than the humans’ sometimesbumbling discretion that smooths away the sharper edges of the law. And does that mean that LLMs are becoming better than human judges or worse?"
But yeah AI slop and all that...
It responds: Since it’s only 100 meters away (about a 1-minute walk), I’d suggest walking — unless there’s a specific reason not to.
Here’s a quick breakdown: ...
While claude gets it: Drive it — you're going there to wash the car anyway, so it needs to make the trip regardless.
Idk I'd rather have a human judge I think.
codingdave•1h ago
Digging a bit deeper, the actual paper seems to agree: "For the sake of consistency, we define an “error” in the same way that Klerman and Spamann do in their original paper: a departure from the law. Such departures, however, may not always reflect true lawlessness. In particular, when the applicable doctrine is a standard, judges may be exercising the discretion the standard affords to reach a decision different from what a surface-level reading of the doctrine would suggest"
latchkey•1h ago
gowld•1h ago
These were technical rulings on matters of jurisdiction, not subjective judgments on fairness.
"The consistency in legal compliance from GPT, irrespective of the selected forum, differs significantly from judges, who were more likely to follow the law under the rule than the standard (though not at a statistically significant level). The judges’ behavior in this experiment is consistent with the conventional wisdom that judges are generally more restrained by rules than they are by standards. Even when judges benefit from rules, however, they make errors while GPT does not.
swalsh•1h ago
droidjj•1h ago
tylervigen•1h ago
scottLobster•1h ago
I don't trust AI in its current form to make that sort of distinction. And sure you can say the laws should be written better, but so long as the laws are written by humans that will simply not be the case.
rco8786•50m ago
gambiting•43m ago
jagged-chisel•41m ago
I don't see how an AI / LLM can cope with this correctly.
conradev•33m ago
https://en.wikipedia.org/wiki/COMPAS_(software)
Lerc•33m ago
arctic-true•30m ago
qmmmur•10m ago
wvenable•26m ago
throwaway894345•21m ago
Lerc•20m ago
So yes, a judge can let a stupid teenager off on charges of child porn selfies. but without the resources, they are more likely be told by a public defender to cop to a plea.
And those laws with ridiculous outcomes like that are not always accidental. Often they will be deliberate choices made by lawmakers to enact an agenda that they cannot get by direct means. In the case of making children culpable for child porn of themselves, the laws might come about because the direct abstinence legislation they wanted could not be passed, so they need other means to scare horny teens.
FarmerPotato•3m ago
qwertox•1h ago
You can have a team of agents exchange views and maybe the protocol would even allow for settling the cases automatically. The more agents you have, the higher the nuances.
jagged-chisel•36m ago
qwertox•31m ago
deepsun•1h ago
In both cases, lawmakers must adapt the law to reflect what people think is "just". That's why there are jury duty in some countries -- to involve people to the ruling, so they see it's just.
toolslive•54m ago
rootusrootus•39m ago
Agree 100%. This is also the only form of argument in favor of capital punishment that has ever made me stop and think about my stance. I.e. we have capital punishment because without it we may get vigilante justice that is much worse.
Now, whether that's how it would actually play out is a different discussion, but it did make me stop and think for a moment about the purpose of a justice system.
andyferris•20m ago
(I mean - people get killed in prison sometimes, I suppose, but it’s not really like vigilante justice on the streets is causing a breakdown in society in Australia, say…)
vjulian•44m ago
jagged-chisel•39m ago
arctic-true•27m ago
fluidcruft•25m ago
Sentencing is a different thing.