Slightly tangent question - they said that they have protected the public test set with a strong copyleft license to prevent training private models on them.
Does it actually work? Isn’t AI training so far simply ignores all license and copyright restrictions completely?
I definitely trust the totally private dataset more.
stephendause•52m ago
This is a key question in my opinion. It's one of the things that make benchmarking the SWE capabilities of LLMs difficult. It's usually impossible to know whether the LLM has seen a problem before, and coming up with new, representative problem sets is time-consuming.
CuriouslyC•24m ago
You can just fuzz names and switch to a whitespace compact representation.
stri8ed•47m ago
Not a chance. Even if American companies did abide by it, there is no reason Chinese companies would. And good luck definitely proving that a model trained on it.
candiddevmike•43m ago
Sir, we've already ingested 503,377 copyleft licensed codebases, I don't think the training set can take anymore!
WhitneyLand•39m ago
Recently it was pointed out that models were sometimes finding SWE-Bench verified cheats by scanning parts of the repo not meant to be visible.
siliconc0w•1h ago