It also:
* Bakes in the assumption that there are no internal mechanisms to be discovered ("Each environment is a mixture of multivariate Gaussian distributions")
* Ignores the possibility that their model of falsification is inadequate (they just test more near points with high error).
* Does a lot of "hopeful naming" which makes the results easy to misinterpret as saying more about like-named things in the real world than it actually does.
They are analyzing a toy model of science. The details and in figure 1. They have a search space that has a few Gaussians like
f(x,y,z) = A0 * expt(-(x-x0)^2-(y-y0)^2-(z-z0)^2) + A1 * expt(-(x-x1)^2-(y-y1)^2-(z-z1)^2)
but maybe in more than 3 dimensions and maybe with more than 2 Gaussians.
They want the agents to find all of Gaussians.
It's somewhat similar to a maximization problem that is easier. There are many strategies for this, from gradient ascent to random sampling to a million more of variants. I like simulated annealing.
They claim that the best method is random sampling, that only work when the search space is small. But it breaks quite fast for high dimensional problems, unless the Gaussians are so big that cover most of the space, and perhaps I'm beeing too optimistic. Add noise, overlapping Gaussians and the problem gets super hard.
Let's get to a realistic example, all the molecules with 6 Carbons and 12 Hydrogens. Let's try to find all of them and their stables 3D configuration. This is chemistry from the first year in the university, perhaps earlier, no cutting edge science.
You have 18 atoms, so 18 * 3 = 54 dimensions, and the surface of -energy has a lot of mountains ranges and nasty stuff. Most of them very sharp. Let's try to find the local points of maximal -energy, that is much easier than the full map. These are the stable molecules, that (usually) have names.
* There is a cycle one with 6 Carbons, where each Carbon has 2 Hydrogens, https://en.wikipedia.org/wiki/Cyclohexane Note that it actually has two different 3D variants.
* There is one with a cycle of 5 Carbons and 1 carbon attached to the cycle https://en.wikipedia.org/wiki/Methylcyclopentane
* There are variants with shorter cycles, but I'm not sure how stable they are and Wikipedia has no page for them.
* There is also 3 linear versions, where the 6 Carbons are a s wavy line, and there is a double bound in one of the steps https://en.wikipedia.org/wiki/1-Hexene I'm not sure why the other two version have no page in Wikipedia, I think they should be stable, but sometimes it's not a local maximum or the local maximum is to shallow and the double bound jump and the Hydrogen reorganize.
* And there may be other nasty stuff, take a look at the complete list https://en.wikipedia.org/wiki/C6H12.
And don't try to make the complete list when of molecules that includes a few Nitrogen, because the number of molecules explodes exponentially.
So this random sampling method they propose, does not even work for an elementary Chemistry problem.
https://www.wired.com/2010/06/ff-sergeys-search/
He went backwards and started with just collecting an absurd amount of data. Later while talking to a researcher he could confirm years of research with a "simple" search in his database.
selridge•1h ago
MarkusQ•1h ago
(This problem is not just limited to social scientists. I think you could, for example, construct a plausible objection to dark matter as an "explanation" that just "saves appearances" on the same basis.)
selridge•32m ago
What’s interesting about this paper is the suggestion that perhaps empiricism could do with a soft blur.
One might even invoke KJ Healy’s “Fuck Nuance” here as well.