https://cdn.aaai.org/IAAI/2004/IAAI04-019.pdf
It has 490 citations.
DARPA has a whole program named after it: https://www.darpa.mil/research/programs/explainable-artifici...
The real question is whether we can get some insight as to how exactly it's able to do this. For convolution neural networks it turns out that you can isolate and study the behavior of individual circuits and try to understand what "traditional image processing" function they perform, and that gives some decent intuition: https://distill.pub/2020/circuits/ - CNNs become less mysterious when you break them down as being decomposed into "edge detectors, curve detectors, shape classifiers, etc."
For LLMs it's a bit harder, but anthropic did some research in this vein.
Is this even something that's possible with current tech? Like, surely cats have some facial features that can be used to uniquely identify them? It would be cool to have a global database of all cats that users would be able to match their photos against. Imagine taking a picture of a cat you see on the street, and it immediately tells you the owner's details and whether it's missing.
[1]: https://tanelpoder.com/posts/catbench-vector-search-query-th...
Maybe not with you ;)
Tricks include facial alignment + cropping and very strong constraints on orientation to make sure you have a good frontal image (apps will give users photo alignment markers). Otherwise it's a standard visual seatch. Run a face extraction model to get the crop, warp to standard key points, compute the crop embedding, store in a database and do a nearest neighbour lookup.
There are a few startups doing this. Also look at PetFace which was a benchmark released a year or so ago. Not a huge amount of work in this area compared to humans, but it's of interest to people like cattle farmers as well.
Impressed that it can do as well as it does, I just find that amusing.
Anyway, it’s 40 years later and I just read this article and said, “Oh! Now I get it.” A little too late, for Dr. Hippe’s class.
Seems extremely prescient…
My favorite work on digging into the models to explain this is Golden Gate Claude [0]. Basically, the folks at Anthropic went digging into the many-level, many-parameter model and found the neurons associated with the Golden Gate Bridge. Dialing it up to 11 made Claude bring up the bridge in response to literally everything.
I'm super curious to see how much of this "intuitive" model of neural networks can be backed out effectively, and what that does to how we use it.
isopede•6h ago
cwmoore•5h ago
I am struck by the conceptual framework of classification tasks so snappily rendering clear categories from such fuzziness.
chacham15•5h ago