The Symbol Grounding Problem
Think your AI understands the meanings of words? Or understands anything at all? Guess again. It probably doesn’t!
There’s a big issue inherent in trying to make artificial minds that understand like a human does. It’s called the Symbol Grounding Problem1S. Harnad, “The Symbol Grounding Problem”. Physica D 42, 1990., pp. 335-346..
TLDR: How can understanding in an AI be made intrinsic to the system, rather than just parasitic on the meanings in the minds of the developers / trainers?
If you think about AI systems that are around, is there a point where you can disconnect human engineers providing meaning in the development or training areas and say alright here this AI actually generates meaning that it knows with no causal dependencies on data and operations provided by humans?
But where does understanding and meaning ultimately come from? Meaning is a human thing (maybe for other animals too but lets focus on humans). You might say it’s a human activity. And one major gateway into how meaning works inside the mind is through language and other signs that we routinely create and interact with.
Semiotics
opening a door requires us to approach it with ‘preconceptions’: we need to recognize that it belongs to the linguistic category of ‘doors’ (based on what these are for and how they work). It also requires us to relate it to the social system, for instance with reference to whether you own it, whether you are entitled to open it, and so on. Such prior knowledge forms part of a complex interpretive framework that ‘mediates’ the experience and routinely guides our expectations and behaviour (usually beyond our conscious awareness). 2Chandler, D. Semiotics: The Basics. 3rd Ed. Routledge, 2017.
Let’s say you’re driving along and see a road sign indicating speed bumps. You slow your car down, and make your way over the speed bumps. Simple right?
But a lot just went on. The physical metal road sign was seen by you (hopefully, for the sake of your car). And you some how interpreted the paint on the sign as a symbol, which in turn triggered something in your mind to react and slow down.
There was some kind of understanding there. There was meaning. You might say well that’s just a knee-jerk trained response to the stimulus of the sign. Alright, then let’s make sure by saying that you recognize what the sign refers to and you actively search for visual confirmation of the speed bumps. Maybe you even discuss your distaste of speed bumps with a passenger.
The processes of understanding this is based on a lot of structures already formed in your mind so that you can know what it is. The meaning is built up of some smaller and and more primitive mental structures, at least that’s my conjecture. And those structures link through space and time to more bodily and body-environment primitives.
Argument: A road sign that says “speed bump” is referring to physical speed bumps. So isn’t that meaning embedded right there? Speed bumps exist, they are right there. The sign is just a helper, the bumps are there even if nobody reads the sign, right?
No, it would be magic if say a blank slate human, like a Mr. Bean when he drops in the opening titles of his show, could be presented with no prior experience a road sign and somehow get the link to this other object he’s never known, the speed bump. There’s an interpretation. Signs do not in themselves transmit messages.
In all of these interactions you’ll notice that human minds are part of it and in a nontrivial way.
So we’ve got signs going on “out there.” But what if we follow this network of signs—or symbols—all the way “in there”—inside people’s minds.
It seems there has to be some kind of semantic network that has a path down into the mind. Or perhaps you might think of it as through the mind.
Semantics
Understanding, like seeing, is grasping this in relation to that.3Sousanis, N. Unflattening. Harvard University Press, 2015.
You perceive an object (the sign about speed bumps) and interpret it as a message. This message is then linked to concepts, memories, something that we might say is “in” the mind. Maybe it’s representational, maybe it’s not. But somehow it triggers a concept of what speed bumps are, which in turn probably depends on your entire life history and evolution of your species.
Well that escalated quickly.
There’s a deepness in the mind you might say. A mind most likely has semantics because of a built up history of connections that boil down to more and more primitive ones that are currently or in the past very bodily.
And these probably span over a gradient and/or combination of mental parts learned during your development as a baby, learning as a child and adult and things that are instinctual. This has been called the altricial-precocial spectrum.4Sloman, A. & Chappell, J. The Altricial-Precocial Spectrum for Robots. Proceedings IJCAI’05, pp 1187–1192, Edinburgh, 2005
And in turn you as a part of this environment can then additionally find the actual speed bump objects and then recognize those as the ones the sign was referring to.
If you imagine this is like a network of networks, just this “simple” speed bump example results in information flowing through a whole lot of nodes and multiple cycles through an environment-body-mind system. “Nodes” is an abstraction so I can attempt to explain this but it might also be useful for Strong AI architectures.
Anyway, a critical part of this whole multi-media organic network and information flow is: a concept has to hit “ground” during the process for understanding to happen in the human mind.
The speed bump hit ground as the information flowed through the mind, probably multiple times, and triggered structures, possibly in a complicated network, that had developed during the experience of the human and in turn were based on evolution and the environment and interactions during their life.
“Symbol Grounding” is perhaps an unfortunate name, because the “symbol” at any stage may not be a symbol in any normal way. But it’s a way to start describing a system. And it might be an ok abstraction.
In computing we use abstractions all the time but underneath can be something totally different. E.g. a “symbol” at one plane of abstraction might be several layers of increasingly distributed nodes and agents and whatnot. And a digital symbol in a computer could, if you go far down enough, be based on an analog signal from something in the real world. Networking and the Internet itself are classic layered protocols where things that seem reliable and usable on the top become increasingly more primitive and unreliable as you go down.
Avoiding the Problem
What is at issue here is really quite simple, although it is easily overlooked in the layers and layers of overinterpretation that are characteristic of a condition I’ve dubbed getting lost in the “hermeneutic hall of mirrors,” which is the illusion one creates by first projecting an interpretation onto something (say, a cup of tea-leaves or a dream) and then, when challenged to justify that interpretation, merely reading off more and more of it, as if it were answerable only to itself.5Harnad, S. (1990) Lost in the hermeneutic hall of mirrors. Invited Commentary on: Michael Dyer: Minds, Machines, Searle and Harnad. Journal of Experimental and Theoretical Artificial Intelligence 2: 321 – 327.
Some might think that enough auto generated architecture, an essentially statistical approach, will capture something in reality in the same way that human minds effectively do, and therefore have real understanding. E.g. with a big enough disembodied non-embedded deep learning / transformer architecture trained the right way. I am highly skeptical, at least as an approach on its own.
Aaron Sloman, who’s one of the few hardcore philosophers that are also hardcore AI researchers, and who I had the pleasure of meeting and discussing some issues with many years ago, has thrown a possible monkey wrench into the debate: he’s said that Symbol Grounding is just a rehash of the philosophical concept “concept empiricism” which has been refuted and refuted and refuted until it has died many times6A. Sloman, “Symbol Grounding is Not a Serious Problem. Theory Tethering Is,” IEEE AMD Newsletter, April 2010..
But it seems to me that is just a different way of defining Symbol Grounding. The big examples in his arguments are abstract concepts like space and time and scientific entities like electrons and theories of things you’ve never encountered. How can humans have grounding in experience of those concepts?
Sloman offers “Theory Tethering” as an alternative. But my interpretation of “Symbol Grounding” was always of tethering. Of course there’s no direct way to link abstract concepts. That’s why the notion of “metaphors we live by”7G. Lakoff and M. Johnson, Metaphors We Live By, 2nd ed. University Of Chicago Press, 2003. is so powerful. And also, when you start out as a baby, you’re not magically creating a grounding for concepts with your ontogenetic (lifetime) mechanisms from nothing—you get to start out with something that’s bodily and instinctive, a result of evo-devo8Carroll, S.B. (2005). Endless forms most beautiful : the new science of evo devo and the making of the animal kingdom..
I guess I always thought of it as “tethering” at least in my own way. Grounding isn’t a single hop in a network of relations—it could have any number of hops and that’s I think basically what tethering means. And that allows one to create theories or even purely imaginary concepts that are still grounded.
Ok let’s say we tackle it head on in AI. What are the mental primitives I mentioned? What are the gritty mechanisms of meaning? More about those in future essays…