Turned up to eleven: Fair and Balanced

Wednesday, May 15, 2002

Neural Nets, Consciousness, Intelligence; The Finale

This will probably be the last post on the topic of consciousness and modeling intelligence and brain function; I am driving people away in droves, I suspect, and the blog is just not conducive to the kind of in depth mathematics that is necessary to flesh this out. I know that will sound like a cop out, and if anyone has good suggestions on how to add mathematical symbology to a blog post, I am all ears. More to the point, however, I would like to get back to general interest essays, and a bit of first person reporting next week from the American Society for Microbiology general meeting. It is held Sun-Wed, in Salt Lake City, and I will be there. I will try to get some info on new developments in Microbiology, and report some of the stuff that will interest a general audience on this page. Stay tuned.

So when I left you, I started to describe, in very general terms, a way to describe the brain (really the entire CNS) as a connected graph, and mentioned the computational schema known as a artificial neural network. My last substantive notion was the idea of weighted inputs, W(ai), meaning the input into neuron i from neuron a. This is perhaps the most important concept in neurology, and probably the integral notion that will (eventually) describe consciousness. It is most emphatically not my idea!! The weighting function has a long history in artificial neural networks research, and is borne of the original research done long ago in neurophysiology establishing the notion of stimulatory and inhibitory neurotransmitters. In order to describe a system like this, however, we need to step back a bit, and decide what we are trying to describe.

The major philosophical point about consciousness is essentially materialism (I am not sure of the exact term used) vs. dualism (the Cartesian theatre of the mind). Dualism is appealing because we as humans naturally think of ourselves as a single, conscious actor in events, and when we look into the function of our own conscious mind, we instinctively envision a homunculous, a little puppeteer inside us that is pulling the strings based on our sensory inputs to make us act. The little puppeteer is our "mind", or our "conscience", or our "will". All of these concepts, crucial to the development of human thought, are rather useless when it comes to describing the actual mechanisms of thought. Of course, the sophisticated modern scientist wouldn't be so silly, right? Not true. The idea of a prime controller inside the brain is so strong, that we have real trouble abandoning it (much of this is inspired by the writing of Douglas Hofstadter, Roger Penrose, and to a lesser extent, Daniel Dennett). The clear outcome of any serious thought put to such an idea, however, is that at some point the emperor has no clothes, i.e., there must be some point where you have to actually determine a physical mechanism of conscious thought, unless you believe that it is extracorporeal (spirit, or mind). I personally subscribe to the notion that consciousness is an emergent property of the neural network, that is not "found" in one particular spot of the brain, but rather exists as a diffuse property of the system, if you will. This will inform the final description of my preliminary model.

To return to the question at hand, we have outlined a very sketchy model of the interconnections of the brain. Of course, there are billions of them, and actually identifying what is connected to what would take a very, very long time. So, can we make some simplifying assumptions?

Well, we already made one, dividing the neuronal population into input(I), output(O), and intermediate(M) neurons. Lets return to that structure for a moment. A neural network with only inputs and outputs is called a perceptron, and is in itself a powerful computational tool. Adding even a single layer of intermediate "neurons" (called a hidden layer) to the architecture drastically improves this power. Neural networks are "trained" on a set of solutions. In other words, a set of inputs is given to the network, and the output is compared to the "correct" answer. The connections within the network are then strengthened or weakened in order to change the output. The input is then repeated, and the new output assessed. I don't work in this field (although I am fascinated by it, my computer programming skills are, shall we say, sub par) but I gather that there are certain heuristics that can be used to govern the "error functions", to direct the network toward solutions. By using a "training set" of inputs and outputs, you can tailor the network to identify a given set of properties in unknown data. There is a risk of "overtraining", however, which will make a neural network unable to reliably identify things outside of the training set. Perhaps people in the field will disagree, but I sense that all of this is very "black box", i.e. the reasons behind the way some of this is done are fundamentally unknown, but it works.

It strikes me (as well as many others, I am sure), that there is a very strong parallel between the way that humans learn and the training of an artificial neural network. There is a training set, given during childhood (don't touch the hot stove, learn the alphabet, numbers, falling and hitting your head hurts, etc), and learned responses. Strictly speaking, however, there is no reliable set of solutions. To be sure, there are some universals (everyone, just about, learns some form of speech, just not the same speech), but there is incredible variation in what might be in that set of training data). Lets take a closer look at what this might mean.

As we said before, there is a set of inputs I(a...i...n), a set of outputs O(a...j...m), and a set of intermediate connections M(a...k...p). If we look at the signal into input neuron I(i), it travels into a set of neurons M(a, b, ...k) and O (a,b,...j) in one step (i.e. with no intervening nodes). For argument's sake, lets assume that the time of transmission is the same, so that time step one is signal processing in the next "level". This assumption is not that bad, because diffusion across the synapse is probably the rate limiting step in most brain function, rather than transit down the axon. If we "integrate" over the set of input neurons (I put that in quotes because it is not really mathematical integration, because we don't have a continuous variable, but it is conceptually similar), we can see that after one time step, there will be a large set of Outputs O(a, b, ...j(tot)) that have been activated, and there is a residual signal in the "hidden" layer. That residual signal in the hidden layers is, IMHO, the absolute key to understanding conscious thought.

So, mathematically, can we represent that? Well, the only representation that I can come up with is a massive matrix that contains in it the weighting of every connection, and a massive vector containing every neuron's state at time t. Suppose the vector has dimension n (in other words, there are n components to it; a 3 dimensional vector in cartesian coordinates has components x,y,z), where n is the number of neurons(total). An nxn matrix can be constructed, in which the ith row contains all of the connections from the various neurons into neuron i. The row might look like this;

[a b c 0 0 0 d e...0 0 f] In this instance, a-f are the weights of each connection, and their column position represents what neuron they come from.

At any given point in time, some subset of all neurons in the brain are firing, delivering a signal to the neurons connected to them. This can be represented by a binary vector of dimension n, where n is the number of neurons. Since we know have a vector and a matrix, we can do some (conceptually) straightforward matrix algebra to determine the state at time t+1. When multiplying a vector by a matrix, the result is a vector in which the i_th entry in the vector is the i_th entry in the input vector * the i_th column in the matrix. For every neuron in the brain, we can compute an input value, which corresponds to the neuron state vector times the appropriate column in the weighting matrix. From here, we just need to use a thresholding algorithm to determine whether each neuron is on or off in the next time step. In other words, we set some arbitrary threshold, x, for the input value, and then create a new state vector by the following algorithm;

If input value(i)>x, state(i)=1, if input value(i)
I have obviously drastically simplified the nature of stimulus/response, but the point is that, with the exception of the very first signal into the newborn brain, there is a substantial hidden input into every response. We can conceptualize this by thinking about how our prior experience shapes our reactions. When you first woke up on 9/11, what was your reaction? If you were an Israeli citizen, would it be different? I know that I thought it must be a tragic, scary accident, and it was only as the morning progressed that it dawned on me that someone could do that on purpose. This is a dramatic example of how prior knowledge (or lack thereof) can shape perception, but they go down the line. It gets even more interesting (and confusing), as neurobiologists and psychologists attempt to fool us, and show us something about how our brains work by use of sensory illusions (the most common examples are visual).

Now, I have suggested two vector/matrix algebra models for to "explain" conscious thought/intelligence, but I haven't given any method 1) for solving either set of equations, or 2) any way to connect the two. I will leave that as an exercise for the reader (just kidding!). I actually don't claim to have any solution, but I do think that one way to approach the problem of understanding intelligence/consciousness is to use these types of representations (I don't claim mine are flawless, I have no doubt that smarter people than me will create better ones) to try and bridge the gap. In essence, this is the question of reductionism v. holism, i.e. we have a view of how neurons work and are connected, and a view of how people act and think, and we can only solve the mystery by bringing those two in harmony.

Some might ask, if any are still interested, what does this have to do with Godless Capitalist's original assertion of racial (genetic) components of intelligence? Well, maybe nothing. I suspect, however, that the current view of intelligence is informed by our fundamental belief in the unity, or "thingness" of consciousness. In other words, we intuitively, reflexively think of "the mind" as a single entity, a thing that exists, is identifiable and quantifiable. However, it seems to me that "the mind" is a distributed emergent quality of our CNS, and as such not clearly identifiable as resident in some particular part of the brain. This is a big problem with "heritability" of intelligence discussions (there are others, which I elaborated on before). Mostly, though, I spent a couple of years in grad school thinking about this rather than working on my thesis, and I always wanted to know what people thought, so here is your chance!