|Turned up to eleven: Fair and Balanced|
Monday, May 06, 2002
"Godless Capitalist" is approaching the problem from the viewpoint of classical genetics, which uses allele frequency inferences from phenotype, based largely on breeding programs in animals (crosses, backcrosses, etc) and twin studies in humans. These methods rely on complicated and valuable statistical methods to draw conclusions. In the comments I was pretty harsh on factor analysis, probably unfairly so. But there is an essential assumption in all statistical analysis, which is the assumption of independent variables. By this I do not mean that the variables are all assumed to be independent. There is a well described voluminous literature on Bayesian conditional probability which allows us to calculate the probability of an event occuring dependent on another (spoken in the form "probability of A given B", and written P(A|B)) event or set of events. For a short, thorough explanation see this site.
Bayesian inference requires assumptions about the independence of the variables that the hypothesis variable is dependent on and, importantly for statistical analysis, the constant probability of dependent variables. A simple example may suffice; Suppose that a gene (or a set of genes) confers the ability to play the violin. A number of environmental factors will play into whether this phenotype is displayed, including nutrition, disease, poverty (can you afford a violin, or lessons?), and others, all of which can be controlled for, to some extent (the difficulty here translates into epidemiology as well). However, what if the child has a hereditary disease, such as CF? In this case, the child with the potential to be a great violinist will never have a chance to display it. This to can be controlled for, but we get into some problems as we strive to increase our precision by including more and more of these factors in our analysis, due to our ignorance of the genetic mechanisms of disease. A gene has been identified for CF, and several other severe congenital diseases, but the linkage to any given gene or genes for a complex trait are unknown. We can therefore not make a good assumption about whether any of these myriad genetic diseases, which may affect the display of an unrelated complex phenotype, are statistically or physically independent of that phenotype. This may not seem like such a big problem, because these diseases are, individually, (thankfully) rare. But in the aggregate, you can start to wonder what the limits of precision on any statistical inference method for complex traits might be.
This is not to disparage the fantastic work of classical geneticists from Mendel to the present (I don't keep up with classical geneticists!). Many great insights have been made using this method, but the difficulty of inference seems to me to go up exponentially with genotype complexity. In other words, the more genes that are involved in determining (partially!) a phenotype, the harder it gets to understand by looking at the emerging phenotypes. Not only that, but it gets much harder with only small increases in genetic complexity (i.e. 3-6 genes, perhaps). Here is where I think people with my mindset come into the mix. I am, cough cough, a molecular geneticist (and sometimes a protein biochemist, and sometimes a microbial physiologist, and sometimes a biochemical engineer), and as such, I tend to think directly about the expression of genes to make proteins and biochemical networks. As our knowledge of human genetics explodes in the next twenty years or so, and our ability to make "transgenic" animals and human cells (not transgenic people, for a while...) increases to include the insertion of whole biochemical enzyme cascades and loops, our understanding of the molecular genetics of complex phenotypes will explode as well. I think that will shed incredible light on the questions at hand, and resolve much of this debate over genetic bases for nebulous traits such as intelligence, musical talent, creativity, and so many other things that make us human beings. My prediction (humbly made!) is that our understanding of the molecular basis for these traits will show that these are "emergent properties", that are not directly coded for by any distinct set of genes, but emerge in a (mathematically) complex way from the genetically determined and environmentally shaped development of the human brain.
In my next post, I will go back to my nascent "model", and start from the other end of the spectrum, working up from the bottom, so to speak, incorporating the concepts of a connected graph and a neural network to develop some mathematics to describe the function of the brain at the most basic level. In the final act, I will try and suggest some ways to connect the "top down" and "bottom up" models. Before going too much further, I should note some people whose work has affected my thoughts on this deeply, and although I feel as if this is original work on my part, I must acknowledge their impact; Roger Penrose, Doug Hofstadter, and Dan Dennett in particular have written fantastic books on the topics of Consciousness and Artificial Intelligence that both got me interested in the topic and helped my formulate some hypotheses about it. Although I don't necessarily agree with all or any of them about every aspect of Intelligence or Consciousness, they have been formative influences on me (and millions of others).