Is Algorithmic Specified Complexity Useless for Analyzing Evolution?
Eric Holloway has been asserting that William Dembski's arguments about conservation of Complex Specified Information are valid, that they have never been validly refuted. These include Dembski's original (2002) argument about CSI in his book No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence, Dembski's (2005) revision of his original argument as well as Dembski and Marks's (2009) argument involving "active information". Holloway has also made this sweeping claim at the blog The Skeptical Zone, where he has debated his critics (here, here, here, and here). In addition he has made this broad claim about lack of any sensible criticism of CSI arguments at the site Mind Matters, sponsored by the Discovery Institute's Walter Bradley Center. His arguments have been commented on favorably in posts at the antievolution site Uncommon Descent (here and here) where Holloway has agreed with their characterization of what he has accomplished.
When questioned on his logic, Holloway does not actually defend Dembski's original version of CSI, or Dembski's modified version of CSI. Instead he points to the concept of Algorithmic Specified Complexity, arguing that this is the essential part of the proof that Complex Specified Information is conserved, and hence that observation of this version of CSI implies Design.
But no one has asked whether it makes sense to use ASC in arguments about a law preventing gain of information in evolution. Holloway's argument that theorems about ASC apply also to CSI leaves one with the impression that those theorems somehow prove that, as Dembski (2002) said, "complex specified information cannot be purchased without intelligence".
Let me say flatly: I don't think that it does make sense. And if I am right about that, we can ignore most of the debate about conservation of ASC as irrelevant to arguments about whether normal evolutionary processes can accumulate information in the genome.
Why do I think that ASC is irrelevant to establishing conservation of CSI in all these arguments? Let me explain.
Algorithmic Specified Complexity
Algorithmic Specified Complexity (ASC) is a use of Kolmogorov/Chaitin/Solomonoff (KCS) Complexity, a measure of how short a computer program can compute a binary string (a binary number). By a simple counting argument, those authors were able to show that binary strings that could be computed by short computer programs were rare, and that binary strings that had no such simple description were common. They argued that that it was those, the binary strings that could not be described by short computer programs, that could be regarded as "random".
It is possible to prove theorems, in effect about conservation of the shortness of the program. A very rough handwavy version of those conservation arguments is just to note that if a binary string can be computed by a short computer program, and if we then make a transformation of the binary string, the new string can be computed by the original short program followed by another short one that carries out the transformation.
ASC reflects shortness of the computer program. In simple cases, the ASC of a binary string is its "randomness deficiency", its length, n, less the length of the shortest program that gives it as its output. That means that to get a genome (or binary string) that has a large amount of ASC, it needs long string that is computed by a short program. To get a moderate amount of ASC, one could have a long string computed by medium-length program, or a medium-length string computed by a short program. Randomness deficiency was invented by information theory researcher Leonid Levin and is discussed by him in a 1984 paper (here). Definitions and explanations of ASC will be found in the papers by Ewert, Marks, and Dembski (2013), and Ewert, Dembski and Marks (2014). Nemati and Holloway have recently published a scientific paper at the Discovery Institute's house journal BIO-Complexity, presenting a proof of conservation of ASC. There has been discussion at The Skeptical Zone of the technical issues with ASC -- is it conserved or is it not? In particular, Tom English (here and here) has presented detailed mathematical argument at The Skeptical Zone showing simple cases which are counterexamples to the claims by Nemati and Holloway, and has identified errors in their proof. See also the comments by English in the discussion on those posts.
But the real question is not whether the randomness deficiency is conserved, or whether the shortness of the program is conserved, but whether that implies that an evolutionary process in a population of genomes is thereby somehow constrained. Do the theorems about ASC somehow show us that ordinary evolutionary processes cannot achieve high levels of adaptation?
Pooish Puzzlements
At this stage in the argument, I confess myself puzzled. I hope that the readers here will help me out with this. I confess myself very fallible on these subjects. As Pooh said:
When you are a Bear of Very Little Brain, and you Think of Things, you find sometimes that a Thing which seemed very Thingish inside you is quite different when it gets out into the open and has other people looking at it.
A. J. Milne, The House at Pooh Corner, 1928
Bumbling around in Pooish fashion, I have trouble seeing how ASC has any connection to limits on evolution. Why use ASC as something that indicates design? There is a puzzle here. Is simply-described structure somehow difficult for an evolving system to achieve? Is it somehow desirable? Is evolution succeeding in achieving adaptations, in increasing fitness, only to the extent that it brings about organisms that are simply describable? Or to the extent that it brings about organisms that are not simply describable? I don’t think that Holloway has at all made this clear.
For that matter, there is another Pooish muddle in my brain. What is it that the computation is computing? A binary string representing the genotype? Or a binary string representing the phenotype? All of this is left distressingly unclear in the ASC arguments, however meticulously the conservation of the ASC quantity may (or may not) be proven.
A Pooish Brainwave
Well perhaps, I thought, there really is a rationale for the ASC criterion. Perhaps we are talking about the complexity of living organisms, and the conservation of ASC shows that it is very difficult for ordinary evolutionary processes to achieve genotypes or phenotypes that are complex. This seemed like a promising direction for exploration, until I realized that it is backwards. Backwards because ASC does not increase with complexity, it increases with greater simplicity of description. Far from arguing that complexity cannot be achieved by natural evolutionary processes, the arguments that high ASC is difficult to achieve seem instead to be trying to show that biological systems cannot achieve simplicity.
Another Issue
Specified-complexity arguments about evolution have another problem. Whether they are for ASC or for the earlier criterion CSI, the change of the genotype under normal evolutionary processes is modeled by a function applied to the genotype. This might be a description of what mutation does to a genome, if we allow random functions. But it is not a good description of natural selection. In natural selection, a population of individuals of different genotypes survives and reproduces, and those individuals with higher fitness are proportionately more likely to survive and reproduce. It is not a matter of applying some arbitrary function to a single genotype, but of using the fitnesses of more than one genotype to choose among the results of changes of genotype. Thus modeling biological evolution by functions applied to individual genotypes is a totally inadequate way of describing evolution. And it is fitness, not simplicity of description or complexity of description, that is critical. Natural selection cannot work in a population that always contains only one individual. To model the effect of natural selection, one must have genetic variation in a population of more than one individual.
CSI and conservation arguments
In the case where we do not use ASC, but use Complex Specified Information, the Specified Information (SI) quantity is intrinsically meaningful. Whether or not it is conserved, at least the relevance of the quantity is easy to establish. In William Dembski's original argument that the presence of CSI indicates Design (2002) the specification is defined on a scale that is basically fitness. Dembski (2002, p. 148) notes that
The specification of organisms can be cashed out in any number of ways. Arno Wouters cashes it out globally in terms of the viability of whole organisms. Michael Behe cashes it out in terms of the minimal function of biochemical systems. Darwinist Richard Dawkins cashes out biological specification in terms of the reproduction of genes. Thus in The Blind Watchmaker Dawkins writes "Complicated things have some quality, specifiable in advance, that is highly unlikely to have been acquired by random chance alone. In the case of living things, the quality that is specified in advance is ... the ability to propagate genes in reproduction."
The scale on which SI is defined is basically either a scale of fitnesses of genotypes, or a closely-related one that is a component of fitness such as viability. It is a quantity that may or may not be difficult to increase, but there is little doubt that increasing it is desirable, and that genotypes that have higher values on those specification scales will make a larger contribution to the gene pool of the next generation.
If a conservation law can be established that shows that a population cannot end up in a state of high fitness without already having started with fitness at least that high, it will have shown that there is a barrier to achieving that state by normal evolutionary processes such as natural selection. But, alas for William Dembski and company, the conservation of CSI is basically unprovable in the form that would be needed to show that (for an accessible argument, see see my article on that). I have also shown in a straightforward population genetics calculation at The Skeptical Zone that natural selection can increase Specified Information, with no barrier in that simple case to making the SI high enough to be Complex Specified Information.
"Complexity"
The use of the word "Complex" in both concepts is confusing. It was actually first associated with Specified Information by Leslie Orgel, who invented Specified Information. In his case, and in subsequent uses, high complexity means that the organism is not a simple crystal with repeating structure, but is more complicated. He uses the length of a description of the organism to make this distinction. But his approach is not like the KCS complexity measure -- he does not discuss how long a description would be needed for a random bit string. In the CSI measure, a genotype has "complex" specified information when its fitness is sufficiently high, whether or not that is associated with complicated phenotypes or with a long genome. The name Algorithmic Specified Complexity is even more confusing. ASC is a number that is high when the bit string that describes the organism is long, but can be computed by a relatively short program. I think. Or perhaps when the bit string which encodes the genome of the organism is long, but can be computed by a relatively short program. If anything is needed, it is a careful discussion of how ASC relates to the length of the genome, the length of a description of the phenotype, to fitness, and to achieving a complicated structure.
I suggest that no connection of ASC to fitness is possible. Whether or not I am right about that, the matter needs to be addressed before anyone can say that there is conservation of ASC, and that this shows that there some limit to the ability of evolutionary processes to achieve adaptation or to increase the fitness of organisms.
Conclusions
1. There is no known correlation between fitness, or any other measure of degree of adaptation, and the simplicity with which we can describe an organism's genotype or phenotype.
2. A proof that the high levels of fitness that we see in living organisms cannot be achieved by evolutionary processes such as natural selection would be a major refutation of modern evolutionary biology. William Dembski's Law of Conservation of Complex Specified Information attempted such a proof, but this proof fails, and Dembski's LCCSI is no longer discussed by proponents of Intelligent Design, except in occasional mistaken assertions that such a law has been proven in a form that shows that high levels of fitness cannot be achieved from lower ones.
3. By contrast, the ASC algorithmic complexity measure is argued to have proofs that can be made that, in effect, constrain how large a "randomness deficiency" can be achieved; in effect, how simple an algorithm can be achieved. Those purported proofs are disputed, with counterexamples provided by Tom English.
4. Nevertheless, it has become common to argue that the alleged conservation of ASC shows that there are limits on what evolution can do. Holloway, Uncommon Descent, and the Discovery Institute's website Mind Matters have all bought into this.
5. However, the connection to evolution is lacking. There is actually no explanation as what the short computer program is computing. Is the issue how simple an algorithm is needed to compute a binary string which represents the genome? It is not hard to imagine a binary string which has one pair of bits for each base in the genome. Is it that binary string that is being computed? Or is the issue how simple an algorithm is needed to compute a detailed description of the individual's phenotype? This uncertainty has not been addressed at all in the ASC arguments about evolving systems, rendering those arguments even more meaningless.
6. If, as I argue, there is no correlation between high ASC and high fitness, then natural selection will not tend to bring about high values of ASC (hence simpler descriptions of whatever-it-is that is being described), because there will be no fitness reward for doing so. Observing organisms that are well-adapted, we can be reasonably sure that they have high Specified Information, where the specification is fitness. But we have no reason to believe that they have high ASC. In finding that they have high fitness, that they are in some sense well-adapted, we have not observed anything that is relevant to how simple or how non-simple are any descriptions of their genotypes or phenotypes.
7. We may conclude that even if ASC of organisms could somehow be defined, and even if some limit on its change could somehow be proven, the non-increase of ASC would not establish any limits on what natural selection can do to improve fitness.
Thanks to Tom English for helpful comments on an earlier draft of this post.