William Paley Institute
for
Intelligent Design

Home
Reports
Back
 
Intelligent Design as a Theory of Information

by William A. Dembski
Department of Philosophy
University of Notre Dame
Notre Dame, Indiana, USA


Information

In his book Steps Towards Life, Manfred Eigen (1992, p. 12) summarizes the task of origins-of-life research as follows: Our task is to find an algorithm, a
natural law that leads to the origin of information. This summary of
origins-of-life research is at once insightful and misguided. It is insightful
because it correctly isolates the central problem facing origins-of-life
research, to wit, the origin of information. At the same time, it is misguided
because it prescribes an unworkable solution for this problem, to wit,
algorithms and natural laws. Algorithms and natural laws are utterly incapable
of producing information. Indeed, it is an oxymoron to attribute the origin of
information to algorithms and natural lawsinformation is inaccessible from
algorithms and natural laws. Eigen is working on the right problem, but looking
to the wrong solution. Eigens insight is to see that the origin of information
constitutes the central problem facing origins-of-life research; Eigens mistake
is to think that algorithms and natural laws constitute the solution. In this
paper I shall examine Eigens insight and correct Eigens mistake. To examine
Eigens insight, I shall explicate the concept of information and connect it to
biological reality. To correct Eigens mistake, I shall introduce intelligent
causation and show why it, and not algorithms or natural laws, provide the right
way to account for the origin of information.

Let us then begin with information. What is information? The fundamental
intuition underlying information is not, as is commonly thought, the
transmission of signals across a communication channel, but rather, the ruling
out of possibilities. To be sure, when signals are transmitted across a
communication channel, invariably a set of possibilities is ruled out, namely,
those signals which were not transmitted. But to acquire information remains
fundamentally a matter of ruling out possibilities, whether these possibilities
comprise signals across a communication channel or take some other form. As
Robert Stalnaker (1984, p. 85) puts it, To understand the information conveyed
in a communication is to know what possibilities would be excluded by its
truth. Information in the first instance presupposes not some medium of
communication, but contingency. For there to be information, there must be a
multiplicity of distinct possibilities any one of which might happen. When one
of these possibilities does happen and the others are ruled out, information
becomes actualized. Indeed, information in its most general sense can be defined
as the actualization of one possibility to the exclusion of others.

Complex Information

This definition of information is highly abstract and by itself of little use to
biology and science more generally. To render information a useful concept for
science we need to do two things: first, provide a means for measuring
information; second, introduce a crucial distinctionthe distinction between
specified and unspecified information. First, let us consider how to measure
information. In measuring information it is not enough to count the number of
possibilities that were ruled out, and offer this number as the relevant measure
of information. The problem is that this simple enumeration of excluded
possibilities tells us nothing about how the possibilities under consideration
were individuated in the first place. Consider, for instance, the following
individuation of poker hands:

(i) A royal flush.

(ii) Everything else.

To learn that something other than a royal flush was dealt (i.e., possibility
(ii)) is clearly to acquire less information than to learn that a royal flush
was dealt (i.e., possibility (i)). Yet if our measure of information is simply
an enumeration of excluded possibilities, then the same numerical value must be
assigned in both instances since in both instances a single possibility is
excluded.

It follows, therefore, that how we measure information needs to be independent
of whatever procedure is used to individuate the possibilities under
consideration. And the way to do this is not simply to count possibilities, but
to assign probabilities to these possibilities. For a thoroughly shuffled deck
of cards, the probability of being dealt a royal flush (i.e., possibility (i))
is approximately .000002 whereas the probability of being dealt anything other
than a royal flush (i.e., possibility (ii)) is approximately .999998.
Probabilities by themselves, however, are not information measures. Although
probabilities properly distinguish possibilities according to the information
they contain, nonetheless probabilities remain an inconvenient way measuring
information. There are two reasons for this. First, the scaling and
directionality of the numbers assigned by probabilities needs to be
recalibrated. We are clearly acquiring more information when we learn someone
was dealt a royal flush than when we learn someone wasnt dealt a royal flush.
And yet the probability of being dealt a royal flush (i.e., .000002) is
minuscule compared to the probability of being dealt something other than a
royal flush (i.e., .999998). Smaller probabilities signify more information, not
less.

The second reason probabilities are inconvenient for measuring information is
that they are multiplicative rather than additive. If I learn that Alice
obtained a royal flush playing poker at Caesars Palace and that Bob obtained a
royal flush playing poker at the Mirage, the probability that both Alice and Bob
were dealt royal flushes is the product of the individual probabilities.
Nonetheless, it is convenient for information to be measured additively so that
the measure of information assigned to Alice and Bob jointly being dealt royal
flushes equals the measure of information assigned to Alice being dealt a royal
flush plus the measure of information assigned to Bob being dealt a royal flush.

Now there is an obvious way of transforming probabilities which circumvents both
these difficulties, and that is to apply a negative logarithm to the
probabilities. Applying a negative logarithm assigns the more information to the
less probability and, because the logarithm of a product is the sum of the
logarithms, transforms multiplicative probability measures into additive
information measures. Whats more, in deference to communication theorists it is
customary to use the logarithm to the base 2. The rationale for this choice of
logarithmic base is as follows. The most convenient way for communication
theorists to measure information is in bits. Any message sent across a
communication channel can be viewed as a string of 0s and 1s. For instance,
the ASCII code uses strings of eight 0s and 1s to represent the characters on
a typewriter, with whole words and sentences in turn represented as strings of
such character strings. In like manner all communication may be reduced to the
transmission of sequences of 0s and 1s. Given this reduction, the obvious way
for communication theorists to measure information is in number of bits
transmitted across a communication channel. And since the negative logarithm to
the base 2 of a probability corresponds to the average number of bits needed to
identify an event of that probability, the logarithm to the base 2 is the
canonical logarithm for communication theorists.

We may now summarize how information is measured as follows. Given a collection
of possibilities, and probabilities assigned to those possibilities, the measure
of information inherent in one of those possibilities is the negative logarithm
to the base 2 of the probability of that possibility. This is harder to say than
to conceive, and is really quite straightforward. One terminological point,
however, is worth making. As a purely formal object, the information measure I
have just described is a complexity measure (cf. Dembski, 1996, ch. 4). It is
therefore appropriate to speak of the complexity of information and say that
the complexity of information increases as the associated information measure
increases (and, correspondingly, as the associated probability measure
decreases). This notion of complexity is important to biology since it is not
just the origin of information that stands in question, but the origin of
complex information.

Complex Specified Information

With a means of measuring information in hand, we turn now to the distinction
between specified and unspecified information. This is a vast and complicated
topic whose full elucidation is beyond the scope of this paper, requiring the
formulation of a substantial technical apparatus involving both probability and
complexity theory. All the painstaking details about specification may be found
in my monograph The Design Inference, which I expect to have published next
year. Nonetheless, in what follows I shall try to make this distinction
intelligible, and offer some hints on how to make it rigorous.

For an intuitive grasp of the difference between specified and unspecified
information, consider the following example. Suppose an archer stands 50 meters
from a large blank wall with bow and arrow in hand. The wall, let us say, is
sufficiently large that the archer cannot help but hit it. Consider now two
alternative scenarios. In the first scenario the archer simply shoots at the
wall. In the second scenario the archer first paints a target on the wall, and
then shoots at the wall, squarely hitting the target in the bulls-eye. Let us
suppose that in both scenarios the precise place on the wall where the arrow
lands is identical. In both scenarios the arrow might have landed anywhere on
the wall. Whats more, any place where it might land is highly improbable. It
follows that in both scenarios highly complex information is actualized. Yet the
conclusions we draw from each scenario are very different. In the first scenario
we can conclude absolutely nothing about the archers ability as an archer. On
the other hand, in the second scenario we have evidence of the archers skill.

The obvious difference between these two scenarios is of course that in the
first the information follows no pattern whereas in the second it does. Now the
information that tends to interest us as rational inquirers is not the
actualization of arbitrary possibilities corresponding to no patterns, but
rather the actualization of circumscribed possibilities corresponding to
patterns. And indeed, when we speak of information in common parlance, we
typically do not mean the actualization of an arbitrary possibility so much as
the actualization of a possibility that corresponds to a pattern. In fact, in
the study of information, patterns assume so great a significance that the
patterns themselves become identified as information. The patterns that
represent informationbe they linguistic, pictorial, or mathematicalare in
common parlance what we mean by information. Yet in the service of clarity it is
useful to distinguish information qua the actualization of a possibility from
its representation qua some pattern.

All the same, information that corresponds to a pattern still isnt quite enough
to constitute specified information. The problem is that patterns can be
concocted after the fact so that instead of helping us make sense of
information, they are merely read off already actualized information. To see
this, consider a third scenario in which an archer shoots at a wall. As before,
we suppose the archer stands 50 meters from a large blank wall with bow and
arrow in hand, the wall being so large that the archer cannot help but hit it.
And as in the first scenario, the archer shoots at the wall while it is still
blank. But this time suppose that after having shot the arrow, and finding the
arrow stuck in the wall, the archer paints a target around the arrow so that the
arrow sticks squarely in the bulls-eye. Let us further suppose that the precise
place on the wall where the arrow lands in this scenario is identical with where
it landed in the first two scenarios. Since any place where the arrow might land
is highly improbable, in this as in the other scenarios highly complex
information has been actualized. Whats more, since the information corresponds
to a pattern, we can even say that in this third scenario highly complex
patterned information has been actualized. Nevertheless, we would be wrong to
say that highly complex specified information has been actualized. Of the three
scenarios, only the information in the second scenario is specified. In that
scenario, by first painting the target and then shooting the arrow, the pattern
is given independently of the information. On the other hand, in this, the third
scenario, by first shooting the arrow and then painting the target around it,
the pattern is merely read off the information.

Specified information is always patterned information, but patterned information
is not always specified information. For specified information not just any
pattern will do. We may therefore distinguish between good patterns and bad
patterns. The good patterns will henceforth be called specifications.
Specifications are the independently given patterns that are not simply read off
information. By contrast, the bad patterns will be called fabrications.
Fabrications are the post hoc patterns that are simply read off information.

Unlike specifications, fabrications are wholly uninformative. We are no better
off with a fabrication than without one. This is clear from comparing the first
and third scenarios. Whether an arrow lands on a blank wall and the wall stays
blank (as in the first scenario), or an arrow lands on a blank wall and a target
is then painted around the arrow (as in the third scenario), any conclusions we
draw about the arrows flight remain the same. In either case chance is as good
an explanation as any for the arrows flight. The fact that the target fixes a
pattern in the third scenario makes no difference since this pattern is
constructed only after the arrow has flown and landed. Only when the pattern qua
target is given in advance of the arrow being shot does a hypothesis other than
chance come into play. Thus only in the second scenario does it make sense to
ask whether we are dealing with a skilled archer. Only in the second scenario
does the pattern constitute a specification. In the third scenario the pattern
constitutes a mere fabrication.

The distinction between specified and unspecified information may now be defined
as follows: the actualization of a possibility (i.e., information) is specified
if independently of the possibilitys actualization, the possibility is
identifiable via a pattern. If not, then the information is unspecified. Note
that this definition implies an asymmetry between specified and unspecified
information: specified information cannot become unspecified information, though
unspecified information may become specified information. Unspecified
information need not remain unspecified, but may become specified as our
background knowledge increases. For instance, a cryptographic transmission whose
cryptosystem we have yet to break will constitute unspecified information. Yet
as soon as we break the cryptosystem, the cryptographic transmission becomes
specified information.

What is it for a possibility to be identifiable via an independently given
pattern? A full exposition of specification requires a detailed answer to this
question. Unfortunately, such an exposition is beyond the scope of this paper.
The key conceptual difficulty here is to characterize the independence condition
that obtains between patterns and information. This independence condition in
turn decomposes into two conditions: (1) a condition to stochastic conditional
independence between the information in question and certain relevant background
knowledge; and (2) a tractability condition whereby the pattern in question is
constructible via the aforementioned background knowledge. Although these
conditions make good intuitive sense, they are not easily formalized. For the
details refer to my monograph The Design Inference.

If formalizing what it means for a pattern to be given independently of a
possibility is difficult, determining in practice whether a pattern is given
independently of a possibility is much easier. If the pattern is given prior to
the possibility being actualizedas in the second scenario above where the
target was painted before the arrow was shotthen the pattern is automatically
independent of the possibility, and we are dealing with specified information.
Patterns given prior to the actualization of a possibility are just the
rejection regions of statistics. There is a well-established statistical theory
that describes such patterns and their use in probabilistic reasoning. These are
clearly specifications since having been given prior to the actualization of
some possibility, they have already been identified, and thus are identifiable
independently of the possibility being actualized.

Many of the interesting cases of specified information, however, are those in
which the pattern is given after a possibility has been actualized. This is
certainly the case with the origin of life: life originates first and only
afterwards do pattern-forming rational agents (like ourselves) enter the scene.
It remains the case, however, that a pattern corresponding to a possibility,
though formulated after the possibility has been actualized, can constitute a
specification. Certainly this was not the case in the third scenario above where
the target was painted around the arrow only after it hit the wall. But consider
the following example. Alice and Bob are celebrating their fiftieth wedding
anniversary. Their six children all show up bearing presents. Each present is
part of a matching set of china. There is no duplication of presents, and
together the presents form a complete set of china. Suppose Alice and Bob were
satisfied with their old set of china, and had no inkling prior to opening their
presents that they might expect a new set of china. Alice and Bob are therefore
without a relevant pattern whither to refer their presents prior to actually
receiving the presents from their children. Nevertheless, the pattern they
explicitly formulate only after receiving the presents could be formed
independently of receiving the presents (after all, their colluding children
formed just such a pattern prior to delivering their presents; so too, the china
manufacturer formed this pattern to construct the china in the first place).
This pattern therefore constitutes a specification.

But what about the origin of life? Is the origin of life specified? If so, to
what patterns does life correspond, and how are these patterns given
independently of lifes origin? As was just pointed out, pattern-forming
rational agents like ourselves dont enter the scene till after life originates.
Nonetheless, there are functional patterns to which life corresponds, and which
are given independently of the actual living systems. An organism is a
functional system comprising many functional subsystems. The functionality of
organisms can be cashed out in any number of ways. Arno Wouters (1995) cashes it
out globally in terms of viability of whole organisms. Michael Behe (1996)
cashes it out in terms of the irreducible complexity and minimal function of
biochemical systems. Even the staunch Darwinist Richard Dawkins will admit that
life is specified functionally, cashing out the functionality of organisms in
terms of genetic reproduction. Thus Dawkins (1987, p. 9) will write:
Complicated things have some quality, specifiable in advance, that is highly
unlikely to have been acquired by random chance alone. In the case of living
things, the quality that is specified in advance is ... the ability to
propagate genes in reproduction.

Life is specified. Life is also complex. The origin of life is the origin of
complex specified information. This then, suitably reformulated, is Manfred
Eigens problemhow to explain the origin of complex specified information.
Complex specified information, or CSI for short, is what all the fuss over
information has been about in recent years, not just in biology, but within
science more generally. It is CSI that the various anthropic principles are
trying to explain when they account for the fine-tuning of the universe (cf.
Barrow and Tipler, 1986). It is CSI that David Bohms quantum potentials are
extracting when they scour the microworld for what Bohm calls active
information (cf. Bohm, 1993, pp. 35-38). It is CSI that enables Maxwells demon
to outsmart a thermodynamic system tending towards thermal equilibrium (cf.
Landauer, 1991, p. 26) It is CSI that David Chalmers posits in attempting to
explain human consciousness (Cf. Chalmers, 1996, ch. 8). It is CSI that the
mathematician Keith Devlin (1991, p.1) intends when he writes: That there is
such a thing as information cannot be disputed.... After all, our very lives
depend upon it, upon its gathering, storage, manipulation, transmission,
security, and so on. Huge amounts of money change hands in exchange for
information. People talk about it all the time. Lives are lost in its pursuit.
Vast commercial empires are created in order to manufacture equipment to handle
it. Surely then it is there.

Nor is CSI confined to the domain of science. CSI is indispensable in our
everyday lives. The 16-digit number on your VISA card is an example of CSI. The
complexity of this number ensures that a would-be thief cannot randomly pick a
number and have it turn out to be a valid VISA card number. Whats more, the
specification of this number ensures that it is your number, and not anyone
elses. Even your phone number constitutes CSI. As with the VISA card number,
the complexity ensures that this number wont be dialed randomly (at least not
too often), and the specification that this number is yours and yours only. All
the numbers on our bills, credit slips, and purchase orders represent CSI. CSI
makes the world go round. Consequently, CSI is also a rife field for
criminality. CSI is what motivated the villainous Michael Douglas character in
the movie Wall Street to lie, cheat, and steal. CSIs total and absolute control
was the objective of the monomaniacal Ben Kingsley character in the movie
Sneakers. CSI is the artifact of interest in most techno-thrillers. Ours is an
information age, and the information that excites us most is CSI.

The Law of Conservation of Information

With this characterization of CSI in hand, I want now to return to Manfred
Eigens central problemthe origin of CSI. Where does CSI come from, and where
is CSI incapable of coming from? According to Eigen, CSI comes from algorithms
and natural laws. To recall Eigens dictum: Our task is to find an algorithm, a
natural law that leads to the origin of [complex specified] information. The
only question for Eigen is which algorithms and natural laws explain the origin
of CSI. The logically prior question of whether algorithms and natural laws are
even in-principle capable of explaining the origin of CSI is not one he properly
considers. And yet it is a question whose answer vitiates Eigens entire
project. Algorithms and natural laws are in-principle incapable of explaining
the origin of information. To be sure, algorithms and natural laws can explain
the flow of information. Indeed, algorithms and natural laws are ideally suited
for transmitting already existing information. What they cannot do, however, is
originate information.

The easiest way to see this is mathematically. From a mathematical point of view
algorithms and natural laws are just functions, that is, relations between two
sets, which to every member in one set (called the domain) associates one, and
only one, member in the other set (called the range). As such, the functional
relationship is fully deterministic: given an element in the domain, the
function determines a unique element in the range. For algorithms the domain
comprises the various possible input data, and the range the various possible
output data. For natural laws the domain comprises the various possible initial
and boundary conditions, and the range the various possible states at subsequent
times t. Now suppose we had some CSI j, and a function (qua algorithm or natural
law) f that, to quote Eigen again, led to the origin of the [complex specified]
information j. This would mean that some element in the domain of f, call it i,
when acted on by f, yielded the output j. But this hardly explains the origin of
the information j. One problem has been solved by creating another, for now the
origin of i must be explained.

Worse yet, the newly created problem is no easier than the one we started with.
Functional relationships at best preserve what information is already there, or
else degrade itbut they can never add to it. Thus however much information
resides in j will be contained in any i that via the function f maps onto j.
Whats more, if j is specified, then the inverse image under the function f will
also be specified (in particular, since i maps onto j via f, i is in this
inverse image). In short, if j constitutes complex specified information and f
is a function that maps i onto j, then i constitutes specified information at
least as complex as j. Thus instead of explaining the origin of CSI, algorithms
and natural laws shift the problem elsewhere to a place where the origin of CSI
will be at least as difficult to explain.

It is vital to realize that functions can only make the information problem
worse. Suppose, for instance, you look at the U.S. Statistical Abstract and find
that the average income of a U.S. citizen is so-much-and-so-much. How did this
item of information originate? Well, the census bureau had to contact all the
U.S. citizens, record their individual incomes, add the incomes all together,
and divide by the number of U.S. citizens. To take an average is thus to apply a
functiongiven the input data (all the individual U.S. incomes), the output data
is uniquely determined. But more so, to take an average is also to compress
data. The information inherent in the record of all individual incomes far
exceeds the information inherent in the corresponding average. Taking an average
is a standard statistical technique for compressing data. In an information age
we are inundated with information. Thus frequently when we look at information,
we look at information whose complexityas a service to the information
seekerhas already been drastically compressed.

There is one subtlety we need now to consider, and it is the one which not just
Manfred Eigen, but also Ilya Prigogine, Stuart Kauffman, and indeed the entire
Santa Fe Institute group is pinning their hopes on. I have just argued that when
a function acts to yield information, what the function acts upon has at least
as much information as what the function yields. This argument, however, treats
functions as mere conduits of information, and does not take seriously the
possibility that functions might actually add information. I give the example of
taking an average whereby data is compressed and information is lost. But
consider the function which maps library call numbers to their corresponding
books. Clearly, there is less information in the call numbers than in the books.
Thus here we have a function that is adding information. Whats more, it is
adding information because the information is embedded in the function itself.

Although this observation seems to undermine my previous argument, in fact it
leaves the previous argument virtually unchanged. The point is that instead of
the function f now merely serving as a conduit taking information i and yielding
information j, the information in f must now itself be taken into account. The
way to do this is to employ the universal composition function U, which to an
ordered information-function pair (i,f) assigns the information obtained by
applying f to iin this case j. Thus U(i,f) = f(i) = j. Now unlike f, which may
well incorporate information, U, the universal composition function,
incorporates no information of its own, but serves merely as a conduit for
information. By simply taking ordered pairs, and treating the second element as
a function applied to the first, U introduces no information of its own. U adds
no information. Note that in the case of algorithms U is just a universal Turing
machine. The form of my original argument is therefore unchanged: the
information j arises by applying U (cf. f in the original argument) to the
information (i,f) (cf. i in the original argument. Just as in computer science
the distinction between data and programs is not hard and fast, so the
distinction between functions and information is not hard and fast. We can thus
treat the ordered pair (i,f) as information which via the universal composition
function yields the information U. And now it is clear that information inherent
in (i,f) exceeds that in j. Like a bulge under a rug, the information problem
can be shifted around, but it does not go away.

This argument, by employing the universal composition function, is perfectly
general. In particular, it answers the attempt by complexity-theorists to
account for the origin of information in terms of dynamical systems (for popular
accounts of this enterprise see Levy 1992 and Waldrop 1992).
Complexity-theorists, especially the Santa Fe Institute group, continue to hope
that information can be gotten on the cheap. Look at all those amazing fractal
patterns, we are told. The incredibly intricate Mandelbrot set is generated by
so modest a complex function as h(z) = z2+c. To state the matter in this way,
however, is misleading. The function h(z) = z2 + c is simple enough, and even
simpler to write down. And granted, it is the crucial element in constructing a
graphic depiction of the Mandelbrot set. But that is the point: It is the
graphic depiction of the Mandelbrot set that has to be explained, not its
existence as an abstract mathematical object. And this graphic depiction has to
be constructed.

Pixels on a computer screen have to be assigned coordinates representing complex
numbers. The function h(z) = z2 + c has to be iterated with respect to those
coordinates. The trajectory of those iterations needs to be tracked to see if
the trajectory stays locally bounded or heads off towards infinity. Given these
trajectories, a color has to be assigned to the pixel, black if the trajectory
stays locally bounded, white if it heads off to infinity. All of this must be
programmed. All of this is information far exceeding the information inherent in
simply writing down h(z) = z2 + c. The function h(z) = z2 + c is never the
function that produces the pretty graphic depictions of the Mandelbrot set we
see in books on fractals. Any function that produces a graphic depiction of the
Mandelbrot set will be a complicated algorithm employing a complicated set of
input data. Any such algorithm f applied to a data set i can be conjoined as an
ordered pair (i,f), and then evaluated by the universal composition function U
to produce a graphic depiction of the Mandelbrot set j. But by itself the
function h(z) = z2 + c is too information-poor to produce this graphic depiction
of the Mandelbrot set j. Once we examine the precise informational antecedents
to j, the illusion that we have generated information for nothing disappears.

The origin of CSI simply cannot be explained as the output of a function, be it
an algorithm, a natural law, or whatever. The root problem here is that
functions are deterministic, and thus cannot yield contingency. Recall that
information becomes realizable only as a multiplicity of distinct possibilities
obtains any one of which might actually happen. The problem with functions is
that they invariably yield only a single live possibility. Take a computer
algorithm that performs addition. Let us say the algorithm has a correctness
proof, so that it performs its additions correctly. Given the input data 2+2,
can the algorithm output anything other than 4? Algorithms, and functions more
generally, are wholly deterministic. They allow for no contingency, and thus can
generate no information. At best functions can shift information around, or lose
it, as when data gets compressed. What they cannot do is produce contingency.
And without contingency they cannot generate information.

If not by means of functions, how then does contingency arise? Two, and only
two, answers are possible here. Either the contingency is a blind, purposeless
contingencywhich is chance; or it is a guided, purposeful contingencywhich is
intelligent causation. We shall return to intelligent causation in due course,
but for now let us examine whether chance is capable of generating CSI. First
notice that pure chance, entirely unsupplemented and left to its own devices, is
incapable of generating CSI. Chance can generate complex unspecified
information, and chance can generate non-complex specified information. What
chance cannot generate is information that is jointly complex and specified.

To see this, consider again our archer friend who fires arrows at a large blank
wall. The archer, even if driven purely by chance, is perfectly capable of
generating complex unspecified information: the precise place where the arrow
hits a large blank wall signifies a highly improbable unspecified event,
instancing complex unspecified information (recall that high probability
corresponds to low complexity whereas low probabilityi.e., high
improbabilitycorresponds to high complexity). Alternatively, if a target is
painted on the wall, but the target is so large that the bulls-eye takes up
half the area of the wall, then the archer, even if driven purely by chance,
will be quite likely to hit the bulls-eye, thereby generating non-complex
specified information: hitting the bulls-eye signifies a specified
high-probability event, instancing non-complex specified information. What an
archer driven purely by chance cannot do is, having painted a minuscule target
on the wall, hit the bulls-eye, thereby generating information that is both
complex and specified: hitting the bulls-eye of a minuscule target signifies a
highly improbable specified event, instancing complex specified informationCSI.

But cant someone simply by chance let fly an arrow and hit a bulls-eye? Not if
the target is sufficiently small. At some point the improbabilities become too
vast and the specifications too tight for chance to be taken seriously. Just
where this point is first reached can be debated, but that there is a
probabilistic cut-off beyond which chance becomes an unacceptable explanation is
beyond doubt. The universe will experience heat death before random typing at a
keyboard produces a Shakespearean sonnet. The French mathematician Emile Borel
(1962, p. 28) proposed 1050 as a universal probability bound below which chance
could definitely be precluded, i.e., any specified event as improbable as this
could not be attributed to chance. Borel based his universal probability bound
on cosmological considerations, taking into account the opportunities to repeat
and observe events through the history and expanse of the universe. Borels
1050 probability bound translates into 170 bits of information. I have proposed
a more stringent universal probability bound of 10150 based on the number of
elementary particles in the universe, the Planck time, and the duration of the
universe until its head death (see Dembski, 1996, ch. 6). A probability bound of
10150 translates into 500 bits of information. The bound I propose is more
securely justified than Borels. Given a universal probability bound of 10150
we therefore refuse to attribute to chance specified information with a
complexity of 500 or more bits. I have yet to encounter CSI with a complexity
greater than the 500 bits for which chance is an adequate explanation.

Biologists by and large do not dispute this claim. Most are agreed that pure
chancethe Epicurean hypothesis as Hume called itis not an adequate explanation
for CSI. Jacques Monod (1972) is one of the few exceptions, arguing that the
origin of life, though vastly improbable, can nonetheless be attributed to
chance because of a selection effect. Just as the winner of a lottery is shocked
at winning, so we are shocked to have evolved. But the lottery was bound to have
a winner, and so too something was bound to have evolved. Something vastly
improbable was bound to happen, and so, the fact that it happened to us (i.e.,
that we were selectedhence the name selection effect) does not preclude chance.
This is Monods argument and it is fallacious. It has been refuted by the
philosophers John Earman, William Craig, and Richard Swinburne. It has also been
refuted by the biologists Wolfgang Stegmller, Bernd Olaf-Kppers, and Hubert
Yockey. Swinburnes refutation is perhaps the most memorable (Swinburne, 1979,
p. 138):

Suppose that a madman kidnaps a victim and shuts him in a room with a
cardshuffling machine. The machine shuffles ten packs of cards simultaneously
and then draws a card from each pack and exhibits simultaneously the ten cards.
The kidnapper tells the victim that he will shortly set the machine to work and
it will exhibit its first draw, but that unless the draw consists of an ace of
hearts from each pack, the machine will simultaneously set off an explosion
which will kill the victim, in consequence of which he will not see which cards
the machine drew. The machine is then set to work, and to the amazement and
relief of the victim the machine exhibits an ace of hearts drawn from each pack.
The victim thinks that this extraordinary fact needs an explanation in terms of
the machine having been rigged in some way. But the kidnapper, who now
reappears, casts doubt on this suggestion. It is hardly surprising, he says,
that the machine [drew] only aces of hearts. You could not possibly see
anything else. For you would not be here to see anything at all, if any other
cards had been drawn. But of course the victim is right and the kidnapper is
wrong. There is indeed something extraordinary in need of explanation in ten
aces of hearts being drawn. The fact that this peculiar order is a necessary
condition of the draw being perceived at all makes what is perceived no less
extraordinary and in need to explanation.

Selection effects do nothing to render chance an adequate explanation of complex
specified information. For a detailed treatment of selection effects and their
failure to account for CSI, see Dembski (1996, sec. 6.3).

Most biologists then reject pure chance as an adequate explanation of CSI. The
problem here is not simply one of faulty statistical reasoning. Besides flying
in the face of every canon of sound statistical reasoning, pure chance is
scientifically unsatisfying as an explanation of CSI. To explain CSI in terms of
pure chance is no more instructive than pleading ignorance or proclaiming CSI a
mystery. It is one thing to explain the occurrence of heads on a coin toss by
appealing to chance. It is quite another, as Kppers (1990, p. 59) points out,
to follow Monod and take the view that the specific sequence of the nucleotides
in the DNA molecule of the first organism came about by a purely random process
in the early history of the earth. CSI cries out for explanation, and pure
chance wont do it. Richard Dawkins (1987, pp. 139, 145-146) makes this point
eloquently:

We can accept a certain amount of luck in our explanations, but not too
much.... In our theory of how we came to exist, we are allowed to postulate a
certain ration of luck. This ration has, as its upper limit, the number of
eligible planets in the universe.... We [therefore] have at our disposal, if
we want to use it, odds of 1 in 100 billion billion as an upper limit (or 1 in
however many available planets we think there are) to spend in our theory of the
origin of life. This is the maximum amount of luck we are allowed to postulate
in our theory. Suppose we want to suggest, for instance, that life began when
both DNA and its protein-based replication machinery spontaneously chanced to
come into existence. We can allow ourselves the luxury of such an extravagant
theory, provided that the odds against this coincidence occurring on a planet do
not exceed 100 billion billion to one.

Dawkins is right. We can allow our scientific theorizing only so much luck.
After that we degenerate into handwaving and mystery. A probability bound of
10150, or a corresponding complexity bound of 500 bits of information, sets a
conservative limit on the amount of luck we can allow ourselves (certainly more
conservative than the one Dawkins was just now alluding to).

We may summarize our findings up to this point as follows: (1) Chance generates
contingency, but not complex specified information. (2) Functions (e.g.,
algorithms and natural laws) generate neither contingency, nor information, much
less complex specified information. (3) At best functions transmit already
present information. Given these three findings, it seems intuitively obvious
that no chance-function combination is going to generate information either.
After all, functions transmit what they are given, and whatever chance gives a
function is not complex specified information. Ergo, chance and functions
working in tandem cannot generate information. This intuition is of course
exactly right, and I shall provide a theoretical justification for it
momentarily. Nevertheless, the sense that functions can sift chance and thereby
generate CSI is deep-seated in the scientific community. Trial and error is the
basis for all sorts of probabilistic algorithms (e.g., genetic algorithms), and
what is trial and error but the sifting of chance by means of a function? Whats
more, the very Darwinian mechanism of mutation and natural selection is a
chance-function combination, in which the variability of the organism provides
the chance component, and selection pressure from the environment provides the
function component.

The theoretical justification for the inability of chance and functions working
in tandem to generate information is virtually the same as the theoretical
justification given earlier for the inability of functions by themselves to
generate information. Instead of considering a deterministic function f(i) in
one variable, we now consider an indeterministic function f(i,w) in two
variables where the first variable signifies the object on which the function
acts, and the second the randomizing component. We then define the universal
composition function U which inputs the object-chance-function ordered triple
(i,w,f) and outputs f(i,w) = j, i.e., U(i,w,f) = f(i,w) = j. As in the
deterministic case, the universal composition function U incorporates no
information of its own, but serves merely as a conduit for information. U adds
no information. The formalism just described for combining chance and functions
is perfectly general, and accommodates everything from Darwins
mutation-selection mechanism to the probabilistic algorithms of computer science
(genetic algorithms being a case in point).

Now suppose we had some CSI j, and an indeterministic function (i.e.,
chance-function combination) f that, to quote Eigen again, led to the origin of
the CSI j. The origin of the CSI j can then be broken into two stages. In the
first stage, a chance outcome w occurs. Once w occurs and is fixed, the function
f becomes deterministic, i.e., f becomes the function in one variable f(.,w) =
fw(.), w now being treated as a fixed parameter of the function f. In the second
stage, the parameterized deterministic function fw(.) gets applied to some
element in its domain, call it i, yielding the item of interest, the CSI j. From
this it is clear that neither of these stages can generate CSI. The first stage
involves only chance, and therefore, as was argued earlier, cannot generate CSI.
The second stage involves no chance, but only a deterministic function, and
therefore, as was argued earlier, cannot generate CSI either. Thus at no point
in the transition from w to fw(.) to fw(i) = j is CSI created. Whatever CSI is
inherent in j is therefore already inherent in the indeterministic function f
together with the nonrandom element in the domain of f, namely, i. This argument
is valid and holds universally. Just as chance or functions left to themselves
individually cannot purchase CSI, so too their joint action cannot purchase CSI
either.

This result, that neither chance nor functions nor some combination of the two
can generate CSI, I call the Law of Conservation of Information, or LCI for
short. Though formulated at a high level of mathematical abstraction, LCI has
many profound implications for science. Among its immediate corollaries are the
following: (1) The CSI within a system closed to outside information always
remains constant or decreases. (2) If CSI increases within a system, then CSI
was added exogenously. (3) CSI cannot be generated spontaneously, originate
endogenously, or organize itself. (4) To explain the CSI within a system is to
appeal to a system whose CSI is equal or greater in complexity still (in
particular, reductive explanations of CSI are never adequate).


Applying the Theory to

Evolutionary Biology

Up to this point I have sketched a theory of complex specified information, and
concluded with a general law characterizing the origin of complex specified
information, to wit, the Law of Conservation of Information. I want next to
apply this theory to evolutionary biology. Before doing so, however, it will be
convenient to provide a synonym for the term function. As I have used the
term, function signifies a certain law-like mathematical relation between two
sets. In the sequel it will therefore be convenient to use the word law to
signify functions. If we do this, the Law of Conservation of Information has the
following perspicuous formulation: Neither law nor chance nor some combination
of the two can generate complex specified information. The reference to
functions was useful so long as their mathematical properties were being
explicitly cited. But continued reference to them, especially when juxtaposed
with chance, will tend henceforth obscure rather than clarify. Thus in
particular we shall refer to deterministic laws as functions of the form f(i) =
j and indeterministic laws as chance-function combinations of the form f(i,w) =
j with random component w.

In applying the theory of information here developed to evolutionary biology,
let us begin by noting that nothing in this theory so far undermines the
naturalistic accounts of evolution currently in vogue. All that has been shown
so far is that CSI is not a free lunch in the sense that law and chance together
cannot generate CSI. But law and chance can take already existing CSI and shift
it around. And there is nothing to prevent CSI from being abundant in the
universe, and thus to prevent law and chance from expressing CSI in the origin
and development of biological systems. With Hubert Yockey (1992, p. 335) we
could therefore say that CSI, and by implication life, is axiomatic, and leave
it at that. Like the principle of rationality which according to the ancient
Stoics pervaded the universe, we could simply treat CSI as a given.

Although this move might be philosophically justified, it remains scientifically
unsatisfying. As scientists we want to know how the CSI which supposedly is so
abundant in the universe got itself into the organisms we see around us. In
reference to the origin of life, we want to know the informational pathway that
takes the CSI inherent in a lifeless universe, and translates it into a
protobiont. In reference to the development of life, we want to know the
informational pathway that takes the CSI inherent in an already existing
organism plus its environment, and translates this CSI into an organism of still
greater complexity. Even if the origin of CSI admits no scientific explanation,
its flow surely does. How then does CSI flow into and out of biological systems?

The answer to this question, at least in broad terms, is clear: The CSI inherent
in an organism consists of the CSI acquired at birth together with whatever CSI
it acquires during the course of its life. The CSI acquired at birth derives
from inheritance with modification (i.e., the CSI acquired at birth is inherited
from the parent(s) and consists of the CSI inherent in the parent(s) as modified
by chance). The CSI acquired after birth consists of selection (i.e., the
environmental pressure that selects some organisms to reproduce and eliminates
others before they can reproduce) along with infusion (i.e., the direct
introduction of novel information from outside the organism). The Darwinian
mechanism admits selection and inheritance with modification, but proscribes
infusion. The Lamarckian mechanism, on the other hand, focuses mainly on
infusion. Certainly infusion as Lamarck conceived it has largely been
discredited. Nevertheless, there is good scientific evidence for non-Lamarckian
infusion wherein organic informational structures belonging to one organism are
assimilated by another. For instance, it is well-established that bacteria
exchange plasmids as a way of developing antibiotic resistance
(cf.Ambile-Cuevas et al., 1995, p. 324). On the other hand, Lynn Marguliss
idea of symbiosis, where organisms co-opt and assimilate other organisms to form
still more complex organisms, remains speculative (cf. Margulis, 1993).

Inheritance with modification, selection, and infusionthese three account for
the CSI inherent in biological systems. Together they comprise all the sources
of CSI in biology. I want therefore to examine more closely the respective roles
of these three sources in contributing to the CSI of an organism. First consider
inheritance with modification (alternatively, inheritance and mutation).
Inheritance is merely a conduit for already existing information and
modification is merely chance operating on the information passing through this
conduit. It follows that by itself inheritance with modification is incapable of
explaining the increased complexity of CSI that organisms have exhibited in the
course of natural history. Inheritance with modification needs therefore to be
supplemented.

The most obvious candidate here, of course, is selection. Selection presupposes
inheritance with modification, but instead of merely shifting around already
existing information, selection also introduces new information. By seizing on
advantageous modifications, selection is able to introduce new information into
a population. The majority view in biologyknown as the neo-Darwinian
synthesisis that selection and inheritance with modification together are
adequate to account for all the CSI inherent in organisms. As a parsimonious
account of the origin and development of life, this view has much to commend it.
Unfortunately, this view places undue restrictions on biological information
flow, restrictions which biological systems seem routinely to violate. The
problem is that selection and inheritance with modification can only yield very
gradual increases in the informational complexity of organisms, whereas many of
the increases in the informational complexity of organisms are abrupt and large.

This point deserves careful attention. Suppose that an organism in reproducing
generates N offspring, and that of these N offspring M (1ʲMʲN) succeed in
reproducing. The amount of information introduced through selection is then
log2M/N. Let me stress that this formula is not an case of misplaced
mathematical exactness. This formula holds universally and is non-mysterious.
Take a simple non-biological example. If I am sitting at a radio transmitter,
and can transmit only zeros and ones, then every time I transmit a zero or one,
I choose between two possibilities, selecting precisely one of them. Here N
equals 2 and M equals 1. The information log2M/N thus equals log21/2 = 1,
i.e., 1 bit of information is introduced every time I transmit a zero or one.
This is of course as things should be. Now this example from communication
theory is mathematically isomorphic to the case of cell-division where only one
of the daughter cells goes on to reproduce. On the other hand, if both daughter
cells go on to reproduce, then N equals M equals 2, and thus log2M/N = log22/2
= 0, indicating that selection, by failing to eliminate any possibility failed
also to introduce new information. To take another example, imagine you are
typing at a keyboard consisting of the twenty-six capital Roman letters. Thus
every time you type a key you select one of twenty-six letter. Here N equals 26
and M equals 1. The information log2M/N thus equals log21/26 = 4.7, i.e., 4.7
bits of information are introduced every time you type a key. Or consider a dog
breeder who from a given litter of seven Boston terriers selects two for
reproduction. The dog breeder thus introduces log22/7 = 1.8 bits of information
into those Boston terriers selected for reproduction. (In the formula log2M/N
and throughout these examples I have assumed a uniform probability distribution.
This simplifying assumption, however, only strengthens our case: since uniform
probability distributions maximize entropy, on average the information
introduced through selection will in fact fall below log2M/N.)

Its therefore clear that selection among the offspring of an organism can at
most introduce a few bits of information. Cell division, the preeminent form of
reproduction, and the only one prior to multi-cellular life, introduces at most
one bit of information. Even if an organism can produce 1030 gametesso many
gametes that their biomass would equal that of the earth, and each of these
became mature organisms, and then only one of these mature organisms were
selected for further reproduction, the total number of bits of information
introduced through selection would in this instance be log21/1030 = 100. A
hundred bits of information is far less information than is contained in an
average protein.

From these observations it is clear that selection can accumulate a lot of
information over successive generations. As is noted Joklik and Willetts (1976,
p. 78) microbiology text, Within a short period, often as short as 20 minutes,
a bacterium can create a complete duplicate of itself, which in turn is capable
of duplicating. Over a billion years, at one bit of information introduced
every twenty minutes, selection could in principle produce 26 trillion bits of
information, certainly enough to handle any conceivable genome. Nonetheless,
from these observations it is equally clear that selection can only produce a
very limited amount of information at any one generation. 100 bits is certainly
too generous. The most fecund breeders with which I am familiar are certain fish
whose spawn include a hundred million eggs. A realistic upper limit on the
amount of biological information introduced by selection is therefore around 30
bits. For many organisms it is far less. Mammals, for instance, have an upper
limit of about 5 bits of information per generation through selection.

The preceding analysis gives new urgency to Darwins (1859, p. 189) famous
challenge: If it could be demonstrated that any complex organ existed, which
could not possibly have been formed by numerous, successive, slight
modifications, my theory would absolutely break down. In information-theoretic
terms, this is to say that if informational jumps of considerably more than
thirty bits are required in any one generation, then some means of producing
information other than selection must be sought. Have such informational jumps
been discovered? Darwin and his disciples believe in the infinite plasticity of
organisms to change gradually from one form into another. This belief, however,
no longer seems justified.

Perhaps the clearest examples of informational jumps that exceed the power of
selection occur in biochemistry. Michael Behe (1996) and Siegfried Scherer
(1983) have both examined biochemical systems which if produced by selection
need to be produced in a single generation, but whose information requirements
exceeds what selection can deliver in a single generation. The key feature of
these biochemical systems is one Behe calls irreducible complexity. A system is
irreducibly complex if it consists of several interrelated components the
removal of any one of which leads to the complete loss of function of the
system. As an example of irreducible complexity, Behe (1996, p. 43) offers a
mousetrap. A mousetrap consists of a platform, a hammer, a spring, a catch, and
a holding bar. Remove any one of these five components, and it is impossible to
construct a functional mousetrap. Irreducible complexity needs to be contrasted
with reducible complexity. A system is reducibly complex if it contains a
dispensable component, i.e., a component which can be removed without destroying
functionality. An example of a reducibly complex system is a pocket watch. The
glass face that covers and protects the dial is not necessary for the watch to
keep time. It can be removed without destroying the watchs function (function
may be diminished, but it is not lost).

Besides being contrasted with reducible complexity, irreducible complexity needs
also to be contrasted with cumulative complexity. A system is cumulatively
complex if the components of the system can be arranged sequentially so that the
successive removal of components never leads to the complete loss of function.
An example of a cumulatively complex system is a city. It is possible
successively to remove people and services from a city until one is down to a
tiny village, all without losing the cohesiveness of the community, which in
this case constitutes functionality. Note, however, that the order in which
people and services are removed is important. To remove as the first thing the
police and courts from a large city would result in chaos. Observe that it is
possible to define cumulative complexity recursively in terms of reducible
complexity: A system is cumulatively complex if it is reducibly complex, and if
after the removal of some component from the system, the system is again
cumulatively complex. It follows that cumulatively complex systems are always
reducibly complex. The converse, however, is not the case. Reducibly complex
systems may contain an irreducibly complex core, and thus fail to be
cumulatively complex. For instance, a pocket watch, though reducibly complex,
contains certain ineliminable components without which the watch cannot
function, e.g., hour and minute hands, certain gears and springs, and a base to
keep all these elements together. Such ineliminable components form the
irreducible core of the pocket watch.

Given these types of complexityirreducible, reducible, and cumulativeit is
clear that selection can account for cumulative complexity. The gradual accrual
of information via selection mirrors the retention of function as components are
removed in cumulative complexity. Selection has no problem producing cumulative
complexity. But what about irreducible complexity? Can selection produce
irreducible complexity? Certainly if selection acts with reference to a goal, it
can produce an irreducibly complex system. Take Michael Behes mousetrap, for
instance. Given the goal of constructing a mousetrap, one can specify a
goal-directed selection process that in turn selects a platform, a hammer, a
spring, a catch, and a holding bar, and at the end puts all these components
together to form a functional mousetrap. Given a pre-specified goal, selection
has no difficulty producing irreducibly complex systems.

But the selection that operates in biology is Darwinian natural selection. And
this form of selection operates without goals, has neither plan nor purpose, and
is wholly undirected (cf. Miller and Levine, 1993, p.658). The great appeal of
Darwins selection mechanism was precisely that it would eliminate teleology
from biology. Yet by making selection an undirected process, Darwin drastically
abridged the type of complexity biological systems could manifest. Henceforth
biological systems could manifest only cumulative complexity, not irreducible
complexity. Why is this? As Behe (1996, p. 39) explains, An irreducibly complex
system cannot be produced ... by slight, successive modifications of a
precursor system, because any precursor to an irreducibly complex system that is
missing a part is by definition nonfunctional.... Since natural selection can
only choose systems that are already working, then if a biological system cannot
be produced gradually it would have to arise as an integrated unit, in one fell
swoop, for natural selection to have anything to act on.

Recall that for the complex specified information inherent in organisms, what
specifies this information is functionality. The organism as a whole, as well as
its various subsystems are specified in virtue of the respective functions these
systems perform. For irreducibly complex systems, however, function is attained
only when all components of a system are in place. Moreover, natural selection,
insofar as it introduces complex specified information into organisms, must
select for function. It follows that natural selection, if it is going to
produce an irreducibly complex system, has to produce it all at once or not at
all. Of course, this would not be a problem if the amount of information natural
selection can produce in a single generation matches or exceeds the amount of
information inherent in the irreducibly complex systems of biology. But nothing
like this is the case. Whereas natural selection at its very best can introduce
about 30 bits of information per generation, the irreducibly complex biochemical
systems Michael Behe considers in Darwins Black Box contain several orders of
magnitude more information. These irreducibly complex biochemical systems, like
the bacterial flagellum, are protein machines consisting of numerous distinct
proteins, each indispensable for the function of the machine (hence the
irreducible complexity), and where each individual protein in the machine
requires more bits of information than natural selection can conceivably produce
in a single generation.

The irreducible complexity of biochemical systems counts decisively against the
joint action of selection and inheritance with modification to account for the
CSI in biological systems. Because irreducible complexity occurs at the
biochemical level, there is no lower level of biological analysis to which the
irreducible complexity of biochemical systems might be referred, and at which a
Darwinian analysis in terms of selection and inheritance with modification might
still hope for success. Undergirding biochemistry is ordinary chemistry and
physics, neither of which can account for biological information (cf. Yockey,
1992). Also, whether a biochemical system is irreducibly complex is a fully
empirical question: Individually knock out each protein constituting an
irreducibly complex biochemical system, and determine whether function is lost.
If so, we are dealing with an irreducibly complex system. Mutagenesis
experiments of this sort are routine in biochemistry.

If the joint action of selection and inheritance with modification is unable to
account for the CSI in biological systems (and specifically for the irreducible
complexity of certain biochemical systems like the bacterial flagellum), there
remains but one source for the CSI in biological systems, namely, infusion, the
direct introduction of novel information from outside the biological system. In
principle there is nothing problematic or controversial about infusion. To
innovate a given informational structure an organism has informational needs,
and these needs can be supplied from outside the organism, either through
selection pressures (and therefore indirectly), or by the insertion of
ready-to-go information into the organism (and therefore directly). The latter
is of course infusion.

Although at this level of generality infusion is unproblematic, it quickly
becomes problematic once we start tracing backwards the informational pathways
of infused information. Consider for instance what is perhaps the best
scientifically confirmed instance of infusion in biology, namely, plasmid
exchange among bacteria to develop antibiotic resistance (cf.Ambile-Cuevas et
al., 1995, p. 324). Plasmids are small circular pieces of DNA that can easily be
exchanged among bacteria of the same species, and are capable of conferring
antibiotic resistance. When one bacterium releases a plasmid and another absorbs
it, information is infused from one into the other. By itself this is
unproblematic. Problems begin, however, when we ask, Where did the bacterium
that released the plasmid in turn derive it? There is a regress here, and this
regress always terminates in something non-organismal. We cant just keep
explaining plasmid infusion into a bacterium by plasmid release from another
bacteriumeventually, as we trace the informational pathway back, we must tell a
different kind of story. If, for instance, the plasmid is cumulatively complex,
then it could have arisen through selection and inheritance with modification.
But if on the other hand it is irreducibly complex, whence could it have arisen?


It will be helpful here to distinguish between symbiotic and abiotic infusion,
and correspondingly between endogenous and exogenous information. Symbiotic
infusion is the infusion of information from one organism to another; abiotic
infusion is the infusion of information not derived from any organism.
Correspondingly, endogenous information comprises symbiotically infused
information (and thus information already present within biology); exogenous
information comprises abiotically infused information (and thus information
external to biology). Now regardless whether plasmids are irreducibly complex or
have an irreducibly complex core (the analysis to determine the nature of the
complexity of plasmids has to my knowledge not yet been performed), the fact
remains that there exist irreducibly complex biochemical systems. Whats more,
even though symbiotic infusion may explain how a particular instance of an
irreducibly complex biochemical system came to exist in a given organism, it
cannot explain how such a system arose in the first place. Because organisms
have a finite trajectory back in time, symbiotic infusion must ultimately give
way to abiotic infusion, and endogenous information must ultimately derive from
exogenous information.


Reconceptualizing Evolutionary Biology

The abiotic infusion of exogenous information is the great mystery confronting
modern evolutionary biology. It is Manfred Eigens mystery with which we began
this paper. Why is it a mystery? Not because the abiotic infusion of exogenous
information is inherently spooky or unscientific, but rather because
evolutionary biology has failed to grasp the centrality of information to its
task. The task of evolutionary biology is to explain the origin and development
of life. The key feature of life is the presence of complex specified
informationCSI. Caught up in the Darwinian mechanism of selection and
inheritance with modification, evolutionary biology has failed to appreciate the
informational hurdles organisms need to jump in the course of natural history.
To jump those hurdles, organisms require information. Whats more, a significant
part of that information is exogenous and must originally have been infused
abiotically.

In this section I want briefly to consider what evolutionary biology would look
like if information were taken as its central and unifying concept. First off,
lets be clear that the Darwinian mechanism of selection and inheritance with
modification will continue to occupy a significant place in evolutionary theory.
Nevertheless, its complete and utter dominance in evolutionary theorythat
selection and inheritance with modification together account for the full
diversity of lifethis inflated view of the Darwinian mechanism will have to be
relinquished. As a mechanism for conserving, adapting, and honing already
existing biological structures, the Darwinian mechanism is ideally suited. But
as a mechanism for innovating irreducibly complex biological structures, it
utterly lacks the informational resources. As for symbiotic infusion, its role
within an information-theoretic framework must always remain quite limited, for
even though it can account for how organisms trade already existing biological
information, it can never get at the root question of how that biological
information came to exist in the first place.

Not surprisingly, therefore, the key task an information-theoretic approach to
evolutionary biology faces is to make sense of abiotically infused CSI.
Abiotically infused CSI is information exogenous to an organism, but which
nonetheless gets transmitted to and assimilated by the organism. Two obvious
questions now arise: (1) What is the mode of transmission of abiotically infused
CSI into the organism? and (2) Where is this information prior to being
transmitted? If this information is clearly represented in some empirically
accessible non-biological physical system, and if there is a clear informational
pathway from this system to the organism, and if this informational pathway can
be shown suitable for transmitting this information to the organism so that the
organism properly assimilates it, only then will these two questions receive an
empirically adequate naturalistic answer. But note that this naturalistic
answer, far from eliminating the information question, simply pushes it one step
further back, for how did the CSI that was abiotically infused into an organism
first get into a non-organism? Because of the Law of Conservation of
Information, whenever we inquire into the source of some information, we never
resolve the information problem, but only intensify it. This is not to say that
such inquiries are unilluminating (contra Dawkins, 1987, pp. 1113; and Dennett,
1995, p. 153 who think that the only valid explanations in evolutionary biology
are reductive, explaining the more complex in terms of the simpler). We learn an
important fact about a pencil when we learn a certain pencil-making machine made
it. Nonetheless, the information in the pencil-making machine exceeds the
information in the pencil. The Law of Conservation of Information guarantees
that as we trace informational pathways backwards, we have more information to
explain than we started with.

Where then do the informational pathways of life terminate as we trace them
backwards? The possibilities are limited. One possibility is that we get
nowhere, unable even to begin tracing backwards the information in a biological
system. Thus we may discover an irreducibly complex biological system, but be
unable to trace it back to any abiotic source of exogenous information (this is
by far the most common case in biologysee Behe, ch.8). Another possibility is
that we can trace the information in a biological system back to an abiotic
source of exogenous information, but then cant trace it back any further.
Graham Cairns-Smith (1985; 1986), for instance, has a clay-template theory for
the origin of life in which self-replicating clays form templates for
carbon-based life. The Cairns-Smith theory is clearly an abiotic infusion
theory, with exogenous information represented in (abiotic) clays providing
templates for carbon-based life. What the Cairns-Smith theory does not consider
is how the exogenous information that was transmitted to carbon-based life from
clay templates got into those clay templates in the first place. Needless to
say, the Cairns-Smith theory is highly speculative. Still another possibility is
that we can trace the information in a biological system all the way back to the
initial conditions of the big bang (cf. Corey 1994). Though this approach
appeals to our naturalistic sensibilities, it remains scientifically sterile
until a definite informational pathway can be traced back to the big bang.
Finally, there is the creationist alternative which traces the information in a
biological system to the direct intervention of God. Though this approach
appeals to our theistic longings, it remains scientifically sterile until an
in-principle argument is offered showing that information inherent in a
biological system could not have been contained in any non-biological physical
precursor. And even then its not clear what sort of God one infers.

In tracing back the informational pathways of life, evolutionary biology does
well to avoid speculation, and follow only those informational pathways that can
be rigorously traced. To take an analogy, I can rigorously trace the
informational pathway issuing in my copy of King Lear through the various extant
editions of the play spanning the last four centuries. On the other hand, I
cannot rigorously trace the informational pathway issuing in an isolated first
century papyrus fragment. Any story behind this fragment is lost and cannot be
reconstructed. Alternatively, any relevant informational pathways are blocked
and cannot be rigorously traced. In a similar vein, evolutionary biology may
progress to the point where it can rigorously trace an informational pathway
back to an abiotic source of exogenous information. On the other hand, it may
remain stuck on a given irreducibly complex biological structure, and never be
able rigorously to trace it back to an abiotic source of exogenous information.

In fine, I propose to reconceptualize evolutionary biology in
information-theoretic terms. An evolutionary biology thoroughly cognizant of
information theory is one whose chief task is to trace informational pathways.
In tracing these informational pathways, evolutionary biology must place a
premium on rigor. Detailed informational pathways need to be explicitly
exhibited. Moreover, unlike the nebulous informational pathways sketched by
Stuart Kauffman and his associates at the Santa Fe Institute, informational
pathways need to conform to biological reality, and not to the virtual reality
residing in a computer (cf. Kauffman, 1996). Finally, empirical evidenceand not
metaphysical prejudice or aesthetic preferencemust decide whether an
informational pathway exists at all. For instance, the Darwinian preference to
cash out taxonomy in terms of genealogy must not be taken as evidence for common
descent. To establish common descent requires showing that certain informational
pathways connect all organisms. Many of the low-level facts of current
evolutionary biology will stay put. Whats more, information theory is
sufficiently flexible to accommodate the mechanisms of evolutionary change
proposed to date. Nonetheless, their adequacy will have to be evaluated in terms
of the information-theoretic constraints to which they are subject. Thus for
instance, the Darwinian mechanism can be formulated in information-theoretic
terms, but the claim that this mechanism can account for the full diversity of
life must be rejected given its inability to produce irreducibly complex
systems. Many old questions will remain. Many new questions will arise. But some
old questions will have to be discarded. In particular, all reductionist
attempts to explain information in terms of something other than information
will have to be discarded.

Intelligent Design

Up to this point I have developed a theoretical apparatus for understanding
information, I have critiqued the main naturalistic attempts to account for
biological information, and reconceptualized evolutionary biology in
information-theoretic terms. One question, however, remains unanswered, to wit,
Whence the origin of complex specified information in biology? Tracing
informational pathways back to abiotic sources of exogenous information is as
far back as the information trail goes within the framework so far developed.
But again, all weve really done is push the information problem back, shift its
focus, and exchange one information problem for another. To be sure, this need
not be a vain exercise. Plasmid exchange, though it represents no more than a
shifting around of pre-existing biological information still gives us tremendous
insight into antibiotic resistance. Nonetheless, all such exercises get us no
closer to the origin of information.

In what remains of this paper I want to argue that intelligent causation, or
equivalently design, properly accounts for the origin of complex specified
information. My argument focuses on the nature of intelligent causation, and
specifically, on what it is about intelligent causes that makes them detectable.
To see why CSI is a reliable indicator of design, we need to probe the nature of
intelligent causation. The principal characteristic of intelligent causation is
choice. Whenever an intelligent cause acts, it chooses from a range of competing
possibilities. This is true not just of humans, but of animals as well as of
extra-terrestrial intelligences. A rat navigating a maze must choose whether to
go right or left at various points in the maze. When NASAs SETI researchers
attempt to discover intelligence in the extra-terrestrial radio transmissions
they are monitoring, they assume an extra-terrestrial intelligence could have
chosen any number of possible radio transmissions, and then attempt to match the
transmissions they observe with certain patterns as opposed to others. Whenever
a human being utters meaningful speech, a choice is made from a range of
possible sound-combinations that might have been uttered. Intelligent causation
always entails discrimination, choosing certain things, ruling out others.

Given this characterization of intelligent causes, the next question is how to
recognize their operation. Intelligent causes act by making a choice. How do we
know when an intelligent cause has so acted? A bottle of ink spills accidentally
onto a sheet of paper; someone takes a fountain pen and writes a message on a
sheet of paper. In both instances ink is applied to paper. In both instances one
among an almost infinite set of possibilities is realized. In both instances a
choice is madeone possibility is selected and the rest are ruled out. Yet in
one instance we infer design, in the other we dont. What is the relevant
difference? Not only do we need to observe that a choice has been made, but we
ourselves need also to be able to specify that choice. Its not enough that one
possibility has been chosen and others have been ruled out. We ourselves need to
be able to make the same choice. Wittgenstein (1980, p. 1e) illustrated this
point as follows: We tend to take the speech of a Chinese for inarticulate
gurgling. Someone who understands Chinese will recognize language in what he
hears. Similarly I often cannot discern the humanity in man.

In hearing a Chinese utterance, someone who understands Chinese not only
recognizes that a choice was made from the range of all possible utterances, but
is also able to specify the utterance that was made as coherent Chinese speech.
Contrast this with someone who does not understand Chinese. In hearing a Chinese
utterance, someone who does not understand Chinese also recognizes that a choice
was made from the range of all possible utterances, but this time, because
lacking the ability to understand Chinese, is unable to specify the utterance as
coherent speech. To someone who does not understand Chinese, the utterance is
gibberish. To be sure, uttering gibberish always constitutes a choice from the
range of all possible utterances. Nonetheless, gibberish corresponds to nothing
we can understand in any language, and so cannot be specified. As a result,
gibberish is never taken for intelligent communication, but always for what
Wittgenstein calls inarticulate gurgling.

This choosing of one among several competing possibilities, ruling out the rest,
and specifying the one that was chosen encapsulates how we recognize intelligent
causes, or equivalently, how we detect design. Psychologists who study animal
learning and behavior have known this all along. For these psychologistsknown
as learning theoristslearning is discrimination (cf. Mazur, 1990; Schwartz,
1984). To learn a task an animal must acquire the ability to choose behaviors
suitable for the task as well as the ability to rule out behaviors unsuitable
for the task. Moreover, for a psychologist to recognize that an animal has
learned a task, it is necessary not only to observe the animal making the
appropriate discrimination, but also to specify this discrimination.

Thus to recognize whether a rat has successfully learned how to traverse a maze,
a psychologist must first specify which sequence of right and left turns
conducts the rat out of the maze. No doubt, a rat randomly wandering a maze also
discriminates a sequence of right and left turns. But by randomly wandering the
maze, the rat gives no indication that it can discriminate the appropriate
sequence of right and left turns for exiting the maze. Consequently, the
psychologist studying the rat will have no reason to think the rat has learned
how to traverse the maze. Only if the rat executes the sequence of right and
left turns specified by the psychologist will the psychologist recognize that
the rat has learned how to traverse the maze. Now it is precisely the learned
behaviors we regard as intelligent in animals. Hence it is no surprise that the
same scheme for recognizing animal learning recurs for recognizing intelligent
causes generally, to wit: choosing one among several competing possibilities,
ruling out the others, and specifying the one chosen.

Now this general scheme for recognizing intelligent causes coincides precisely
with how we recognize complex specified information: First of all, the basic
precondition for information to exist must be established, to wit, contingency.
Thus one must establish that any one of a multiplicity of distinct possibilities
might actually obtain. Next, one must establish that the possibility chosen
after the others were ruled out was also specified. So far the match between
this general scheme for recognizing intelligent causation and how we recognize
complex specified information is exact. Only one loose end remainscomplexity.
Although complexity is essential to CSI (corresponding to the first letter in
the acronym), its role in this general scheme for recognizing intelligent
causation is not immediately obvious. In this scheme a choice is made among
several competing possibilities, the rest are ruled out, and the possibility
chosen is specified. Where in this scheme does complexity figure in?

The answer is that it is there implicitly. To see this, consider again a rat
traversing a maze, but now take a very simple maze in which two right turns
conduct the rat out of the maze. How will a psychologist studying the rat
determine whether it has learned to exit the maze. Just putting the rat in the
maze will not be enough. Because the maze is so simple, the rat could by chance
just happen to take two right turns, and thereby exit the maze. The psychologist
will therefore be uncertain whether the rat actually learned to exit this maze,
or whether the rat just got lucky. But contrast this now with a complicated maze
in which a rat must take just the right sequence of left and right turns to exit
the maze. Suppose the rat must take one hundred appropriate right and left
turns, and that any mistake will prevent the rat from exiting the maze. A
psychologist who sees the rat take no erroneous turns and in short order exit
the maze will be convinced that the rat has indeed learned how to exit the maze,
and that this was not dumb luck. With the simple maze there is a substantial
probability that the rat will exit the maze by chance; with the complicated maze
this is exceedingly improbable. And improbability is precisely what we mean by
complexity.

This argument for showing that CSI is a reliable indicator of design may now be
summarized as follows: CSI is a reliable indicator of design because its
recognition coincides with how we recognize intelligent causation generally. In
general, to recognize intelligent causation we must observe a choice among
competing possibilities, note which possibilities were not chosen, and then be
able to specify the possibility that was chosen. Whats more, the competing
possibilities that were ruled out must be live possibilities, and sufficiently
numerous so that specifying the possibility that was chosen cannot be attributed
to chance. In terms of probability, this just means that the possibility that
was specified has small probability. In terms of complexity, this just means
that the possibility that was specified has high complexity. All the elements in
this general scheme for recognizing intelligent causation (i.e., choosing,
ruling out, and specifying) find their counterpart in complex specified
informationCSI. It follows that CSI pinpoints precisely what we need to be
looking for when we detect design.

As a postscript, let me call the readers attention to the etymology of the word
intelligent. The word intelligent derives from two Latin words, the
preposition inter, meaning between, and the verb lego, meaning to choose or
select. Thus according to its etymology, intelligence consists in choosing
between. It follows that the etymology of the word intelligent parallels the
formal analysis of intelligent causation just given. Intelligent Design is
therefore a thoroughly apt phrase, signifying that design is inferred precisely
because an intelligent cause has done what only an intelligent cause can do, to
wit, make a choice.

References

Ambile-Cuevas, Carlos F., Maura Crdenas-Garc'a, and Maaricio Ludgar. 1995.
Antibiotic Resistance. American Scientist, 83: 320-329.

Barrow, John D. and Frank J. Tipler. 1986. The Anthropic Cosmological Principle.
Oxford: Oxford University Press.

Behe, Michael. 1996. Darwins Black Box: The Biochemical Challenge to Evolution.
New York: The Free Press.

Bohm, David. 1993. The Undivided Universe: An Ontological Interpretation of
Quantum Theory. London: Routledge.

Borel, Emile. 1962. Probabilities and Life, translated by M. Baudin. New York:
Dover.

Cairns-Smith, Alexander G. 1985 Seven Clues to the Origin of Life. Cambridge:
Cambridge University Press.

Cairns-Smith, Alexander G. and H. Hartman, eds. 1986. Clay Minerals and the
Origin of Life. Cambridge: Cambridge University Press.

Chalmers, David J. 1996. The Conscious Mind: In Search of a Fundamental Theory.
New York : Oxford University Press.

Corey, Michael A. 1994. Back to Darwin: The Scientific Case for Deistic
Evolution. Lanham, Maryland: University Press of America.

Darwin, Charles. 1859. On the Origin of Species, facsimile first edition.
Cambridge, Mass.: Harvard University Press, 1964.

Dawkins, Richard. 1987. The Blind Watchmaker. New York: Norton.

Dembski, William A. 1996. The Design Inference: Eliminating Chance through Small
Probabilities. Doctoral Dissertation, University of Illinois at Chicago.

Dennett, Daniel C. 1995. Darwins Dangerous Idea: Evolution and the Meanings of
Life. New York: Simon & Schuster.

Devlin, Keith J. 1991. Logic and Information. New York: Cambridge University
Press.

Eigen, Manfred. 1992. Steps Towards Life: A Perspective on Evolution, translated
by Paul Woolley. Oxford: Oxford University Press.

Joklik, Wolgang K. and Hilda P. Willett, eds. 1976. Zinsser Microbiology, 16th
ed. New York: Appleton-Century-Crofts.

Kauffman, Stuart. 1995. At Home in the Universe. Oxford: Oxford University
Press.

Kppers, Bernd-Olaf. 1990. Information and the Origin of Life. Cambridge, Mass.:
MIT Press.

Landauer, Rolf. 1991. Information is Physical. Physics Today, May: 2329.

Levy, Steven. 1992. Artificial Life: The Quest for a New Creation. New York:
Pantheon.

Margulis, Lynn. 1993. Symbiosis in Cell Evolution: Microbial Communities in the
Archean and Proterozoic Eons, 2nd ed. New York: Freeman.

Mazur, James. E. 1990. Learning and Behavior, 2nd edition. Englewood Cliffs,
N.J.: Prentice Hall.

Miller, Kenneth R. and Joseph Levine. 1993. Biology. Englewood Cliffs, N.J.:
Prentice-Hall.

Monod, Jacques. 1972. Chance and Necessity. New York: Vintage.

Scherer, Siegfried. 1983. Basic Functional States in the Evolution of
Light-driven Cyclic Electron Transport. Journal of Theoretical Biology, 104:
289-299.

Schwartz, Barry. 1984. Psychology of Learning and Behavior, 2nd edition. New
York: Norton.

Stalnaker, Robert. 1984. Inquiry. Cambridge, Mass.: MIT Press.

Swinburne, Richard. 1979. The Existence of God. Oxford: Oxford University Press.

Waldrop, M. Mitchell. 1992. Complexity: The Emerging Science at the Edge of
Order and Chaos. New York: Simon & Schuster.

Wittgenstein, Ludwig. 1980. Culture and Value, edited by G. H. von Wright,
translated by P. Winch. Chicago: University of Chicago Press.

Wouters, Arno. 1995. Viability Explanation. Biology and Philosophy, 10: 435-457.

Yockey, Hubert P. 1992. Information Theory and Molecular Biology. Cambridge:
Cambridge University Press.








Promoting an Understanding of the Intelligent Design of the Universe