All About Science Green and blue chemistry symbols - All About Science Banner

New Proteins


New proteins – The driving force of evolution?
One of the foundational postulates of modern neo-Darwinian theory is that the basic raw materials of evolution are new proteins which are generated more or less at random by mutations induced by DNA miscopying, recombination or radiation. Genes which confer an advantageous phenotype are subsequently selected and hence new genes, encoding new proteins, are added incrementally. It is important to understand that a DNA sequence cannot, with foresight, induce changes that might bring about the meaningful changes that would be necessary for a gene to become of some survival value. Natural selection may only determine any utility or improvement in retrospect -- in other words, natural selection can determine the survival of the fittest, but not the arrival of the fittest.

What is the likelihood of obtaining a useful macromolecule assuming that we have the mechanisms in place for generating them efficiently through mutations of existing genetic material? May a quantitative estimate be placed on this probability? Mutations, of course, are of nucleotides in nucleic acids which are subsequently transcribed and translated into proteins. But let us consider instead the random ordering of amino acids to form proteins. This is not only easier to formulate, but also errs on the side of overestimating the probability of obtaining a useful polypeptide by chance alone.


New Proteins – The final death blow for evolution
What is the probability of the specific production of modestly-lengthed new proteins by random mutation? Consider the cytochrome c molecule. Cytochrome c is a short protein of around 104 amino acids. It is present in most cells and is connected with the production of the ATP energy molecule. Because it is found in both prokaryotes and eukaryotes, it is often thought to have evolved very early on, before the existence of the last universal common ancestor (LUCA).

But how likely is it that a protein of this size could have arisen by random ordering of amino acids, as it would have needed to before natural selection could have kicked in? This can be calculated by estimating the number of potential options there are for a protein of this size, and then comparing this with how many sequences might have been tried out in the course of evolution.

As mentioned, the number of amino acids in the cytochrome c protein is about 104. There are 20 different types of amino acids. This means that the number of possible protein molecules is 20104, or 2 x 10135 (2 followed by 135 zeros!). To put this into perspective, consider that the number of atoms in the universe is considered to be 1080. This implies that the number of possibilities for even a short protein just 100 amino acids long exceeds the number of atoms in the universe by a factor of 1050.

Let us now place an extremely generous estimate on how many alternatives might have been tried out. Consider that the entire universe was made up solely of amino acids that we could use to make proteins. A protein a mere 100 amino acids long would contain at least 1000 atoms. Thus, at any one time we could generate something of the order of 1080/103 = 1077 such proteins.

Now let us consider that the time it would take to produce a protein 100 amino acids long would be about half a minute (polypeptide bonds are formed at the rate of three to five times per second). It is not necessary, however, to synthesise each new polypeptide from scratch. A new amino acid sequence can be generated simply by changing a bond or two (e.g. by substituting one amino acid for a different one). It may be reasonable to suggest, then, that we could try a whole new batch of 1077 proteins every second. This, of course, is a very conservative estimate, and ignores the more realistic delay in translating/transcribing a gene.

How long have we got? Even taking the conventional estimate of the entire age of the universe as, say, 15 billion years old, we have a total of 1018 seconds to play with. What is the total number of proteins that could be generated in this time? If we multiply the 1077 by 1018, we can estimate the upper limit to be 1095.


New Proteins – Hopelessly improbable
What are the implications of this calculation concerning the shear improbability of the production of new proteins? Even if we could employ the total material resources of the universe, and cycle them not only continually since the beginning of the universe, but also as fast as is rationally conceivable, the proportion of the possible combinations that might be produced would have been only 1095/10135 = 1/1040 or 10-40. Thus, if we were attempting to produce a specific 104 amino acid-long protein at random, our chance of success would be of the order of 1 in 1040, even generously assuming all of the above.

What can we conclude from all of this? We cannot depend on random mutations to produce specific proteins. The probabilistic resources are, frankly, not at our disposal. It is no longer tenable to hind behind billions of years of geological history, not even given the argument that life could have evolved on an astronomical number of other planets. With the improbabilities of producing even a specific 104-amino-acid long novel protein so slim, let alone the myriads of accompanying proteins which are engaged in highly specific interrelatedness, evolution is dead in the water.

Learn More!

Copyright © 2002-2021 AllAboutScience.org, All Rights Reserved