The Genetic Code: How We Know That Life On Earth Shares A Common Origin
The fault with bacteria from Mars is not in our stars, but in ourselves.
Do aliens live among us? This question may sound fanciful — more befitting of a conspiracy theorist than a scientist. Yet in recent decades we have discovered forms of microscopic life so bizarre that the notion that they share a common origin with us may feel just as farfetched.
Pyrolobus fumarii, for example, seems more suited to hell than to Earth. It is so adapted to extreme heat that it would freeze in a hot cup of coffee, thriving instead in water heated to over 100 degrees Celsius by hydrothermal vents which belch out sulfurous black smoke in the depths of the ocean. It is not alone in its peculiarity. Picrophilus oshimae was discovered in highly acidic volcanic vents and soil. It can live in an environment as acidic as battery acid, but could not grow in a pool of acid rain because it would not be acidic enough! Indeed, acidophiles like this have even made habitats out of the acid mine drainage from disused coal and metal mines.
Perhaps most remarkable of all is Deinococcus radiodurans. It can withstand a dose of nuclear radiation a thousand times greater than that which could kill a human, allowing it to prosper in nuclear waste. In fact, it is so hardy that it can even survive exposure for years in outer space. This suggests an ability to survive an interplanetary journey that has inspired some to speculate that it really could be an alien. Perhaps, the argument goes, the ancestor of Deinococcus radiodurans really evolved on Mars before being transported to Earth on a meteorite. That would certainly explain its strangeness, and in particular why it seems adapted to high-radiation environments that do not occur naturally on Earth.
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ace296-2ce3-44e8-a056-86565b74fc0d_532x612.jpeg)
Yet the vast majority of scientists would confidently dismiss this idea. Indeed, we are confident that Deinococcus radiodurans, Pyrolobus fumarii and Picrophilus oshimae, along with all other known lifeforms, share a common ancestral origin with us. Far from being aliens, they are our long-lost cousins. To understand why, we need look not out to space, but deep within ourselves to uncover the little-known languages of life.
The Languages of Life
Like all complex machinery, life needs an instruction manual. This manual is the genome — the collection of all an organism’s genes. You were conceived as a single cell with your complete genome, but little else. Your genes detailled the incredibly intricate series of steps necessary to transform you into a human baby over a period of nine months. Since then, your genes have provided instructions for how to maintain your body and how your body should respond to every situation that has arisen in your life. Without these instructions, life would be impossible.
Specifically, genes contain instructions for producing proteins. Proteins are incredibly diverse and complex molecules that perform every bodily function you can imagine — from digesting food and fighting viruses to seeing, moving and learning. Each different gene instructs the body to produce a different protein. Your genome is the collection of instructions for all of the different proteins necessary to make and operate your body. Life can therefore be understood to consist of two fundamental components; genes store the information necessary to create proteins, and these proteins perform the processes necessary for life.
Like any information, the instructions of your genes are written in a particular language. Genes are physically manifested in DNA (or its close relative RNA) which has at its centre a sequence of chemical bases that can each be chosen from a list of four options — adenine (A), thymine (T), cytosine (C) or guanine (G). The language of DNA is therefore written in an alphabet of four letters: A, T, C and G, and genes consist of long sequences of these letters. Meanwhile, proteins consist of sequences of a different type of chemical — amino acids. There are twenty amino acids used in proteins. Proteins can therefore be understood to be written in a different language from genes — a language with an alphabet of twenty letters corresponding to the twenty amino acids.
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55903a46-ff15-432c-a548-d1364d52ab34_640x675.png)
Life therefore uses two languages — the four-letter language of DNA and the twenty-letter language of proteins. In order to make use of the instructions in genes, the information must be translated from the language of DNA to the language of proteins. This is done using the genetic code.
The Genetic Code
The genetic code is the DNA-to-protein language dictionary used by all life. Since there are more amino acids than DNA bases, it is not possible to have a simple translation from one DNA base to one amino acid. Instead, an amino acid is associated with each possible sequence of three DNA bases (called a codon). For example, the codon ACG in DNA translates to the amino acid threonine, while the codon AAC translates to the amino acid asparagine.1 In this way, a gene can be translated into a protein by translating one codon at a time into its corresponding amino acid.
Like human languages, the genetic code is largely arbitrary. There is nothing special about the sounds or letters of the word dog that naturally correspond to a canine instead of a feline; it just so happens that this is what dog means in the English language. Because of this, there are an absolutely enormous number of alternative languages that are just as sensible to use as English. This means that it would be an absolutely incredible coincidence for the English language to independently evolve twice.
Similarly, there is nothing about the chemistry of the codon ACG that dictates that it should translate to threonine instead of asparagine (or any other amino acid). It just so happens that this is how it is in the genetic code used by life on Earth. There are over a billion trillion trillion trillion trillion trillion trillion trillion different choices of how to translate DNA codons to amino acids, of which an immense number would function just as well as the actual genetic code.2 So, just as with the English language, it would be an utterly implausible coincidence for the same genetic code to have independently evolved twice.
Yet, all studied life, including Deinococcus radiodurans, has been found to use essentially the same genetic code.3 If Deinococcus radiodurans truly is a Martian that evolved independently of us, there is no plausible explanation for why this should be. It would be like encountering a Martian who spoke English. The only sensible conclusion is that the genetic code was inherited, by both Deinococcus radiodurans and us, from a long-lost common ancestor that all life on Earth shares.
Extraordinary Evidence
By studying the genetic code, we can infer facts about the distant past and the nature of life that would otherwise be utterly unknowable. Specifically, billions of years ago, somewhere in the primordial oceans, lived a single-celled organism we call LUCA (for Last Universal Common Ancestor). LUCA had genes encoded in DNA, which were translated into proteins using the same genetic code that is being used in each cell of your body right now. At some point, LUCA split into two daughter cells, as cells naturally do. One of these daughter cells would go on to be our billion (or more) times great-grandmother, while the other would be the ancestor of bacteria like Deinococcus radiodurans. Out of that divide has grown the one ultimate family tree, on which every organism ever to live on Earth can be found — the one true tree of life. If it is a disappointment to learn that aliens do not live among us, we can surely find consolation in contemplating the majesty of this reality. In the words of Charles Darwin, “there is a grandeur in this view of life.”
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2550c6b-d267-4b9a-9751-b017f8a775a6_1296x852.png)
That we can learn so much about the distant past and deepest questions of biology from the genetic code is extraordinary and may seem counterintuitive. We intuitively prefer direct evidence, such as fossils or direct observations, over evidence that requires interpretation and inference. But this is a mistake. Deep discoveries in science have come from the ability of scientists to deduce extraordinary conclusions from subtle evidence. Comparing the weight of oxide powders allowed John Dalton to deduce the existence of atoms long before they had ever been seen. Observations that electricity could move compass needles and that magnets could generate electricity allowed James Clerk Maxwell to solve the ancient mystery of the nature of light and infer the existence of radio waves that had not yet been detected. Slight differences in the colour of light from distant stars allowed Georges Lemaitre to infer that the universe had a beginning that we can never directly observe. That we can conclude that all life shares a common ancestor from the indirect evidence of the genetic code is just another example of the remarkable resourcefulness of science.
Ultimately, the power of evidence comes not from its directness, but from its ability to discriminate between different theories. If one theory requires an implausibly improbable coincidence to explain an observation, while an alternative theory offers a straightforward explanation, then that observation is very compelling evidence for the latter theory. For this reason, we do not need to look to outer space to refute the idea that aliens live among us; we can find the evidence we need inside our own cells. As Shakespeare no doubt would have said: The fault with bacteria from Mars is not in our stars, but in ourselves.
In addition to the twenty amino acids, codons can also translate to a stop signal which is used to mark the end of a gene. Multiple codons also translate to the same amino acid (or stop); for example, CGC and CGA both translate to the amino acid arginine. However, since each codon only translates to one amino acid, genes can still be uniquely translated into proteins.
The number of possibilities is 64!/43! (allocating one codon to each of the twenty amino acids and to stop) times 21^43 (for the remaining 43 codons which can be translated to any of the 21 options of amino acids or stop). This comes to 1.5x10^93.
The genetic code is not entirely arbitrary. It is structured so that similar codons translate to the same amino acid, which reduces the likelihood of mutations in DNA changing the translated protein. It also tends to associate more codons with simpler amino acids that are more commonly used, again to reduce the likelihood of damage from mutations. One study suggested that the code is one in a million, in the sense that only one in a million of the alternative translations are as advantageous as the genetic code. However, this still leaves 1.5x10^87 possibilities that work just as well as the actual genetic code — far too many to have the same code arise twice by coincidence.
The genetic code is not absolutely the same in all life (as discussed here). However, the variations that exist each have only minor changes from the one standard code. These can be thought of as analogous to dialectical variations of a language. Just as Australian and British English are not identical but it is implausible that two languages so similar could have evolved independently by chance, all variations on the genetic code are so minor that it is implausible that different organisms could have evolved their codes independently.
Quokka bonus points!
To the above comment I just wanted to add that a samples from the asteroid contain uracil one of the four building blocks of RNA. In September NASA’s OSIRIS spacecraft will be bringing back even larger samples from an asteroid called Bennu, so it will be interesting to see what can be found.