Proceeding from a concept called the proto-metabolism hypothesis, geneticists at the University College London (UCL) provide a new framework for the origin of the genetic code in protocells growing by CO2 fixation.
Proceeding from a concept called the proto-metabolism hypothesis, geneticists at the University College London (UCL) provide a new framework for the origin of the genetic code in protocells growing by CO2 fixation.
The genetic code consists of the 64 possible sequences of three nucleotides, known as codons, found in the RNA and DNA helices inside the cells of all living organisms.
These triplet codons enable specific chemical interactions through transfer RNA (tRNA) molecules. These, in turn, give rise to amino acid coding in the ribosome structure resulting in protein formation.
The basic working of the amino acid/protein formation process has been known for some time. However, how the genetic code, present in all living things, might have evolved from thermodynamically driven chemical processes has remained a mystery.
In an article to appear in the journal Biochimica et Biophysica Acta, Nov. 1, authors Stuart Harrison, Raquel Nunes Palmeira, Aaron Halpern and Nick Lane propose a hypothesis that explains how each of the three nucleotides of the triplet codons, still present in life, originated in the metabolic necessities of early proto-life forms.
Genetic code
Each codon triplet is made up of three of the four possible nucleobases that make up RNA; these are adenine, cytosine, guanine and uracil. Expressed as the letters A, C, U, and G, there are 64 possible combinations of three-letter sequences. The genetic code, however, makes use of only 20 of these possible sequences to define the codes that produce the 20 essential amino acids necessary for life.
Proceeding from the hypothesis that the codons represent the preserved core of the early metabolic process of protocells, the authors discover a new interpretation of the significance of the three positions on the codon triplet.
First codon position and possible early environment
The authors show that the first position on the codon corresponds to the evolutionary distance from carbon dioxide fixation, the process by which autotrophic organisms, such as today’s photosynthetic plants, build their body structure using the available carbon dioxide and hydrogen in their environment.
The amino acids encoded by the nucleotide guanosine (G) turn out to be the closest to CO2 fixation. When the letter G appears in the first position of the codon, five of the 20 essential amino acids can be synthesized and incorporated into the organism’s proteins.
The nucleotide adenosine (A) in the first position can encode for another six essential amino acids, using carbon dioxide fixation from the environment.
These two nucleotides, guanosine and adenosine, are known as purines for their molecular structure, which uses a hexagonal and pentagonal ring made up of carbon hydrogen and nitrogen. The other two nucleotides in RNA, cytidine and uridine, are composed of a single ring, known as a pyrimidine.
Examining the energetics of the chemical pathways required for early CO2 fixation, the authors suggest that “far-from-equilibrium environments such as alkaline hydrothermal vents” in the ocean could drive the spontaneous growth of protocells from available hydrogen and carbon dioxide.
Hydrophobicity and second codon position
The second letter in the codon corresponds to the hydrophobicity of the amino acid encoded, in the authors scheme.
Hydrophobicity refers to the property of a molecule that it is repelled by water. Combining multiple scales of hydrophobicity of amino acids, the authors find an association, though not as robust as the first codon position, to the presence of uridine (U) or cytidine (C) in the second position of the codon.
However, given the weaker association the authors say, “Whether hydrophobicity itself or some related property such as partition energy is reflected in the genetic code is therefore unclear. But it is nonetheless sufficient to explain some codon assignments.”
Third codon position and amino acid length
Examination of the third codon position focused on the question of codon redundancy. Sometimes different combinations of nucleotides in the codon triplet can specify the same amino acid, such as both GAA and GAG for glutamic acid. These are known as redundant, or degenerate, codons.
More often than not, the difference in the two redundant codons occurs in the second, or in the third codon position (as in the example of GAA and GAG).
Looking at the non-redundant codons, the authors found the identity of the nucleotide at the third position corresponds to the length of the amino acid chain it encodes.
Summing up their work, the authors note, “The assumption of a spontaneous protometabolism in growing protocells therefore makes sense of the code within the codons, and simultaneously offers a framework that enables the transition from deterministic chemistry to genetic information at the origin of life."
Stuart Harrison, Raquel Nunes Palmeira, Aaron Halpern, Nick Lane, "A biophysical basis for the emergence of the genetic code in protocells," Biochimica et Biophysica Acta (BBA) - Bioenergetics, Volume 1863, Issue 8, 2022.