Quantcast
Belova59/Pixabay

Princeton biologist: 'De novo' genes a proven possibility

How is it possible for a random sequence to produce a functional gene?


Marjorie Hecht
Jun 13, 2022

How is it possible for a random sequence to produce a functional gene?

This is a question that Caroline Weisman, a genomic biologist and Lewis-Sigler scholar at Princeton University, explores in a review article in the Journal of Molecular Evolution, April 22.

A "de novo" gene is only found in a particular clade or species. In other words, it doesn't have an obvious homolog in another species.

Weisman summarizes what's known about the molecular functions and the origins of several known de novo genes. She then speculates "on what these examples may tell us about how de novo genes manage to emerge despite what seems like enormous opposing odds."

For many years the birth of a de novo gene was considered impossible. The Nobel Prize -winning French biologist François Jacob wrote in an often-quoted 1977 Science magazine article that "the probability that a functional protein would appear de novo by random association of amino acids is practically zero."

Since we now know that de novo genes exist, Weisman asks, why are Jacob's arguments wrong?

She gives three possibilities.

First, "An appreciable fraction of all possible sequences would produce a biological effect beneficial to the organism," she said. "Second, de novo genes emerge from sequences that are, compared to a truly random sample, somehow enriched for beneficial biological effects. And third, the number of sequences tested by evolution is sufficiently high that it successfully samples the very small fraction of sequence space that has beneficial biological effects."

A review of de novo genes 

To answer her question, Weisman reviews de novo genes that are "strongly supported" and whose biological effects are "directly characterized." In other words, she considers de novo genes for which there is evidence of a biological effect when the gene is knocked out or knocked down.

Weisman presents case studies of several known protein-coding de novo genes. These include the northern gadid AFGP, an antifreeze glycoprotein essential for arctic codfish (gadids) to survive their cold environment. 

Other de novo genes include Saccharomyces cerevisiae MDF1, Saccharomyces cerevisiae BSC4, Homo sapiens PBOV1, Homo sapiens NYCM and Homo sapiens MYEOV.

Weisman gives evidence of the biological roles of these genes, which for many includes their relationship with aiding cell proliferation in cancer, such as MYEOV with colorectal cancer and pancreatic cancer.

She also includes case studies of two de novo RNA genes, Homo sapiens ELFN1-AS1 and Mus musculus Poldi.

Beating the odds

In the second part of her review, Weisman speculates how these de novo genes "avoid Jacob's conundrum of improbability." She notes that this is speculative because her sample group is too small to ensure that the similarities she finds are not due to chance.

Weisman argues there are many "trials" for de novo gene birth, and that the evidence points to the existence of a "vast number of sequences tested during evolution."

She states "basic structural properties are easy to come by," such as alpha helices, beta sheets or globularity, and are shared by many of the de novo genes she describes. Some biological effects, she notes, require only small regions of Watson-Crick complementarity (antiparallel strands forming a double helix), which is crucial for DNA replication, and, thus, are common in sequence space. (Sequence space represents all possible sequences for a gene, protein or genome).

Weisman describes how four of the genes she reviews overlap with more conserved genes, and this allows them to overcome hurdles to expression by co-opting the features that drive expression of the conserved gene. (A conserved gene is one that has remained essentially unchanged throughout evolution.)

Weisman also notes that the noncoding function of some DNA "lowers the barrier" for de novo genes to achieve coding expression and function.

She further speculates that a couple of the genes she reviews, MYEOV and NCYM, have similar functions to RNA and might be able to share features. She also suggests that some interactions might involve "fuzzy binding," which is easier to achieve than the usual concept of binding.

Another facilitating mechanism for the birth of de novo genes is termed a "freeloader function." Freeloading is the term Weisman uses for functions that modulate existing functions simply by binding. She proposes that "freeloading makes function vastly more common in sequence space than we have imagined," thus violating Jacob's premise of sparsity of de novo gene birth possibilities.

The future

Weisman concludes by stressing that she presented only information that she considered to be of "high confidence." She expects that new insights are forthcoming from ongoing work. "...[E]xperimental characterization of de novo genes lags behind other approaches," she notes, and "more focus here strikes me as essential."

She writes, "If our examples are representative, enriched as they are for freeloader functions, reprisals of existing roles or reactivation of existing pathways, one might conclude, ironically, that, at least at the molecular level, de novo genes represent the opposite of novelty. We might now pause to re-evaluate the widespread assumption that new genes are the best candidates for what lies beneath new features and should take care not to bias our efforts by assuming that this must be so." 

---

C. Weisman, The Origins and Functions of De Novo Genes: Against All Odds?, Journal of Molecular Evolution, April 22, 2022.

DOI: https://doi.org/10.1007/s00239-022-10055-3


RECOMMENDED