Scientists have been working to identify all human genes since the initial genome draft in 2001. While progress has been made in identifying protein-coding genes (now fewer than 20,000), there has been a surge in reported non-coding RNA genes.
Scientists have been working to identify all human genes since the initial genome draft in 2001. While progress has been made in identifying protein-coding genes (now fewer than 20,000), there has been a surge in reported non-coding RNA genes.
According to a report, researchers from several universities recently provided an update to the cataloging of the Human Genome Project. The Human Genome Project was initiated with the objectives of analyzing the structure of human DNA and determining the location of all human genes. These goals were challenging, with the second goal involving complexities in identifying genes and their regulatory mechanisms. These themes in the study collectively address the complex and evolving nature of human gene annotation, highlighting the need for improved methods, standards, and technologies to achieve a more comprehensive understanding of the human genome.
The perception of genes has evolved from being seen as primarily encoding single protein-coding transcripts to acknowledging the influence of alternative transcripts, non-coding RNA elements, and post-transcriptional processing in human biology, according to the report.
The research highlights challenges in completing the human gene annotation, particularly in areas like identifying protein-coding genes, splice variants, pseudogenes, and non-coding RNA genes. This is attributed to difficulties in accurately characterizing complex isoforms and a lack of functional understanding of many non-coding RNAs.
There are challenges in completing the human gene annotation, particularly in areas like identifying protein-coding genes, splice variants, pseudogenes, and non-coding RNA genes. This is attributed to difficulties in accurately characterizing complex isoforms and a lack of functional understanding of many non-coding RNAs.
Proper gene annotation is crucial for diagnosing and treating genetic diseases. Flaws in gene annotations can lead to errors in clinical diagnoses, emphasizing the importance of a universal annotation standard.
The transition from GRCh37 to GRCh38 and the emergence of additional reference genomes for diverse human populations have introduced challenges in achieving consistent gene annotation across multiple reference genomes.
The report mentions various technologies like long-read sequencing, proteomics, and capture sequencing that hold promise for addressing the challenges in gene annotation and for identifying low-expressed transcripts.
The understanding of human genes continues to evolve, with the number of protein-coding genes stabilizing around 19,500. The catalog of non-coding RNA genes is still expanding, and future studies aim to refine this catalog.
The human gene catalog is not a one-size-fits-all; genetic diversity among individuals means some may have more or fewer copies of certain genes. Therefore, surveying the diversity of the human population is crucial for a comprehensive view of gene content.
Amaral, Pauo, et al., The status of the human gene catalogue