Researchers from the University of Virginia (UVA) have developed a mathematical tool that can filter out noise and bias from large sets of data about the building blocks of our chromosomes, which could advance genomics and disease research.
Researchers from the University of Virginia (UVA) have developed a mathematical tool that can filter out noise and bias from large sets of data about the building blocks of our chromosomes, which could advance genomics and disease research.
According to a release by UVA, the team, led by computational biologist Dr. Chongzhi Zang, created the tool to help scientists distinguish signal from noise when investigating the genetic causes of diseases such as cancer.
The tool is expected to accelerate the development of new treatments and improve cancer diagnosis by making it easier for doctors to identify cancerous cells, thanks to a cutting-edge technology called "single-cell ATAC-seq," which generates a vast amount of data, but can include a lot of noise and bias. This can be confusing and time-consuming for scientists.
“Using the traditional way of analyzing the data, you might see some patterns that look like real signals of a particular chromatin state, but they are actually fake due to the bias of the experimental technology itself. Such fake signals can confuse scientists,” said Zang, a computational biologist with UVA’s Center for Public Health Genomics and UVA Health Cancer Center.
“We developed a model to better capture and filter out such fake signals so that the real needle we are looking for can more easily stand out of the hay,” he added.
Zang's tool uses a model from number theory and cryptology, called "simplex encoding," which codes DNA sequences into simpler mathematical forms. That allows researchers to compare different forms to detect noise and bias that cannot be identified using traditional methods.
Dr. Shengen Shawn Hu, a research scientist in Zang's lab and the lead author of the work, said the simplex encoding model provides an accurate estimation of sequence biases because of its mathematical property. The researchers tested the tool and found that it was significantly better at analyzing complex single-cell data to characterize different cell types.
Zang added that the tool could benefit the biomedical research community in studying chromatin biology and genomics and eventually help disease research. He hopes that other researchers will use the tool to make important scientific discoveries in their own research, the release noted.
The findings were published in the scientific journal Nature Communications, and the team included Hu, Liu, Li, Ma, Guertin, Meyer, Deng, Zhang and Zang.
The work was supported by the National Institutes of Health (NIH), the National Science Foundation, the University of Pittsburgh Center for Research Computing, UVA Cancer Center and the NIH's National Cancer Institute.