Phil Davis - man in cardigan sitting in front of three computer screens

Featured News

Improved Genomics Through Machine Learning

Global Health Surveillance

Technology Benefits Genomic Analysis

Machine learning allows computers to learn informative associations in genomics data without explicitly programming them what to learn. These algorithms find patterns and critical information, enabling them to weigh the importance of the data and handle bias. As a result, we can achieve improved genomics through machine learning. 

“This technology exists because our ability to create data is outstripping our ability to analyze it,” says Phil Davis, senior scientist in MRIGlobal’s Applied Biology and Bioinformatics group.

Our work expands the use of computational methods like machine learning to improve our understanding of hidden patterns in large and complex data sets like those in genomics and next generation sequencing. The analysis made possible through machine learning could lead to major breakthroughs in disease research because of improved genomics through machine learning. According to Davis, there is a mature machine learning software ecosystem, but gaps exist that prevent the easy application of these tools to genomics data.

“A lot of infrastructure to do this is established, so we’re now adapting creative ways to manage, analyze, and utilize it. Over the last four years, we have tried to identify some well-trodden paths in machine learning and understand how to adapt this to a new context,” Davis said.

Technological innovations are also now improving the speed and quality, and reducing the cost of genomic sequencing, while advancing the overall progress of genome research. 

Leveraging bioinformatics to improve genome research

Blending computer science, statistics, and biology, bioinformatics is very useful in many ways. Bioinformatics researchers play important roles in pharmaceutical research, personalized medicine research, preventative medicine, gene therapy, and many more important fields. There are bioinformatics applications in various sectors of biological science and technology.

“It is an extraordinary time to be working in bioinformatics method development,” said Joe Russell, Ph.D., MRIGlobal Principal Scientist and Group Lead for the Applied Biology & Bioinformatics team.  “The confluence of exponentially increasing genomics data across many different domains and emerging machine learning approaches to scale traditional statistical techniques in novel ways has produced extremely fertile ground for creative applications.”

Researchers at MRIGlobal have published a set of novel feature-engineering algorithms to identify genomic motifs that are predictive of dangerous properties in RNA viruses like human pathogenicity and high transmissibility. This significantly improves our ability to characterize the functional threat profile of a novel virus encountered through surveillance efforts, as well as provides promising targets for vaccines and therapeutics. 

How can genomics be applied?

Improved genomics through machine learning is benefitting researchers in various healthcare fields. It has provided huge advancements in long challenging fields such as protein structure prediction. However, in the application of these tools in a research setting, understanding how the model learned its solution is often more important than its accuracy.

Because of this, MRIGlobal researchers have focused on development of these feature-engineering techniques and utilizing simple and easy to interpret models. “This closely integrates with what our colleagues at the lab bench are doing,” Davis said. “We are running experiments on the data. This often starts with a hypothesis about why we think the features we are looking at should be related to the phenomena we are studying. What we might learn from these experiments should be able to plug in immediately to what is happening in the lab.”

The real potential of these machine learning methods is identifying new or understudied biology that could yield completely novel approaches to drug development or biosurveillance.


Contact MRIGlobal to further understand our work in genomics and bioinformatics. We work with our clients to transform data into the next generation of biosurveillance, disease diagnostics, and pharmaceutical development. 

If you are part of an agency, business, or academic institution seeking assistance with a project, use our Project Quote Tool to get started.   


Sign up for the MRIGlobal newsletter! It’s the best way to get the latest updates in the world of applied scientific engineering research delivered directly to your inbox.