An Exponentially Increasing Set of Genomics Research Data
10 followed by 17 zeros of data. That’s an exabyte and is equal to one billion gigabytes. 20 years after the human genome was first sequenced, an extraordinary amount of data related to genomics research has been and continues to be generated. In fact, it is estimated that between two and 40 exabytes of data will be produced in the next decade. Some researchers have suggested that if the genomics data generation growth trend remains constant, genomics will soon generate more data than applications such as social media, earth sciences, and astronomy. Because the volume and interdisciplinary nature of this data is growing exponentially, so are its value and the challenges associated with managing and analyzing it.
Solving Problems by Understanding Genomes
Genome sequencing is a laboratory method used to understand the genetic makeup of an organism, such as a virus. This type of data can help us understand how a disease works, how to identify it in a human or in the field, and how to best attack it. Understanding genomes has the potential to solve some of the most daunting challenges faced around the world.
At MRIGlobal, our commitment to solving these challenges has driven our investment in genomics research and innovation. In fact, there are discoveries waiting to be made right now in genomic data that is already available.
“It could be a client’s data, it could be publicly available data, or public data a client wants to ask specific questions about,” says Phil Davis, Senior Scientist in MRIGlobal’s Applied Biology and Bioinformatics group. “We can dive into this data using a variety of analytical tools and harness its potential.”
Next Generation Sequencing
Through genomics, we can either try to characterize something in a sample or do predictive analysis. With an understanding of appropriate data science techniques, we can take a virus genome and predict something about the genotype just from its sequence alone.
“You can make an analogy between a sentence and a genome sequence. It can be looked at as a collection of words. Our work is to make those words, or genomes, make sense,” Davis said.
MRIGlobal genomics experts like Davis interact with colleagues and collaborators who perform next generation sequencing work. They use next generation sequencing to determine the order of nucleotides in entire genomes or targeted regions of DNA or RNA.
Genomic sequencing has led to the discovery of genes associated with diseases and the identification of new variants tied to phenotypes. Through comprehensive DNA exploration, researchers have created maps of gene variations related to health, disease, and drug response.
“We’re constantly thinking of new ways to discover hidden information that is encoded in the sequences of these viral genomes,” said Davis. “We’ve made a lot of progress, but I think we’ve just scratched the surface of what’s possible.”
The future of genomics research may rely on the use of machine learning, which can improve our understanding of complex data sets like those in genomics.
GETTING STARTED AT MRIGLOBAL
Contact MRIGlobal to further understand our work in genomics and bioinformatics. We work with our clients to transform data into the next generation of biosurveillance, disease diagnostics, and pharmaceutical development.
If you are part of an agency, business, or academic institution seeking assistance with a project, use our Project Quote Tool to get started.
SIGN UP FOR OUR NEWSLETTER
Sign up for the MRIGlobal newsletter! It’s the best way to get the latest updates in the world of applied scientific engineering research delivered directly to your inbox.