Making More Sense of Microbial Communities – Los Alamos Reporter

Understanding the larger picture of a sample’s origin is one of the goals of the National Microbiome Data Collaborative. Photo courtesy NMDC

LANL PRESS RELEASE

A review article in the journal Frontiers in Bioinformatics highlights the challenges faced and the accomplishments of the National Microbiome Data Collaborative (NMDC) in improving the ability to find and access comparable multiomics data. Multiomics data for microbial communities includes characterization of all DNA, RNA, proteins and metabolites in a system. These help describe the genetic content of a microbiome as well as the nuances of how and when genetic material is used, in particular, which transcripts and proteins respond to which stimuli and which metabolites are consumed and made. Apply them to whole microbial communities and you get the multiomics of the microbiome.

The world’s microbiomes are still largely unknown frontiers: soil, ocean, and human body microbes play important roles in their respective ecosystems, but scientists still only “know” a few of them. By pooling quality and comparable data into the NMDC, scientists hope to begin to elucidate some of these unknowns and hypothesize their functions.

“This availability of data could allow for ever larger research projects,” said NMDC co-principal investigator Patrick Chain. “Scientists looking to more comprehensively compare microbiome data, or compare their own omics data to similar studies in comparable settings, would now be able to find these samples using the NMDC. They could then use the bioinformatics workflows of the NMDC to compare their own data to those processed identically within the NMDC.

The NMDC was created in 2019 by the Department of Energy to ensure that data is findable, accessible, interoperable and reusable. As reliable data was lacking, the NMDC set out to develop standards for cataloguing, processing, analysis and documentation. This includes bioinformatics workflows for processing data and terminology to be used for sample metadata or to document the use of the workflows themselves. Metadata, for example, is particularly useful for giving context to a sample.

“While current public repositories contain many datasets from samples taken from water sources, we need to know more,” said Los Alamos computational biologist Bin Hu. “Was it seawater?” How deep was the sample taken in the water column? What time of day was it collected? What temperature ? This sample information provides more detail and nuance that will help us better understand the results.

NMDC’s major partners (including Lawrence Berkeley Laboratory and Pacific Northwest National Laboratory) have already populated the catalog with hundreds of high-quality samples for use by the wider scientific community, but the ultimate goal is allow community members to contribute their own data (combined with metadata and using NMDC standardized workflows to generate results) to further enrich the catalog.

Hu, Chain and their colleagues at the Los Alamos, Berkeley, and Pacific Northwest labs led NMDC’s efforts in developing standardized workflows and data processing. The Los Alamos team has also revamped its award-winning online bioinformatics software package EDGE into an NMDC microbiome data-friendly version (NMDC-EDGE) and has been instrumental in developing training and outreach materials to help community users.

2022-03-03

The world’s microbiomes are still largely unknown frontiers: soil, ocean, and human body microbes play important roles in their respective ecosystems, but scientists still only “know” a few of them. “While current public repositories contain many datasets from samples taken from water sources, we need to know more,” said Los Alamos computational biologist Bin Hu. Courtesy of LANL

Comments are closed.