4 min read

by Elena Biagi – UNIBO

Microbiome research, i.e. the characterization of the complex microbial ecosystems that populate all environmental niches on our planet, including ourselves, has a relatively long history, especially for what concerns the microbes inhabiting the human body. In recent years, this field has witnessed a dramatic increase in scale and scope due to the advances in DNA-sequencing technologies and computational methods. DNA sequencing is indeed becoming every day easier to access, thanks to the decreasing price and increasing quickness in obtaining results. Such “democratization” of the access to the most up-to-date technologies has allowed the start of many different international and interdisciplinary research programmes worldwide, aimed at the exploration of the microbial diversity in a large variety of ecosystems, including both natural and human-related environments. This massive effort of the scientific community will produce huge repositories of data and metadata, made publicly available in order to serve as a reference for future and on-going studies. This will bring an enormous “catalogue of microbial life” to progressive generations, allowing us to observe as microbes and microbiomes change in time and space, as well as how this affects our life and the environment in which we live. Any project that aims at contributing to this common purpose needs to carefully consider the fact that choices made at every step of the research, from study design to the final analyses, can impact results and, most importantly, their comparability with respect to results obtained by other research programmes. Indeed, datasets generated by a single project should always be seen as part of a larger whole, built by the progressive addition of data generated by many projects to the public repositories.

With such growing awareness, a large number of scientists are stating the need for shared, agreed-upon protocols that allow for the co-analysis of data generated by different studies.

Up to now, microbiome data are obtained predominantly by three molecular methodologies: (i) 16S rRNA gene sequencing, that portrays the ecosystem microbial membership, (ii) metagenomics, used to provide the vision of the ecosystem’s functional potential, and (iii) metatranscriptomics, that describes the gene expression profile of the whole microbial community. For each of these methods, many steps along with the microbiome analysis can have an impact on the result comparability, starting from the very beginning: the study design, the samples collection strategy, and the collection of metadata. Longitudinal and/or single point sampling strategy, collection and analysis of replicates, size of each samples, are all parameters affecting precision, accuracy, representativeness, and reproducibility of the obtained dataset. The accurate collection of metadata is crucial in all fields of microbiome research, not only in clinical comparative studies, in order to provide the correct interpretation to a subsequent use of the dataset, as well as the possibility to harmonize their dataset with previous ones.

Another key step that might influence results is the method used to extract the DNA, or the RNA, from the original sample, since bacteria present in a complex community always belong to many different species and some cell types can resist common mechanical or chemical lysis methods. The different nature of the matrices that are analysed in microbiome explorations (water, soil, particulate filtered from air, plant compartments, animal tissues and fluids, environmental swab, human biopsies, feces, etc) makes the nucleic acid extraction challenging to standardize, but it has been repeatedly stated that the same extraction protocols should be employed consistently with all the samples of similar nature, taking into account the published work of large international consortia (Human Microbiome Project, Earth Microbiome Project, Tara Oceans and so on).

After obtaining purified DNA, all samples should be processed consistently, according to the chosen analysis approach and the available sequencing technology. The latter, i.e. which sequencing platform is used, has a confirmed effect on the comparability of the results among different studies, but most updated computational tools can provide means for compensating such an effect. 16S rRNA gene sequencing is the analytical approach that most suffers the differences in adopted procedures. This method foresees the PCR amplification of a small portion of the target gene, which is present in all bacteria but whose sequence can discriminate among different taxa: depending on the gene regions selected for sequencing, some bacterial groups can disappear from the final results, because they are not amplified in PCR. For this reason, it is crucial to compare studies that have been carried on using the same 16 rRNA gene region as a sequencing target.

CIRCLES aims at contributing to the current state of the art in microbiome research by providing a comprehensive analysis of the microbiomes across all actors (plants and animals, final products and intermediates, surrounding and working environment, and human workers) involved in different food system, relevant for the EU food market. The definition of standard operative procedures for sampling, storage, processing, and analysis, carried out alongside the complex and multifaceted study design, has been a crucial phase of the project, with the scope of providing accurate and comparable results, able to robustly contribute to the study of the involvement of microbiomes in the safety, security and sustainability of modern food systems.