µSTASIS - Assessment of human microbiota stability across longitudinal samples
Institute for Research in Biomedicine (IRB) Barcelona
Microbiome research is moving forward increasingly faster mainly due to the continuous implementations developed throughout the years to gain more detail in taxonomy and functional surveys. For example, this rapid advance is reflected by the diversity of cutting-edge microbiome-based therapeutics such as fecal microbiota transplantation, prebiotics or engineered symbiotic bacteria. So, along the years, microbiome data captured as a snapshot in time has uncovered the interindividual variability of human microbiota meanwhile changes in individualized microbiota emerge as predictors of clinical outcomes and disease forecast. Therefore, novel methods are needed to provide robust evaluation of longitudinal data since not only the amount of surveys based on high-dimensional sequencing data is key but also to make use of correct methodologies for their right interpretation. With the aim of assessing intrapersonally short-term to long-term changes in microbial communities, I (together with Francisco J. Santonja and Alfonso Benítez-Páez) developed µSTASIS, a multifunction R package to assess the microbiota temporal stability of individuals by means of a contextualized, intuitive and validated metric (mS), which is independent of beta diversity distance methods and correlation coefficients too. It relies on iterative partitioned clustering for stressing paired samples from multiple individuals out. Concretely, the Hartigan-Wong k-means algorithm is used as many times as possible to get the frequency in which the samples from the same individual are clustered together. Moreover, this algorithm is respectful toward compositional data analysis, hence it deals properly with data belonging to the Aitchison simplex. Firstly, µSTASIS was released at CRAN under GPL-3 licensing. Later, it was published at Briefings in Bioinformatics (bbac055) where we reported the metric fill the gap in microbiome research regarding stability in a temporal framework when treating data as compositional. Also, we demonstrated its utility by assessing the gut microbiota stability in three already-published, independent, longitudinal data sets. Besides, the whole project was selected for the first cycle of the Bioconductor New Developer Mentorship (under Dario Strbenac’s and Davide Risso’s supervision) by which I am implementing some functions, improving some theoretical aspects of the main algorithm and preparing complete documentation for the package to be soon moved from CRAN to Bioconductor.