Machine learning benchmarks

Microbiome benchmarks for machine learning from the Vangay et al MLRepo paper are here.

Daily diet-microbiome variation study

Processed shotgun metagenomics data, dietary intake data, and all metadata from the Johnson et al 2019 Cell Host and Microbe paper, “Daily Sampling Reveals Personalized Diet-Microbiome Associations” are available here. Related FoodTree tree and code are here.

Immigration Microbiome Project

Processed 16S microbiome data, shotgun metagenomics data, dietary intake data, and all metadata from the Vangay et al 2018 Cell paper, “U.S. Immigration Westernizes the Human Microbiome” are available here.

FEMS Benchmarks

Raw and processed benchmark data, including metadata, from the Knights 2011 FEMS paper, “Supervised Classification of Human Microbiota” are available here.


Data used in SourceTracker publication, Knights et al. Nature Methods 2011, “Bayesian Community-Wide Culture-Independent Microbial Source Tracking” are available here.


Final data used in Koren et al. 2013, “A Guide to Enterotypes accross the Human Body: Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets” are here.

Host Genetics and the Microbiome in IBD

Final data analyzed in Knights et al. 2014, “Complex Host Genetics Influence the Microbiome in Inflammatory Bowel Disease” are here.

Publication data used in Montassier et al. 2016, “Pretreatment Gut Microbiome Predicts Chemotherapy-related Infection” are here.

Captivity Humanizes the Primate Microbiome

Species and lifestyle metadata and OTU tables used in Clayton et al. 2016, “Captivity Humanizes the Primate Microbiome” are here Raw sequencing data here at EBI under project PRJEB11414