Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference

ABSTRACT The human microbiome, the community of microorganisms that reside on and inside the human body, is critically important for health and disease. However, it is influenced by various factors and may vary among individuals residing in distinct geographic regions. In this study, 220 samples, co...

Full description

Saved in:
Bibliographic Details
Main Authors: Yinlei Lei, Min Li, Han Zhang, Yu Deng, Xinyu Dong, Pengyu Chen, Ye Li, Suhua Zhang, Chengtao Li, Shouyu Wang, Ruiyang Tao
Format: Article
Language:English
Published: American Society for Microbiology 2025-01-01
Series:mSphere
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/msphere.00672-24
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832583436544507904
author Yinlei Lei
Min Li
Han Zhang
Yu Deng
Xinyu Dong
Pengyu Chen
Ye Li
Suhua Zhang
Chengtao Li
Shouyu Wang
Ruiyang Tao
author_facet Yinlei Lei
Min Li
Han Zhang
Yu Deng
Xinyu Dong
Pengyu Chen
Ye Li
Suhua Zhang
Chengtao Li
Shouyu Wang
Ruiyang Tao
author_sort Yinlei Lei
collection DOAJ
description ABSTRACT The human microbiome, the community of microorganisms that reside on and inside the human body, is critically important for health and disease. However, it is influenced by various factors and may vary among individuals residing in distinct geographic regions. In this study, 220 samples, consisting of sterile swabs from palmar skin and oral and nasal cavities were collected from Chinese Han individuals living in Shanghai, Chifeng, Kunming, and Urumqi, representing the geographic regions of east, northeast, southwest, and northwest China. The full-length 16S rRNA gene of the microbiota in each sample was sequenced using the PacBio single-molecule real-time sequencing platform, followed by clustering the sequences into operational taxonomic units (OTUs). The analysis revealed significant differences in microbial communities among the four regions. Cutibacterium was the most abundant bacterium in palmar samples from Shanghai and Kunming, Psychrobacter in Chifeng samples, and Psychrobacillus in Urumqi samples. Additionally, Streptococcus and Staphylococcus were the dominant bacteria in the oral and nasal cavities. Individuals from the four regions could be distinguished and predicted based on a model constructed using the random forest algorithm, with the predictive effect of palmar microbiota being better than that of oral and nasal cavities. The prediction accuracy using hypervariable regions (V3-V4 and V4-V5) was comparable with that of using the entire 16S rRNA. Overall, our study highlights the distinctiveness of the human microbiome in individuals living in these four regions. Furthermore, the microbiome can serve as a biomarker for geographic origin inference, which has immense application value in forensic science.IMPORTANCEMicrobial communities in human hosts play a significant role in health and disease, varying in species, quantity, and composition due to factors such as gender, ethnicity, health status, lifestyle, and living environment. The characteristics of microbial composition at various body sites of individuals from different regions remain largely unexplored. This study utilized single-molecule real-time sequencing technology to detect the entire 16S rRNA gene of bacteria residing in the palmar skin, oral, and nasal cavities of Han individuals from four regions in China. The composition and structure of the bacteria at these three body sites were well characterized and found to differ regionally. The results elucidate the differences in bacterial communities colonizing these body sites across different regions and reveal the influence of geographical factors on human bacteria. These findings not only contribute to a deeper understanding of the diversity and geographical distribution of human bacteria but also enrich the microbiome data of the Asian population for further studies.
format Article
id doaj-art-d160caec080b42f1a79e46a26c2d44a5
institution Kabale University
issn 2379-5042
language English
publishDate 2025-01-01
publisher American Society for Microbiology
record_format Article
series mSphere
spelling doaj-art-d160caec080b42f1a79e46a26c2d44a52025-01-28T14:00:56ZengAmerican Society for MicrobiologymSphere2379-50422025-01-0110110.1128/msphere.00672-24Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inferenceYinlei Lei0Min Li1Han Zhang2Yu Deng3Xinyu Dong4Pengyu Chen5Ye Li6Suhua Zhang7Chengtao Li8Shouyu Wang9Ruiyang Tao10Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Key Laboratory of Forensic Science, Ministry of Justice, Shanghai, ChinaSchool of Clinical and Basic Medical Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, ChinaInstitute of Forensic Science, Fudan University, Shanghai, ChinaShanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Key Laboratory of Forensic Science, Ministry of Justice, Shanghai, ChinaMinhang Branch of Shanghai Public Security Bureau, Shanghai, ChinaDepartment of Forensic Medicine, Zunyi Medical University, Zunyi, ChinaDepartment of Forensic Medicine, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, ChinaInstitute of Forensic Science, Fudan University, Shanghai, ChinaInstitute of Forensic Science, Fudan University, Shanghai, ChinaDepartment of Forensic Medicine, Shanghai Medical College, Fudan University, Shanghai, ChinaShanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Key Laboratory of Forensic Science, Ministry of Justice, Shanghai, ChinaABSTRACT The human microbiome, the community of microorganisms that reside on and inside the human body, is critically important for health and disease. However, it is influenced by various factors and may vary among individuals residing in distinct geographic regions. In this study, 220 samples, consisting of sterile swabs from palmar skin and oral and nasal cavities were collected from Chinese Han individuals living in Shanghai, Chifeng, Kunming, and Urumqi, representing the geographic regions of east, northeast, southwest, and northwest China. The full-length 16S rRNA gene of the microbiota in each sample was sequenced using the PacBio single-molecule real-time sequencing platform, followed by clustering the sequences into operational taxonomic units (OTUs). The analysis revealed significant differences in microbial communities among the four regions. Cutibacterium was the most abundant bacterium in palmar samples from Shanghai and Kunming, Psychrobacter in Chifeng samples, and Psychrobacillus in Urumqi samples. Additionally, Streptococcus and Staphylococcus were the dominant bacteria in the oral and nasal cavities. Individuals from the four regions could be distinguished and predicted based on a model constructed using the random forest algorithm, with the predictive effect of palmar microbiota being better than that of oral and nasal cavities. The prediction accuracy using hypervariable regions (V3-V4 and V4-V5) was comparable with that of using the entire 16S rRNA. Overall, our study highlights the distinctiveness of the human microbiome in individuals living in these four regions. Furthermore, the microbiome can serve as a biomarker for geographic origin inference, which has immense application value in forensic science.IMPORTANCEMicrobial communities in human hosts play a significant role in health and disease, varying in species, quantity, and composition due to factors such as gender, ethnicity, health status, lifestyle, and living environment. The characteristics of microbial composition at various body sites of individuals from different regions remain largely unexplored. This study utilized single-molecule real-time sequencing technology to detect the entire 16S rRNA gene of bacteria residing in the palmar skin, oral, and nasal cavities of Han individuals from four regions in China. The composition and structure of the bacteria at these three body sites were well characterized and found to differ regionally. The results elucidate the differences in bacterial communities colonizing these body sites across different regions and reveal the influence of geographical factors on human bacteria. These findings not only contribute to a deeper understanding of the diversity and geographical distribution of human bacteria but also enrich the microbiome data of the Asian population for further studies.https://journals.asm.org/doi/10.1128/msphere.00672-24human microbiome16S rRNAgeographic regionsmachine learning
spellingShingle Yinlei Lei
Min Li
Han Zhang
Yu Deng
Xinyu Dong
Pengyu Chen
Ye Li
Suhua Zhang
Chengtao Li
Shouyu Wang
Ruiyang Tao
Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference
mSphere
human microbiome
16S rRNA
geographic regions
machine learning
title Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference
title_full Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference
title_fullStr Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference
title_full_unstemmed Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference
title_short Comparative analysis of the human microbiome from four different regions of China and machine learning-based geographical inference
title_sort comparative analysis of the human microbiome from four different regions of china and machine learning based geographical inference
topic human microbiome
16S rRNA
geographic regions
machine learning
url https://journals.asm.org/doi/10.1128/msphere.00672-24
work_keys_str_mv AT yinleilei comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT minli comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT hanzhang comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT yudeng comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT xinyudong comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT pengyuchen comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT yeli comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT suhuazhang comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT chengtaoli comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT shouyuwang comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference
AT ruiyangtao comparativeanalysisofthehumanmicrobiomefromfourdifferentregionsofchinaandmachinelearningbasedgeographicalinference