ObjectiveThis study aimed to analyze the bacteria in dental caries and establish an optimized dental-ca-ries diagnosis model based on 16S ribosomal RNA (rRNA) data of oral flora.MethodsWe searched the public databa-ses of microbiomes including NCBI, MG-RAST, EMBL-EBI, and QIITA and collected data involved in the relevant research on human oral microbiomes worldwide. The samples in the caries dataset (1 703) were compared with healthy ones (20 540) by using the microbial search engine (MSE) to obtain the microbiome novelty score (MNS) and construct a caries diagnosis model based on this index. Nonparametric multivariate ANOVA was used to analyze and compare the impact of different host factors on the oral flora MNS, and the model was optimized by controlling related factors. Finally, the effect of the model was evaluated by receiver operating characteristic (ROC) curve analysis.Results1) The oral microbiota distribution obviously differed among people with various oral-health statuses, and the species richness and species diversity index decreased. 2) ROC curve was used to evaluate the caries data set, and the area under ROC curve was AUC=0.67. 3) Among the five hosts’ factors including caries status, country, age, decayed missing filled tooth (DMFT) indices, and sampling site displayed the strongest effect on MNS of samples (P=0.001). 4) The AUC of the model was 0.87, 0.74, 0.74, and 0.75 in high caries, medium caries, low caries samples in Chinese children, and mixed dental plaque samples after controlling host factors, respectively.ConclusionThe model based on the analysis of 16S rRNA data of oral flora had good diagnostic efficiency.