TY - JOUR
T1 - Optimal 16S rRNA gene amplicon sequencing analysis for oral microbiota to avoid the potential bias introduced by trimming length, primer, and database
AU - Nagai, Takahiko
AU - Shiba, Takahiko
AU - Komatsu, Keiji
AU - Watanabe, Takayasu
AU - Nemoto, Takashi
AU - Maekawa, Shogo
AU - Kobayashi, Ryota
AU - Matsumura, Shunsuke
AU - Ohsugi, Yujin
AU - Katagiri, Sayaka
AU - Takeuchi, Yasuo
AU - Iwata, Takanori
N1 - Publisher Copyright:
Copyright © 2024 Nagai et al.
PY - 2024/12
Y1 - 2024/12
N2 - 16S rRNA gene amplicon sequencing analysis is used to investigate bacterial communities; however, the estimated bacterial composition can differ from the original due to experimental and analytical biases. Therefore, this study determines the optimal conditions to minimize the potential biases in 16S rRNA gene amplicon sequencing analysis using different DNA samples, trimming lengths, primers, and databases. The results of the mock1 community with 250 and 300 bp paired-end (PE), comprising 15 bacteria from various environments, showed the highest similarity between the theoretical value and the data using the Greengenes2 and the V3 region at the genus level and V5–V6 at the species level. In the 300 bp PE sequencing analysis of the mock2 community, comprising six major oral bacteria, the data using the V3–V4, V4, and V5–V6 regions with the SILVA, Greengenes2, and the Human Oral Microbiome Database (HOMD) showed the highest similarity to the theoretical values at the genus level. At the species level, the data using the V3–V4 and V4 regions with Greengenes2 and the data using the V1–V2 with HOMD exhibited the highest alignment with the theoretical values. In the species analysis of the dental calculus samples with 300 bp PE, the Shannon index value was higher with the V1–V2 region and HOMD than with others. Our results suggest that the optimal conditions for oral microbiome analysis are the combinations of 300 bp PE and the V3–V4 or V4 region with the Greengenes2 and the V1–V2 with HOMD.
AB - 16S rRNA gene amplicon sequencing analysis is used to investigate bacterial communities; however, the estimated bacterial composition can differ from the original due to experimental and analytical biases. Therefore, this study determines the optimal conditions to minimize the potential biases in 16S rRNA gene amplicon sequencing analysis using different DNA samples, trimming lengths, primers, and databases. The results of the mock1 community with 250 and 300 bp paired-end (PE), comprising 15 bacteria from various environments, showed the highest similarity between the theoretical value and the data using the Greengenes2 and the V3 region at the genus level and V5–V6 at the species level. In the 300 bp PE sequencing analysis of the mock2 community, comprising six major oral bacteria, the data using the V3–V4, V4, and V5–V6 regions with the SILVA, Greengenes2, and the Human Oral Microbiome Database (HOMD) showed the highest similarity to the theoretical values at the genus level. At the species level, the data using the V3–V4 and V4 regions with Greengenes2 and the data using the V1–V2 with HOMD exhibited the highest alignment with the theoretical values. In the species analysis of the dental calculus samples with 300 bp PE, the Shannon index value was higher with the V1–V2 region and HOMD than with others. Our results suggest that the optimal conditions for oral microbiome analysis are the combinations of 300 bp PE and the V3–V4 or V4 region with the Greengenes2 and the V1–V2 with HOMD.
KW - 16S rRNA gene sequencing
KW - dental calculus
KW - Greengenes2
KW - human oral microbiome database
KW - oral microbiota
UR - http://www.scopus.com/inward/record.url?scp=85211570202&partnerID=8YFLogxK
U2 - 10.1128/spectrum.03512-23
DO - 10.1128/spectrum.03512-23
M3 - Article
C2 - 39436127
AN - SCOPUS:85211570202
SN - 2165-0497
VL - 12
JO - Microbiology spectrum
JF - Microbiology spectrum
IS - 12
ER -