Mohamed Omar Moawad Fares

BENCHMARKING SEVERAL CLUSTERING AND DENOISING APPROACHES FOR OTUs/ASVs INFERENCE FROM AMPLICON BASED SEQUENCING/ Mohamed Omar Moawad Fares - 2023 - 74 p. ill. 21 cm.

Supervisor: Mohamed El-Helw

Thesis (M.A.)—Nile University, Egypt, 2023 .

"Includes bibliographical references"

Contents:
Dedication................................................................................................................... iii
Acknowledgments....................................................................................................... iv
List of Tables .............................................................................................................. vi
List of Figures............................................................................................................ vii
Abstract..................................................................................................................... viii
Introduction................................................................................................................... 1
Background .................................................................................................................. 4
Methods ...................................................................................................................... 18
Results......................................................................................................................... 32
Discussion................................................................................................................... 44
References................................................................................................................... 53

Abstract:
Amplicon sequencing is an indispensable tool for microbiome studies
needed to unravel the taxonomical composition and relative abundance of
microbial community. Yet, several artifacts are introduced at different
processing steps, including sequencing errors necessitating the use of
computational methods to eliminate those errors. Distance-based
clustering into operational taxonomic units (OTUs) and sequence reads
denoising into Amplicon Sequence Variants (ASVs) are two main
approaches to handle this issue. Varying experimental setups and complex
pipeline parameters have hindered unbiased comparisons between
different approaches, resulting in divergent findings across separate
studies. In this study, we aimed to conduct a comprehensive benchmarking
analysis via an unbiased head-to-head comparison of eight different
clustering and denoising algorithms by using a collection of various mocks
from the Mockrobiota database. Using unified preprocessing steps for
quality filtering and chimera removal, a fair comparison between DADA2,
Deblur, MED, UNOISE3, UPARSE, DGC, Average neighborhood and
Opticlust was conducted. DADA2 and UPARSE were the most efficient
algorithms, producing comparable results in terms of overall error rate,
percentage of exact matches to the mock reference and percentage of
taxonomical over-splitting and over-merging. These results suggest that at
the same level of quality preprocessing, sequence abundance filtering and
chimera detection parameters, OTU clustering and ASV denoising
produce comparable results with minor approach-dependent traits.
Keywords: Amplicon Sequence Analysis, Denoising, Clustering,
OTU, ASV, Benchmarking


Text in English, abstracts in English and Arabic


Informatics-IFM


Dissertation, Academic

610