TY - BOOK AU - Mohamed Omar Moawad Fares TI - BENCHMARKING SEVERAL CLUSTERING AND DENOISING APPROACHES FOR OTUs/ASVs INFERENCE FROM AMPLICON BASED SEQUENCING U1 - 610 PY - 2023/// KW - Informatics-IFM KW - NULIB KW - Dissertation, Academic N1 - Thesis (M.A.)—Nile University, Egypt, 2023; "Includes bibliographical references"; Contents: Dedication................................................................................................................... iii Acknowledgments....................................................................................................... iv List of Tables .............................................................................................................. vi List of Figures............................................................................................................ vii Abstract..................................................................................................................... viii Introduction................................................................................................................... 1 Background .................................................................................................................. 4 Methods ...................................................................................................................... 18 Results......................................................................................................................... 32 Discussion................................................................................................................... 44 References................................................................................................................... 53 N2 - Abstract: Amplicon sequencing is an indispensable tool for microbiome studies needed to unravel the taxonomical composition and relative abundance of microbial community. Yet, several artifacts are introduced at different processing steps, including sequencing errors necessitating the use of computational methods to eliminate those errors. Distance-based clustering into operational taxonomic units (OTUs) and sequence reads denoising into Amplicon Sequence Variants (ASVs) are two main approaches to handle this issue. Varying experimental setups and complex pipeline parameters have hindered unbiased comparisons between different approaches, resulting in divergent findings across separate studies. In this study, we aimed to conduct a comprehensive benchmarking analysis via an unbiased head-to-head comparison of eight different clustering and denoising algorithms by using a collection of various mocks from the Mockrobiota database. Using unified preprocessing steps for quality filtering and chimera removal, a fair comparison between DADA2, Deblur, MED, UNOISE3, UPARSE, DGC, Average neighborhood and Opticlust was conducted. DADA2 and UPARSE were the most efficient algorithms, producing comparable results in terms of overall error rate, percentage of exact matches to the mock reference and percentage of taxonomical over-splitting and over-merging. These results suggest that at the same level of quality preprocessing, sequence abundance filtering and chimera detection parameters, OTU clustering and ASV denoising produce comparable results with minor approach-dependent traits. Keywords: Amplicon Sequence Analysis, Denoising, Clustering, OTU, ASV, Benchmarking ER -