000 04028nam a22002537a 4500
008 201210b2023 a|||f bm|| 00| 0 eng d
040 _aEG-CaNU
_cEG-CaNU
041 0 _aeng
_beng
_bara
082 _a610
100 0 _aMohamed Omar Moawad Fares
_93141
245 1 _aBENCHMARKING SEVERAL CLUSTERING AND DENOISING APPROACHES FOR OTUs/ASVs INFERENCE FROM AMPLICON BASED SEQUENCING/
_cMohamed Omar Moawad Fares
260 _c2023
300 _a74 p.
_bill.
_c21 cm.
500 _3Supervisor: Mohamed El-Helw
502 _aThesis (M.A.)—Nile University, Egypt, 2023 .
504 _a"Includes bibliographical references"
505 0 _aContents: Dedication................................................................................................................... iii Acknowledgments....................................................................................................... iv List of Tables .............................................................................................................. vi List of Figures............................................................................................................ vii Abstract..................................................................................................................... viii Introduction................................................................................................................... 1 Background .................................................................................................................. 4 Methods ...................................................................................................................... 18 Results......................................................................................................................... 32 Discussion................................................................................................................... 44 References................................................................................................................... 53
520 3 _aAbstract: Amplicon sequencing is an indispensable tool for microbiome studies needed to unravel the taxonomical composition and relative abundance of microbial community. Yet, several artifacts are introduced at different processing steps, including sequencing errors necessitating the use of computational methods to eliminate those errors. Distance-based clustering into operational taxonomic units (OTUs) and sequence reads denoising into Amplicon Sequence Variants (ASVs) are two main approaches to handle this issue. Varying experimental setups and complex pipeline parameters have hindered unbiased comparisons between different approaches, resulting in divergent findings across separate studies. In this study, we aimed to conduct a comprehensive benchmarking analysis via an unbiased head-to-head comparison of eight different clustering and denoising algorithms by using a collection of various mocks from the Mockrobiota database. Using unified preprocessing steps for quality filtering and chimera removal, a fair comparison between DADA2, Deblur, MED, UNOISE3, UPARSE, DGC, Average neighborhood and Opticlust was conducted. DADA2 and UPARSE were the most efficient algorithms, producing comparable results in terms of overall error rate, percentage of exact matches to the mock reference and percentage of taxonomical over-splitting and over-merging. These results suggest that at the same level of quality preprocessing, sequence abundance filtering and chimera detection parameters, OTU clustering and ASV denoising produce comparable results with minor approach-dependent traits. Keywords: Amplicon Sequence Analysis, Denoising, Clustering, OTU, ASV, Benchmarking
546 _aText in English, abstracts in English and Arabic
650 4 _aInformatics-IFM
_9266
655 7 _2NULIB
_aDissertation, Academic
_9187
690 _aInformatics-IFM
_9266
942 _2ddc
_cTH
999 _c10258
_d10258