Identifying Biomarkers Affecting Response to Immune Checkpoint Blockers in Patients with Melanoma Using Supervised Machine Learning Techniques/ Hagar Elsayed abdelhamid elshora
Material type:
TextLanguage: English Summary language: English, Arabic Publication details: 2023Description: 115 p. ill. 21 cmSubject(s): Genre/Form: DDC classification: - 610
| Item type | Current library | Call number | Status | Date due | Barcode | |
|---|---|---|---|---|---|---|
Thesis
|
Main library | 610/ H.E.I/ 2023 (Browse shelf(Opens below)) | Not for loan |
Browsing Main library shelves Close shelf browser (Hides shelf browser)
Supervisor:
Sahar Fawzi
Thesis (M.A.)—Nile University, Egypt, 2023 .
"Includes bibliographical references"
Contents:
Contents
Page
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Chapters:
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Outline and Summary of Contributions . . . . . . . . . . . . . . . . . . . . . 3
2. Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Design of Immune system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Cancer and its therapeutic approaches . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Role of Therapeutic Agents in Alleviating Cancer Mortality . . . . . . . . . . . . . 7
2.4.1 Breakthrough of Immunotherapy Applications for Cancer . . . . . . . . . . 8
2.5 Cancer types treated with Immune checkpoint blockers . . . . . . . . . . . . . . . . 9
2.6 Cancer immunosurveillance and immunediting . . . . . . . . . . . . . . . . . . . . . 10
2.6.1 Cancer Immunosurveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6.2 Cancer Immunoediting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.7 Role of anti-CTLA-4 and anti-PD-1 antibodies in the management of Melanoma . . 11
2.7.1 Anti-CTLA-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7.2 Anti-PD-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7.3 Combination therapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.8 Melanoma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8.1 Mechanism of action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8.2 The genetics of melanoma risk phenotypes . . . . . . . . . . . . . . . . . . . 14
2.8.3 Clinical Efficacy in Malignant Melanoma . . . . . . . . . . . . . . . . . . . . 15
2.8.4 Role of immunological elements in determining immune reactions to ICB. . . 16
2.9 RNA-seq sequencing technology and its applications in Bioinformatics . . . . . . . 16
2.9.1 RNA-seq Sequencing Technology . . . . . . . . . . . . . . . . . . . . . . . . 16
2.9.2 Applications of RNA-seq Sequencing . . . . . . . . . . . . . . . . . . . . . . 16
2.9.3 Challenges and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . 17
2.10 Single-cell RNA-seq sequencing and its applications in Bioinformat- ics . . . . . . . 17
2.10.1 Single-cell Sequencing Technology . . . . . . . . . . . . . . . . . . . . . . . . 18
2.10.2 Applications of Single-cell Sequencing . . . . . . . . . . . . . . . . . . . . . 18
2.10.3 Challenges and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . 18
2.11 Machine Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.11.1 Significance of Machine Learning in Bioinformatics . . . . . . . . . . . . . . 19
2.11.2 Machine learning classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.11.3 Applications of Machine Learning in Bioinformatics . . . . . . . . . . . . . . 21
2.11.4 Challenges and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . 22
3. Hypothesis and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4. Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 RNA-Seq Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.1 Quality Check of the Raw reads . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.2 Mapping reads to reference transcriptome . . . . . . . . . . . . . . . . . . . 25
4.2.3 Differential Gene Expression Analysis . . . . . . . . . . . . . . . . . . . . . 26
4.2.4 Pathway Enrichment and Druggability Analysis . . . . . . . . . . . . . . . 26
4.3 Single cell RNA-seq Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.2 Filtering Low Quality Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.3 Feature Selection and Identification of highly variable features . . . . . . . . 28
4.3.4 Linear Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.5 Non-Linear Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . 29
4.3.6 Cluster Marker Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.7 Cell Type Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3.8 Differential Gene Expression analysis . . . . . . . . . . . . . . . . . . . . . . 30
4.3.9 Cell-cell communication analysis . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4 Maintaining the Machine Learning Model of the data . . . . . . . . . . . . . . . . . 32
4.4.1 Data input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.2 Feature Engineering & Feature Selection . . . . . . . . . . . . . . . . . . . . 32
4.4.3 Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4.4 Model Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4.5 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5. Results and Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1 RNA-Seq data analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.1 Quality checking of the raw reads . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.2 Reads Mapping to reference transcriptome . . . . . . . . . . . . . . . . . . . 37
5.1.3 Differential Expression Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1.4 Functional Gene Enrichment Analysis . . . . . . . . . . . . . . . . . . . . . 39
5.1.5 Pathway Enrichment Analysis of DEGs . . . . . . . . . . . . . . . . . . . . 40
5.1.6 GSEA Enrichment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.1.7 Disease Gene Association Analysis . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Single Cell RNA-Seq data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2.1 Filtering Low Quality Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2.2 Feature Selection and Identification of highly variable features . . . . . . . . 45
5.2.3 Linear Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.4 Non-Linear Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . 47
5.2.5 Cluster Marker Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2.6 Cell Type Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.7 Enrichment of Cell States in Responders Vs Non-responders . . . . . . . . . 54
5.2.8 Inference of Cell-Cell communication using Cell Chat . . . . . . . . . . . . 55
5.2.9 Identification of context-specific signaling pathways . . . . . . . . . . . . . . 59
5.3 Supervised Predictive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.1 Preprocessing Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3.3 Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.4 Classification Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.5 Model training and parameter tuning . . . . . . . . . . . . . . . . . . . . . . 62
5.3.6 Model validation and independent testing . . . . . . . . . . . . . . . . . . . 62
5.3.7 Comparison with Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 65
6. Conclusions and Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.2 Future Endeavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8. Supplementary Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Abstract:
Background:
Anti-Cytotoxic T-Lymphocyte Associated Protein 4 (Anti-CTLA4) and Anti-Programmed Death1 (Anti-PD-1) are antibodies that block immune checkpoint proteins and have been approved by
the FDA for the treatment of cancers such as melanoma, renal carcinoma, and non-small cell lung
cancer. Immunotherapy uses checkpoints such as CTLA-4 and PD-1 to stimulate patients’ immune
systems to detect and kill cancer cells while sparing normal cells. Despite the high response rate in
Melanoma patients, majority of patients are resistant to treatment, which is a major cause of death.
An emerging requirement to fully comprehend and control the use of immunotherapy is the identification of the elements that are causing or preventing such response to immune checkpoint treatment.
Therefore, this study firstly aimed to identify biomarkers driving the response to Immune checkpoint
blockers (ICBs) in melanoma patients based on gene expression levels from analysis of both RNA-seq
and Single-cell RNAseq data. Secondly, it aimed to develop a reliable model using supervised ML
approaches that can predict response to ICBs therapy.
Methods:
Two datasets were investigated, a bulk RNAseq dataset of melanoma patients treated with antipd1 was retrieved from the sequence read archive (PRJNA356761) whilst the single cell dataset of
melanoma patients treated with anti-PD1, anti-CTLA4 or combination was retrieved from the Gene
expression omnibus(GSE120575). For the RNAseq dataset, preprocessing was performed for quality
checking of the reads followed by mapping of reads to HG transcriptome by kallisto aligner for quantification of the gene expression followed by differential gene expression using Deseq2. KEGG pathways and GO functional categories were selected for gene set enrichment analysis using the DAVID
tool and GSEA package on R, along with gprofiler2 and enrichR. As for the single cell dataset, the
preprocessing included quality checking, followed by feature selection of the highly variable genes,
then linear dimensionality reduction by PCA followed by clustering of cells then visualisation using
iv
UMAP. Cell type annotation was performed manually and packed up with automatic approaches of
SingleR. Differential gene expression was conducted using Wilcoxon test followed by Cell-cell communication with CellChat. Both Datasets were then imported for constructing a reliable model for
ICB response prediction, different feature engineering approaches of DEGs, immune genes, cell type
signatures, inf-G signature and top variable 1000 features were applied, two types of cross validations such as LOOCV and k-fold were utilized, and multiple classifiers were used: Random forest,
Support vector machine, logistic regression and XGBoost, K-nearest neighbor for the RNA cohort,
while Adaboost, RF, Decision tree, KNN were implemented for the single cell cohort.
Results:
Single cell analysis identified enrichment of cell states like CD4 and CD8 Naive-like in responders
and enrichment of CD8 T Regulatory and CD8 T Exhausted cells in the non-responders, implying their association with immune checkpoint therapy response. Differential gene expression and
enrichment analysis identified regulation of T cell activation, T cell receptor signaling pathways,
and leukocyte cell-cell adhesion to be linked with response to immune checkpoint blockade therapy.
Whereas, the adaptive immune response based on somatic recombination of immune receptors built
from immunoglobulin superfamily domains as well as the complement activation is among different
mechanisms of resistance. Multiple machine learning classifiers were utilized to train a model for
predicting response of Immune checkpoint therapy in Melanoma patients. Adaboost showed the
highest performance with 86% F1 score and 95% AUC with DEGs as feature selection in the single
cell dataset. SVM showed the highest f1 score of 72.5% with InfG pathway genes as feature selection
compared to other classifiers.
Conclusion:
The thesis findings suggest the correlation of enrichment of CD4+ and CD8+ T cells with the response to immune checkpoint blockers in Melanoma patients. In addition, a reliable model was
developed using Adaboost and feature selection of the resulted SC DEGs to predict the response to
ICB so that therapy will be provided for those who will make the best benefit from. In addition, all
different layers of transcriptomic data have proven to provide a better overview from a large wide
scale, and in-depth view for the cell states of individual cells, leading to a better understanding of
the underlying mechanism of response to ICBs and thus predicting response to ICBs therapy.
v
Keywords
Immunotherapy, Single cell RNAseq Analysis, Machine learning, Transcriptomics, Immune checkpoint blockers, Bulk RNAseq analysis, Melanoma, Cancer, IFNG Signature, ICB Response, cell-cell
communication.
Text in English, abstracts in English and Arabic
There are no comments on this title.