000 04430nam a22002537a 4500
008 201210b2022 a|||f bm|| 00| 0 eng d
040 _aEG-CaNU
_cEG-CaNU
041 0 _aeng
_beng
082 _a610
100 0 _aMina Adly Awad
_91853
245 1 _aSupervised and Unsupervised Hybrid Models/
_cMina Adly Awad
260 _c2022
300 _a91 p.
_bill.
_c21 cm.
500 _3Supervisor: Dr. Nashwa Abdelbaki
502 _aThesis (M.A.)—Nile University, Egypt, 2022 .
504 _a"Includes bibliographical references"
505 0 _aContents: ERTIFICATION OF APPROVAL DEDICATION ACKNOWLEDGEMENTS TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES PUBLICATIONS ACRONYMS ABSTRACT 1. INTRODUCTION 1.1. MACHINE LEARNING APPLICATION IN INDUSTRIAL USE CASES 1.2. MACHINE LEARNING APPLICATIONS IN TELECOM INDUSTRY 1.3. CHALLENGING DATASET FOR SUPERVISED METHODS 1.4. NON-LINEAR SYSTEMS CHALLENGES 2. RESEARCH HISTORICAL OVERVIEW 2.1. UNSUPERVISED MODELS AS AN ENHANCEMENT TO SPECIFIC SUPERVISED MODEL 2.2. UNSUPERVISED MODELS FOR APPLICATION-ORIENTED REQUIREMENTS 2.3. UNSUPERVISED WITH SUPERVISED MODELS IN THE TELECOM INDUSTRY 3. PROPOSED MODEL ARCHITECTURE 3.1. CLASSICAL METHODS USED UNSUPERVISED MODELS WITH SUPERVISED MODELS 3.1.1. ELBOW METHOD 3.1.2. DUNN INDEX 3.2. PROPOSED MODEL ARCHITECTURE 3.3. TESTING AND VALIDATION FLOW FOR NON-SCENE DATA 3.4. MODEL CHALLENGES 4. MOBILE NETWORK KPIS AND DEFINITIONS 4.1. USED DATASETS OVERVIEW 4.2. MOBILE NETWORK ARCHITECTURE 4.3. DATASET FEATURES 5. DATA INSIGHTS AND VISUALIZATION 5.1. FIRST DATASET VISUALIZATION 5.2. SECOND DATASET VISUALIZATION6. USED PLATFORM AND ASSESSMENT CRITERIA 6.1. BISECTING K-MEANS ALGORITHM 6.2. K-MEANS ALGORITHM 6.3. ASSESSMENT METHOD 7. EXPERIMENTS AND RESULTS 7.1. EXPERIMENTS 7.2. RESULTS 8. CONCLUSION & FUTURE WORK References
520 3 _aAbstract: Industrial operations aim to use automation and Machine Learning (ML) models to achieve better operation quality. Datasets provided by the industry in real-life use cases have a challenging design. Industrial datasets are commonly large and have more than a single signature. ML models aim to search for the dataset signature and phenomena to train the model on. Linear and non-linear models provide a solution for datasets with a single or a limited number of phenomena. A dataset with more than one phenomenon or a signature trend is a noisy dataset for this model. In practical cases, ML models are trained over the major trend in the dataset. The other trends are considered noise and ignored. This is a challenge for many industrial applications that we aim to provide a solution for. In this thesis, we assumed that if we split the dataset into a count of sub-datasets, more than one model can be trained for each sub-dataset. This can decrease the overall error compared with a single model when applied to the main dataset. This splitting can be done visually or using unsupervised learning models. Researchers applied unsupervised learning before supervised learning models had a common flow for this process. Researchers commonly started by solving the unsupervised learning problem. This is done by searching for the best number of clusters using methods like the elbow method. This provides better results when compared with a supervised model alone. However, in our research, we proved that other possibilities lead to better results. In this thesis, we provided a new method by applying unsupervised models before supervised models. This method avoids going through the classical methods for validating the number of clusters. It provides better results when compared with both techniques supervised model or classical clustering method. Mobile networks have use cases that match the dataset signature we mentioned. Data is generated from different network nodes. Each node has its behavior based on the nature of the carried traffic and the covered area. This by nature generates a dataset with multiple signatures. We used one of these datasets to test our model. We applied a classical regression model over this result and a classical method for applying clustering and then regression. Both results were compared with our model.
546 _aText in English, abstracts in English.
650 4 _aInformatics
655 7 _2NULIB
_aDissertation, Academic
_9187
690 _aInformatics
_91856
942 _2ddc
_cTH
999 _c9776
_d9776