Image from Google Jackets

Supervised and Unsupervised Hybrid Models/ Mina Adly Awad

By: Material type: TextTextLanguage: English Summary language: English Publication details: 2022Description: 91 p. ill. 21 cmSubject(s): Genre/Form: DDC classification:
  • 610
Contents:
Contents: ERTIFICATION OF APPROVAL DEDICATION ACKNOWLEDGEMENTS TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES PUBLICATIONS ACRONYMS ABSTRACT 1. INTRODUCTION 1.1. MACHINE LEARNING APPLICATION IN INDUSTRIAL USE CASES 1.2. MACHINE LEARNING APPLICATIONS IN TELECOM INDUSTRY 1.3. CHALLENGING DATASET FOR SUPERVISED METHODS 1.4. NON-LINEAR SYSTEMS CHALLENGES 2. RESEARCH HISTORICAL OVERVIEW 2.1. UNSUPERVISED MODELS AS AN ENHANCEMENT TO SPECIFIC SUPERVISED MODEL 2.2. UNSUPERVISED MODELS FOR APPLICATION-ORIENTED REQUIREMENTS 2.3. UNSUPERVISED WITH SUPERVISED MODELS IN THE TELECOM INDUSTRY 3. PROPOSED MODEL ARCHITECTURE 3.1. CLASSICAL METHODS USED UNSUPERVISED MODELS WITH SUPERVISED MODELS 3.1.1. ELBOW METHOD 3.1.2. DUNN INDEX 3.2. PROPOSED MODEL ARCHITECTURE 3.3. TESTING AND VALIDATION FLOW FOR NON-SCENE DATA 3.4. MODEL CHALLENGES 4. MOBILE NETWORK KPIS AND DEFINITIONS 4.1. USED DATASETS OVERVIEW 4.2. MOBILE NETWORK ARCHITECTURE 4.3. DATASET FEATURES 5. DATA INSIGHTS AND VISUALIZATION 5.1. FIRST DATASET VISUALIZATION 5.2. SECOND DATASET VISUALIZATION6. USED PLATFORM AND ASSESSMENT CRITERIA 6.1. BISECTING K-MEANS ALGORITHM 6.2. K-MEANS ALGORITHM 6.3. ASSESSMENT METHOD 7. EXPERIMENTS AND RESULTS 7.1. EXPERIMENTS 7.2. RESULTS 8. CONCLUSION & FUTURE WORK References
Dissertation note: Thesis (M.A.)—Nile University, Egypt, 2022 . Abstract: Abstract: Industrial operations aim to use automation and Machine Learning (ML) models to achieve better operation quality. Datasets provided by the industry in real-life use cases have a challenging design. Industrial datasets are commonly large and have more than a single signature. ML models aim to search for the dataset signature and phenomena to train the model on. Linear and non-linear models provide a solution for datasets with a single or a limited number of phenomena. A dataset with more than one phenomenon or a signature trend is a noisy dataset for this model. In practical cases, ML models are trained over the major trend in the dataset. The other trends are considered noise and ignored. This is a challenge for many industrial applications that we aim to provide a solution for. In this thesis, we assumed that if we split the dataset into a count of sub-datasets, more than one model can be trained for each sub-dataset. This can decrease the overall error compared with a single model when applied to the main dataset. This splitting can be done visually or using unsupervised learning models. Researchers applied unsupervised learning before supervised learning models had a common flow for this process. Researchers commonly started by solving the unsupervised learning problem. This is done by searching for the best number of clusters using methods like the elbow method. This provides better results when compared with a supervised model alone. However, in our research, we proved that other possibilities lead to better results. In this thesis, we provided a new method by applying unsupervised models before supervised models. This method avoids going through the classical methods for validating the number of clusters. It provides better results when compared with both techniques supervised model or classical clustering method. Mobile networks have use cases that match the dataset signature we mentioned. Data is generated from different network nodes. Each node has its behavior based on the nature of the carried traffic and the covered area. This by nature generates a dataset with multiple signatures. We used one of these datasets to test our model. We applied a classical regression model over this result and a classical method for applying clustering and then regression. Both results were compared with our model.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)

Supervisor:
Dr. Nashwa Abdelbaki

Thesis (M.A.)—Nile University, Egypt, 2022 .

"Includes bibliographical references"

Contents:
ERTIFICATION OF APPROVAL
DEDICATION
ACKNOWLEDGEMENTS
TABLE OF CONTENTS
LIST OF FIGURES
LIST OF TABLES
PUBLICATIONS
ACRONYMS
ABSTRACT
1. INTRODUCTION
1.1. MACHINE LEARNING APPLICATION IN INDUSTRIAL USE CASES
1.2. MACHINE LEARNING APPLICATIONS IN TELECOM INDUSTRY
1.3. CHALLENGING DATASET FOR SUPERVISED METHODS
1.4. NON-LINEAR SYSTEMS CHALLENGES
2. RESEARCH HISTORICAL OVERVIEW
2.1. UNSUPERVISED MODELS AS AN ENHANCEMENT TO SPECIFIC SUPERVISED MODEL
2.2. UNSUPERVISED MODELS FOR APPLICATION-ORIENTED REQUIREMENTS
2.3. UNSUPERVISED WITH SUPERVISED MODELS IN THE TELECOM INDUSTRY
3. PROPOSED MODEL ARCHITECTURE
3.1. CLASSICAL METHODS USED UNSUPERVISED MODELS WITH SUPERVISED MODELS
3.1.1. ELBOW METHOD
3.1.2. DUNN INDEX
3.2. PROPOSED MODEL ARCHITECTURE
3.3. TESTING AND VALIDATION FLOW FOR NON-SCENE DATA
3.4. MODEL CHALLENGES
4. MOBILE NETWORK KPIS AND DEFINITIONS
4.1. USED DATASETS OVERVIEW
4.2. MOBILE NETWORK ARCHITECTURE
4.3. DATASET FEATURES
5. DATA INSIGHTS AND VISUALIZATION
5.1. FIRST DATASET VISUALIZATION
5.2. SECOND DATASET VISUALIZATION6. USED PLATFORM AND ASSESSMENT CRITERIA
6.1. BISECTING K-MEANS ALGORITHM
6.2. K-MEANS ALGORITHM
6.3. ASSESSMENT METHOD
7. EXPERIMENTS AND RESULTS
7.1. EXPERIMENTS
7.2. RESULTS
8. CONCLUSION & FUTURE WORK
References

Abstract:
Industrial operations aim to use automation and Machine Learning (ML) models to achieve better operation quality. Datasets provided by the industry in real-life use cases have a challenging design. Industrial datasets are commonly large and have more than a single signature. ML models aim to search for the dataset signature and phenomena to train the model on. Linear and non-linear models provide a solution for datasets with a single or a limited number of phenomena. A dataset with more than one phenomenon or a signature trend is a noisy dataset for this model. In practical cases, ML models are trained over the major trend in the dataset. The other trends are considered noise and ignored. This is a challenge for many industrial applications that we aim to provide a solution for.
In this thesis, we assumed that if we split the dataset into a count of sub-datasets, more than one model can be trained for each sub-dataset. This can decrease the overall error compared with a single model when applied to the main dataset. This splitting can be done visually or using unsupervised learning models. Researchers applied unsupervised learning before supervised learning models had a common flow for this process. Researchers commonly started by solving the unsupervised learning problem. This is done by searching for the best number of clusters using methods like the elbow method. This provides better results when compared with a supervised model alone. However, in our research, we proved that other possibilities lead to better results.
In this thesis, we provided a new method by applying unsupervised models before supervised models. This method avoids going through the classical methods for validating the number of clusters. It provides better results when compared with both techniques supervised model or classical clustering method.
Mobile networks have use cases that match the dataset signature we mentioned. Data is generated from different network nodes. Each node has its behavior based on the nature of the carried traffic and the covered area. This by nature generates a dataset with multiple signatures. We used one of these datasets to test our model. We applied a classical regression model over this result and a classical method for applying clustering and then regression. Both results were compared with our model.

Text in English, abstracts in English.

There are no comments on this title.

to post a comment.