MoArLex: An Arabic Sentiment Lexicon Built Through Automatic Lexicon Expansion / Mohab Youssef Shawky
Material type:
TextLanguage: English Summary language: English Publication details: 2019Description: 88 p. ill. 21 cmSubject(s): Genre/Form: DDC classification: - 610
| Item type | Current library | Call number | Status | Date due | Barcode | |
|---|---|---|---|---|---|---|
Thesis
|
Main library | 610 / M.Y.M / 2019 (Browse shelf(Opens below)) | Not For Loan |
Browsing Main library shelves Close shelf browser (Hides shelf browser)
Supervisor: Mohamed A. El-Helw
Thesis (M.A.)—Nile University, Egypt, 2019 .
"Includes bibliographical references"
Contents:
CHAPTER I Introduction ................................................................................................ 14
1.1 Problem Definition .................................................................................................. 14
1.2 Objectives ................................................................................................................ 15
1.3 Motivation ............................................................................................................... 15
1.4 Thesis organization ................................................................................................. 17
1.5 Publication related to this work .............................................................................. 17
CHAPTER II Literature Review ...................................................................................... 18
2.1 The beginning of Arabic Sentiment Lexicons ........................................................ 18
2.2 Current alternative Arabic lexicons ........................................................................ 19
2.2.1 Arabic sentiment analysis: Lexicon-based and Corpus-based [27] ................. 19
2.2.2 Automatic expandable large-scale sentiment lexicon of Modern Standard Arabic and Colloquial [36] ....................................................................................... 23
2.2.3 SANA: A Large-Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis [18] ................................................................. 26
2.2.4 A Proposed Sentiment Analysis Tool for Modern Arabic ............................... 28
vi
Using Human-Based Computing [40] ....................................................................... 28
2.2.5 Mining Arabic Business Reviews [41] ............................................................ 30
2.2.6 Idioms-Proverbs Lexicon for Modern Standard Arabic and Colloquial Sentiment Analysis [44] ............................................................................................ 32
2.2.7 BiSAL - A bilingual sentiment analysis lexicon to analyse Dark Web forums for cybersecurity [46] ................................................................................................ 35
2.2.8 NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic [10] ................................................................................... 39
2.2.9 Sentiment Lexicons for Arabic Social Media [16] .......................................... 44
2.2.9 A web-based tool for Arabic sentiment analysis [58] ...................................... 48
2.2.10 Lexicon-Based Sentiment Analysis of Arabic Tweets [65] ........................... 51
2.2.11 Improving Sentiment Analysis in Arabic Using Word Representation [67] 54
2.2 Summary of Lexicon Expansion Works ........................................................... 57
2.3 Conclusion .............................................................................................................. 59
CHAPTER III The Proposed System: Arabic Sentiment Lexicon Construction and Sentiment Analysis Tool ................................................................................................... 60
3.1 Methodology ........................................................................................................... 60
3.1.1 Lexicon Expansion ........................................................................................... 60
3.1.1.1 Generation of candidate terms .................................................................. 62
3.1.1.2 Filtering of candidate terms ...................................................................... 64
3.1.1.3 Polarity determination ............................................................................... 65
3.1.1.4 Time taken for building the lexicon .......................................................... 69
3.1.1.5 Hardware & software used for building lexicon ....................................... 69
3.2 Main features of the presented lexicon ................................................................... 69
vii
3.3 A Simple Sentiment Analysis Tool for Lexicon Evaluation ................................... 70
CHAPTER IV Evaluation ................................................................................................. 73
Introduction ................................................................................................................... 73
4.1 Manual Evaluation of the Lexicon .......................................................................... 73
4.2 Comparison between MoArLex and other lexicons ............................................... 76
4.3 Supervised Learning Experiment ............................................................................ 79
4.4 Evaluating the Lexicon through Sentiment Analysis .............................................. 80
4.5 Conclusion .............................................................................................................. 80
CHAPTER V Conclusion and Future Work ..................................................................... 82
5.1 Conclusion .............................................................................................................. 82
5.2 Future Work ............................................................................................................ 82
References .........................................................................................................................
Abstract:
Research addressing Sentiment Analysis has witnessed great attention over the last decade especially after the huge increase in social media network usage. Social networks like Facebook and Twitter generate an incredible amount of data on a daily basis, containing posts that discuss all kinds of different topics ranging from sports and products to politics and current events. Since data generated within these mediums are created by users from all over the world, it is multilingual in nature. Arabic is one of the important languages recently targeted by many sentiment analysis efforts. However, Arabic is considered to be under-resourced in terms of lexicons and datasets when compared to English.
This work presents a novel technique for automatically expanding an Arabic sentiment lexicon using word embeddings. The main aim of this work is to build high coverage Arabic lexicon automatically. The main objective of this work is to use the built Arabic lexicon in a sentiment analysis task. Moreover, the proposed system is designed to overcome the low accuracy problem of the other automatically expanded Arabic lexicons.
Evaluation of the quality of the automatically added terms was done in multiple ways, all of which have shown that lexicon entries added using the presented way are more accurate than sentiment lexicon entries obtained using machine learning or distant supervision methods.
Text in English, abstracts in English.
There are no comments on this title.