Confidence aware incremental learning approach for named entity linking / Talaat Maher Talaat Mohamed Khalil

By:

Talaat Maher Talaat Mohamed Khalil

Material type: Text

TextLanguage: English Summary language: English Publication details: 2016Description: 73 p. ill. 21 cmSubject(s):

Genre/Form:

Dissertation, Academic

DDC classification:

Contents:

Contents: 1 INTRODUCTION ........................................................................................................ 1 1.1 TASK DESCRIPTION .................................................................................................. 2 1.2 NEL APPLICATIONS ................................................................................................. 4 1.3 NEL RESOURCES ..................................................................................................... 5 1.3.1 Knowledge Bases ............................................................................................. 5 1.3.2 Datasets ............................................................................................................ 7 1.4 CHALLENGES AND CONTRIBUTIONS ......................................................................... 8 1.5 OUTLINE OF THE THESIS ......................................................................................... 10 2 STATE OF THE ART ............................................................................................... 11 2.1 EARLY NAMED ENTITY LINKING SYSTEMS ............................................................ 12 2.1.1 Non-Global disambiguation approaches ....................................................... 12 2.1.2 Global disambiguation approaches ............................................................... 12 2.2 STATE OF THE ART SYSTEMS .................................................................................. 13 2.2.1 Graphical model approaches ......................................................................... 13 2.2.1.1 Graph greedy optimization ............................................................................................................... 13 2.2.1.2 PageRank ......................................................................................................................................... 13 2.2.1.3 Markov models ................................................................................................................................ 14 2.2.2 Ranking approaches ....................................................................................... 15 2.2.2.1 Heuristics approaches ....................................................................................................................... 15 2.2.2.2 Machine learning approaches ........................................................................................................... 16 2.2.2.2.1 SVM-based ............................................................................................................................... 16 2.2.2.2.2 Neural Networks ....................................................................................................................... 16 2.2.2.2.3 Ensemble trees ......................................................................................................................... 18 2.3 SUMMARY .............................................................................................................. 19 3 NAMED ENTITY LINKING: THE PROPOSED MODEL .................................. 20 3.1 CANDIDATE GENERATION ...................................................................................... 21 3.2 CANDIDATE DISAMBIGUATION ............................................................................... 22 3.2.1 NEL Features ................................................................................................. 23 3.2.1.1 Popularity Features ........................................................................................................................... 23 3.2.1.2 String Similarity Features .................................................................................................................. 24 3.2.1.3 Entity Class Feature .......................................................................................................................... 25 3.2.1.4 Context Similarity Features ............................................................................................................... 25 v Word Template by Friedman & Morgan 2014 3.2.1.4.1 Building the Context Modelling Resources ...............................................................................28 3.2.1.4.1.1 Data Preparation and Pre-processing ...............................................................................28 3.2.1.4.1.2 Word and Entity Vectors Training ....................................................................................30 3.2.2 Learning to Rank ............................................................................................ 35 3.2.2.1 Gradient Boosted Regression Trees Classifier (GBRT) .......................................................................35 3.2.2.2 Training Setup ..................................................................................................................................37 3.2.2.2.1 Data Preparation and Training ..................................................................................................37 3.2.2.2.2 System Parameters Tuning .......................................................................................................38 3.2.2.3 Two Step Testing ...............................................................................................................................38 4 EXPERIMENTAL RESULTS AND ANALYSIS ................................................... 40 4.1 DATASETS AND METRICS ....................................................................................... 40 4.1.1 Datasets .......................................................................................................... 40 4.1.2 NEL Evaluation Metrics ................................................................................. 41 4.2 NEL SYSTEM RESULTS .......................................................................................... 41 4.2.1 Basic Configuration Results ........................................................................... 42 4.2.2 Skip-Gram Parameters Effect ........................................................................ 44 4.2.2.1 Wider Context ..................................................................................................................................45 4.2.2.2 Higher Vector Dimensionality ...........................................................................................................45 4.2.3 Average Classifier Results ............................................................................. 47 4.2.4 State Of The Art Comparison ......................................................................... 48 4.3 INCREMENTAL LEARNING ....................................................................................... 49 4.3.1 Incremental learning setup ............................................................................ 49 4.3.1.1 Incremental token vectors learning ..................................................................................................49 4.3.1.2 NEL ranker training and testing setup ...............................................................................................50 4.3.2 Incremental learning results .......................................................................... 52 5 CONFIDENCE SCORING ....................................................................................... 54 5.1 CONFIDENCE SCORER ............................................................................................. 54 5.2 CONFIDENCE OUTPUT ANALYSIS ........................................................................... 56 6 CONCLUSION AND FUTURE WORK ................................................................. 59 7 REFERENCES ........................................................................................................... 61 8 APPENDICES ............................................................................................................ 67 APPENDIX 1 ADDITIONAL EXPERIMENTAL RESULTS ................................. 68 APPENDIX 2 HIERARCHICAL SOFTMAX ...........................................................

Dissertation note: Thesis (M.A.)—Nile University, Egypt, 2016 . Abstract: Abstract: Named Entity Linking is the task of disambiguating entities in natural language text by linking them to their relevant entries in a knowledge base. The state of the art systems lack two main features that could be important in an industrial setting. First, they do not provide a confidence value associated with the output links. Second, the ability of these systems to cope with the daily incremental increase of entities has not been tested yet. The main contribution of the presented work, is that it proposes a system that tackles both problems while providing performance that is comparable to state of the art. Following the recent state of the art methods, we developed a ranking approach for the Named Entity Linking task. In addition to using entity popularity, string similarity, and Named Entity Recognition based features, we incorporated additional features to capture the similarity between the candidate entities and the input text. These features were derived from entity and word tokens which in turn were trained using the “Skip-Gram” model on the English Wikipedia. Our results show a comparable performance to the state of the art methods by achieving micro and macro linking accuracies of 89.5%, 90.3% respectively on the “AIDA” test set. Furthermore our incremental learning approach showed its effectiveness by achieving 84.8% and 84% micro and macro accuracies respectively using only less than 15% of the training data for training. The results also revealed the critical importance of the token vectors derived features in such an incremental learning scenarios. Our experimental results showed the ability of the system to provide comparable results to the full system using only language independent tools and resources, making the system portable to any language available in Wikipedia. Moreover, a confidence scoring approach was applied by employing a logistic classifier to give confidence value for the first ranked entity given the ranking scores of a predetermined number of the best ranked candidate entities. We show that the scorer was successfully able to capture the confidence values through the analysis of its precision, recall, and F-Score curves on a test set.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	Status	Date due	Barcode
Thesis	Main library	610/ T.K.C 2016 (Browse shelf(Opens below))	Not for loan

Supervisor: Samhaa El-Beltagy

Thesis (M.A.)—Nile University, Egypt, 2016 .

"Includes bibliographical references"

Contents:
1 INTRODUCTION ........................................................................................................ 1
1.1 TASK DESCRIPTION .................................................................................................. 2
1.2 NEL APPLICATIONS ................................................................................................. 4
1.3 NEL RESOURCES ..................................................................................................... 5
1.3.1 Knowledge Bases ............................................................................................. 5
1.3.2 Datasets ............................................................................................................ 7
1.4 CHALLENGES AND CONTRIBUTIONS ......................................................................... 8
1.5 OUTLINE OF THE THESIS ......................................................................................... 10
2 STATE OF THE ART ............................................................................................... 11
2.1 EARLY NAMED ENTITY LINKING SYSTEMS ............................................................ 12
2.1.1 Non-Global disambiguation approaches ....................................................... 12
2.1.2 Global disambiguation approaches ............................................................... 12
2.2 STATE OF THE ART SYSTEMS .................................................................................. 13
2.2.1 Graphical model approaches ......................................................................... 13
2.2.1.1 Graph greedy optimization ............................................................................................................... 13
2.2.1.2 PageRank ......................................................................................................................................... 13
2.2.1.3 Markov models ................................................................................................................................ 14
2.2.2 Ranking approaches ....................................................................................... 15
2.2.2.1 Heuristics approaches ....................................................................................................................... 15
2.2.2.2 Machine learning approaches ........................................................................................................... 16
2.2.2.2.1 SVM-based ............................................................................................................................... 16
2.2.2.2.2 Neural Networks ....................................................................................................................... 16
2.2.2.2.3 Ensemble trees ......................................................................................................................... 18
2.3 SUMMARY .............................................................................................................. 19
3 NAMED ENTITY LINKING: THE PROPOSED MODEL .................................. 20
3.1 CANDIDATE GENERATION ...................................................................................... 21
3.2 CANDIDATE DISAMBIGUATION ............................................................................... 22
3.2.1 NEL Features ................................................................................................. 23
3.2.1.1 Popularity Features ........................................................................................................................... 23
3.2.1.2 String Similarity Features .................................................................................................................. 24
3.2.1.3 Entity Class Feature .......................................................................................................................... 25
3.2.1.4 Context Similarity Features ............................................................................................................... 25
v
Word Template by Friedman & Morgan 2014
3.2.1.4.1 Building the Context Modelling Resources ...............................................................................28
3.2.1.4.1.1 Data Preparation and Pre-processing ...............................................................................28
3.2.1.4.1.2 Word and Entity Vectors Training ....................................................................................30
3.2.2 Learning to Rank ............................................................................................ 35
3.2.2.1 Gradient Boosted Regression Trees Classifier (GBRT) .......................................................................35
3.2.2.2 Training Setup ..................................................................................................................................37
3.2.2.2.1 Data Preparation and Training ..................................................................................................37
3.2.2.2.2 System Parameters Tuning .......................................................................................................38
3.2.2.3 Two Step Testing ...............................................................................................................................38
4 EXPERIMENTAL RESULTS AND ANALYSIS ................................................... 40
4.1 DATASETS AND METRICS ....................................................................................... 40
4.1.1 Datasets .......................................................................................................... 40
4.1.2 NEL Evaluation Metrics ................................................................................. 41
4.2 NEL SYSTEM RESULTS .......................................................................................... 41
4.2.1 Basic Configuration Results ........................................................................... 42
4.2.2 Skip-Gram Parameters Effect ........................................................................ 44
4.2.2.1 Wider Context ..................................................................................................................................45
4.2.2.2 Higher Vector Dimensionality ...........................................................................................................45
4.2.3 Average Classifier Results ............................................................................. 47
4.2.4 State Of The Art Comparison ......................................................................... 48
4.3 INCREMENTAL LEARNING ....................................................................................... 49
4.3.1 Incremental learning setup ............................................................................ 49
4.3.1.1 Incremental token vectors learning ..................................................................................................49
4.3.1.2 NEL ranker training and testing setup ...............................................................................................50
4.3.2 Incremental learning results .......................................................................... 52
5 CONFIDENCE SCORING ....................................................................................... 54
5.1 CONFIDENCE SCORER ............................................................................................. 54
5.2 CONFIDENCE OUTPUT ANALYSIS ........................................................................... 56
6 CONCLUSION AND FUTURE WORK ................................................................. 59
7 REFERENCES ........................................................................................................... 61
8 APPENDICES ............................................................................................................ 67
APPENDIX 1 ADDITIONAL EXPERIMENTAL RESULTS ................................. 68
APPENDIX 2 HIERARCHICAL SOFTMAX ...........................................................

Abstract:
Named Entity Linking is the task of disambiguating entities in natural language text by linking them to their relevant entries in a knowledge base. The state of the art systems lack two main features that could be important in an industrial setting. First, they do not provide a confidence value associated with the output links. Second, the ability of these systems to cope with the daily incremental increase of entities has not been tested yet. The main contribution of the presented work, is that it proposes a system that tackles both problems while providing performance that is comparable to state of the art.
Following the recent state of the art methods, we developed a ranking approach for the Named Entity Linking task. In addition to using entity popularity, string similarity, and Named Entity Recognition based features, we incorporated additional features to capture the similarity between the candidate entities and the input text. These features were derived from entity and word tokens which in turn were trained using the “Skip-Gram” model on the English Wikipedia. Our results show a comparable performance to the state of the art methods by achieving micro and macro linking accuracies of 89.5%, 90.3% respectively on the “AIDA” test set. Furthermore our incremental learning approach showed its effectiveness by achieving 84.8% and 84% micro and macro accuracies respectively using only less than 15% of the training data for training. The results also revealed the critical importance of the token vectors derived features in such an incremental learning scenarios.
Our experimental results showed the ability of the system to provide comparable results to the full system using only language independent tools and resources, making the system portable to any language available in Wikipedia.
Moreover, a confidence scoring approach was applied by employing a logistic classifier to give confidence value for the first ranked entity given the ranking scores of a predetermined number of the best ranked candidate entities. We show that the scorer was successfully able to capture the confidence values through the analysis of its precision, recall, and F-Score curves on a test set.

Text in English, abstracts in English .

There are no comments on this title.

to post a comment.