Normal view MARC view ISBD view

Text Auto-Tagging Using Wikipiedia / (Record no. 8843)

MARC details
000 -LEADER
fixed length control field	06642nam a22002537a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	210112b2018 a\|\|\|f mb\|\| 00\| 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency	EG-CaNU
Transcribing agency	EG-CaNU
041 0# - Language Code
Language code of text	eng
Language code of abstract	eng
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	610
100 0# - MAIN ENTRY--PERSONAL NAME
Personal name	Shaimaa Abdelber Shamseldin Ali
245 1# - TITLE STATEMENT
Title	Text Auto-Tagging Using Wikipiedia /
Statement of responsibility, etc.	Shaimaa Abdelber Shamseldin Ali
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Date of publication, distribution, etc.	2018
300 ## - PHYSICAL DESCRIPTION
Extent	95 p.
Other physical details	ill.
Dimensions	21 cm.
500 ## - GENERAL NOTE
Materials specified	Supervisor: Samhaa El-Beltagy
502 ## - Dissertation Note
Dissertation type	Thesis (M.A.)—Nile University, Egypt, 2018 .
504 ## - Bibliography
Bibliography	"Includes bibliographical references"
505 0# - Contents
Formatted contents note	Contents:<br/>Chapter 1: Introduction ................................................................................... 1<br/>1.1 Motivation…………. ..................................................................................... 1<br/>1.2 Problem definition………………. ................................................................ 2<br/>1.3 Contributions. …………….. .......................................................................... 2<br/>1.4 Thesis outline ………………. ....................................................................... 3<br/>Chapter 2: Background ................................................................................... 4<br/>2.1 Wikipedia…………….. ................................................................................. 4<br/>2.2 Text mining…………….. .............................................................................. 5<br/>2.2.1 Text mining application…………….. ................................................. 6<br/>2.2.2 Text mining pre-processing …………….. ........................................... 7<br/>2.2.3 Information Retrieval (IR) …………….. ............................................ 8<br/>2.2.4 Word Sense Disambiguation (WSD) …………….. .......................... 11<br/>2.3 Measuring semantic relatedness…………….. ............................................. 12<br/>2.3.1 Cosine Similarity…………….. ........................................................ 12<br/>2.3.2 The Jaccard Cofficient…………….. ................................................ 13<br/>2.3.3 Milne and Witten’s Wikipedia Link-based Measure (WLM)…………….. ........................................................................ 14<br/>2.4 Information retrieval evaluation measures…………….. ............................. 17<br/>Chapter 3: Related Work .............................................................................. 19<br/>3.1 Wikify! Linking Documents to Encyclopedia knowledge ......................... 19<br/>3.2 Learning to Link with Wikipedia .............................................................. 25<br/>3.3 Fast and accurate annotation of short text with Wikipedia pages ............. 30<br/>Table of Contents<br/>v<br/>3.3.1 Information Stored ..................................................................... 31<br/>3.3.2 Algorithm Applied ..................................................................... 32<br/>3.4 A model for Auto-Tagging of Research Papers based on Keyphrase Extraction Methods ................................................................................... 37<br/>Chapter 4: Design and Implementation ........................................................ 39<br/>4.1 Design objective ......................................................................................... 39<br/>4.2 The proposed approach .............................................................................. 40<br/>4.2.1 Phase 1: Building the concept dictionary ...................................... 41<br/>4.2.1.1 Extract needed information from Wikipedia and carry out processing on it ............................................................... 41<br/>4.2.1.2 Perform entry filtration ................................................... 52<br/>4.2.1.3 Measure semantic relatedness .......................................... 53<br/>4.2.1.4 Build an inverted index of dictionary entries .................. 56<br/>4.2.1.5 Perform entry partitioning ................................................ 57<br/>4.2.2 Phase 2: Tagging input text........................................................... 58<br/>Chapter 5: Evaluation ................................................................................... 68<br/>5.1 Building the evaluation dataset .................................................................. 68<br/>5.2 Result ........................................................................................................ 71<br/>5.3 Conclusion ................................................................................................ 71<br/>Chapter 6: Conclusion and Future Work ...................................................... 73<br/>6.1 Summary and Conclusion .......................................................................... 73<br/>6.2 Future Work ............................................................................................... 74<br/>List of Abbreviations ..................................................................................... 76<br/>References ......................................................................................................
520 3# - Abstract
Abstract	Abstract:<br/>Because of large amounts of unstructured text data generated on the Internet, Text mining is believed to have high opportunity to significant developments. An important goal of text mining is to sift through large volumes of text to extract patterns and models that can then be incorporated in intelligent applications, such as automatic text categorizers and named entity recognition. This dissertation proposes an efficient method for automatically annotating Arabic news stories with tags using Wikipedia. The idea of the system is to use Wikipedia article names, properties, and re-directs to build a pool of meaningful tags. Sophisticated and efficient matching methods are then used to detect text fragments in input news stories that correspond to entries in the constructed tag pool. Generated tags represent real life entities or concepts such as the names of popular places, known organizations, celebrities, etc. These tags can be used indirectly by a news site for indexing, clustering, classification, statistics generation or directly to give a news reader an overview of news story contents. Evaluation of the system has shown that the tags it generates are better than those generated by MSN Arabic news.
546 ## - Language Note
Language Note	Text in English, abstracts in English.
650 #4 - Subject
Subject	Informatics-IFM
655 #7 - Index Term-Genre/Form
Source of term	NULIB
focus term	Dissertation, Academic
690 ## - Subject
School	Informatics-IFM
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme	Dewey Decimal Classification
Koha item type	Thesis
650 #4 - Subject
--	266
655 #7 - Index Term-Genre/Form
--	187
690 ## - Subject
--	266

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Damaged status	Not for loan	Home library	Current library	Date acquired	Total Checkouts	Full call number	Date last seen	Price effective from	Koha item type
		Dewey Decimal Classification		Not For Loan	Main library	Main library	01/12/2021		610 / S.A.T / 2018	01/12/2021	01/12/2021	Thesis