Incorporating Perfect Hashing in the Enhanced Suffix Array to Improve Exact Pattern Matching / (Record no. 8782)

MARC details
000 -LEADER
fixed length control field 05317nam a22002537a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 210111b2009 a|||f mb|| 00| 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency EG-CaNU
Transcribing agency EG-CaNU
041 0# - Language Code
Language code of text eng
Language code of abstract eng
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 627
100 0# - MAIN ENTRY--PERSONAL NAME
Personal name Ayat Ahmed Reda Dawood Hatem
245 1# - TITLE STATEMENT
Title Incorporating Perfect Hashing in the Enhanced Suffix Array to Improve Exact Pattern Matching /
Statement of responsibility, etc. Ayat Ahmed Reda Dawood Hatem
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Date of publication, distribution, etc. 2009
300 ## - PHYSICAL DESCRIPTION
Extent 75 p.
Other physical details ill.
Dimensions 21 cm.
500 ## - GENERAL NOTE
Materials specified Supervisor: Mohamed Abouelhoda
502 ## - Dissertation Note
Dissertation type Thesis (M.A.)—Nile University, Egypt, 2009 .
504 ## - Bibliography
Bibliography "Includes bibliographical references"
505 0# - Contents
Formatted contents note Contents:<br/>Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br/>1.1 Exact Pattern Matching Problem . . . . . . . . . . . . . . . . . . . 1<br/>1.2 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>2. Exact Pattern Matching Over Indexing data structures . . . . . . . . . . 4<br/>2.1 Basic Notions and Definitions . . . . . . . . . . . . . . . . . . . . . 4<br/>2.2 Suffix Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4<br/>2.2.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 4<br/>2.2.2 Exact Pattern Matching over The Suffix Tree . . . . . . . . 6<br/>2.2.3 Drawbacks of The Suffix Tree . . . . . . . . . . . . . . . . . 6<br/>2.3 Suffix Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br/>2.3.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 7<br/>2.3.2 Exact Pattern Matching Over The Suffix Array . . . . . . . 8<br/>2.3.3 Drawbacks of The Suffix Array . . . . . . . . . . . . . . . . 10<br/>2.4 The Enhanced Suffix Array . . . .<br/>2.4.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 11<br/>2.4.2 Exact Pattern Matching Over The Enhanced Suffix Array . 13<br/>3. Minimal Perfect Hash Functions . . . . . . . . . . . . . . . . . . . . . . . 21<br/>3.1 Minimal perfect hash functions (MPHF) . . . . . . . . . . . . . . . 21<br/>3.2 Methods for Constructing Perfect Hashing . . . . . . . . . . . . . . 22<br/>3.2.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br/>3.2.2 Reducing the search space . . . . . . . . . . . . . . . . . . . 24<br/>3.2.3 Probabilistic perfect hashing . . . . . . . . . . . . . . . . . 26<br/>3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 27<br/>3.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br/>4. Exact Pattern Matching in O(m) . . . . . . . . . . . . . . . . . . . . . . 32<br/>4.1 Proposed Searching Method . . . . . . . . . . . . . . . . . . . . . . 32<br/>4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 36<br/>4.2.1 Lcp-interval tree pruning . . . . . . . . . . . . . . . . . . . 37<br/>4.2.2 Replacing the home table with the index tree . . . . . . . . 38<br/>4.2.3 Binary search instead of sequential search . . . . . . . . . . 40<br/>4.2.4 Lcp-table in Manber Myer’s . . . . . . . . . . . . . . . . . . 40<br/>4.2.5 Other variations . . . . . . . . . . . . . . . . . . . . . . . . 40<br/>5. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br/>5.1 Algorithm variations . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br/>5.2 Running Time for The Different Algorithms . . . . . . . . . . . . . 46<br/>5.2.1 Enumerative Search . . . . . . . . . . . . . . . . . . . . . . 49<br/>5.2.2 Counting Search . . . . . . . . . . . . . . . . . . . . . . . . 54<br/>6. Conclusions and Future Work . . . . .
520 3# - Abstract
Abstract Abstract:<br/>The exact pattern matching problem is the problem of finding the exact match<br/>for a pattern in a text. This problem is a basic problem in many applications in<br/>computer science. Many algorithms recently presented to solve this problem are<br/>based on indexing data structures. A major indexing data structure used by these<br/>algorithms is the enhanced suffix array, which is basically the suffix array enhanced<br/>with extra tables that give further information. The best theoretical time bound<br/>based on this data structure is O(mlog ||), where m is the pattern length and ||<br/>is the alphabet length. Although this complexity is theoretically considered the best<br/>achieved, it could not practically outperform the performance of the algorithm of<br/>Manber and Myers based only on the suffix array that takes O(mlog n), where n is<br/>the length of the text. In fact, Manber and Myers based algorithm is faster 1.5 up to<br/>6.5 times compared to the other algorithms.<br/>In this thesis, we present a modification to the enhanced suffix array which enables<br/>us to achieve O(m) time to find the pattern, i.e., independent of . The basic idea<br/>behind this improvement is the extensive use of the minimal perfect hashing technique,<br/>by which a static set of n elements can be accessed in O(1) time while maintaining<br/>a linear space. Furthermore, we explored different implementation tricks to speed up<br/>our searching algorithm. Interestingly, our algorithm became faster than Manber and<br/>Myers’ algorithm for certain alphabet size and range of query lengths
546 ## - Language Note
Language Note Text in English, abstracts in English.
650 #4 - Subject
Subject Software Engineering
655 #7 - Index Term-Genre/Form
Source of term NULIB
focus term Dissertation, Academic
690 ## - Subject
School Software Engineering
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Thesis
650 #4 - Subject
-- 211
655 #7 - Index Term-Genre/Form
-- 187
690 ## - Subject
-- 211
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Home library Current library Date acquired Total Checkouts Full call number Date last seen Price effective from Koha item type
    Dewey Decimal Classification   Not For Loan Main library Main library 01/11/2021   627/ A.H.I 2009 01/11/2021 01/11/2021 Thesis