MARC details
| 000 -LEADER |
| fixed length control field |
05317nam a22002537a 4500 |
| 008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
| fixed length control field |
210111b2009 a|||f mb|| 00| 0 eng d |
| 040 ## - CATALOGING SOURCE |
| Original cataloging agency |
EG-CaNU |
| Transcribing agency |
EG-CaNU |
| 041 0# - Language Code |
| Language code of text |
eng |
| Language code of abstract |
eng |
| 082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER |
| Classification number |
627 |
| 100 0# - MAIN ENTRY--PERSONAL NAME |
| Personal name |
Ayat Ahmed Reda Dawood Hatem |
| 245 1# - TITLE STATEMENT |
| Title |
Incorporating Perfect Hashing in the Enhanced Suffix Array to Improve Exact Pattern Matching / |
| Statement of responsibility, etc. |
Ayat Ahmed Reda Dawood Hatem |
| 260 ## - PUBLICATION, DISTRIBUTION, ETC. |
| Date of publication, distribution, etc. |
2009 |
| 300 ## - PHYSICAL DESCRIPTION |
| Extent |
75 p. |
| Other physical details |
ill. |
| Dimensions |
21 cm. |
| 500 ## - GENERAL NOTE |
| Materials specified |
Supervisor: Mohamed Abouelhoda |
| 502 ## - Dissertation Note |
| Dissertation type |
Thesis (M.A.)—Nile University, Egypt, 2009 . |
| 504 ## - Bibliography |
| Bibliography |
"Includes bibliographical references" |
| 505 0# - Contents |
| Formatted contents note |
Contents:<br/>Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br/>1.1 Exact Pattern Matching Problem . . . . . . . . . . . . . . . . . . . 1<br/>1.2 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>2. Exact Pattern Matching Over Indexing data structures . . . . . . . . . . 4<br/>2.1 Basic Notions and Definitions . . . . . . . . . . . . . . . . . . . . . 4<br/>2.2 Suffix Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4<br/>2.2.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 4<br/>2.2.2 Exact Pattern Matching over The Suffix Tree . . . . . . . . 6<br/>2.2.3 Drawbacks of The Suffix Tree . . . . . . . . . . . . . . . . . 6<br/>2.3 Suffix Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br/>2.3.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 7<br/>2.3.2 Exact Pattern Matching Over The Suffix Array . . . . . . . 8<br/>2.3.3 Drawbacks of The Suffix Array . . . . . . . . . . . . . . . . 10<br/>2.4 The Enhanced Suffix Array . . . .<br/>2.4.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 11<br/>2.4.2 Exact Pattern Matching Over The Enhanced Suffix Array . 13<br/>3. Minimal Perfect Hash Functions . . . . . . . . . . . . . . . . . . . . . . . 21<br/>3.1 Minimal perfect hash functions (MPHF) . . . . . . . . . . . . . . . 21<br/>3.2 Methods for Constructing Perfect Hashing . . . . . . . . . . . . . . 22<br/>3.2.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br/>3.2.2 Reducing the search space . . . . . . . . . . . . . . . . . . . 24<br/>3.2.3 Probabilistic perfect hashing . . . . . . . . . . . . . . . . . 26<br/>3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 27<br/>3.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br/>4. Exact Pattern Matching in O(m) . . . . . . . . . . . . . . . . . . . . . . 32<br/>4.1 Proposed Searching Method . . . . . . . . . . . . . . . . . . . . . . 32<br/>4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 36<br/>4.2.1 Lcp-interval tree pruning . . . . . . . . . . . . . . . . . . . 37<br/>4.2.2 Replacing the home table with the index tree . . . . . . . . 38<br/>4.2.3 Binary search instead of sequential search . . . . . . . . . . 40<br/>4.2.4 Lcp-table in Manber Myer’s . . . . . . . . . . . . . . . . . . 40<br/>4.2.5 Other variations . . . . . . . . . . . . . . . . . . . . . . . . 40<br/>5. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br/>5.1 Algorithm variations . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br/>5.2 Running Time for The Different Algorithms . . . . . . . . . . . . . 46<br/>5.2.1 Enumerative Search . . . . . . . . . . . . . . . . . . . . . . 49<br/>5.2.2 Counting Search . . . . . . . . . . . . . . . . . . . . . . . . 54<br/>6. Conclusions and Future Work . . . . . |
| 520 3# - Abstract |
| Abstract |
Abstract:<br/>The exact pattern matching problem is the problem of finding the exact match<br/>for a pattern in a text. This problem is a basic problem in many applications in<br/>computer science. Many algorithms recently presented to solve this problem are<br/>based on indexing data structures. A major indexing data structure used by these<br/>algorithms is the enhanced suffix array, which is basically the suffix array enhanced<br/>with extra tables that give further information. The best theoretical time bound<br/>based on this data structure is O(mlog ||), where m is the pattern length and ||<br/>is the alphabet length. Although this complexity is theoretically considered the best<br/>achieved, it could not practically outperform the performance of the algorithm of<br/>Manber and Myers based only on the suffix array that takes O(mlog n), where n is<br/>the length of the text. In fact, Manber and Myers based algorithm is faster 1.5 up to<br/>6.5 times compared to the other algorithms.<br/>In this thesis, we present a modification to the enhanced suffix array which enables<br/>us to achieve O(m) time to find the pattern, i.e., independent of . The basic idea<br/>behind this improvement is the extensive use of the minimal perfect hashing technique,<br/>by which a static set of n elements can be accessed in O(1) time while maintaining<br/>a linear space. Furthermore, we explored different implementation tricks to speed up<br/>our searching algorithm. Interestingly, our algorithm became faster than Manber and<br/>Myers’ algorithm for certain alphabet size and range of query lengths |
| 546 ## - Language Note |
| Language Note |
Text in English, abstracts in English. |
| 650 #4 - Subject |
| Subject |
Software Engineering |
| 655 #7 - Index Term-Genre/Form |
| Source of term |
NULIB |
| focus term |
Dissertation, Academic |
| 690 ## - Subject |
| School |
Software Engineering |
| 942 ## - ADDED ENTRY ELEMENTS (KOHA) |
| Source of classification or shelving scheme |
Dewey Decimal Classification |
| Koha item type |
Thesis |
| 650 #4 - Subject |
| -- |
211 |
| 655 #7 - Index Term-Genre/Form |
| -- |
187 |
| 690 ## - Subject |
| -- |
211 |