Sequence to Sequence Learning for Unconstrained Scene Text Recognition / (Record no. 9065)
[ view plain ]
| 000 -LEADER | |
|---|---|
| fixed length control field | 10257nam a22002537a 4500 |
| 008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION | |
| fixed length control field | 210830s2015 ||||f mb|| 00| 0 eng d |
| 040 ## - CATALOGING SOURCE | |
| Original cataloging agency | EG-CaNU |
| Transcribing agency | EG-CaNU |
| 041 0# - Language Code | |
| Language code of text | eng |
| Language code of abstract | eng |
| 082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER | |
| Classification number | 610 |
| 100 0# - MAIN ENTRY--PERSONAL NAME | |
| Personal name | Ahmed Mamdouh Abd el-kariem |
| 245 1# - TITLE STATEMENT | |
| Title | Sequence to Sequence Learning for Unconstrained Scene Text Recognition / |
| Statement of responsibility, etc. | Ahmed Mamdouh Abd el-kariem |
| 260 ## - PUBLICATION, DISTRIBUTION, ETC. | |
| Date of publication, distribution, etc. | 2015 |
| 300 ## - PHYSICAL DESCRIPTION | |
| Extent | 49 p. |
| Other physical details | ill. |
| Dimensions | 21 cm. |
| 500 ## - GENERAL NOTE | |
| General note | Supervisor: Mohamed A. El-Helw |
| 502 ## - Dissertation Note | |
| Dissertation type | Thesis (M.A.)—Nile University, Egypt, 2015 . |
| 504 ## - Bibliography | |
| Bibliography | "Includes bibliographical references" |
| 505 0# - Contents | |
| Formatted contents note | Contents:<br/>Abstract ........................................................................................................................................................................... vi<br/>Keywords......................................................................................................................................................................... vi<br/>List of Figures .................................................................................................................................................................. vii<br/>Introduction .............................................................................................................................................. 1<br/>1.1 Motivation ........................................................................................................................................................ 1<br/>1.2 Objectives ......................................................................................................................................................... 1<br/>Background ............................................................................................................................................... 3<br/>2.1 Convolutional Neural Networks CNNs ............................................................................................................... 4<br/>2.2 Convolutional Neural Networks ........................................................................................................................ 4<br/>2.2.1 Convolutional Layer .............................................................................................................................. 5<br/>2.2.2 Pooling Layer ........................................................................................................................................ 5<br/>2.3 Logistic Regression ‐ Softmax ............................................................................................................................ 6<br/>2.3.1 Binary Classification .............................................................................................................................. 7<br/>2.3.2 Multiclass Classification ........................................................................................................................ 7<br/>2.4 Long Short‐Term Memory ‐ LSTM.................................................................................................................... 10<br/>2.4.1 Recurrent Neural Networks ................................................................................................................ 10<br/>2.4.2 Constant Error Carrousels ................................................................................................................... 12<br/>State‐of‐the‐Art Approaches .................................................................................................................... 14<br/>3.1 Lexicon‐Based CNN Model .............................................................................................................................. 14<br/>3.2 Character Sequence Encoding ......................................................................................................................... 15<br/>3.3 N‐gram Encoding ............................................................................................................................................ 15<br/>3.4 The Joint Model .............................................................................................................................................. 16<br/>3.5 Sequence‐to‐Sequence Learning with Neural Networks ................................................................................. 17<br/>Sequence‐to‐Sequence Learning for Unconstrained Scene Text Recognition ............................................ 18<br/>4.1 Arbitrary Length Sequence to Sequence Modeling ......................................................................................... 18<br/>4.2 Training ........................................................................................................................................................... 19<br/>4.3 Lasagne ........................................................................................................................................................... 20<br/>Experiments ............................................................................................................................................ 21<br/>5.1 Extending CNN Model with LSTM for error correction .................................................................................... 21<br/>5.2 LSTM Architecture Experiments ...................................................................................................................... 23<br/>5.2.1 The models’ architecture .................................................................................................................... 23<br/>5.3 Extending CNN Model with optimized LSTM Architecture for error correction .............................................. 28<br/>5.4 Generalisation Experiment .............................................................................................................................. 31<br/>Traffic sense ............................................................................................................................................ 33<br/>6.1 Extracting suitable corner points and their trajectories .................................................................................. 33<br/>Acknowledgements<br/>iv<br/>6.2 Initial Clustering .............................................................................................................................................. 34<br/>6.3 Adaptive Background Construction ................................................................................................................. 34<br/>6.4 Extracting bounding boxes around vehicles .................................................................................................... 35<br/>6.5 Experimental results and analysis.................................................................................................................... 35<br/>Conclusion .............................................................................................................................................. 37<br/>7.1 Achieved results ............................................................................................................................................. 37<br/>7.2 Future development ....................................................................................................................................... 37<br/>Appendix A ..................................................................................................................................................................... 38<br/>Appendix B ..................................................................................................................................................................... 39<br/>Datasets ......................................................................................................................................................................... 39<br/>References ...................................................................................................................................................................... 41 |
| 520 3# - Abstract | |
| Abstract | In this work we present a state‐of‐the‐art approach for unconstrained natural scene text recognition.<br/>We propose a cascade approach that incorporates a convolutional neural network CNN architecture<br/>followed by a long short term memory model LSTM. The CNN learns visual features for the characters<br/>and uses them with a softmax layer to detect sequence of characters. While the CNN gives very<br/>good recognition results, it does not model relation between characters, hence gives rise to false positive<br/>and false negative cases confusing characters due to visual similarities like “g” and “9”, or confusing<br/>background patches with characters; either removing existing characters or adding non‐existing<br/>ones To alleviate these problems we leverage recent developments in LSTM architectures to encode<br/>contextual information. We show that the LSTM can dramatically reduce such errors and achieve<br/>state‐of‐the‐art accuracy in the task of unconstrained natural scene text recognition. Moreover we<br/>manually remove all occurrences of the words that exist in the test set from our training set to test<br/>whether our approach will generalize to unseen data. We use the ICDAR 13 test set for evaluation and<br/>compare the results with the state of the art approaches 11, 18. We also present an algorithm for<br/>traffic monitoring.<br/>Keywords<br/>CNN: Convolutional Neural Networks<br/>LSTM: Long Short Term Memory<br/>SVM: Support Vector Machines<br/>HOG: Histogram of Oriented Gradient<br/>ICDAR: International Conference on Document Analysis and Recognition<br/>RNN: Recurrent Neural Networks<br/>BPTT: Back Propagation Through Time<br/>FNN: Feedforward Neural Networks<br/>BLSTML: Bidirectional Long Short Term Memory layer<br/>OCR: Optical Character Recognition<br/>SIFT: Scale Invariant Feature Transform<br/>JOINT‐CNN: A model that joins the character sequence encoding model with the n‐gram model<br/>JOINT‐LSTM: A model that joins the output of our proposed model with the n‐gram model |
| 546 ## - Language Note | |
| Language Note | Text in English, abstracts in English . |
| 650 #4 - Subject | |
| Subject | Informatics-IFM |
| 655 #7 - Index Term-Genre/Form | |
| Source of term | NULIB |
| focus term | Dissertation, Academic |
| 690 ## - Subject | |
| School | Informatics-IFM |
| 942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
| Source of classification or shelving scheme | Dewey Decimal Classification |
| Koha item type | Thesis |
| 650 #4 - Subject | |
| -- | 266 |
| 655 #7 - Index Term-Genre/Form | |
| -- | 187 |
| 690 ## - Subject | |
| -- | 266 |
| Withdrawn status | Lost status | Source of classification or shelving scheme | Damaged status | Not for loan | Home library | Current library | Date acquired | Total Checkouts | Full call number | Date last seen | Price effective from | Koha item type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dewey Decimal Classification | Main library | Main library | 08/30/2021 | 610/ A.M.S 2015 | 08/30/2021 | 08/30/2021 | Thesis |