Sequence to Sequence Learning for Unconstrained Scene Text Recognition / (Record no. 9065)

MARC details
000 -LEADER
fixed length control field 10257nam a22002537a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 210830s2015 ||||f mb|| 00| 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency EG-CaNU
Transcribing agency EG-CaNU
041 0# - Language Code
Language code of text eng
Language code of abstract eng
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 610
100 0# - MAIN ENTRY--PERSONAL NAME
Personal name Ahmed Mamdouh Abd el-kariem 
245 1# - TITLE STATEMENT
Title Sequence to Sequence Learning for Unconstrained Scene Text Recognition /
Statement of responsibility, etc. Ahmed Mamdouh Abd el-kariem 
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Date of publication, distribution, etc. 2015
300 ## - PHYSICAL DESCRIPTION
Extent 49 p.
Other physical details ill.
Dimensions 21 cm.
500 ## - GENERAL NOTE
General note Supervisor: Mohamed A. El-Helw
502 ## - Dissertation Note
Dissertation type Thesis (M.A.)—Nile University, Egypt, 2015 .
504 ## - Bibliography
Bibliography "Includes bibliographical references"
505 0# - Contents
Formatted contents note Contents:<br/>Abstract ........................................................................................................................................................................... vi<br/>Keywords......................................................................................................................................................................... vi<br/>List of Figures .................................................................................................................................................................. vii<br/>Introduction .............................................................................................................................................. 1<br/>1.1 Motivation ........................................................................................................................................................ 1<br/>1.2 Objectives ......................................................................................................................................................... 1<br/>Background ............................................................................................................................................... 3<br/>2.1 Convolutional Neural Networks CNNs ............................................................................................................... 4<br/>2.2 Convolutional Neural Networks ........................................................................................................................ 4<br/>2.2.1 Convolutional Layer .............................................................................................................................. 5<br/>2.2.2 Pooling Layer ........................................................................................................................................ 5<br/>2.3 Logistic Regression ‐ Softmax ............................................................................................................................ 6<br/>2.3.1 Binary Classification .............................................................................................................................. 7<br/>2.3.2 Multiclass Classification ........................................................................................................................ 7<br/>2.4 Long Short‐Term Memory ‐ LSTM.................................................................................................................... 10<br/>2.4.1 Recurrent Neural Networks ................................................................................................................ 10<br/>2.4.2 Constant Error Carrousels ................................................................................................................... 12<br/>State‐of‐the‐Art Approaches .................................................................................................................... 14<br/>3.1 Lexicon‐Based CNN Model .............................................................................................................................. 14<br/>3.2 Character Sequence Encoding ......................................................................................................................... 15<br/>3.3 N‐gram Encoding ............................................................................................................................................ 15<br/>3.4 The Joint Model .............................................................................................................................................. 16<br/>3.5 Sequence‐to‐Sequence Learning with Neural Networks ................................................................................. 17<br/>Sequence‐to‐Sequence Learning for Unconstrained Scene Text Recognition ............................................ 18<br/>4.1 Arbitrary Length Sequence to Sequence Modeling ......................................................................................... 18<br/>4.2 Training ........................................................................................................................................................... 19<br/>4.3 Lasagne ........................................................................................................................................................... 20<br/>Experiments ............................................................................................................................................ 21<br/>5.1 Extending CNN Model with LSTM for error correction .................................................................................... 21<br/>5.2 LSTM Architecture Experiments ...................................................................................................................... 23<br/>5.2.1 The models’ architecture .................................................................................................................... 23<br/>5.3 Extending CNN Model with optimized LSTM Architecture for error correction .............................................. 28<br/>5.4 Generalisation Experiment .............................................................................................................................. 31<br/>Traffic sense ............................................................................................................................................ 33<br/>6.1 Extracting suitable corner points and their trajectories .................................................................................. 33<br/>Acknowledgements<br/>iv<br/>6.2 Initial Clustering .............................................................................................................................................. 34<br/>6.3 Adaptive Background Construction ................................................................................................................. 34<br/>6.4 Extracting bounding boxes around vehicles .................................................................................................... 35<br/>6.5 Experimental results and analysis.................................................................................................................... 35<br/>Conclusion .............................................................................................................................................. 37<br/>7.1 Achieved results ............................................................................................................................................. 37<br/>7.2 Future development ....................................................................................................................................... 37<br/>Appendix A ..................................................................................................................................................................... 38<br/>Appendix B ..................................................................................................................................................................... 39<br/>Datasets ......................................................................................................................................................................... 39<br/>References ...................................................................................................................................................................... 41
520 3# - Abstract
Abstract In this work we present a state‐of‐the‐art approach for unconstrained natural scene text recognition.<br/>We propose a cascade approach that incorporates a convolutional neural network 􁈺CNN􁈻 architecture<br/>followed by a long short term memory model 􁈺LSTM􁈻. The CNN learns visual features for the characters<br/>and uses them with a softmax layer to detect sequence of characters. While the CNN gives very<br/>good recognition results, it does not model relation between characters, hence gives rise to false positive<br/>and false negative cases 􁈺confusing characters due to visual similarities like “g” and “9”, or confusing<br/>background patches with characters; either removing existing characters or adding non‐existing<br/>ones􁈻 To alleviate these problems we leverage recent developments in LSTM architectures to encode<br/>contextual information. We show that the LSTM can dramatically reduce such errors and achieve<br/>state‐of‐the‐art accuracy in the task of unconstrained natural scene text recognition. Moreover we<br/>manually remove all occurrences of the words that exist in the test set from our training set to test<br/>whether our approach will generalize to unseen data. We use the ICDAR 13 test set for evaluation and<br/>compare the results with the state of the art approaches 􁈾11, 18􁈿. We also present an algorithm for<br/>traffic monitoring.<br/>Keywords<br/>CNN: Convolutional Neural Networks<br/>LSTM: Long Short Term Memory<br/>SVM: Support Vector Machines<br/>HOG: Histogram of Oriented Gradient<br/>ICDAR: International Conference on Document Analysis and Recognition<br/>RNN: Recurrent Neural Networks<br/>BPTT: Back Propagation Through Time<br/>FNN: Feedforward Neural Networks<br/>BLSTML: Bidirectional Long Short Term Memory layer<br/>OCR: Optical Character Recognition<br/>SIFT: Scale Invariant Feature Transform<br/>JOINT‐CNN: A model that joins the character sequence encoding model with the n‐gram model<br/>JOINT‐LSTM: A model that joins the output of our proposed model with the n‐gram model
546 ## - Language Note
Language Note Text in English, abstracts in English .
650 #4 - Subject
Subject Informatics-IFM
655 #7 - Index Term-Genre/Form
Source of term NULIB
focus term Dissertation, Academic
690 ## - Subject
School Informatics-IFM
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Thesis
650 #4 - Subject
-- 266
655 #7 - Index Term-Genre/Form
-- 187
690 ## - Subject
-- 266
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Home library Current library Date acquired Total Checkouts Full call number Date last seen Price effective from Koha item type
    Dewey Decimal Classification     Main library Main library 08/30/2021   610/ A.M.S 2015 08/30/2021 08/30/2021 Thesis