Cite Article

Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images

Choose citation format

BibTeX

@article{IJASEIT15771,
   author = {S. Kiruthika Devi and Subalalitha CN},
   title = {Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images},
   journal = {International Journal on Advanced Science, Engineering and Information Technology},
   volume = {12},
   number = {3},
   year = {2022},
   pages = {1263--1268},
   keywords = {Deep learning; natural scene images; text detection; text recognition; COCO dataset; CRNN model; CTC loss.},
   abstract = {The scene Text Recognition process has become a hot research topic and a challenging task owing to the complicated background, varying light intensities, colors, font styles, and sizes. Text extraction from natural scene images encompasses two main processes: text detection and text recognition. The latest advancements in Machine Learning (ML) and Deep Learning (DL) concepts can effectually automate the text detection and recognition process by training the model properly. In this view, this paper presents an Automated DL empowered Text Detection model from Natural Scene Images (ADLTD-NSI). The ADLTD-NSI technique includes two important processes: text detection and text recognition. Firstly, a single shot detector (SSD) with Inception-v2 as a baseline model is employed for text detection, an object detector based on the VGG-16 framework for feature map extraction followed by six convolution layers. Secondly, Convolutional Recurrent Neural Network (CRNN) technique is utilized for the text recognition process. Besides, the recurrent layers in the CRNN model utilize long short-term memory (LSTM) for encoding the sequence of feature vectors. Lastly, Connectionist Temporal Classification (CTC) loss is applied to predict text labels equivalent to the sequences from the recurrent layers. A wide range of experiments was carried out on benchmark COCO datasets, and the results are examined in several aspects. The experimental outcomes showcased the better performance of the ADLTD-NSI technique over the other compared methods with a maximum accuracy of 96.78%.},
   issn = {2088-5334},
   publisher = {INSIGHT - Indonesian Society for Knowledge and Human Development},
   url = {http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=15771},
   doi = {10.18517/ijaseit.12.3.15771}
}

EndNote

%A Devi, S. Kiruthika
%A CN, Subalalitha
%D 2022
%T Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images
%B 2022
%9 Deep learning; natural scene images; text detection; text recognition; COCO dataset; CRNN model; CTC loss.
%! Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images
%K Deep learning; natural scene images; text detection; text recognition; COCO dataset; CRNN model; CTC loss.
%X The scene Text Recognition process has become a hot research topic and a challenging task owing to the complicated background, varying light intensities, colors, font styles, and sizes. Text extraction from natural scene images encompasses two main processes: text detection and text recognition. The latest advancements in Machine Learning (ML) and Deep Learning (DL) concepts can effectually automate the text detection and recognition process by training the model properly. In this view, this paper presents an Automated DL empowered Text Detection model from Natural Scene Images (ADLTD-NSI). The ADLTD-NSI technique includes two important processes: text detection and text recognition. Firstly, a single shot detector (SSD) with Inception-v2 as a baseline model is employed for text detection, an object detector based on the VGG-16 framework for feature map extraction followed by six convolution layers. Secondly, Convolutional Recurrent Neural Network (CRNN) technique is utilized for the text recognition process. Besides, the recurrent layers in the CRNN model utilize long short-term memory (LSTM) for encoding the sequence of feature vectors. Lastly, Connectionist Temporal Classification (CTC) loss is applied to predict text labels equivalent to the sequences from the recurrent layers. A wide range of experiments was carried out on benchmark COCO datasets, and the results are examined in several aspects. The experimental outcomes showcased the better performance of the ADLTD-NSI technique over the other compared methods with a maximum accuracy of 96.78%.
%U http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=15771
%R doi:10.18517/ijaseit.12.3.15771
%J International Journal on Advanced Science, Engineering and Information Technology
%V 12
%N 3
%@ 2088-5334

IEEE

S. Kiruthika Devi and Subalalitha CN,"Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images," International Journal on Advanced Science, Engineering and Information Technology, vol. 12, no. 3, pp. 1263-1268, 2022. [Online]. Available: http://dx.doi.org/10.18517/ijaseit.12.3.15771.

RefMan/ProCite (RIS)

TY  - JOUR
AU  - Devi, S. Kiruthika
AU  - CN, Subalalitha
PY  - 2022
TI  - Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images
JF  - International Journal on Advanced Science, Engineering and Information Technology; Vol. 12 (2022) No. 3
Y2  - 2022
SP  - 1263
EP  - 1268
SN  - 2088-5334
PB  - INSIGHT - Indonesian Society for Knowledge and Human Development
KW  - Deep learning; natural scene images; text detection; text recognition; COCO dataset; CRNN model; CTC loss.
N2  - The scene Text Recognition process has become a hot research topic and a challenging task owing to the complicated background, varying light intensities, colors, font styles, and sizes. Text extraction from natural scene images encompasses two main processes: text detection and text recognition. The latest advancements in Machine Learning (ML) and Deep Learning (DL) concepts can effectually automate the text detection and recognition process by training the model properly. In this view, this paper presents an Automated DL empowered Text Detection model from Natural Scene Images (ADLTD-NSI). The ADLTD-NSI technique includes two important processes: text detection and text recognition. Firstly, a single shot detector (SSD) with Inception-v2 as a baseline model is employed for text detection, an object detector based on the VGG-16 framework for feature map extraction followed by six convolution layers. Secondly, Convolutional Recurrent Neural Network (CRNN) technique is utilized for the text recognition process. Besides, the recurrent layers in the CRNN model utilize long short-term memory (LSTM) for encoding the sequence of feature vectors. Lastly, Connectionist Temporal Classification (CTC) loss is applied to predict text labels equivalent to the sequences from the recurrent layers. A wide range of experiments was carried out on benchmark COCO datasets, and the results are examined in several aspects. The experimental outcomes showcased the better performance of the ADLTD-NSI technique over the other compared methods with a maximum accuracy of 96.78%.
UR  - http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=15771
DO  - 10.18517/ijaseit.12.3.15771

RefWorks

RT Journal Article
ID 15771
A1 Devi, S. Kiruthika
A1 CN, Subalalitha
T1 Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images
JF International Journal on Advanced Science, Engineering and Information Technology
VO 12
IS 3
YR 2022
SP 1263
OP 1268
SN 2088-5334
PB INSIGHT - Indonesian Society for Knowledge and Human Development
K1 Deep learning; natural scene images; text detection; text recognition; COCO dataset; CRNN model; CTC loss.
AB The scene Text Recognition process has become a hot research topic and a challenging task owing to the complicated background, varying light intensities, colors, font styles, and sizes. Text extraction from natural scene images encompasses two main processes: text detection and text recognition. The latest advancements in Machine Learning (ML) and Deep Learning (DL) concepts can effectually automate the text detection and recognition process by training the model properly. In this view, this paper presents an Automated DL empowered Text Detection model from Natural Scene Images (ADLTD-NSI). The ADLTD-NSI technique includes two important processes: text detection and text recognition. Firstly, a single shot detector (SSD) with Inception-v2 as a baseline model is employed for text detection, an object detector based on the VGG-16 framework for feature map extraction followed by six convolution layers. Secondly, Convolutional Recurrent Neural Network (CRNN) technique is utilized for the text recognition process. Besides, the recurrent layers in the CRNN model utilize long short-term memory (LSTM) for encoding the sequence of feature vectors. Lastly, Connectionist Temporal Classification (CTC) loss is applied to predict text labels equivalent to the sequences from the recurrent layers. A wide range of experiments was carried out on benchmark COCO datasets, and the results are examined in several aspects. The experimental outcomes showcased the better performance of the ADLTD-NSI technique over the other compared methods with a maximum accuracy of 96.78%.
LK http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=15771
DO  - 10.18517/ijaseit.12.3.15771