Automatic Detection of Shadda in Modern Standard Arabic Continuous Speech

Ammar Al-Sabri (1), Afzan Adam (2), Fadhilah Rosdi (3)
(1)
(2)
(3)
Fulltext View | Download
How to cite (IJASEIT) :
Al-Sabri, Ammar, et al. “Automatic Detection of Shadda in Modern Standard Arabic Continuous Speech”. International Journal on Advanced Science, Engineering and Information Technology, vol. 8, no. 4-2, Sept. 2018, pp. 1810-9, doi:10.18517/ijaseit.8.4-2.6813.
The presence of diacritics Shadda in Arabic continuous speech may lead to the reduction of the accuracy of automatic Word Boundary Detection (WBD), which caused one word will be wrongly detected as two words. Therefore, this will affect the accuracy of Automatic Speech Recognition (ASR), if it is based on WBD. Shadda is one of the essential characteristics of the Arabic language which represents a consonant doubling.  In this paper, a proposed method of automatic detection of Shadda in Modern Standard Arabic (MSA) continuous speech was introduced to improve the accuracy of WBD in MSA continuous speech. The prosodic features namely Short Time Energy (STE), Fundamental Frequency and Intensity were investigated for its ability as Shadda pattern detection in continuous MSA speech. We have analyzed the proposed features by implementing a separated algorithm for each feature to detect Shadda pattern automatically. In addition, a new proposed method which is a combination of STE and Intensity were introduced. The dataset in this work is a collection of 1-hour TV broadcast news from Aljazeera Arabic TV channel for 2018 - broadcasters. We found that the Shadda pattern is very similar to unvoiced regions of speech, and this represents a big challenge for the improvement of WDB using Shadda. Results showed that the detection of Shadda using Short Time Energy and Intensity outperforms the Fundamental frequency with 55% of accuracy. Intensity achieved 71.5% in accuracy. In addition, a combination between Intensity & STE features was performed and achieved good results with 67.15% in accuracy. The number of false positive too has been reduced compared to Intensity alone.

A. Ali, Y. Zhang, P. Cardinal, N. Dahak, S. Vogel, and J. Glass, "A complete kaldi recipe for building arabic speech recognition systems," in Spoken Language Technology Workshop (SLT), 2014 IEEE, 2014, pp. 525-529.

S. Kanneganti, "DESIGN OF AN AUTOMATIC WORD BOUNDARY DETECTION SYSTEM USING THE COUNTING RULE," Temple University Libraries, 2011.

A. Tsiartas, P. K. Ghosh, P. Georgiou, and S. Narayanan, "Robust word boundary detection in spontaneous speech using acoustic and lexical cues," in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, 2009, pp. 4785-4788.

V. Naganoor, A. K. Jagadish, and K. Chemmangat, "Word boundary estimation for continuous speech using higher order statistical features," in Region 10 Conference (TENCON), 2016 IEEE, 2016, pp. 966-969.

E. A. Mohammed and M. J. Ab Aziz, "Towards English to Arabic Machine Translation," International Journal of Advanced Research in Computer Science, vol. 2, 2011.

B. Bataineh, S. N. H. S. Abdullah, and K. Omar, "Generating an Arabic Calligraphy Text Blocks for Global Texture Analysis," International Journal on Advanced Science, Engineering and Information Technology, vol. 1, pp. 150-155, 2011.

R. E. Salah and L. Qadri binti Zakaria, "A Comparative Review of Machine Learning for Arabic Named Entity Recognition," International Journal on Advanced Science, Engineering and Information Technology, vol. 7, pp. 511-518, 2017.

S. Khoja, "APT: Arabic part-of-speech tagger," in Proceedings of the Student Workshop at NAACL, 2001, pp. 20-25.

R. E. Salah and L. Q. binti Zakaria, "Arabic Rule-Based Named Entity Recognition Systems Progress and Challenges," International Journal on Advanced Science, Engineering and Information Technology, vol. 7, pp. 815-821, 2017.

Y. Alotaibi, S.-A. Selouani, and D. O'shaughnessy, "Experiments on automatic recognition of nonnative Arabic speech," EURASIP Journal on Audio, Speech, and Music Processing, vol. 2008, p. 679831, 2008.

K. Daqrouq, M. Alfaouri, A. Alkhateeb, E. Khalaf, and A. Morfeq, "Wavelet LPC with neural network for spoken arabic digits recognition system," British Journal of Applied Science & Technology, vol. 4, p. 1238, 2014.

I. L. Learning and A. P. com, Learn Arabic - Level 1: Introduction to Arabic: Volume 1: Lessons 1-25, 2017.

M. Alkhalifa and H. Rodrí­guez, "Automatically extending named entities coverage of Arabic WordNet using Wikipedia," International Journal on Information and Communication Technologies, vol. 3, pp. 20-36, 2010.

N. Halabi, "Modern standard Arabic phonetics for speech synthesis," University of Southampton, 2016.

S. Davis and M. Ragheb, "Geminate representation in Arabic," in Perspectives on Arabic Linguistics XXIV-XXV. vol. 1, ed: John Benjamins Publishing Company, 2014, pp. 3-19.

H. Al-Haj, R. Hsiao, I. Lane, A. W. Black, and A. Waibel, "Pronunciation modeling for dialectal Arabic speech recognition," in Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on, 2009, pp. 525-528.

O. HACHOUR, N. MASTORAKIS, and M. GUERTI, "The problem of “cheddah” In Standard Arabic Language," 2016.

K. Ferrat and M. Guerti, "An Experimental Study of the Gemination in Arabic Language," Archives of Acoustics, vol. 42, pp. 571-578, 2017.

A. Radman, N. Zainal, C. Umat, and B. A. Hamid, "Effective arabic speech segmentation strategy," Jurnal Teknologi, vol. 77, pp. 9-13, 2015.

F. Biadsy, N. Habash, and J. Hirschberg, "Improving the Arabic pronunciation dictionary for phone and word recognition with linguistically-based pronunciation rules," in Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2009, pp. 397-405.

Y. F. Al-Irhaim and E. G. Saeed, "Arabic word recognition using wavelet neural network," in Scientific Conference on Information Technology, 2010, pp. 29-30.

S. S. AlDahri and Y. A. Alotaibi, "Phonetic investigation of MSA Arabic stops (/t, d/)," in Image and Signal Processing (CISP), 2010 3rd International Congress on, 2010, pp. 3524-3527.

F. Diehl, M. J. F. Gales, X. Liu, M. Tomalin, and P. C. Woodland, "Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems," in INTERSPEECH, 2011, pp. 777-780.

S. Khalid, "Arabic Speech Recognition Using Hidden Markov Model," 2014.

A. M. Elkourd and K. El Kourd, "Arabic Isolated Word Speaker Dependent Recognition System," ed: Islamic University, Gaza, Palestine Deanery of Higher Studies Faculty of Engineering Computer Engineering Department, 2014.

N. Lass, Contemporary issues in experimental phonetics: Elsevier, 2012.

D. B. Fry, "Experiments in the perception of stress," Language and speech, vol. 1, pp. 126-152, 1958.

S. K. Mugair, "A Linguistic Study of Gemination of Arabic Languages," 2018.

F. Eyben, Real-time speech and music classification by large audio feature space extraction: Springer, 2015.

P. Sharma and A. K. Rajpoot, "Automatic identification of silence, unvoiced and voiced chunks in speech," Journal of Computer Science & Information Technology (CS & IT), vol. 3, pp. 87-96, 2013.

J. O. Uguru, "Fundamental frequency as cue to intonation: Focus on Ika Igbo and English rising intonation," Proceedings of Meetings on Acoustics, vol. 19, p. 060231, 2013.

L. Fu, X. Mao, and L. Chen, "Relative speech emotion recognition based artificial neural network," in Computational Intelligence and Industrial Application, 2008. PACIIA'08. Pacific-Asia Workshop on, 2008, pp. 140-144.

P. R. Rao, Communication Systems: Tata McGraw-Hill Education, 2013.

F. Eyben, "Acoustic Features and Modelling," in Real-time Speech and Music Classification by Large Audio Feature Space Extraction, ed: Springer, 2016, pp. 9-122.

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).