WORKSHOP AT INTERSPEECH 2019, GRAZ, AUSTRIA
PLURICENTRIC LANGUAGES IN SPEECH TECHNOLOGY
PLACE/DATE: Graz, Austria, September 14 2019
ORGANIZERS: Rudolf Muhr Graz, Austria; Sarmad Hussain, Lahore, Pakistan; Tania Habib, Lahore, Pakistan; Barbara Schuppler, Graz, Austria
CALL FOR PAPERS:
DESCRIPTION AND OBJECTIVES OF THE WORKSHOP:
1. Pluricentric languages (PLCLs) are a common type among the languages of the world. Presently 43 languages have been identified to belong to this category (see www.pluricentriclanguages.org). Languages like English, Spanish, Portuguese, Bengali, Hindi, Urdu etc. fall into this category. These languages are being used in at least two nations having an official function there and forming national varieties of their own with specific linguistic and pragmatic features. In addition to the variation on the level of national standard varieties there is also so called “second level variation” on a regional and local level that is often being used in diglossic speech situations where code switching is a salient feature with two or more varieties being used within thesame utterance.
2. The amount of linguistic variation in pluricentric languages is considerable and poses a challenge for speech recognition in particular andhuman language technology in general.
3. The motivation for the special session isthe observation that pluricentric languages have not sufficiently been dealt with. This is particularly the case with the so-called “non-dominant varieties” that often suffer from lack of documentation and treatment in speech technology. (For details see www.pluricentriclanguages.org). The special session will therefore have a focus on these varieties as they share many features with endangered languages.
THE ORGANIZERS WELCOME PAPERS THAT DEAL WITH:
- Speech recognition and development of language resources for underresourced pluricentric languages and varieties of languages. This is particularly the case for the so called non-dominant varieties. Examples for this are amongst others, Scots, Saami, Karelian Finnish, Ruthenian and Kashubian, Tadczik, Frisian as well as diverse American and African languages: Aymara, Bamabara, Fulfulde, Lingala, Malinke, Soninke, Tuareg, Xhosa etc.
- Language and speech resources development (parallel corpora, pronunciation databases, tagging etc.) especially for non-dominant varieties.
Speech technologies such as speech recognition, text-to-speech and speech-to-speech for the national varieties of Pluricentric languages; on the level of standard varieties and on the level of so-called “informal speech”.
- Empirical studies on the phonetics and phonology of national varieties of different pluricentric languages.
- Speech and language technologies that are able to cope with the variation in the pluricentric languages and particularly in respect to non-dominant varieties and under-resourced languages
- Speech and language processing for code-switched speech in national varieties of pluricentric languages.
LENGTH OF PRESENTATIONS: 15 minutes presenation plus 5 minutes for discussion.
Clyne, Michael (1992) (ed.): Pluricentric Languages: Differing Norms in Different Nations. Berlin et. al.: Mouton de Gruyter.
Clyne, Michael (1995): The German Language in a Changing Europe. Cambridge. CUP.
Clyne, Michael 1997 (ed.): Undoing and redoing corpus planning. Mouton de Gruyter, Berlin.
Clyne, Michael/Kipp, Sandra 1999: Pluricentric Languages in an Immigrant Context. Spanish, Arabic and Chinese. Mouton de Gruyter, Berlin.
El Zarka, D., Schuppler B., Lozo C., Eibler W., & Wurzwallner P. (2017). Acoustic correlates of stress and accent in Standard Austrian German. (S. Moosmüller, C. Schmid, M. Sellner, Ed.).Phonetik in und über Österreich. Veröffentlichungen zur Linguistk und Kommunikationsforschung. 31, 16 - 44.
Fuchs, R. 2016. Speech Rhythm in Varieties of English. Evidence from Educated Indian English and British English. Singapore: Springer.
Habib, W. Basit, H. R., Hussain, S. and Adeeba, F. "Design of Speech Corpus for Open Domain Urdu Text to Speech System Using Greedy Algorithm", in the Proceedings of Conference on Language and Technology 2014 (CLT14), Karachi, Pakistan. (URL:http://cs.dsu.edu.pk/clt14). Presentation.
Muhr, Rudolf (2007): Österreichisches Aussprachewörterbuch - Österreichische Aussprachedaten¬bank. [Austrian Pronunciation Dicitionary – Austrian Pronunciation Database]. Frankfurt a. M Wien. Peter Lang Verlag. 534 pp.
Muhr, Rudolf / Amorós Negre, Carla, Fernández Juncal, Carmen / Zimmermann. Klaus / Prieto, Emilio and Hernández, Natividad (2013) (eds.): Exploring Linguistic Standards in
Non-Dominant Varieties of Pluricentric Languages - Explorando estándares lingüísticos en variedades no dominantes de lenguas pluricéntricas. Frankfurt a.M. / Wien u.a., Peter Lang Verlag.
Muhr, Rudolf / Marley, Dawn in collaboration with Heinz L. Kretzenbacher and Anu Bissoonauth (eds.) (2015): Pluricentric Languages. New Perspectives in Theory and Description. Frankfurt a.M. / Wien u.a., Peter Lang Verlag.
Muhr, Rudolf in collaboration with Catrin Norrby, Leo Kretzenbacher, Carla Amorós Negre (eds.) (2012): Non-dominant Varieties of pluricentric Languages Getting the Picture. In memory of Michael Clyne. Frankfurt a.M. / Wien u.a., Peter Lang Verlag.
Muhr, Rudolf in collaboration with Eugênia Duarte, Amália Mendes, Carla Amóros Negre and Juan A. Thomas (eds.) (2016): Pluricentric Languages and non-dominant Varieties worldwide: Volume 2: The pluricentricity of Portuguese and Spanish: New concepts and descriptions. Frankfurt a.M. / Wien u.a., Peter Lang Verlag.
Muhr, Rudolf in collaboration with Kelen Ernesta Fonyuy, Zeinab Ibrahim and Corey Miller (eds.) (2016): Pluricentric Languages and Non-Dominant Varieties Worldwide:
Volume 1: Pluricentric Languages across Continents - Features and Usage. Frankfurt a.M. / Wien u.a., Peter Lang Verlag.
Qasim, M., Nawaz, S., Hussain, S. and Habib, T. "Urdu Speech Recognition System for District Names of Pakistan: Development, Challenges and Solutions", in the Proceedings of 19th Oriental COCOSDA Conference 2016, Bali, Indonesia. (URL: http://www.ococosda2016.org/)
Qasim, M., Rauf, S., Habib, T. and Hussain, S. "Urdu Speech Corpus for Travel Domain", in the Proceedings of 19th Oriental COCOSDA Conference 2016, Bali, Indonesia. (URL: http://www.ococosda2016.org/)
Sailaja, P. 2009. Indian English. Edinburgh: Edinburgh University Press.
Schuppler, B., Hagmüller M., Morales-Cordovilla J. A., & Pessentheiner H. (2014). GRASS: The Graz Corpus of Read and Spontaneous Speech. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). 1465-1470.
Schuppler, B., Adda-Decker M., & Morales-Cordovilla J. A. (2014). Pronunciation variation in read and conversational Austrian German. Interspeech 2014. 1453-1457.