Statistical machine translation proposal for Uzbek to English

Authors

  • Alisher Shakirovich Ismailov Andijan Machine Building Institute
  • Gulshoda Shamsiyeva National University of Uzbekistan
  • Nilufar Abdurakhmonova Tashkent State University of Uzbek Language and Literature named after Alisher Navoi

Keywords:

Machine translation, natural language, statistical machine translation, corpora

Abstract

The machine translation means is a translating one natural language to another natural language automatically [1]. The machine translation is one of the major and the most active areas in natural language processing. The last decade have seen the rise of the use of statistical approaches to the machine translation. The statistical machine translation approaches learn translation parameters automatically from alignment text rather than relying on rule-based approaches. There has been quite extensive work in statistical machine translation area for some language pairs. However, there are very limited research sources available for the Uzbek to English language pair [2]. In this paper, we propose statistical machine translation algorithm for Uzbek to English. The developing English to Uzbek statistical machine translation algorithm is an interesting obstacle from a number of perspectives. The most important challenge is that English and Uzbek are typologically distant languages. The English language has very limited morphology and Uzbek is an agglutinative language with a very rich and productive derivational and inflectional morphology. The Uzbek word structures that can correspond to complete phrases of several words in English when translated. In this paper, propose that will achieve Uzbek to English statistical machine translation algorithm using phrase-base model. Moreover, in order to achieve statistical machine translation we need to develop English-Uzbek corpora. In this paper, we present briefly about English-Uzbek corpora development.

References

Aripov, M., Sharipbay, A., Abdurakhmonova, N., Razakhova B.: Ontology of grammar rules as example of noun of Uzbek and Kazakh languages. In: Abstract of the VI International Conference “Modern Problems of Applied Mathematics and Information Technology - Al-Khorezmiy 2018”, pp. 37–38, Tashkent, Uzbekistan (2018)

Abduraxmonova, N. Z. "Linguistic support of the program for translating English texts into Uzbek (on the example of simple sentences): Doctor of Philosophy (PhD) il dis. aftoref." (2018).

Abdurakhmonova N. The bases of automatic morphological analysis for machine translation. Izvestiya Kyrgyzskogo gosudarstvennogo tekhnicheskogo universiteta. 2016;2 (38):12-7.

Abdurakhmonova N, Tuliyev U. Morphological analysis by finite state transducer for Uzbek-English machine translation/Foreign Philology: Language. Literature, Education. 2018(3):68.

Abdurakhmonova N, Urdishev K. Corpus based teaching Uzbek as a foreign language. Journal of Foreign Language Teaching and Applied Linguistics (J-FLTAL). 2019;6(1-2019):131-7.

Abdurakhmonov N. Modeling Analytic Forms of Verb in Uzbek as Stage of Morphological Analysis in Machine Translation. Journal of Social Sciences and Humanities Research. 2017;5(03):89-100.

Kubedinova L. Khusainov A., Suleymanov D., Gilmullin R., Abdurakhmonova N. First Results of the TurkLang-7 Project: Creating Russian-Turkic Parallel Corpora and MT Systems. Proceedings of the Computational Models in Language and Speech Workshop (CMLS 2020) co-located with 16th International Conference on Computational and Cognitive Linguistics (TEL 2020) .2020/11: 90-101

Abdurakhmonova N. Dependency parsing based on Uzbek Corpus. InProceedings of the International Conference on Language Technologies for All (LT4All) 2019.

A. Ismailov, M. M. A. Jalil, Z. Abdullah and N. H. A. Rahim, "A comparative study of stemming algorithms for use with the Uzbek language," 2016 3rd International Conference on Computer and Information Sciences (ICCOINS), 2016, pp. 7-12, doi: 10.1109/ICCOINS.2016.7783180.

Jalil, Masita & Ismailov, Alisher & Abd Rahim, Noor Hafhizah & Abdullah, Zailani. (2017). The Development of the Uzbek Stemming Algorithm. Advanced Science Letters. 23. 4171-4174. 10.1166/asl.2017.8332.

Downloads

Published

2021-12-28

How to Cite

Ismailov, A. S., Shamsiyeva, G., & Abdurakhmonova, N. (2021). Statistical machine translation proposal for Uzbek to English. Science and Education, 2(12), 212-219. Retrieved from https://openscience.uz/index.php/sciedu/article/view/2155

Issue

Section

Technical Sciences