COMPUTER LINGUISTICS

COMPUTATIONAL LINGUISTICS: A STEP TOWARDS DEVELOPMENT

 

 

 

The existence of the Uzbek computer language adapted to artificial intelligence is a guarantee of its national development and survival.

 

Today, raising the culture of literacy in our country, strengthening the status of our state language, developing the prestige of the Uzbek language in the world community has become one of the priorities of state policy.

In this regard, the promulgation of the Decree of the President of the Republic of Uzbekistan dated October 21, 2019 PF-5850 "On measures to radically increase the prestige and status of the Uzbek language as the state language" accelerated the movement. As noted in the decree, the Uzbek language is actively used in political, legal, socio-economic, spiritual and educational spheres, and is widely heard in international forums. There is a growing interest in our language and its study in foreign countries.

Therefore, the issues set out in the Decree, including radically raising the prestige of the Uzbek language in the social life of our people, internationally and enhancing the role and prestige of the state language, ensuring a worthy place of the Uzbek language in information and communication technologies, in particular the Internet. Tasks such as creating are particularly noteworthy for their extreme urgency. Today, the active consumption of natural language is determined by its reflection in digital technologies and the creation of its formal appearance.

Computational Linguistics at the Tashkent State University of Uzbek Language and Literature named after Alisher Navoi

In the process of strategic education aimed at fulfilling these tasks at the Tashkent State University of Uzbek Language and Literature named after Alisher Navoi in 2020 opened a master's degree in 5A120106 - "Computer Linguistics", which admitted 13 masters.

Computer linguistics is a field of education aimed at solving language problems through computer programs, the main focus is on the creation of a formal grammar of the Uzbek language, the creation of a linguistic base of all language-related phenomena. In this case, the linguistic rules take on a formal appearance. Therefore, the role of the computer linguist in the translation of natural language into machine language is invaluable.

Uzbek computer linguistics was formed in the late twentieth century, initially through scientific research and the creation of a number of frequency dictionaries with the help of software that determines the frequency of words most frequently used in periodicals and a number of fiction. By the 20s of the XXI century, the implementation of research with practical results, the creation of various linguistic programs (automatic editing and analysis, transliteration, mobile applications) and master's degree in Computer Linguistics in higher education institutions of the Republic (Tashkent State University) with the opening of the stage began its own path of development.

 

10s of the XXI century

 

 

20s of the XXI century

 

 

At our university, such disciplines as "Computer Linguistics", "Machine Translation", "Natural Language Processing / NLP", "Python Programming Language", "Database" are taught on the basis of two modules: linguistic knowledge and software engineering. This integration of information technology and linguistics is paying off: students are gaining skills and competencies to form a linguistic database tailored to the purpose of these disciplines.

It is known that Google, Yandex (Online) translator services have Uzbek language, but because Uzbek language is not fully reflected in the memory of these translation systems, in many cases the translation does not have clarity and original language expression. It takes a tremendous amount of hard work to eliminate the cause of a said sentence. Therefore, the following issues in the field of computer linguistics are a priority at our university:

formation of digital linguistics;

- creation of a national language corps;

- creation of a language teaching system;

- new generation electronic dictionaries;

- e-learning literature;

- Excellent interpreter programs;

- Word text editor program aims to create the ability to edit and analyze Uzbek texts.

Both scientific and practical research is being carried out on the implementation of these tasks, and the efforts are bearing fruit. In particular, a program for editing and analyzing text was created on the operating system Ubuntu Linux. Research is currently underway to incorporate its linguistic database into Microsoft Office.

Priorities identified in accordance with the Resolution of the President of the Republic of Uzbekistan dated February 17, 2021 "On measures to create conditions for the accelerated introduction of artificial intelligence technologies" and the Strategy "Digital Uzbekistan - 2030" to accelerate the creation of artificial intelligence in the social sphere A number of practical projects are underway.

Our achievements

 As a result of the practical project "Development of Uzbek-based speaking software and voice synthesizer for the visually impaired using computer technology, reading and writing texts", the Uzbek speech synthesizer (Text to Speech / TTS) was created. .

This speech synthesizer allows you to automatically voice written texts. It should be noted that the creation of this speech synthesizer is the first experiment in Uzbek computational linguistics. Now the problems related to the process of independent work of blind people with electronic texts in Uzbek will be solved and their employment will be provided.

It is known that today the effective use of the rich expressive potential of the Uzbek language has become one of the priorities of state policy. One of the main hallmarks of speech culture is correct pronunciation. Problems in this area are reflected in the pronunciation of words learned from languages ​​with a free accent. Words assimilated in a particular language tend to be reconstructed according to the nature and laws of that language, which causes the acquired words to settle from the lexical layer of the language.

For example: monosemiya (in Russian) - monosemiya (in Uzbek), muskul (in Russian) - muskul (in Uzbek). The retention of an accent in the language or source language to which the acquired word belongs may prevent the acquisition of the international word. Taking into account similar aspects, the university has prepared for publication in 2 volumes "Accent Dictionary of Uzbek words", which includes 11,000 words widely used in various fields and social networks by computer linguists.

 This dictionary plays a special role in the elimination of speech errors caused by the accentuation of words in the Uzbek language, serves as a valuable lexicographic source in the development of speech corpora and linguistic models for Uzbek speech synthesizer and speech recognition software, automatic analysis and automatic translation programs. TV and radio channels will play an important role in the retraining courses organized in order to improve the speech culture of speakers.

Another such project is being implemented on the basis of the tasks set by the scientific community of the university on the grant project "Creation of the educational corps of the Uzbek language" for 2020-2022. As a result of this project, an educational corps will be created, which will serve as a general educational tool for the III Renaissance. The focus is on creating a modern electronic textbook, 10 types of philological dictionaries and non-translated lexical units of the Uzbek language - explanatory dictionaries of national and cultural words, lexicographic mobile applications and multimedia products aimed at developing correct pronunciation skills in Uzbek.

It is known that language teaching requires a lot of work. At present, the creation of innovative educational technologies based on the needs and requirements of the modern student has become a topical issue. As a solution to these problems, active work is underway on a practical project "Creation of a linguodidactical electronic platform of Turkic languages" aimed at teaching Uzbek to Turks, Kazakhs, Kyrgyz, Azerbaijanis and Tatars in their own languages.

The main goal is to demonstrate the nature of the Uzbek language in these Turkic languages ​​through computer technology and to express the textual and audio conversations of the Uzbek language in creative graphic designs, to provide multimedia interpretations of folklore in other Turkic languages ​​on a single platform. It takes into account language teaching and popularization of our spiritual heritage through an electronic platform.

Thanks to the targeted online propaganda of university professors and teachers, today our students are creating and putting into operation mobile applications on "Navoi ghazals", "Bobur's work" and a number of academic subjects and fiction.

 

 

  

“Navoi Gazelles ”mobile application

 

 

Bobur Ijodi mobile application

 

 

Also, a special site (http://alisher.navoiy-uni.uz/) was created by diligent masters of computer linguistics, reflecting the personality and creativity of Alisher Navoi.

Specialists in the field of computer linguistics as innovative specialists who create and work on philological programs:

Teachers and researchers in higher education institutions of the Republic;

The university has a high goal of combining the Uzbek language and modern information technologies, strengthening the position of our national language in the world community and further enhancing the prestige of our spiritual heritage in the development of computer linguistics and digital publishing.

 

 

 

 

Elov Botir Boltaevich - Head of the Department of Information Technology, Tashkent State University of Uzbek Language and Literature named after Alisher Navoi, Doctor of Philosophy (PhD) in Technical Sciences

 

 

 

 

Abjalova Manzura Abdurashetovna - Doctor of Philosophy in Philology (PhD), Acting Associate Professor of "Information Technology" of Tashkent State University of Uzbek Language and Literature named after Alisher Navoi