Том 1 № 1 (2022): КОМПЬЮТЕРНАЯ ЛИНГВИСТИКА: ПРОБЛЕМЫ, РЕШЕНИЯ, ПЕРСПЕКТИВЫ
Статьи

UZBEK TEXT ANALYSIS USING ZIPF DISTRIBUTION

Опубликован 2022-05-19

Ключевые слова

  • Uzbek text,
  • Zipf’s law,
  • mathematical statistics,
  • word frequency

Аннотация

The frequency distribution of words has been a key object of study
in statistical linguistics. This article presents a words analysis of three works in the
Uzbek language based on the mathematical-statistical law. Words frequency
distribution for each document is calculated based on Zipf’s law. Results per
document are compared with each other and they are described in visual graphs.
This article shows that human language has a highly complex, reliable structure in
frequency distribution. Some empirical phenomena related to word frequencies are
then reviewed. These facts are chosen to be informative about the mechanisms
giving rise to Zipf’s law and are then used to evaluate many of the theoretical
explanations of Zipf’s law in language.