[RUS][ENG]

Series 13

ASIAN STUDIES. AFRICAN STUDIES.

Issue 1, 2014

CONTENTS

Section LINGUISTICS
Codes UDC 811.411.21 Page 14-22
Title Formation of text corpus and frequency definition for the words in the Arabiclanguage: problems and solutions
Author 1 Redkin Oleg I. St.Petersburg State University
199034, St. Petersburg, Russian Federation
doctor of philological sciences, professor
e-mail: oleg_redkin@mail.ru
Summary Although the problem of formation of corpus on the material of the Indo-European languages, including Russian, is comparatively developed in relation to other languages and particularly Arabic, it is far from its final solution. The article deals with the problems and solutions for building the Arabic corpus, based on the material from the Internet and other available sources, and identifies the principles of data selection. The article also considers the results of formation of frequency dictionary of Arabic, as well as peculiarities of the Arabic phonology, morphology and script. Besides, the article studies some peculiarities of the stress in Arabic. The article is supplied with a list of the most common Arabic words with their frequency indexing.
Keywords Arabic, corpus, computer, data, proceeding, frequency, dictionary.