Научная статья на тему 'Towards a log-normal model of phonation units lengths distribution in the oral utterances'

Towards a log-normal model of phonation units lengths distribution in the oral utterances Текст научной статьи по специальности «Языкознание и литературоведение»

CC BY
56
11
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
АКУСТИЧЕСКАЯ ФОНЕТИКА / ACOUSTIC PHONETICS / ORAL SPEECH / SPEECH UTTERANCE / ЛОГНОРМАЛЬНОЕ РАСПРЕДЕЛЕНИЕ / LOG‐NORMAL DISTRIBUTION / PHONATION UNIT / ПАУЗА / PAUSE / УСТНОПОРОЖДАЕМАЯ РЕЧЬ / ФОНАЦИОННЫЙ ОТРЕЗОК

Аннотация научной статьи по языкознанию и литературоведению, автор научной работы — Ratnikova E.I.

The present article is concerned with the issue of developing workable approaches to establishing a more robust parametrization of the phonetic and prosodic characteristics of a speech utterance. The article presents the results of an experimental study analyzing the distribution of phonation units within speech utterances produced in the French, Russian and English languages in three communicative situations: 1) interview; 2) general oral discussion and 3) student’s spoken answer in a foreign language test. The analysis of an extensive corpus of recorded speech data showed that such fundamental intrinsic characteristic of speech utterance as the length of its constituent phonation units distinctly follows a log‐normal distribution irrespective of the communicative situation, the language spoken (mother or foreign), the thematic subject and overall duration of the utterance, and the individual manner of the speakers. The findings of the conducted study presented in the article can be a valuable resource for developing a practicable methodology of speech‐utterance analysis, as well as for fine‐tuning and improving the existing speech‐recognition algorithms and approaches.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

РАСПРЕДЕЛЕНИЕ ДЛИТЕЛЬНОСТИ ФОНАЦИОННЫХ ОТРЕЗКОВ В УСТНОПОРОЖДАЕМОМ ВЫСКАЗЫВАНИИ: ЛОГНОРМАЛЬНАЯ МОДЕЛЬ

В статье представлены результаты экспериментального исследования распределения длительности фонационных отрезков в устнопорождаемом высказывании на материале французского, русского и английского языков в трех ситуациях общения: 1) интервью, 2) устной беседы, 3) устного ответа на экзамене по иностранному языку. В ходе анализа статистически репрезентативного корпуса устных текстов было установлено, что такая темпоральная характеристика устнопорождаемого высказывания как длительность фонационного отрезка подчиняется логнормальному статистическому распределению вне зависимости от типа изучаемой ситуации, индивидуальных особенностей говорящих, тематики и общей длительности высказывания, а также от того, на каком языке осуществляется порождение (родном или иностранном). Подчинение логнормальному распределению количественных характеристик проявлений какого-либо процесса свидетельствует об устойчивости базового механизма, отвечающего за их реализацию. Результаты настоящей работы могут быть использованы при разработке методики анализа устнопорождаемого высказывания, а также компьютерных алгоритмов автоматического распознавания речи.

Текст научной работы на тему «Towards a log-normal model of phonation units lengths distribution in the oral utterances»

13. Adams J. The Epic of America. - New York. : Little, Brown, 1959. - 405 p.

14. BAAS Pamphlet No. 6 (First Published 1981) [Электронный ресурс] https://goo.gl/h9a9Qd (дата обращения: 15. 03. 2016)

15. Bragg J., Drayton M. The American Dream. - London.: Longman, 2004.- 64 p.

16. Brown H. The Age of Gold. - N. Dakota: Powell's Books, 2000. - 147 p.

17. Ekirch Jr. Arthur [Электронный ресурс] - URL: https://goo.gl/ImirC1 (дата обращения: 15. 03.2016)

18. Hugen H. The American Dream. - New York: Vintage Books, 1968. - 290 p.

19. Lawrence R. Striking the Root: Essays on Liberty. New York: Vintage Books, 1995. - 110 p.

20. Martin Luther King. Letter from Birmingham City Jail.1963. [Электронный ресурс] http://mlkkpp01.stanford.edu:5801/transcription/document_images/undecided/630416-019.pdf (дата обращения 30. 03. 2015)

21. Miller D. City of the Century. - New York: Simon & Schuster, 1996. - 263 p.

DOI: https://doi.org/10.23670/IRJ.2017.57.103 Ратникова Е.И.

Кандидат филологических наук, Московский Государственный университет имени М.В.Ломоносова РАСПРЕДЕЛЕНИЕ ДЛИТЕЛЬНОСТИ ФОНАЦИОННЫХ ОТРЕЗКОВ В УСТНОПОРОЖДАЕМОМ ВЫСКАЗЫВАНИИ: ЛОГНОРМАЛЬНАЯ МОДЕЛЬ

Аннотация

В статье представлены результаты экспериментального исследования распределения длительности фонационных отрезков в устнопорождаемом высказывании на материале французского, русского и английского языков в трех ситуациях общения: 1) интервью, 2) устной беседы, 3) устного ответа на экзамене по иностранному языку. В ходе анализа статистически репрезентативного корпуса устных текстов было установлено, что такая темпоральная характеристика устнопорождаемого высказывания как длительность фонационного отрезка подчиняется логнормальному статистическому распределению вне зависимости от типа изучаемой ситуации, индивидуальных особенностей говорящих, тематики и общей длительности высказывания, а также от того, на каком языке осуществляется порождение (родном или иностранном). Подчинение логнормальному распределению количественных характеристик проявлений какого-либо процесса свидетельствует об устойчивости базового механизма, отвечающего за их реализацию. Результаты настоящей работы могут быть использованы при разработке методики анализа устнопорождаемого высказывания, а также компьютерных алгоритмов автоматического распознавания речи.

Ключевые слова: акустическая фонетика, устнопорождаемая речь, логнормальное распределение, фонационный отрезок, пауза.

Ratnikova E.I.

PhD in Philology, Lomonosov Moscow State University TOWARDS A LOG-NORMAL MODEL OF PHONATION UNITS LENGTHS DISTRIBUTION

IN THE ORAL UTTERANCES

Abstract

The present article is concerned with the issue of developing workable approaches to establishing a more robust parametrization of the phonetic and prosodic characteristics of a speech utterance. The article presents the results of an experimental study analyzing the distribution of phonation units within speech utterances produced in the French, Russian and English languages in three communicative situations: 1) interview; 2) general oral discussion and 3) student's spoken answer in a foreign language test. The analysis of an extensive corpus of recorded speech data showed that such fundamental intrinsic characteristic of speech utterance as the length of its constituent phonation units distinctly follows a log-normal distribution irrespective of the communicative situation, the language spoken (mother or foreign), the thematic subject and overall duration of the utterance, and the individual manner of the speakers. The findings of the conducted study presented in the article can be a valuable resource for developing a practicable methodology of speech-utterance analysis, as well as for fine-tuning and improving the existing speech-recognition algorithms and approaches.

Keywords: acoustic phonetics, oral speech, speech utterance, log-normal distribution, phonation unit, pause.

Introduction

One of the principal temporal-acoustic characteristics of a speech signal is the length of its constituent segments. A number of studies conducted in this field have demonstrated that such phenomena as the distribution of the lengths of filled-in pauses in oral French speech [2], the distribution of the lengths of non-filled-in pauses in the speech of sufferers from ataxic dysarthria articulatory disorders [4], and of the lengths of vowels and consonants in oral English speech [3] follow a lognormal model. Besides that, the length-related characteristics of certain specific instances of the written speech have also been found to be log-normally distributed, for example, the length of postings on internet forums dedicated to various subjects [5]. It is also well-known that in many languages the length of speech segments plays a key role in the formation of the rhythmic structure of an utterance, therefore further and more thorough analysis of the observed distribution patterns undoubtedly presents a matter of exceptional scientific interest bound to deepen our knowledge about the speech-production process.

As it is impossible to explore the whole multitude of speech-production forms within a single study, this article narrows down the research subject to three communicative situations: 1) interview, 2) general oral discussion and 3) a student's spoken answer in a foreign language test. Therefore, the subject of the conducted analysis is monologue spontaneous oral utterance, its spontaneous nature being determined by the absence of any visual (textual) support for the speaker and by the unpredictability

of its realization course, the latter, however, not excluding the presence of some general cognitive concept or, in some cases, even of a speech plan. By monologue is meant a long-distance speech production during which the speaker is not interrupted by the listener.

The study adopts a formal approach towards the analysis of the oral utterance, regarding it from the acoustic viewpoint as a combination of phonation units (speech chunks) and pauses. This approach appears to be fully workable at the initial stages of the research as it allows to ensure the required degree of consistency of the input data represented by a large oral-speech corpus (more than 100 speech productions) whose preliminary systematization with the use of perception analysis would have posed considerable difficulties.

While intentionally leaving the language contents of the phonation units outside the boundaries of the analysis, we presume that there exists a certain mechanism, a definitive model, responsible for turning these units into an ultimate speech production. It is the effort to get an insight, however limited, into the workings of this mechanism that lies at the basis of the current research.

The study is dedicated to the exploration of the frequency distribution of phonation units of different lengths making up oral utterances produced, as was mentioned above, in three different communicative situations. The given article presents preliminary results of this experimental research and sets out a hypothesis that irrespective of the communicative situation, thematic speech subject, individual manner of the speaker and the language spoken (mother tongue or taught foreign), the time-length values of the constituent speech segments follow a log-normal distribution.

Experimental Platform

The study examined a corpus of recorded speech data comprising 100 spontaneous monologue speech utterances with a total length 4 minutes produced in three communicative situations: 1) interview, 2) general oral discussion and 3) a student's spoken answer in a foreign language test. The first group contains recorded radio broadcasts in interview format on various subjects (music, sports, cinema, literature, computer games) conducted in Russian and French respectively on Russian and French radio stations. The second group contains recordings made by the author: the speakers were asked to give an extended monologue answer in native Russian to the question put by the researcher. The third group consists of monologue answers of the Russian school students given during a French language competition (the language competency level B2). Therefore, the analyzed corpus comprises speech recordings both in the native and taught foreign languages, with the age of the speakers, both male and female, lying within the range from 15 to 65 years. The sound files were recorded in .wav file format with a sample rate of 44 kHz and processed in Adobe Audition computer program.

Method

Acoustic analysis and speech signal annotation were made with the use of Praat computer program, whereas the quantitative analysis was performed with the use of Excel and statistical analysis with MiniTab program.

Experiment Stages

1. Segmentation of the speech signals into pauses and phonation units in Praat computer program. The utterances were analyzed from the acoustic standpoint, i.e. as interconnected sequence of phonation and pauses. By 'pause' is meant a period of complete silence (interruption of the phonation) with a length starting from 200ms [1]. In line with the approaches of traditional linguistics, the so-called 'filled-in pauses' are regarded as phonation, which enables a more precise application of the selected segmentation methodology.

2. Annotation of the identified segments. Writing orthographic transcription (e.g. see fig. 1 below).

Fig. 1 - Segmentation and annotation of an English utterance as L2

3. Calculation of the length of phonation segments in Excel

4. Histogram and distribution analysis in MiniTab.

The purpose of the next stage was to establish the character of the observed frequency distribution, which was done with the use of Minitab computer program. The analysis of the output histograms demonstrated the presence of a distinct lognormal distribution. Histograms serve as an instrument for graphically representing the experimentally obtained data and describing its frequency distribution. To make the analyzed data visually compact and easily readable, in accordance with the selected grouping algorithm it is sub-divided by the computer program into intervals, or bins. Along with the total number of data

values, this algorithm takes into account their range and variance, thus aiming to ensure the best possible representation of the revealed statistical distribution. The most important characteristic of a histogram is its shape, representing the character of the established frequency distribution.

The histograms built in the course of the experiment contain 9 bins (groups), with an interval of 1 second between the average values of the data in each bin. The first bin contains phonation units with a length 200 to 500ms, the second 500ms to 1.5sec and so on.

The Y axis shows the frequency, i.e. the total number of phonation units. The higher the vertical box, the higher the frequency of the phonation units within the given group. This approach was used for all the oral-speech utterances in the analyzed corpus, i.e. for each speaker was built a separate histogram.

The histogram (frequency) analysis has demonstrated that the lengths of the considered phonation units are log-normally distributed (see fig.2).

Histogram of Phonation Units Lengths of an English utterance as L2

25

20

15

10

5

0

012345678 Phonation Units Lengths Intervals, ms

Fig. 2 - The example of the log-normal distribution of the phonation units' lengths in an utterance with a duration

of 4 minutes in non-native English (p< 0.05).

The quantitative analysis has found that the total number of phonation units varies depending on the total length of the utterance. As for the range of the lengths of the individual phonation units within the utterance, it is subject to a much lesser variance, being almost identical in each particular instance and on average measuring from 200ms to 7sec, meaning that each utterance contains short, medium and long phonation units.

The same graph also shows the inverse relation between the frequency and length of the phonation units. As the most frequent units are situated on the left-hand side from the mean value and the least frequent on the right-hand side, it can be seen that the left side contains the lesser number of bins as compared to the right one containing the most part of them. The most frequent phonation units are grouped in bins 1 and 2, while the less frequent are contained in bins 5 to 8, i.e. the frequency falls as the length grows. As a result, the right-hand side of the histogram has a distinctly elongated narrowing tail demonstrating a skewness and general asymmetry typical of a log-normal distribution.

Conclusions

The identified log-normal distribution pattern for phonation units' lengths has turned out to be valid for all the types of utterances analyzed in the experiment. This allows to conclude that the variance in the phonation units' length values is subject to a certain universal model which is realized each time irrespective of the subject of the utterance, individual traits of the speaker, the language spoken, etc. It seems only natural to assume that the observed general applicability of this distribution pattern reflects the working of a certain intrinsic mechanism regulating the production of a spontaneous oral utterance as such, or at least its temporal structure.

More than that, as the experiment has also shown, this model is not violated by either the degree of predictability of the speech stimulus put forward to the speaker (an unexpected question from the researcher), or the amount of time provided for preparing the answer or delivering it, or the fact whether the communicative task in question was explicitly specified or merely contextually implied. There was only one common condition for all the speakers: the requirement to produce an oral monologue utterance.

Список литературы / References

1. Boomer D. S., Dittmann A. T. Hesitation pauses and juncture pauses in speech // Language and Speech. - 1962. - 5.- P. 215-220.

с ф 3

ст ф

с 3 с о

Я5 С

о

2. Christodoulides G., Avanzi M. Phonetic and Prosodic Characteristics of Disfluencies in French Spontaneous Speech // Poster presented at the 14th Conference on Laboratory Phonology 2014, July 25-27, 2014. Tokyo, Japan.

3. Kristin M. R. Analysis of speech segment duration with the lognormal distribution: A basis for unification and comparison // Journal of Phonetics.- 2005.- 33(4). - P. 411-426.

4. Rosen K.M., Kent R.D., Duffy J.R. Lognormal distribution of pause length in ataxic dysarthria // Clinical Linguistic Phonetics. - 2003. - 17(6). - P. 469-86.

5. Sobkowicz P. et al. Lognormal distributions of user post lengths in Internet discussions - a consequence of the Weber-Fechner law? // EPJ Data Science, 2013.

DOI: https://doi.org/10.23670/IRJ.2017.57.120 Туранина Н.А.1, Кулюпина Г.А.2, Курганская Л.М.3

1ORCID: 0000-0001-8280-6486, Доктор филологических наук, 2ORCID: 0000-0001-9790-3545, Кандидат филологических наук, 3ORCID: 0000-0002-7555-6439, Кандидат педагогических наук, Белгородский государственный институт искусств и культуры ВЫРАЗИТЕЛЬНЫЕ ВОЗМОЖНОСТИ СИНОНИМОВ И АНТОНИМОВ В ПРОИЗВЕДЕНИЯХ А. ЛИХАНОВА

Аннотация

В статье рассматривается использование в произведениях А. Лиханова синонимов и антонимов как изобразительно-выразительных средств русского языка. На конкретных примерах проанализированы особенности функционирования синонимов и антонимов, показано разнообразие выполняемых ими функций. Выявлено употребление писателем синонимов, соединенных сочинительным союзом, что в целом не характерно для слов этого класса, использование антонимичных пар, которые образованы словами, относящимися к разным частям речи.

Ключевые слова: синоним, антоним, изобразительно -выразительные средства.

Turanina N.A.1, Kuljupina G.A.2, Kurganskaja L.M.3

1ORCID: 0000-0001-8280-6486, PhD in Philology,

2ORCID: 0000-0001-9790-3545, PhD in Philology,

3ORCID: 0000-0002-7555-6439, PhD in Pedagogy, Belgorod state institute of arts and culture EXPRESSIVE OPPORTUNITIES OF SYNONYMS AND ANTONYMS IN THE WORKS OF A. LIKHANOV

Abstract

The article deals with the use of synonyms and antonyms in A. Likhanov's works as expressive means of the Russian language. The main features of synonyms and antonyms functioning are analysed on specific examples, the variety of the functions they perform are shown. The paper revealed the use of synonyms connected by a co-ordinating conjunction in the writer's work, which is not typical for words of this class in general, as well as the use of antonymous pairs, formed by words that belong to different parts of speech.

Keywords: synonym, antonym, expressive means.

Умелое использование в языке художественной литературы выразительно -изобразительных средств лексики позволяет писателю акцентировать внимание на предмете или явлении, дать его оценку, усилить воздействие на читателя.

Язык художественной прозы А. Лиханова богат различными приемами использования синонимических и антонимических средств языка.

Чаще всего встречаются случаи открытого использования синонимов, когда синонимичные слова соседствуют в тексте, выполняя разнообразные функции. Прежде всего синонимы применяются для замещения, чтобы избежать простого повторения слов: На столе перед матерью лежала фотокарточка. На снимке был молодой парень с чубом из-под фуражки и с гармошкой в руках («Звезды в сентябре»); Кому не известно, что у каждой печки свой характер, они ведь как люди. Сколько печек, столько норовов («Кикимора»).

С помощью синонимов автор уточняет значение отдельных слов, помогает различить оттенки смысла: ... мысль о том, что это молоко - плата за Ваську, сама собой исчезала, будто растворялась в выпитом молоке; Мне было противно, гадко; Может быть, предчувствие - это не суеверие, не предрассудок, а что-то такое, что есть на самом деле? («Крутые горы»); Что-то с ним происходило, что-то бурлило, кипело в нем, как в котле. Я и раньше замечал, что руки у него всегда дрожат - поколи-ка столько дров! - но теперь они просто тряслись («Кикимора»). В последнем примере на значение уточнения наслаиваются градационные отношения между словами-синонимами и отношения противопоставления.

В отдельных случаях при синонимах находятся слова, подчеркивающие различия в их значениях: У него было много обязанностей в детской поликлинике, а главная среди них - кучер, точнее, извозчик, потому что кучер возит только седоков, а извозчик еще и грузы (в последнем использовании пары кучер - извозчик проявляется еще и сопоставительная функция синонимов: внимание обращается на различия в значении слов) («Кикимора»); - Да, товарищи бабы, вернее - женщины! («Деревянные кони»).

Иную роль играют синонимы в таком примере: ... он заведет Машку в оглобли возка, черного, лакированно-блестящего, а потом подгонит свой экипаж к парадному, или «чистому», как говорила Захаровна, подъезду

i Надоели баннеры? Вы всегда можете отключить рекламу.