Научная статья на тему 'THE USE OF GLOTTOGRAPHIC ANALYSIS IN LINGUISTICS'

THE USE OF GLOTTOGRAPHIC ANALYSIS IN LINGUISTICS Текст научной статьи по специальности «Языкознание и литературоведение»

CC BY
138
46
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
GLOTTOGRAPHY / SPEECH / SOUNDS / ACOUSTICS / EXPERIMENT

Аннотация научной статьи по языкознанию и литературоведению, автор научной работы — Vikhrova A. Yu.

Glottography as a method for studying sounding speech is a synthesis of medical technologies and linguistic knowledge. The principle of operation of the glottograph is based on a change in the resistance of ultra-high frequency currents supplied to the larynx. The device records the movement of the vocal folds and is attached to the neck in the larynx region. The glottogram shows the phases of vibration of the vocal folds in the form of their electrical counterpart. Thus, the study of speech with the help of a glottograph can give a more complete picture of the work of the articulatory apparatus and, as a result, help in teaching foreign languages. In the present case, the work starts with the modern Korean language, the sounds of the Korean language. An experiment conducted on the basis of the Korean language is a particularly interesting topic, as the Koreans themselves do not often investigate the sound system of their language. Glottographic analysis gives the possibility to study the Korean sounds.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «THE USE OF GLOTTOGRAPHIC ANALYSIS IN LINGUISTICS»

Применение глоттографического анализа в лингвистике

Вихрова Анастасия Юрьевна,

кандидат филологических наук, ведущий научный сотрудник, Институт стран Азии и Африки, МГУ им. М.В. Ломоносова E-mail: avikhrova@gmail.com

Глоттография как метод изучения звучащей речи - это синтез медицинских технологий и лингвистических знаний. Принцип работы глоттографа основан на изменении сопротивления токов сверхвысокой частоты подводящихся к гортани. Устройство регистрирует движение голосовых связок, прикрепляется к шее в области гортани. Глоттограмма показывает фазы вибрация голосовых связок в виде их электрического аналога. Таким образом изучение речи с помощью глоттографа может дать более полное представление о работе артикуляционного аппарата и, как следствие, помощь в обучении иностранным языкам. В данном исследовании речь пойдет о современном корейском языке, о речи, звучащей на корейском языке. Эксперимент, проведенный на основе корейского языка, особенно интересен тем, что сами корейцы не часто исследуют звучащую сторону своего языка. Глоттографический анализ дает возможность более глубоко изучить корейскую речь.

Ключевые слова: глоттография, речь, звуки, акустика, эксперимент.

Recently, a large number of innovative approaches to the study of spoken speech have been used in linguistics. One of these approaches is the use of glottograph. A glottograph or electroglottograph is an electronic device that records the movement of the vocal cords (folds) attached to the neck in the larynx region. Glottography is a recording of speech utterances. To be more precise: recording of vocal cords during breathing and phonation - the production of vocal sounds. In the context of physics and anatomy, the term «recording» here refers to the measurement and study of such phonic expressions. Linguistically, this refers to the fixation of spoken sounds in the form of symbols or signs that define the writing system. "In principle, a writing system can be derived from glot-tographic and non-glottographic origins of symbols or characters [3]. In the latter case, no relation between symbols and sounds is evident. Such a writing system is called semasiographic, formerly ideographic [4]. In semasiography, symbols are constructed by humans who agree upon their meaning. The international road sign system and the ancient quipu of Inca Peru - connected, color-coded cords with tied knots - are examples" [2].

Initially, the glottograph was invented by the French physiologist F. Fabre in 1957. The principle of operation of the glottograph is based on a change in the resistance of ultra-high frequency currents supplied to the larynx. The device records the movement of the vocal folds and is attached to the neck in the larynx region. The glottogram shows the phases of vibration of the vocal folds in the form of their electrical counterpart. Glottography objectively evaluates the dynamic changes in the vocal apparatus in the process of correctional-restorative learning and after completion of it).

Thus, electronic glottography is a diagnostic research method that allows you to observe the dynamic changes in the function of the vocal folds during restorative learning and record the results achieved after its completion. The criterion for assessing the voice function is the ratio of the relative duration of the phase of maximum vocal fold opening to the duration of the contact phase, which normally approaches unity.

During the closing of the vocal folds, the current resistance decreases and vice versa when they are opened. Each such fluctuation is graphically recorded in the form of a glottogram. In a person with a healthy vocal apparatus, a clearly expressed uniform periodicity of fluctuations is observed on the graph. In cases when the patient has any problems with the vocal apparatus, instead of a uniform curve, its heterogeneity will be observed on the graph, up to the complete absence of oscillation frequency in severe forms of pathology.

C3

о

CO "O

1=1 А

—I

о

C3 t; о m О от

З

ы о со

In general, in order to understand the principle of operation of the glottograph and understand whether it is so necessary for the study of the sound circuit, one must start with the definition of the word "voice". According to Dmitriev>s explanatory dictionary, a voice is:

1) the flow of sounds formed by the human vocal apparatus and differing in height, strength, timbre, etc., as well as the sounding speech itself;

2) the ability to speak loudly, to shout ("shout out loud", "cry out loud", "raise your voice");

3) the ability to speak ("raise a voice," "lower-raise your voice," "shouting voice");

4) the ability to sing ("piece for four voices", "voice of the flute");

5) sounds of animals ("bird voices", "sweet voice of a nightingale");

6) opinion, assessment ("the right to vote", "casting vote", "advisory vote", "voice of conscience", "voice of reason", "voice of conscience").

From a linguistic point of view, a voice is a collection of sounds of various heights, strength and timbre, arising from the oscillation of elastic vocal folds; oscillation of air particles propagating in the form of waves of thickening and rarefaction. The main acoustic characteristics (signs) are:

• Height (Hz) - subjective perception of the frequency of vibrations of the vocal folds.

• Intensity (dB) - the subjective sensation of the amplitude of the vibrations of the vocal folds.

• Timbre is a complex quality that consists of a combination of vibrations of the vocal folds and the result of the resonators.

• Duration (duration) of phonation - subjective perception of the duration of the sound of the voice.

• Pitch frequency is the vibration frequency of the vocal cords. For each speaker, the base frequency of the main tone is individual and is due to the structural features of the larynx. On average, for a male voice, it is from 80 to 210 Hz, for a female voice -from 150 to 320 Hz.

Acoustic signs are signs that quantitatively reflect the physical characteristics of speech signals, which are extracted and measured in a computer using software and mathematical software.

When studying a sounding voice, it is necessary to investigate all acoustic characteristics not separately, but in aggregate - this is the only way to obtain correct data. At the center of all studies of the voice and speech apparatus is the person - the carrier of the voice.

Whatever the researcher is interested in, no matter what features and speech deviations he would like to study, for a start it is necessary to establish the speech norm for a particular speaker, to find out what are the acoustic characteristics inherent in him. Indeed, without this it will be extremely difficult to establish what are the features and deviations in the speaker>s speech. It =f is this process - the establishment of the speech, pro-Si nunciation norm and formed the basis of the speak-g er>s identification by oral speech. About 70 years ago, S phoneticians were first recruited as qualified experts in I matters related to the analysis of audio recordings of

speech. For a long time, the practice of investigation and the identification of the speaker by oral speech were not considered parties to the same issue. Despite the fact that phonetic studies have long had an outlet in practice, the need to apply these studies, for example, in legal practice, arose only in the 60s of the XX century.

The central issue of applied speech is the identification of the speaker by oral speech. Speaker identification is the process of determining, based on the characteristics of the speech signal and the speech flow in general, whether a given utterance belongs to a particular speaker, subject to a choice from n stimuls belonging to n persons. In addition to the linguistic message, the voice and speech carry information about the speaker>s territorial and social affiliation, his emotional state, his relationship to the interlocutor, the statement and the situation in general, about his physiological, mental and intellectual characteristics. Traditionally, speaker recognition is carried out in two directions: speaker identification and speaker verification.

Speaker identification is the "one out of many" recognition of the speaker based on speech characteristics. Speaker verification by speech allows you to determine whether a given voice pattern belongs to a particular he / not he speaker. In practice, the identification of a vote on an open set of standards implies the possibility of a negative decision (rejection of all standards) in the event that the available voice sample does not belong to any of the votes. Speaker verification involves a choice of two. In both cases, the main problem is to determine the similarity between the speakers, which would ensure the reliability of the classification process. It is obvious that the identification method is a more difficult task than verification.

In the 60s of the XX century, in practice, the need to identify the speaker from audio recordings was first realized. Independently of each other, two diametrically opposite approaches to this problem have arisen - in the United States and England. In the United States, speaker identification came down mainly to several forms of acoustic analysis: spectrograms of the speaker>s speech fragments recorded on magnetic tape were compared with the spectrograms of another speaker>s speech. This approach is called videogram or voice print. This approach is based on the assumptions that:

1) spectrographic structures of various pronunciations of the same words or sounds by one speaker have relevant similarities;

2) the speech of different speakers on the spectrogram differs significantly.

However, problems of both empirical and theoretical nature arose here. Studies show that the spectrographic patterns of various utterances by the same speaker inevitably differ from each other: first, by the duration of the segments (vowels and consonants); secondly, in terms of frequency and energy structures. At the same time, the spectrograms of utterances belonging to different people, but speaking the same regional or social dialect, can have very similar energy-frequency and temporal structures. The supporters of

the voiceprint method, who advocate the validity of this method, were unable to explain what could serve as a sufficient criterion for stating the similarity or difference of spectrograms, which is essential for making a decision. According to many phoneticians, this approach is a crude image comparison. A complete match is impossible.

However, in the 60s and 70s of the 20th century, videograms of the voice evoked great enthusiasm among phoneticists and employees of the US legal structures. However, later the method proved to be ineffective.

In parallel with the development of a purely acoustic method in the USA in the 60s-80s of the XX century, another trend developed in England: identification was carried out exclusively on the basis of auditory phonetic analysis. This method consisted of specially trained phoneticians listening to speech recordings and highlighting certain vocal and speech patterns identified by analyzing vocal samples. Auditory impressions designed with the help of International Phonetic Alphabet were compared on the basis of segment and prosodic characteristics, which served as a basis for assessing the identity of speakers. The adherents of the purely auditory method, however, have never advocated against performing acoustic analysis. The auditory analysis method was developed extensively in England and has been successfully used for many years. The disadvantage of this method is obvious: recordings of fragments of speech that are different in acoustic parameters can be perceived by ear as the same. In this case, during auditory analysis, important differences between phonograms may not be detected. Thus, each of these methods has advantages and disadvantages. The principle is legitimate, according to which, in any matter, including the identification of the speaker, a conclusion made on the basis of two forms of analysis carried out independently of each other is considered more reliable.

Even when comparing voice samples of different speakers for all possible auditory and acoustic-phonetic parameters, it is impossible to state the speaker>s personality with absolute accuracy. As a form of human behavior, speech is influenced by a wide range of not yet fully understood factors. Internal factors such as fatigue, illness, alcohol in the blood and mental state can affect the speech signal at the segmental and prosodic level. Due to the influence of the speaker>s emotional state on the prosodic and spectral characteristics of speech, those associated with prosodic characteristics, in the study of which, should take into account many factors, acquire a noticeable role in phonetic studies. The study of prosod-ic phenomena is complicated by their internal interdependence at various levels of transformation (lexical, phonetic, pragmatic). A person who perceives speech does not decode a combination of sounds, but tries to understand a communication partner. The prosod-ic organization of speech utterance acts as the leading pronouncing means for expressing emotional and evaluative values, in the transmission of which the elements of the prosodic structure are unequal. The

most indicative parameter is F0. The range of changes in F0, its contour and register, to a greater extent affect the listeners assessment of the state of affect or arousal by the speaker. These acoustic variables turn out to be dependent on the emotional content inherent in the text. The emotive function of tone is also associated with individual psychology, as it concerns the expression of the speaker>s state. The duration parameter at the emotive level is used in most languages to increase the degree of emphasis. The pace is largely determined by the emotional content of the speech situation. An increase in emotional tension is associated with an acceleration or deceleration of the rate of speech. The expression of fear is characterized, as a rule, by a higher speed of articulation than the expression of longing and sadness. At the prosodic level, all types of emotional states are clearly contrasted as negative and positive.

The universal characteristics of negative emotional states are:

• lowering the frequency components of the melody;

• increasing the time of consonant realization;

• changing the formant structure of vowels by lowering the average value of the formants.

Positive states are noted with:

• the rise and wider range of melodies;

• a shift in the formant structure of vowels to high frequencies compared to neutral pronunciation;

• increasing duration and level of intensity.

Thus, prosodic characteristics play a huge role in the analysis of emotionally colored speech. Parameter F0 is constantly used in practice as an analyzed speech characteristic in the process of identifying a speaker by speech. Experiments confirm that the speech situation to varying degrees affects the parameters of the F0 (the average F0, the standard deviation, the maximum and minimum F0, the frequency range changed). This allows us to conclude that, as a rule, F0 does not belong to the number of stable speech characteristics of the speaker, since the range of its change in the speech of one speaker is too wide.

Difficulties associated with using the F0 parameters when identifying the speaker, were confirmed by the data of the analysis of the average values of F0, obtained on the basis of telephone conversations recorded on tape. It turned out that the average values of F0 in natural situations are much higher than the average values of F0 obtained in laboratory conditions. It was also found that this parameter varied within 30Hz for the speaker, depending on whether the voice was recorded during a natural telephone conversation or for a special spoken sample. The researchers concluded that the mean values of F0 and the standard deviation value obtained from natural recordings of conversations cannot automatically be considered as the real characteristics of the voice under normal conditions. To obtain reliable data on the normal distribution of F0 anonymous telephone speaker, it is necessary, using auditory analysis, to identify sufficiently large segments that contain speech that is not affected by abnormal psychological and or situational factors.

C3

o

CO "O

1=1 A

—I

o

C3 t; o m o

OT

3

u o

CO

o d

cj

oj o

CM

Returning to the application of glottographic analysis to elucidate the acoustic characteristics of sounds, it is worth giving an example of a study on Korean noisy consonants using the example of the group [t]. The study used the acoustic glottographic method.

One of the characteristic phonetic features of the Korean language is the system of noisy consonants, which are opposed to each other in the strength of pronunciation and the presence of aspiration when pronouncing them. Thus, weak, strong and aspirated consonants stand out. Weak consonants are pronounced with an insignificant explosion force (the middle sound between «k» and «g», «t» and «d»), strong ones -with a sharp explosion force (as when pronouncing «t» in the word «shade») and aspirated - pronounced aspirated (sounds «kh», «th», etc.). It is important to remember that the position of the letter on the letter determines how the sound is pronounced: loud, dull or implosive, that is, without an explosion.

^^ [tae-hae] proposal 'about, of'

To examine these sounds, a native Korean speaker of the capital (Seoul) pronunciation standard read a short excerpt from a text written in modern Korean in front of a microphone. The recording was carried out at the Laboratory of Experimental Phonetics Institute of African and Asian studies Moscow State University using the Real-Time EGG software in two channels: a microphone signal was recorded in the first channel, and a glottographic signal in the second. All channels were processed with Praat software.

Weak [t]

Korean Weak Fore-Lingual [t] has three positional variants.

• Perceptually weak position

In a perceptually weak position, this sound occurs at the beginning of a word and is realized as a voiceless, front-lingual explosive, aurally reminiscent of Russian [t] (pic. 1).

Length Fo F0 area Imean larea F1 F2 F3 Volume

msec Hz Hz*s dB dB*s Hz Hz Hz Hz*Db*s

21.6 209.85 0.55 75.4 1.63 1021 2358 3352 40.9

Pic. 1

The interval without oscillatory movements corresponds to the bowing phase of this sound and is 0.007 sec, then a very weak aperiodic signal is observed, which corresponds to the bow opening stage without aspiration. The force of the explosion is negligible. The glottogram shows that the vocal cords are not involved in the creation of this sound. • Strong position

In a strong position, this sound occurs between vowels and after sonorant consonants and is realized as a voiced plosive sound reminiscent of Russian [d]

(pic. 2).

The bow phase is represented by weak, but very frequency oscillations, since this sound is voiced, after

which another aperiodic stronger oscillation is seen, which corresponds to the opening of the bow. The glottogram shows that when pronouncing this sound, the vocal cords vibrate slightly, which means they are involved in the process of articulation.

In foreign literature, to distinguish between pairs of voiced and voiceless consonants, they resort to the VOT correlate (voice onset time). This feature takes into account the state of the glottis at the moment of opening the bow and counts the time delay of the voice relative to the bow. It is believed that for voiced consonants this parameter is much lower than for voiceless ones [1]. This is also clearly seen on our intono-grams. In fact, voiced [d] has VOT equal to zero, since

it is in the speech chain after sonorous [n], while VOT In the position of neutralization, this sound is found at [t] = 0.001 sec. the end of words and in front of noisy consonants and is

• Neutralization position realized as a front-lingual implosive, that is, pronounced

with exposure, but without an explosion (pic. 3). ^^ ^^ [sin-mun-dùr-ùl] Accusative 'newspaper'

Length Fo F0 area Imean Iarea F1 F2 F3 Volume

msec Hz Hz*s dB dB*s Hz Hz Hz Hz*Db*s

10.0 247.68 2.47 78.1 0.78 851 2224 3703 193.2

Pic. 2

^^ [tút-ko] 'hear' the adverbial adjective

Length Fo F0 area Imean Iarea F1 F2 F3 Volume

msec Hz Hz*s dB dB*s Hz Hz Hz Hz*Db*s

30.2 194.95 0.58 74.2 2.23 791 2089 3391 42.6

C3

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

o

CO "O

1=1 A

—I

o

C3 t; o m O OT

3

u o

CO

Pic. 3

The intonogram clearly shows that the intensity does not go to zero during exposure, and the pronunciation organs are preparing to pronounce the next sound.

Strong [tt]

This sound occurs in all positions, except for the position at the end of a word and before voiceless con-

^^ [ttae-mun-im-ni-da] 'because' a formal-polite noun

sonants, since in these positions it is neutralized and becomes implosive. In a strong position, this sound is realized as a dull front-lingual explosive, aurally reminiscent of Russian [t], but with greater endurance and intensity than a weak [t] (pic. 4).

Length Fo F0 area Imean Iarea F1 F2 F3 Volume

msec Hz Hz*s dB dB*s Hz Hz Hz Hz*Db*s

16.5 260.43 3.51 74.1 1.22 516 2331 3451 260.1

Pic. 4

[t'el-le-bi-jon-i-na] 'TV' with separating particle «or»

Length Fo F0 area Imean Iarea F1 F2 F3 Volume

msec Hz Hz*s dB dB*s Hz Hz Hz Hz*Db*s

33.6 330.81 3.50 76.4 2.55 1013 2329 3493 259.1

Pic. 5

The intonogram shows that the bow is more intense than that of the weak [t]. VOT = 0.005 sec, which also indicates a longer exposure.

Aspirated [t']

This sound is found in all positions, except for the position at the end of the word and before voiceless consonants, since in these positions it, like the strong [t], is neutralized and becomes implosive. In a strong position, this sound is realized as an aspirated voiceless front-lingual explosive, aurally reminiscent of Russian [t] aspirated (pic. 5).

The interval without oscillatory movements corresponds to the bowing phase of this sound and is 0.006 sec. This is about the same as with weak [t] in his perceptually weak position. Then a very average aperiodic signal is observed, which corresponds to the opening stage with aspiration. On the glottogram, vibrations of the vocal cords are not visible, which indicates that they are not involved in the formation of this sound. VOT of aspirated [t'], as expected, turned out to be the highest - 0.03 sec.

Conclusion

An experiment using glottographic analysis showed in which case the vocal cords are involved in pronouncing the consonants of the Korean language, and which are not. This seems to be extremely important both in theoretical and practical terms. In practice, this method makes it possible to correctly teach Korean studies students to pronounce specific sounds of the target

language. In theoretical terms, such experiments represent a fertile ground for further research in the field of phonetics and phonology.

THE USE OF GLOTTOGRAPHIC ANALYSIS IN LINGUISTICS

Vikhrova A. Yu.

Lomonosov's Moscow State University

Glottography as a method for studying sounding speech is a synthesis of medical technologies and linguistic knowledge. The principle of operation of the glottograph is based on a change in the resistance of ultra-high frequency currents supplied to the larynx. The device records the movement of the vocal folds and is attached to the neck in the larynx region. The glottogram shows the phases of vibration of the vocal folds in the form of their electrical counterpart. Thus, the study of speech with the help of a glottograph can give a more complete picture of the work of the articulatory apparatus and, as a result, help in teaching foreign languages. In the present case, the work starts with the modern Korean language, the sounds of the Korean language. An experiment conducted on the basis of the Korean language is a particularly interesting topic, as the Koreans themselves do not often investigate the sound system of their language. Glottographic analysis gives the possibility to study the Korean sounds.

Keywords: glottography, speech, sounds, acoustics, experiment. References

1. Baart, J. L. G. A Field Manual of Acoustic Phonetics. Dallas, TX: SIL International, 2010.

2. Coe M.D. Breaking the Maya Code. Thames & Hudson, New York, Revised Edition 1999; p. 18.

3. Hyman M. D., Of Glyphs and Glottography. DRAFT 2006-0401, to appear in Language & Communication [archimedes.fas. harvard.edu/mdh/glottography.pdf].

4. Sampson G., Writing Systems [www.icosilune.com/2009/01/ geoffrey-sampson-writing-systems]

i Надоели баннеры? Вы всегда можете отключить рекламу.