How about emotions? Two cultures and varieties of Portuguese
Research Article
Ana Margarida Belém Nunes
Abstract
The capability to precisely recognize and perceive the emotional state of a speaker is essential for successful communication. Emotion expression and perception have many universal characteristics. However, European Portuguese shows some particularities. Values obtained for neutral expression, sadness, and joy are very close, contrary to what happens in other languages. This phenomenon, which differentiates European Portuguese from other languages, may occur due to cultural factors. The main point of this study is to verify whether cultural aspects and different prosody and pronunciation of the same language affect the acuity of listeners' recognition of emotion in spoken European Portuguese. To do so, this study compared the perception of European Portuguese (EP) speakers and Brazilian Portuguese (BP) speakers, given that they share a language but vary in terms of culture and portrayal of emotion. The emotions investigated in the present study include the following: sadness, happiness, anger, despair, fear, and a neutral state. Results have shown that EP listeners are better at interpreting their own emotions, but that certain emotions, such as happiness, sadness, and neutral state, present similar results in both groups' interpretations.
Received:
30 April 2020 Reviewed: 18 May 2020 Accepted: 30 May 2020 Published: 7 June 2020
UDC: 811.134.3 34
Keywords
emotions; culture; speech perception; European Portuguese; Brazilian Portuguese
Faculty of Arts and Humanities, Department of Portuguese, University of Macau, Avenida da Universidade, Taipa, Macau, China
Corresponding author:
Ana Margarida Belém Nunes (Ms.), [email protected]
For citation:
Nunes, Ana Margarida Belem. 2020. "How about emotions? Two cultures and varieties of Portuguese." Language. Text. Society 7 (1). https://ltsj.online/2020-0 7-
1-nunes.
Language. Text. Society
Vol. 7 No. 1, 2020 ISSN 2687-0487
Introduction
By observing different languages, cultures, societies, epochs, and religions, it becomes highly evident that there are different ways of thinking and talking about feelings and emotions. Nevertheless, there are no doubts about the existence of shared characteristics and qualities across these various times and cultures, which permits us to also talk about emotional universals, as stated by Wierzbicka (2005). A question that remains is how to sort out the culture-specific traits from the universals. That is why in this work we use the voice measures to describe emotions, after which we apply a perception test just to know how to recognize and identify the others' emotions. The main objective of this study is to investigate whether speakers of European Portuguese (EP) and Brazilian Portuguese (BP) have the same results on perceiving emotions produced in the European variant of Portuguese. If the results are not the same it will be important to identify the bigger problems and differences, trying to understand it since both groups share the same language.
The first problem one faces in starting the investigation is choosing valid corpora. These corpora can be divided into three main categories: spontaneous speech, acted speech, and elicited speech. All three present pros and cons (for further information, see Scherer 1989; Scherer 2003). The Belfast Naturalistic Emotional Database (Cowie, Cowie and Schroeder 2003) includes spontaneous speech, comprising 298 audio-visual clips from 125 speakers. The corpus was described according to several tiers of descriptors: impaired communication, pitch and volume, timing, paralanguage, voice quality and articulation.
Despite the controversies, many studies on voice quality and emotions use actors for corpus creation. As an example of acted speech, Vogt, Andre and Bee (2008) recorded 10 professional actors (five men and five women) producing 10 utterances with six different emotions (anger, joy, sadness, fear, disgust, and boredom) as well as a neutral emotional state. The sentences were semantically neutral. Consistent with other works, the authors found acted emotions to be more easily recognized than realistic emotions.
Objectives
One of the first objectives of the previous study that subsequently led to the present research was to obtain, to one's knowledge for the first time for European Portuguese values for the parameters commonly contemplated in acoustic analyses of emotional speech. Based on those results, it was possible to compare the obtained parameter values with the ones reported for other languages in order to determine which, follow the general tendencies and which are EP language own characteristics. A part of the mentioned previous work will be presented here to clarify the values for emotions in EP and to make possible the principal study for this paper, the perception test for Brazilian Portuguese speakers.
The aim of the present study is to investigate how speakers of Brazilian Portuguese perceive and clearly identify emotions conveyed in European Portuguese. Once that it is stated that we do have a better perception and comprehension of emotions that are expressed in our own language, one wanted to verify that maybe the cultural aspects and different prosody and pronunciation cues may affect this acuity.
The targeted emotions in the present research are sadness, happiness, anger, despair, fear, and a neutral state. The results show that EP listeners are better at interpreting their own emotions and that certain emotions, such as happiness, sadness, and a neutral state, present similar results for both groups' interpretations, which will try to be explained in this article.
Theoretical framework
Cross linguistic studies on Emotional Speech
The Expression of the Emotions in Man and Animals by Darwin ([1872] 2000) is probably the start of descriptive and theoretical studies about expression of emotions. Since Darwin's work, several discussions, perspectives and theoretical works have been published, also other areas (like psychology, neurology, cognitive sciences, sociolinguistics) started to be interested in the analysis of production and perception of emotions. Among different questions that are important in this field, the function of language and culture became very pertinent and considered. According to Darwin, culture does not play an important role in the expression of emotions since, from his perspective, the recognition of emotions is part of a biological heritage and is therefore universally recognizable. While Darwin recognized that there were different societies, ethnicities, languages and even different cultures and social environment that could influence the expression of emotions, however for him the important matter of study was what humans had in common with animals in the realm of emotions. Darwin focused his studies on expression of emotions and since then it is known that facial expressions are more universal than prosody, even though studies that contemplated only prosody or non-verbal aspects revealed that anger is reasonably well perceived but that other emotions, such as joy, are not. It is also well known that speakers are commonly better at perceiving emotions in their own language.
To further address these questions, scientific studies have been conducted to investigate the interaction between speakers and listeners of several diverging origins. In the following paragraphs some studies are mentioned to serve as background.
According to Zinken, Knoll and Panksepp (2012), the languages and cultures studied so far are not actually very diverse. Moreover, only a few specific emotions have been studied systematically, usually 'basic' emotions. It is universally accepted that anger is the best-perceived emotion, even when interacting in a foreign language, and that other emotions are better perceived in our native language and culture. Findings in that study show that there is, in fact, an influence of linguistic and cultural aspects on recognizing emotion, especially if we refer only to vocal emotion without any facial or corporal expression.
Abelin (2004) carried out an experiment in cross-cultural multimodal interpretation of emotional expressions. The aim of the study was to investigate how speakers of Spanish and Swedish interpret emotions in each other's languages. The emotions studied were sadness, tiredness, anger, scepticism, happiness, fear, depression and extreme happiness. The results showed that Spanish listeners were better at interpreting the Swedish speakers. Certain emotions, such as happiness and fear, were more difficult to interpret only from prosodic information for both groups.
Sawamura et al. (2007), in a study on Japanese, American and Chinese speakers, showed that there are some common factors independent of language and culture that determine emotion perception from speech sounds. The authors also found that multiple emotional components were perceived in most speech materials, even when a single emotion was intended. Anger, joy and sadness seem to be the three basic emotions, while the other emotions were found to interweave with them.
Johnstone and Scherer (1999) have studies in which emotional vocal recordings were made using a computer emotion induction task involving an 'imagination technique'. Voice quality acoustic parameters included F0 minimum, F0 range, jitter and spectral energy distribution. The emotions studied were tenseness, neutral state, irritation, happiness, depression, boredom and anxiousness. The authors reported that, "values for jitter are correlated with F0 floor, thus indicating that period to period F0 variation tends to be larger with higher F0. This tendency is absent for anxious and tense speech though, which is in agreement with previous findings of a reduction of jitter for speakers under
stress". In sum, they found that happy speech presents significantly higher values of jitter than all other emotions. Also, as expected, F0 floor was found to be lowest for the emotions of boredom and depression and highest for happy and anxious speech.
"Increased emotional arousal is accompanied by greater laryngeal tension and increased sub glottal pressure, which increases a speaker's vocal intensity". For example, Darwin observed that angry utterances sound harsh and unpleasant because they are meant to strike terror into an enemy (Darwin [1872] 2000).
Anger is usually associated with an increase in mean F0 and energy. Anger also includes "increases in high frequency energy and downward-directed F0 contours. The increase of mean F0 and range is also a characteristic of fear, which also has a high frequency energy; sadness shows a decrease in mean F0, F0 range and mean energy; joy, a positive emotion (one of the few that are usually studied), has an increase in mean F0, F0 range, F0 variability, mean energy and an increase in high frequency energy" (Banse and Scherer 1996). "Understanding a vocal emotional message requires the analysis and integration of a variety of acoustic cues" (Schirmera and Kotz 2006).
Other well-known researchers in the area of voice and emotion—Zovato, Pacchiotti, Quazza and Sandri (2004)—used three basic simulated emotional styles (beside neutral, angry, happy and sad). An Italian female professional speaker recorded 25 sentences, and their aim was to investigate the correlation between emotions and acoustic parameters (F0 minimum, maximum, mean and range, and RMS energy). They also applied a perceptual test to 10 participants to evaluate the corpus. It was shown that there was some confusion in discerning the pairs neutral/sad and happy/angry (Zovato et al. 2004).
Drioli, Tisato, Cosi and Tesser (2003) analysed F0, duration, intensity, jitter, shimmer, HNR (Harmonic Noise Ratio) and other voice quality indices, such as the Hammarberg Index. The authors utilized Praat voice report. Regarding irregularities, and for stressed vowels, they reported a high shimmer value for anger and higher jitter values for joy and surprise (with anger in third place). The HNR was found to be lower for anger and joy.
Chung investigated acoustic properties of Korean emotional speech. The author measured F0 parameters (mean, maximum, minimum, mean of the 20% lowest values and range), jitter, shimmer, speaking rate and spectral distribution. The analysis showed that joy increases mean F0, whereas sadness enhances the decrease of F0 minimum. The increase of F0 maximum and of F0 range was found to be "a good indicator of the general emotional arousal." "The jitter and the shimmer values seem to increase under the emotional tension (...). However, these variations (... ) were not statistically significant in the case of Korean data" (Chung 2000).
It should be said that voice quality aspects are very often described qualitatively. However recently, the list of investigated parameters has expanded to include jitter, shimmer, HNR, glottal source parameters and open quotient, among others, such that voice quality can now be measured in a more objective, scientific way.
European Portuguese vs. Brazilian Portuguese
It is curious to notice that the literature often contrasts biology with culture, even so, it has to be highlighted that the categorization of our feelings depends on the introspective vocabulary of the individual, which in turn depends on his/her language and culture (James 1890, 485 quoted in Harkings and Wierzbicka 2001).
The Portuguese influence on Brazilian culture is quite evident and pervasive across many aspects of life—i.e. the taste for wines, crafts, different types of food and pastries. Also, the religion presents evident influences, since Catholicism is prevalent in both regions. Several Jesuits went to Brazil with the mission to catechize the Indians and the residents of different areas, and until today,
Portuguese churches and saints are held in esteem by Brazilians. Art and literature also took on Portuguese influence, since it was through the Portuguese that Brazil had access to European artistic movements, such as the Renaissance, Rococo, Baroque, and Neoclassicism. Naturally, Brazilian Portuguese, too, is the result of years of interaction between the Portuguese and the Brazilians.
This exchange also occurred in the reverse direction. For example, Portugal became a big consumer of popular Brazilian soap operas, Bossa Nova music, Samba dance and music, Brazilian gastronomy and the typical Brazilian drink—the Caipirinha. The Portuguese see Brazil as a splendid place to spend vacations, due to their breath-taking seashores, great food and cheerful people. Portugal, in January 2020, and according to the Foreigner Border Authorities, was already the home of 150,864 Brazilian citizens1.
Portugal and Brazil have a strong, affectionate relationship. It is rare to find, either in Brazil or in Portugal, someone who does not have a familiar relation on the other side of the Atlantic. And, according to Feldman-Bianco (2001), "Today, as in past times, Luso-Brazilian relations are not restricted to economics. Through their very existence, both the Portuguese community resident in Brazil and the Brazilian community resident in Portugal give body and form to the bonds and affinities that have united both countries."
Beside the empirical knowledge and one's daily observation, there remains a dearth of knowledge about the comparison of expression of emotions in the different Portuguese language varieties. Comparing only two varieties—European Portuguese and Brazilian Portuguese,—it would appear that the Brazilians are more enthusiastic, euphoric on the expression of happiness and generally happier than the Portuguese, who are always described as nostalgic.
This fact can be closely related with some cultural and historic characteristics, such as the famous European Portuguese music called the Fado2 and the well-known word saudade. In contrast, in most regions of Brazil, inhabitants have summer almost all year around (it is known as a mood regulator) and the best-known dance and music is Samba3.
In this study, our interest is to learn if it is possible for the two groups of Portuguese speakers to recognize each other's emotion; while it has been established that we recognize emotions better in our own language, it is interesting to study these two groups since BP and EP speakers share the same language, with different accent and other variations in vocabulary and language structure.
Studies on Brazilian Portuguese
It is known from the production point of view that speech acts and expressive patterns are independent categories, once the emotions do not disfigure the melodic contours which are typical of different speech acts. This is confirmed by the fact that the normally "proposed phonological representation for a neutral utterance is also applied to expressive utterances". In Brazilian
1 Cipriano, Rita. Observador (https://observador.pt/perfil/rcipriano/}.
2 Fado - "Fado is the music genre that most evokes the Portuguese spirit. Encompasses various styles and themes but is essentially characterized by music and sorrowful and melancholic lyrics. In 2011 Fado was classified as Intangible Heritage of Humanity by UNESCO. The word "Fado" means "fate". It is a musical treasure and one of the largest national pride. Fado is beautiful and touching, is a lyrical and sentimental musical style that have arisen in Lisbon around the year of 1830. The singers often wear a black costume and sing to an audience with the musicians behind. When singers sing, the room darkens and becomes silent". (http://www.portugal-live.net/P/essential/culture-music.html)
3 Samba - "The samba was born in Bahia (Brazil) in the 19th century as a mixture of African rhythms. But it was in Rio de Janeiro that it creates roots and grew. During the 1920s, for example, anyone caught dancing or singing samba ran a great risk of going behind bars, because the samba was linked to black culture, which was unpopular at the time. It is only later that he came to be regarded as a national symbol, especially in the early 40s. In this very Brazilian music, harmony is made by string instruments like the ukulele and the guitar, over time, other instruments such as flute, piano and saxophone were incorporated, giving rise to new styles of samba." (http://mundoestranho.abril.com.br/materia/como-surgiu-o-samba)
Portuguese, according to a study carried out by Colamarco and Moraes (2008), emotional patterns do not always affect different speech acts in the same way. The authors verified that the relationship between the emotional patterns are different in every speech act. Yet, their findings showed that there exists a general tendency: neutral utterances and those expressing sadness present lower values for pitch level and average intensity and higher values for duration, and in utterances expressing joy and anger, pitch level is higher and duration is lower. The expression of anger and joy also present similar values to what is generally described for other languages: an increase in pitch and average intensity of the melodic contours. Even if these emotions affect the F0 in BP in a very similar way, they are not confounded in perception tests. Meaning that, and also according to the study that is presented here, vocal quality certainly has a relevant function in distinguishing these emotions in BP.
For sadness, BP presents different values, just like EP, when compared to results for other languages. In BP, sadness when compared to neutral utterances generally presents not a decrease of pitch and intensity, as in other languages, but an increase. Values described for sadness in BP are closer to the ones reported for despair.
In another study on BP carried out by Peres (2014) it was found that BP intonation parameters have an important role in the prediction, perception and distinction of emotional states. In this study, the authors considered F0 related parameters along with duration parameters. The stimuli in the study were 32 excerpts of spontaneous emotional speech, which were collected from a website. In the first procedure, two Brazilians and two non-Latin speakers classified the stimuli (presented randomly) into four basic emotions: happiness, sadness, fear and anger. After this first analysis 18 native BP speakers and 18 English speakers participated in the experiment. They gave each emotion a score for dimensions: valence (pleasant/unpleasant), activation (non-agitated/agitated) and dominance (submissive/non-submissive). The results showed that the perception of the degree of activation could be predicted by some acoustic parameters of intonation. Regarding the degree of dominance, middle tone had significant results for BP speakers, and the coefficient variation of medium tone and duration (intonation) had significantly better results for English speakers.
Summing up, for English individuals the results of judgment of activation were very similar to those reported for BP speakers. For both groups, the valence dimension did not present any significant result when compared to the acoustic parameters. However, according to the author, there is an important difference between the two groups of participants: Brazilians were better at differentiating each dimension (valence, activation and dominance) than the English speakers, who were somehow more confused. Additionally, the author stated that evaluation by non-native speakers could be explained by acoustic information, without the influence of lexicon. For Peres (2014), it was still necessary to find more acoustic parameters that could help explain the differences between judgments made by BP and English participants. According to his findings, it seems to be a linguistic component related to the perception of emotion in addition to the acoustic parameters (co-variation principle) that may explain the performance of native speakers. In contrast, in the case of non-native speakers, the lack of linguistic knowledge of BP could explain their performance.
European Portuguese analysis and results
Analysis and results of studies on expression of emotion in European Portuguese were reported for the first time in a previous work by Nunes at all. (2010). The analyzed corpus was composed of two sentences—a simple one: "O melhor será tomares conta deles" (You'd better take care of them) /u mi 'Aor si'ra tu'marif 'kotE 'delif/; and a complex utterance: "Nao tenho com certeza a voz de uma pessoa que esconde qualquer coisa" (I don't really have the voice of a person who hides something) /'nEw 'tEpu ko sir'tezE e 'voj di 'umE pi'soE ki fkodi kwal'ker 'kojze/. Both sentences were extracted from the Portuguese version of the naturalistic dialogue "The human voice" by the French writer Jean Cocteau
(1989). The chosen sentences do not have any emotional charge or meaning in them, so the actor was free to interpret them according to the intended principles: joy, despair, anger, fear, sadness and the neutral form. The informant was a professional male actor.
Sentences were first annotated at word and phone levels, using the Speech Assessment Methods Phonetic Alphabet (SAMPA) transcription with Speech Filing System (SFS)4. The limits of each segment were marked, and a broad phonetic transcription was made, considering phenomena such as elision, crasis and addition of certain sounds. All data was processed in Praat software (Boersma 2001), which allowed the extraction of the targeted elements using the Praat Voice Report function. Analysis was made in SPSS (v. 16) and R. As some measures departed from a normal distribution, non-parametric tests were employed.
Raiva
Jitt
Desespero média F0
HNR Jitt
Shim
Tristeza média F0
HNR Jitt
HNR
Shim
Shim
Medo
média F0
Neutra
méd a F0
HNR Jitt
Alegria média F0
HNR Jitt
HNR
Shim
Shim
Shim
Figure 1. Comparing all the four parameters analyzed for the 5 emotions and neutral utterance Note: raiva 'anger'; desespero 'despair'; tristeza 'sadness'; medo 'fear'; neutra 'neutral'; alegria 'joy'.
In this study, four different F0 related parameters were investigated: F0 minimum, maximum, mean and standard deviation. Data analysis of different F0 parameters showed that anger is clearly differentiated, presenting an average value near 300 Hz and the highest standard deviation and range.
Joy and despair present similar values across the four F0 parameters, with a mean around 150 for F0 mean and F0 max. One difference between the two is the higher range of values for despair. Fear has values of F0 a little lower than the previous pair. Standard deviation is also lower. Sadness presents the lower values for those parameters, similar to neutral. For F0 maximum, minimum and mean, post-hoc tests showed significant differences for all pairs except despair-fear, despair-joy, fear-joy and neutral-sadness. For F0 standard deviation, the following pairs were not significantly different:
4 Editor's Note: Transcription is changed to IPA.
fear-neutral, fear-sadness and joy-sadness. It can be said that some pairs are difficult to differentiate based on F0 parameters. The standard deviation presents the lowest discrimination power.
Jitter and shimmer were also studied. For Jitter PPQ5 was the only one considered. Results showed that higher jitter values were associated with despair, fear, anger and sadness—the most negative emotions. Neutral and joyful speech presented lower similar values. Regarding jitter values, joy appears clearly lower than three of the other emotions. Therefore, jitter seems a relevant factor to detect joy.
Shimmer values were found to be particularly high for anger, followed by the group that combines despair, sadness and fear. Anger only does not present significantly higher shimmer values than sadness and despair. Nevertheless, despair also had shimmer values that were significantly higher than joy and neutral state, while the other emotions presented no significant differences. In sum, shimmer only differentiated anger and despair from all the other emotional states.
In general, the results obtained for European Portuguese were in accordance with the consulted literature. This is particularly relevant for F0 related parameters and emotions, such as anger and despair. The results for joy contrasted, at least in some parameters, from some previously reported results. F0 maximum and average differentiated anger, sadness and joy, which is in concurrence with results reported by Banse and Scherer (1996) and Cowie et al. (2003). Anger presented the highest F0, joy an equally high value, and sadness demonstrated the lowest value. Sadness, according to Cowie et al. (2003), has values close to neutral. However, results for EP do not confirm the increase of F0 for fear.
One can report some general results for EP and BP that are in accordance with other languages: neutral utterances and those expressing sadness present lower values for pitch level, and for anger, both languages present a higher pitch level.
It is worth mentioning that joy for BP presents (just like anger) higher values for pitch while in EP, joy has results very close to despair and not always significantly different from sadness. On the one hand, we have results for BP reporting that values between sadness and despair are close, and on the other, it is clear that in EP, values for joy and despair are the ones that are closer.
Material and methods
While some of the parameters in this study are language independent, others are specific to certain languages or speakers. Speakers also vary in their capacity to express, recognize and interpret attitudes or emotions. Research has shown that emotions are not equally recognized across individuals. Perceptual tests have demonstrated, for example, identification problems between the pairs neutral/sadness and joy/anger; furthermore, Sawamura found that disgust and anger are similar, surprise and joy are similar and fear is often confused with sadness. According to current knowledge, joy, sadness and anger can be considered basic emotions, as they are easier to identify perceptually across different languages and cultures. Other emotions can be understood as being part of the specificities of a language or of an individual speaker.
In comparing identification scores across language groups and emotions, it was seen that the effects of linguistic factors, such as sentence length, are important (prosody information). In the present study, it was shown that native Portuguese speakers were better at identifying all emotions, especially sadness. All emotions except fear showed statistically significant differences between groups. Both groups showed some facility with perceiving anger, and identification increased for longer sentences for all groups.
Informants and procedure
The perception test had a total of 40 participants: 18 Brazilian Portuguese native speakers and 22 European Portuguese native speakers.
Participants could only listen once to each of the 42 utterances, which were portrayed with the previously mentioned emotions produced by the actor. They only had access to voice information, and they had to identify which emotion they perceived in each sentence. They heard all the sentences in the same conditions—in a silent room using Windows Media Player connected to the audio system—and all participants completed the task at the same time.
Results and discussion
Identification scores across the different groups and across emotion and sentence length (prosody information) were compared to discuss the results with reference to both linguistic and cultural variables.
A—Results comparing listener, emotion, and sentence length
1) Correct answers on perceiving neutral utterances:
Neutral utterances
EP
BP
I neutral sadness fear anger despair
joy
Figure 2. Identification percentage for neutral utterance by group, %
From figure 2, it is clear that neither group had great difficulty identifying neutral sentences and, when in doubt, they often confused it for the same alternative emotions, except for joy, which was only a choice for Brazilian Portuguese speakers, and anger, which was a choice among EP speakers. Generally, the neutral state was confused with despair, fear and sadness (mainly emotions reported as negative ones).
2) Correct answers on perceiving sadness:
Sadness
EP
BP
18 10
I sadness fear neutral despair
joy
anger
Figure 3. Identification scores for perception of sadness, by group, %
For sadness, EP native speakers were better on the identification task. Although BP native speakers also showed fairly good results, they seemed to be a little fuzzier and more diverse in the identification of this emotion. It can be seen that for EP speakers, sadness was at times confused with the neutral state, despair, fear and even joy—all different families of emotions.
3) Correct answers on perceiving joy:
Joy
EP
16
4 22
joy
neutral
sadness
despair
fear
anger
BP
63
23
7
■
5
5
3
Figure 4. Percentage of correct answers for joy, by group, %
As one may observe, both groups, beside being fairly good at identifying joy, had a tendency to confound it with the neutral state. This verifies findings from previous research, which showed that EP do not give emphasis to the expression of joy and somehow it becomes a mix of neutral utterances or even expression of sadness. We do need to emphasize the fact that joy, neutrality and sadness belong to the three different families of emotions, even though they are portrayed in similar ways in EP. Again, as shown by our results, the common misunderstanding is with neutrality, especially for BP native speakers, which reinforces the fact that they (BP speakers) are probably much more joyful then EP speakers. This comparative expression and perception of joy is something we would like to analyse further in future research.
4) Correct answers on perceiving despair:
Despair
24
I despair fear
sadness
anger
neutral
joy
BP
51
22
10
6 5
9
7
7
6
Figure 5. Identification results for the perception of despair, by group, %
Despair, which is seen as a negative emotion, appears to be the most difficult emotion to identify for both groups. Despair is clearly confused with fear and, according to the analysed voice parameters for EP (Nunes et al. 2010), the voice parameter values for fear and despair are in fact very close (cf. figure 1). Consequently, on a perception test, they cause greater difficulties.
5) Correct answers on perceiving anger:
Frequently defended as the most universally recognized emotion, anger was in fact the easiest emotion to perceive for both groups. This could be explained by the F0 differences presented in the expression of anger, which differentiate it from all the others that present some near and close parameter values between them.
Anger
EP
99
■
anger despair
BP
94
Figure 6. Identification scores for perception of anger, by group, %
6) Correct answers on perceiving fear:
Fear
EP
BP
27 47 ■7
37 39 ■5
711 m fear
sadness despair neutral anger
joy
■
Figure 7. Results for perception of fears, by group, %
Distinguishing fear and sadness also appears to represent a challenge; in both groups, it was perceived much more like the expression of sadness than fear. For what one can observe from the graphics, EP speakers differentiate it as sadness much more then BP speakers. Observing the voice parameter analysis of fear, for EP, it presents values much closer to sadness then all the other
7
emotions. In fact, fear, sadness and despair present almost the same values for jitter PPQ5; fear and sadness have the same values for shimmer APQ3 and HNR values are also very close between the two emotions, meaning that all these emotions become easily confused, even among native speakers.
B—Correct answers according to sentence length
The possible influence of the sentence length, and therefore, additional or reduced prosody information, was also analysed.
la.oKi BOOCfli 50 OOJt
-10 ooji
30 OP*
7(1 □rut 10JM* Û.ÛDK
% simple Correct and wrong answers by group
■ -K arte
■ -JiEirdiJu
PaMUgUKL Curup?u
pLTCL^uëL 2'Jill
> Long Correct and wrong answer» by group
inoflcw
HOflW
iO.flOK 50.00* HO.GOii 30.(1 Oil IOjOCW 10.00)1 040*
Ho'lujurs i .iropcu
Jrrti4gum Bra \ I
Figure 8. Percentage of correct and incorrect answers by sentence length and group of participants
Figure 8 shows that the longer the sentences are, the easier it was for both groups to correctly determine emotions. This corroborates the importance of prosody information in the perception of emotions.
C—Total of correct and wrong answers by group
European Portuguese
Total Incorrect 34* Total Correct 66%
Brazilian Portuguese
Total I icarrect 37%
Total
Correct
Figure 9. General results of correct answers
Figure 9 shows that the accuracy level did not differ significantly between the two groups. Although one may observe small differences indicating better performance by native European Portuguese speakers, these differences were not statistically relevant. The biggest differences were in the identification of sadness, where EP performed better, and fear, where the results were better for BP native speakers.
Observing all the figures above, it is clear that anger was, for both groups, the easiest emotion to identify, followed by the neutral state. Sadness and fear present bigger differences between both groups. Nonetheless, the results support the hypothesis that Portuguese native speakers were better at perceiving and distinguishing these two emotions. In fact, fear and despair were the most complicated emotions to identify for both groups and were often perceived as sadness. It is also curious to observe that fear was very much perceived as sadness (by both groups) and that despair was confused with fear; in other words, the expression of all these emotions presents very similar and close values.
Conclusion
Very few languages and cultures have been studied from this perspective, taking into account solely vocal analysis without the effects of facial expression. From our results, it is clear that native speakers of EP achieved better results in the perception of sadness but that the two groups were very close in their recognition of other emotions.
In terms of differences between the two groups, it was possible to observe a major discrepancy in the identification of fear, despair, and sadness, emotions that presented difficulties for both groups, being all part of the family of negative emotions.
As for the better perception of sadness by European Portuguese native speakers, one can speculate about the importance of knowing the language and its corresponding culture. Nevertheless, in this particular analysis, one could argue that both groups share the same language and that the differences in perception come from cultural differences and even diverging intonation cues that differentiate EP and BP. This would also explain why BP speakers confused the expression of joy in EP with the neutral state, given that EP speakers are not very enthusiastic in expressing positive emotions, and that neutrality, joy and sadness expression are all very close, therefore confusing.
One more conclusion to be drawn is that in both groups, the emotion that stood out in terms of correct answers was anger, which supports findings from prior research showing that anger is universally the most recognized emotion (Pell et al., 2009). It can also be noted that the correct identification of each emotion improved in conjunction with increased sentence length.
The present research comparing Portuguese varieties on the expression and perception of vocally portrayed emotions needs to be continued. It would be pertinent to consider the prosody parameters of both varieties by taking into consideration the different word stress patterns of EP and BP (Santos, 2017). It would also be interesting to do the same procedure with a focus on spoken Brazilian Portuguese. Is that possible that EP speakers can have good results in differentiating sadness, joy, and even neutral utterances when they are spoken in BP? If so, could this be the result of the prolonged contact with the BP variety (through music, soap operas, etc.) that prevails in Portugal?
References
Abelin, Asa. 2004. "Cross-Cultural Multimodal Interpretation of Emotional Expression.—An Experimental Study of Spanish
and Swedish." In Proceedings of Speech Prosody. ISCA, March 23-26, Naran. Banse, Rainer, and Klaus Rainer Scherer. 1996. "Acoustic Profiles in Vocal Emotion Expression." Journal of Personality and
Social Psychology 70 (3): 614-36. https://doi.org/10.1037/0022-3514.70.3.614. Boersma, Paul, and David J. M. Weenink. 2001. "PRAAT, a system for doing phonetics by computer." Glot International 5 (9/10): 341-347.
Castro, Sao Luis, and César F. Lima. 2010. "Recognizing Emotions in Spoken Language: A Validated Set of Portuguese Sentences and Pseudosentences for Research on Emotional Prosody." Behavior Research Methods 42 (1): 74-81. https://doi.org/10.3758/BRM.42.1.74.
Colamarco, Manuela and Joao Antonio de Moraes. 2008. "Emotion expression in speech acts in Brazilian Portuguese: production and perception." In Speech Prosody 2008, Fourth International Conference, Campinas, Brazil, May 6-9, 2008, 717-720. https://www.isca-speech.org/archive/sp2008/papers/sp08 717.pdf.
Chung, S.-J. 2000. L'expression et la perception de l'émotion extraite de la parole spontanée: Evidences du coréen et de l'anglais [Expression and perception of emotion extracted from spontaneous speech in Korean and English]. Unpublished doctoral dissertation. Universite Paris III - Sorbonne Nouvelle, Paris, France.
Cocteau, Jean. 1989. A Voz Humana [The Human Voice]. Assirio and Alvim.
Darwin, Charles. (1872) 2000. The expression of emotions in man and animals. Portuguese translation by Relogio D' Agua.
Douglas-Cowie, Ellen, Roddy Cowie, and M. Schroeder. 2003. "The description of naturally occurring emotional speech." In Proceedings of ICPhS, 2877-2880. Barcelona, Spain.
Drioli, Carlo, Graziano Tisato, Piero Cosi, and Fabio Tesser. 2003. "Emotions and Voice Quality: Experiments with Sinusoidal Modeling." In Voice Quality: Functions, Analysis and Synthesis (VOQUAL'03], 127-132. Geneva, Switzerland.
Feldman-Bianco, Bela. 2001. "Brazilians in Portugal, Portuguese in Brazil: Constructions of Sameness and Difference." Identities 8 (4): 607-50. https://doi.org/10.1080/1070289X.2001.9962710.
Gobl, Christer, and Ailbhe Ni Chasaide. 2003. "The Role of Voice Quality in Communicating Emotion, Mood and Attitude." Speech Communication 40 (1-2): 189-212. https://doi.org/10.1016/S0167-6393(02)00082-1.
Harkins, Jean, and Anna Wierzbicka, eds. 2001. Emotions in crosslinguistic perspective. Cognitive linguistics research 17. Berlin; New York: Mouton de Gruyter.
Johnstone, Tom, and Klaus Rainer Scherer. 1999. "The effects of emotions on voice quality." In Proceedings of ICPhS, 20292032. UCLA, San Francisco, CA.
Martins, C., A. I. de Lemos, and P. E. Bebbington. 1992 "A Portuguese/Brazilian study of Expressed Emotion." Social Psychiatry and Psychiatric Epidemiology 27 (1): 22-27. https://doi.org/10.1007/BF00788952.
Monzo, Carlos, Francesc Alias, Ignasi Iriondo Sanz, Xavi Gonzalvo, and Santiago Planet. 2007. "Discriminating Expressive Speech Styles By Voice Quality Parameterization." In Proceedings of ICPhS, 2081-2084. Saarbrucken, Germany.
Moraes, Joao Antonio de, Albert Rilliard, Bruno Alberto de Oliveira Mota, and Takaaki Schochi. 2010. "Multimodal perception and production of attitudinal meaning in Brazilian Portuguese." In Proceedings of 5th International Conference on Speech Prosody, Chicago, IL.
Moraes, Joao Antonio de. 2008. "The Pitch accents in Brazilian Portuguese: analysis by synthesis." In Speech Prosody 2008, Fourth International Conference, Campinas, Brazil, May 6-9, 2008: 389-397. https://www.isca-speech.org/archive/sp2008/papers/sp08 389.pdf.
Nunes, Ana Margarida Belem, Nancye Roussel, Americo Rodrigues, Rosa Lidia Coimbra, and Antonio Teixeira. 2008. "Cross-linguistic effects on the perception of emotions." In Proceedings of the International Clinical Phonetics and Linguistics 25-28 June, Istanbul, Turkey.
Nunes, Ana Margarida Belem, Rosa Lidia Coimbra, Antonio Teixeira. 2010. "Voice Quality of European Portuguese Emotional Speech." In Computational Processing of the Portuguese Language. PROPOR 2010. Lecture Notes in Computer Science, vol 6001. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-12320-7 19.
Pell, Marc D., Silke Paulmann, Chinar Dara, Areej Alasseri, and Sonja A. Kotz. 2009. "Factors in the recognition of vocally expressed emotions: A comparison of four languages." Journal of Phonetics 37 (4): 417-435. https://doi.org/10.1016/j.wocn.2009.07.005.
Peres, Daniel Oliveira. 2014. "Perception of emotional speech in Brazilian Portuguese: an intonational and multidimensional approach." Nouveaux cahiers de linguistique française 31: 153-196.
Santos, Raquel Santana. 2017. "Aquisiçao da fonologia em lingua materna: acento e palavra prosodica." In Aquisiçao de lingua materna e nao materna. Questôes gerais e dados do português, edited by Maria Joao Freitas, and Ana Lucia Santos, 95-117. Berlin: Language Science Press. https://doi.org/10.5281/zenodo.889425.
Sawamura, Kanae, Jianwu Dang, Masato Akagi, Donna Erickson, Aijun Li, Kyoko Sakuraba, Nobuaki Minematsu, Keikichi Hirose. 2007. "Common Factors in Emotion Perception among Different Cultures." In Proceedings of ICPhS, 21132116. Saarbrucken, Germany.
Scherer, Klaus Rainer. 2003. "Vocal Communication of Emotion: A Review of Research Paradigms." Speech Communication 40 (1-2): 227-256. https://doi.org/10.1016/S0167-6393(02)00084-5.
Schirmer, Annett, and Sonja A. Kotz. 2006. "Beyond the Right Hemisphere: Brain Mechanisms Mediating Vocal Emotional Processing." Trends in Cognitive Sciences 10 (1): 24-30. https://doi.org/10.1016/j.tics.2005.11.009.
Vogt, Thurid, Elisabeth Andre, and Nikolaus Bee. 2008. "EmoVoice — A Framework for Online Recognition of Emotions from Voice." In Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science, vol 5078, 188-199. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-69369-7 21.
Wierzbicka, Anna. 2005. "In Defense of 'Culture'." Theory & Psychology 15 (4): 575-97.
https://doi.org/10.1177/0959354305054752. Zinken, Jorg, Monja Knoll, and Jaak Panksepp. 2012. "Universality and Diversity in the Vocalization of Emotions." In
Emotions of the human voice, edited by K. Isdebski. San Diego: Plural Publishing. Zovato, Enrico, Alberto Pacchiotti, Silvia Quazza, and Stefano Sandri. 2004. "Towards emotional speech synthesis: A ruled based approach." Proceedings of the 5th ISCA Speech Synthesis Workshop, 219-220. Pittsburgh.
Acknowledgments
I want to acknowledge the financial support throughout the MYRG2015-00200 research project of the University of Macau in which I was the Principal Investigator.
Copyrights
Copyright for this article is retained by the author, with publication rights granted to the journal.
This open access article is distributed under a custom license: freely available to download, save, reproduce, and transmit for noncommercial, scholarly, and educational purposes; to reuse portions or extracts in other works—all with proper attribution to the original author(s), title, and the journal. Commercial use, reproduction or distribution requires additional permissions.