facial expression recognition with the use of chimeric face technique
Galina Ya. Menshikova
Lomonosov Moscow State University Moscow
The aim of this study was to investigate holistic / feature processing for encoding face expressions employing the chimeric face technique. In the course of our experiment we tested the recognition accuracy of universal and “chimeric“ countenance. As the study has revealed there was a considerable difference between distributions of subject responses depending on the localization of expression features (top / bottom parts of the face). For chimeric face identification accuracy substantially decreased, there were considerable variations in the naming of face expression and the names assigned were hardly dictated by distinguishing features. The changes in recognition of facial expressions may be accounted for by disarrangement in holistic processing of expression encoding.
Keywords: face expression recognition, holistic / feature processing for encoding face expressions, chimeric face technique, identification accuracy.
Introduction
Face perception, especially recognition of face expressions, plays a crucial role in social communication. The understanding of mechanisms of facial expression encoding will help us to communicate with people of differing cultural formation, or level of intelligence etc.
The main objective of face perception research is to reveal the coding mechanisms which form the basis of our unique ability to recognize face expressions. Psychophysical and neurophysiological evidence suggests that we have a specialized system for encoding face expressions. Two mechanisms may underlie the processes of expression encoding - feature and holistic processes. In the former, expression encoding is driven mostly by local features - wide / narrow shape of one’s eyes, the curves in mouth corners, and so on. On the contrary, holistic or configurational processing comes as a special process of a strong perceptual integration of information across the whole face. It depends on all the features as well
as on their mutual configurations. Both processes are very important for expression encoding, but the particular role of each in expression encoding is still under consideration (Goffaux & Rossion, 2006; Pitcher, Walsh, Yovel, & Duchaine, 2007).
It has been shown that expression encoding heavily depends on configurational or holistic processes rather than on feature coding (Thompson, 1980; Tanaka & Farah, 1993; Hole, George, & Dunsmore, 1999; Maurer, LeGrand, & Mondloch, 2002; Barabanschikov, 2002; McKone, Kanwisher, & Duchaine, 2007). Taken separately, local features can also contribute to facial expression recognition. For example, an observer may identify a face expression by the twist of one’s eyebrows (Duchaine & Weidenfeld, 2003; Robbins & McKone, 2003), or recognize a face by a hairstyle (Sinha & Poggio, 1996), and show the discrimination for single face parts (McKone, Martini, & Nakayama, 2001).
Hypothesis
The aim of our study was to investigate the role of holistic / feature processing in expression recognition. It is suggested that local features would automatically combine into a global perception of face expression. The process of face expression recognition may come into confusion when each of these features should represent a different universal expression. We used the chimeric face technique to form such facial types. This technique was used by A.W. Young (Young, Hellawell, & Hay, 1987) to investigate the process of face identification. In the course of experiment a chimeric face was comprised of two halves taken from two different individuals: for example, the top part of Peter’s face would come in combination with the bottom part of Paul’s face. But we employed the technique to construct “two-expressions” faces, i.e. a picture conveying two different expressions of one and the same face, formed in different parts from the horizontal midline. For example, the top part of Paul’s face would convey one expression as the bottom part - another. The chimeric face technique for combining “happy” - “angry” expression of Ek-man’s face is illustrated in Fig. 1: “happy” (a) and “angry” (b) expressions are composed in “happy (top) - angry (bottom)” (c) and “angry (top) -happy (bottom)” (d) combinations of expressions.
We tested the process of recognition of chimeric face expressions. If feature processing is dominating, the observer will name one of chimeric
Figure 1. (a) - “happy” face expression (FE); (b) - “angry” FE;
(c) - “angry (top) + happy (bottom)” FE;
(d) - “happy (top) + angry (bottom)” FE.
faces, because certain features (a mouth spread in a smile, or frowning brows) would signal of face’s emotional state. At the same time the naming of a face expression would depend on feature’s location: an observer would name either the expression conveyed by the top or the bottom part of the face. As a result, there will be slight variations in names of emotions. The dominance of holistic processing, on the other hand, would bring the opposite results: prevalence of the global perception of face expression will bring considerable changes in recognition of face expressions. If “the whole is greater than the sum of its parts” (as the gestalt theory puts it), then the observer would name some other chimeric expressions instead of any actually imprinted on the face. The range of names she / he may suggest for the emotion in question will be wide.
Methods
Participants
Fourteen students of Moscow State University (ten female and four male) participated in the experiment. All participants were between 20-25 years of age. They all had normal or corrected-to-normal vision.
Stimuli
The stimulation array was comprised of 14 digital photographs of two faces (256 grayscale; 8-bit), these were selected from Ekman’s face atlas (Ekman & Friesen, 1975). Each of the photographs (male and female) was represented in 6 facial versions conveying universal expressions -“happiness,” “sadness,” “fear,” “anger,” “disgust,” “surprise” and a neutral countenance. The technique of chimeric faces was employed to construct 9 combinations of expressions for each photograph. Two ecologically important expressions - “happiness” and “anger,” as endowed with pronounced positive and negative emotional valences, were chosen to form chimeric combinations with other universal expressions. These were the following: 1) “happiness” in the top part of the face + one, in each turn, of other 5 universal expressions (sadness, fear, anger, disgust, surprise) in the bottom part of the same face and vice versa; 2) “anger” in the top part + one of 5 universal expressions in the bottom - and vice versa.
We used such expressions as “happiness” and “anger” for combinations because specific features for their recognition were localized in different facial parts: mouth - a “happy” smile - in the bottom part, and frown - “angry” expression - in the top part. Adobe Photoshop CS2 software was used to create stimulation arrays. Contrasts along the facial horizontal midline were reduced as much as possible.
Experimental design
The experiment consisted of two parts. In part 1 faces conveying universal expressions were presented to observers to test the accuracy of their recognition. In part 2 chimeric faces conveying composed emotions were presented for testing. In general, in the course of the experiment 750 presentations were suggested for every participant: 25 samples exhibiting 7 universal expressions and 18 composed expressions each of which was repeated 30 times.
Procedure
The experiment was programmed using Medialab 4 (OS X) software. Visual patterns were presented on the Sony CRT monitor. Partici-
pants were located in front of the monitor at a distance of approximately 55 cm. As chimeric faces were presented, the participants were asked to answer two questions: to determine a valency of the expression (positive or negative) and to identify the face expression, that is, to name the facial expression without reference to any list of expression names. They were also asked to use several names for ambiguous cases. No training session had been conducted before the experiment. In the course of the experiment the subjects were given no feedback. Each presentation would begin with the fixation cross localized in the center of the screen for about 1000 ms. Then a test face would appear for 500 ms. It was always placed in the center of the monitor. Then participants would see a dark screen until they responded verbally to questions. Their answers were brought in a table. The order of sample presentations was fully randomized.
Results and discussion
The results of part 1 of the experiment are plotted in Fig. 2. Mean accuracy (correct responses in %) of universal expression recognition across subjects was calculated separately for each expression and for female / male faces. The results achieved appear to be typical (Ekman, 1992). High percentage of correct responses was revealed for expressions “happiness," “surprise" and “anger." Certain variety was revealed for identification of expressions represented in pictures of males and
100
90
■ — Male face
□ — Female face
10
happy angry surprise fear disgust sad Face expression
Figure 2. Mean accuracy (correct responses in %) across subjects for universal expressions.
100
90
_c
0 80 ^ 70
V. 60 ^ 50
5° 40
1 30 G
a20
I 10 0
A
happy angry surprise fear disgust sad Face expression
100
90
_c
0 80 ^ 70
v 60
ip 50 5° 40
ess 30
sn
^20
1 10 0
r 1
,
I I
happy angry surprise fear disgust sad Face expression
□ — “happy” ■ — “another”
Figure 3. Responses “happy” when the expression “happiness” was (a) - in the bottom part of the face and other universal expressions - in the top part, (b) - in the top part of the face and other universal expressions - in the bottom part.
b
a
females. Thus, facial expressions of “happiness” and “fear” were more easily perceived in male faces than in female. On the contrary, “anger” was perceived better in female faces.
In Fig. 3 (a, b) and 4 (a, b) the results of part 2 of the experiment are plotted. Data in Fig. 3 (a, b) showed the percentage of responses “happiness” when this expression was localized (a) - in the bottom part as other universal expressions - in the top part of a chimeric face; (b) - in the top as other expressions - in the bottom part of a chimeric face. The first column on the left in Fig. 3 (a, b) indicated the percentage of responses “happiness” when a face was bearing the universal expression “happiness.” The values of responses in other columns showed the proportion of responses “happiness” (white columns) and "another universal expression” (black columns) for a chimeric face expression. The same diagrams were plotted for data obtained when one part of the face bearing “anger” was combined with a part taken from a different universal expression (Fig. 4 a, b). The second column on the left, in Fig. 4 (a, b), indicated the percentage of responses “anger” when a face was bearing singularly the expression of “anger.” Other columns represent the proportion of responses “angry” (white columns) and ”another universal expression” (black columns) for chimeric faces. Comparison of diagrams in Fig. 3, 4 showed significant divergence in distribution of responses according to
100
is 90 ■f 80 ß 70
V 60 I? 50
* 40 1 30
| 20
^ 1 n
Pi 10
0
a Responses “angry” / “another” ooooooooooo
T T
I It! -r
r rl I
happy angry surprise fear disgust sad Face expression
□ — “angry”
happy angry surprise fear disgust sad Face expression
■ — “another”
b
Figure 4. Responses “angry” when the expression “anger” was (a) - in the bottom part of the face as other universal expressions - in the top, (b) - in the top part of the face and other universal expressions - in the bottom.
localization of expression features (top / bottom part of the face). The combination of “happy” eyes with “sad / frightened / angry / fastidious / surprised” bottom part of the face was perceived as “happy” only for combination “happy (top) - angry (bottom)” expression and only in 20% of all responses (Fig. 3 b).
On the contrary, when the mouth part of the face bearing “happy” expression was combined with the top part bearing another expression, responses which were identified as “happiness” increased in number. Participants identified facial expression as “happiness” in 20% responses for “happy” lower part - “anger / fear” upper part of a face combinations and in 50-60% for other combinations. Our data confirm previous investigations (Ekman & Friesen, 1978; McKone et al., 2001; Maurer et al., 2002) which showed the importance of “a smiling mouth” as a strong feature for identification of “happy” expression. Localization of the expression “anger” in various facial parts resulted in a significant reduction of expression identification (Fig. 4 a, b).
Data analysis indicated that none of the two expressions presented in chimeric faces was identified in half of combinations (nearly 50% of all responses). Among the rest only one of two composed expressions was recognized: these were “happiness,” “surprise” and “disgust” located in the bottom part of the face. Thus, our results have demonstrated that
the encoding mechanism of expression recognition was “broken” when configuration of certain features did not correspond to a holistic representation of the expression.
There has been accomplished the analysis of names suggested by participants for chimeric expression recognition. Distribution of expression names showed that participants found it difficult to name the face expression when it was composed of two different expressions. For example, the names suggested by observers were “horror,” “fear,” “inspiration” - for the combination “happiness” (top part) - “surprise” (bottom part). The combination of “fear” (top) - “anger” (bottom) was perceived as “surprise” by half of the participants. The combination “sad” (top) -“happy” (bottom) was called “confusion,” “malevolence” or “insincerity.”
The analysis of participants’ responses to the question of positive / negative valences of chimeric expressions showed that any combinations of the negative expression “anger” with other expressions would bring observers to perceive a chimerical expression as negative. On the contrary, combinations of the positive “happy” expression with other basic expressions were perceived equivocally. Two unexpected results have emerged from the experiment. For one thing, the combination of two positive expressions - “happiness” in the top part of the face and “surprise” in the bottom part was identified as a negative expression by almost all the participants. Besides, the combination of “smiling” mouth (a very strong feature for the positive expression recognition) with “angry” eyes (see Fig. 1 c) was identified as a negative emotion expression.
Conclusion
The recognition of face expressions was substantially reduced for chimeric faces composed of two universal expressions. Observers would name chimeric expressions as any different than those represented. As a result, there was a strong variation in naming of the expression.
Decrease in number of right answers in recognition of chimeric face expressions and variations in their naming may be accounted for by the disarrangement of holistic processes in expression encoding. Our results demonstrated the complexity and sophistication of holistic mechanisms for expression processing.
References
Barabanschikov, V.A. (2002). Vospriatie ekspressii lica [Perception of Face Expressions]. In Vospriatie i sobytie [Perception and Events] (pp. 221-270). Saint Petersburg: Aleteja.
Duchaine, B.C., & Weidenfeld, A. (2003). An Evaluation of Two Commonly Used Tests of Unfamiliar Face Recognition. Neuropsychologia, 41, 713-720.
Ekman, P., & Friesen, W.V. (1975). Unmasking the Face. Englewood Cliffs: Pren-tice-Hall.
Ekman, P., & Friesen, W.V. (1978). Facial Action Coding System. Palo Alto, CA: Consulting Psychologists Press.
Ekman, P. (1992). An Argument for Basic Emotions. Cognition and Emotion, 6, 169-200.
Goffaux, V, & Rossion, B. (2006). Faces are ‘‘Spatial” - Holistic Face Perception is Supported by Low Spatial Frequencies. Journal of Experimental Psychology: Human Perception and Performance, 32, 1023-1039.
Hole, G.J., George, P.A., & Dunsmore, V. (1999). Evidence for Holistic Processing of Faces Viewed as Photographic Negatives. Perception, 28, 341-359.
Maurer, D., LeGrand, R., & Mondloch, C.J. (2002). The Many Faces of Configural Processing. Trends in Cognitive Sciences, 6, 255-260.
McKone, E., Martini, P., & Nakayama, K. (2001). Categorical Perception of Face Identity in Noise Isolates Configural Processing. Journal of Experimental Psychology: Human Perception and Performance, 27, 573-599.
McKone, E., Kanwisher, K., & Duchaine, B. (2007). Can Generic Expertise Explain Special Processing for Faces? Trends in Cognitive Sciences, 11, 8-15.
Robbins, R., & McKone, E. (2003). Can Holistic Processing be Learned for Inverted Faces? Cognition, 88, 79-107.
Pitcher, D., Walsh, V., Yovel, G., & Duchaine, B. (2007). TMS Evidence for the Involvement of the Right Occiptial Face Area in Early Face Processing. Current Biology, 17, 1568-1573.
Sinha, P., & Poggio, T. (1996). I think I know that face... Nature, 384, 404.
Tanaka, J.W., & Farah, M.J. (1993). Parts and Wholes in Face Recognition. Quarterly Journal of Experimental Psychology, 46A, 225-245.
Thompson, P. (1980). Margaret Thatcher - A New Illusion. Perception, 9, 483-484.
Young, A.W., Hellawell, D., & Hay, D.C. (1987). Configural Information in Face Perception. Perception, 16, 747-759.