ERP CORRELATES OF EMOTIONAL FACE PROCESSING IN VIRTUAL REALITY
L.A. Kirasirova1*, A.V. Zakharov2, M.V. Morozova3, A.Ya. Kaplan4, V.P. Pyatin1'2
1 Department of Physiology, Samara State Medical University;
2 Neurosciences Research Institute, Samara State Medical University;
3 Skoltech V. Zelman Center for Neurobiology and Brain Restoration;
4 Laboratory for Neurophysiology and Neuro-Computer Interfaces, Lomonosov Moscow State University. * Corresponding author: [email protected]
Abstract. Emotion regulation is a popular research topic in social, clinical, cognitive psychology, and neurophysiology. Event-related potentials (ERPs) studies have high temporal resolution and are therefore conventionally used in emotion research to study the patterns of emotion processing. Advances in digital technologies are promoting neuropsychological research of emotion and attention in virtual reality (VR). In this work, for the first time, we investigated how the presented emotional facial expressions in VR modulate ERP components in conditions of different combinations of passive or active attention and random or linear presentation sequence. We found the higher amplitude of the C1, N170, P2, P3, P4 ERP components in the condition of active attention compared to passive attention during the random presentation of emotional 3D facial expression. During the linear presentation of emotional 3D facial expressions, a statistically significant difference was found only for the C1 ERP component in conditions of both passive and active attention. We proved that the P2 ERP component represents the perception of positive and negative 3D facial expressions encoding the emotional valence of the stimuli. We also found no statistically significant difference in latency of ERP components between passive and active attention to emotional 3D facial expressions.
Keywords: face emotion, EEG, relevant and non - relevant stimuli, virtual reality.
List of Abbreviations
EEG - electroencephalogram
ERP - event-related potential
VR - virtual reality
Introduction
Emotion regulation studies are part of social, clinical, cognitive psychology, and neurophysiology (Hajcak et al., 2010; Ding et al., 2017). The ERP method has a high level of temporal resolution and is used in neurophysiology to study the temporal patterns of emotion processing. This study set out to investigate the modulation of ERP components by emotional stimuli. Previous ERP studies have confirmed that ERP components such as N1, P1, N170, positive peak potentials (VPP), N250, N3, P3, late positive potentials (LPP; or both late positive complex (LPC) and late positive complex (LPC) and early posterior negative negativity (EPN) are very sensitive to emotional stimuli processing (Luo et al., 2010). It has been shown that ERP, as an experimental tool for the study of temporal characteristics of emotion processing in the brain, is sensitive not only to the
properties of the presented stimuli but also to the mental states of the subjects.
Numerous studies have investigated the interaction and integration of emotions and sensory processing. For example, when perceiving emotional stimuli, information is processed more quickly than non-emotional stimuli (Pour-tois et al., 2013). However, little is known yet about how perceived information interacts with processing of emotional stimuli. In particular, the mechanism by which the physical properties of a stimulus modulate an emotional response experience remains poorly understood (Codispoti & Cesarei, 2008), especially in the research paradigm in which facial expressions are used as emotional stimuli to elicit ERP. Emotion is viewed as a continuum of two dimensions namely valence and arousal. The measurement of valence refers to the degree to which people feel pleasant or unpleasant, while the measurement of arousal denotes the subjective sensation of activation or deactivation, i.e., the intensity of the internal emotional reaction. P1, with a response delay of 60-80 ms, is a positive ERP component, which peaks approxi-
mately 100-130 ms after the onset of the stimulus and is usually recorded at parietooccipital electrodes. It was shown that the P1 component is an electrophysiological correlate of facial expression processing as well as attention (e.g., frightened facial expressions have a higher P1 than happy and neutral) (Luo et al., 2010).
N170 is a negative ERP component that could be recorded in the area of the lateral occipital-temporal electrodes. This ERP component peaks about 170 ms after stimulus onset and is more specific for facial perception than for other visual stimuli. Moreover, significant emotional modulation of the N170 amplitude occurs in response to happy and frightened facial expressions rather than neutral ones. However, there is evidence that high-amplitude N170 is elicited by visual stimuli of anger, fear, and a happy facial expression compared to neutral individuals (Hinojosa et al., 2015; Calvo et al., 2014).
EPN, namely the early posterior negative wave, reaches a maximum of 210-350 ms with a topographic distribution of activity over the occipital-temporal areas of the cerebral cortex. The study by Calvo et al. (2014) has shown that EPN amplitudes in the right hemisphere in response to expressions of anger were greater than the amplitudes caused by other facial expressions and were greater than the amplitudes for neutral faces in the left hemisphere, but the above effects were observed only in the state of the entire face (Calvo et al., 2014). Research has also shown that facial expressions, emotional pictures (Price et al., 2012), and emotional words (Scott et al., 2009) induce EPN modulation in subjects. Moreover, angry facial expressions elicit a higher amplitude EPN response than happy facial expressions (Rellecke et al., 2012). It is believed that the EPN wave occurring after N170 may be a marker for discriminating emotions.
P3 is a positive wave deflection with peak latency after 300 ms that could be detected on parietooccipital electrodes. This component indicates high-order cognitive operations associated with attention processes and their speed, which can respond to the distribution of attention to motivationally significant stimuli. In the
study by Luo et al. (2010), it has been shown that P3 amplitude was different in response to frightened, happy emotional, and neutral faces. This confirms the data that angry faces elicit a greater P3 response than happy and neutral facial expressions (Shupp et al., 2004). Rapid responses to emotional stimuli are critical to human survival. ERPs are also used to study the neural dynamics involved, for example, in the emotional processing of text (Zhang et al., 2014). In the work of Imbir et al. (2015) brain activity in the situation of reading and responses to emotional positive or negative and neutral words was investigated. The early ERP response had a frontal-occipital topography, and positive words elicited greater response amplitude than negative words. Emotional stimuli involving the automatic response system of the brain were associated with significantly higher amplitudes in the left-parietal region of the cerebral cortex, whereas the response to neutral words under these conditions was the same regardless of the systems involved. However, in the central region, neutral stimuli that activate the reflexive system elicited a response with higher amplitude, while there was no systemic effect on emotional stimuli. Therefore, some authors (Imbir et al., 2015) propose to divide emotions into automatic and reflexive groups depending on the involved neural system. Emotiogenic stimuli mediated by an automatic system cause a higher ERP amplitude than emotion stimuli mediated by reflexive origin. The answer to the question of ERP correlates of emotiogenic stimuli can be found in the study of ERP correlates of emotion and attention in VR (Marin-Morales et al., 2020). Direct comparisons of ERP responses to two-dimensional (2D) and VR-stimulation (positive potential of LPP and target P3) showed that alpha / beta desynchronization (event-related desynchronization - ERD) takes place in both VR and 2D environments. However, it should be noted that ERPs were significantly stronger in VR environments compared with 2D-stimulation (Schubring et al., 2020). In the literature, there is evidence that the use of VR gives researchers unprecedented opportunities for studying human emotional behavior and
allows to reveal the meanings of ERP components in the coding of emotions (Diemer et al., 2015). However, we have not found publications in which emotionally significant and neutral stimuli in the form of 3D virtual images of emotionally different faces were used to study the basic ERP components (N1, P1, N170, N250, N3, P3) in conditions of passive and active attention. Therefore, the purpose of this work was to study ERP correlates of the perception of 3D faces in VR.
Materials and methods
Fifteen healthy subjects, all males, aged 1921 years, volunteered to participate in the experiment. After reviewing the information about the study, all subjects gave informed consents to their participation. The study was approved by the Bioethics Committee of the Samara State Medical University of the Minis-try of Health of Russia. Visual stimuli were presented sequentially in the oddball paradigm in HTC Vive Pro Eye VR helmet. Stimuli from the FACES database (Ebner et al., 2010) were used for presentation in VR. Stimuli were presented against a dark gray background. The duration of the stimulus presentation was 200 ms, the interval between adjacent stimuli was 800 ms, during which the glasses screen remained blank. Stimulus presentation was done in the Unity environment using the original virtual environment. In total, there were two parts of records in the study: passive attention, active attention to emotional stimuli. The stimuli were formed in such a way that in each part there were 3D objects of faces of different emotionality.
Part 1 - passive attention. The subjects were instructed to look at the center of the virtual stage. The stimuli were 4 objects without facial emotion (neutral), 1 object with a smiling face, 1 object with a displeased face.
Part 2 - active attention to emotional faces. Two (happy and displeased) are indicated as targets, and subjects were instructed to mentally count the moments of the target's face. The same set of 3D objects from block 1 was used as stimuli.
The stimuli for each block were grouped into sets of six in one recording cycle. Each block
was repeated 4 times on different experimental days. The order of stimuli in one cycle was set linearly (in sessions 1 and 2) and randomly (in sessions 3 and 4). One record contained 50 cycles, which correspond to the presentation of each of the six objects 50 times. In to-tal, 300 stimuli were presented in each block.
EEG registration was performed monopo-larly with 64 electrodes using a 128-channel EEG recording system "VR-01030 BrainAmp Standart128" (manufactured by Brain-Products, Germany) with an ActiCap active electrode system. For fixation of the electrodes, a special textile electroencephalographic cap with nests located according to the "10-10%" system (modification of the international system "10-20" (Jasper, 1957) was used. To reduce the resistance between the electrodes and the skin surface, a special electrically con-duc-tive gel was used under each electrode. When installing the electrodes, contact resistance of no more than 20 kQ was achieved.
Data processing. EEG was bandpass filtered with a 4th order Butterworth IIR zero-phase filter in the 1.0-15.0 Hz band for ERP analysis and 1-40 Hz band for time-frequency analysis. No procedure was applied for the removal of eye-blinking or other artifacts. For the ERP analysis, 1500 ms EEG fragments starting 500 ms before stimulus onset were used. Each epoch and channel were individual-ly baselined by subtraction of the mean of the baseline period from -500 to 0 ms before the stimulus and then normalized. As was expected, the most pronounced difference in grand averages of ERPs for different stimuli and conditions among all the electrodes was found at the Oz electrode, and all the follow-ing analyses of ERP characteristics were per-formed using the data recorded at Oz electrode. For the exploratory analysis, we labeled C1, P1, N1, P2, N2, P3, N3, P4 peaks of aver-aged ERPs in all sessions separately for different types of stimuli (S2 - positive, S4 - negative, S2+S4 - emotional stimuli) by taking the first derivative of the averaged ERP and including in analysis only explicit peaks. Amplitudes and latencies were checked for normality (Shapiro-Wilk test)
and homogeneity (Levene test) and then analyzed by univariate and multivariate ANOVA. Both peak latencies and amplitudes were homogeneous, but latencies did not have a normal distribution; hence we used nonparametric Kruskal-Wallis ANOVA. We considered ANOVA results as biased and inconclusive because in some of averaged ERPs some of the peaks were either missing or merged, so the following analysis of ERPs was performed using a permutation cluster-based F-test with a number of permutations set at 1000 and Threshold Free Cluster Enhancement (TFCE) correction for multiple comparisons with starting threshold equals 0 and step equals 0.2. Time-frequency analysis of epochs was per-formed using the continuous Morlet wavelet transform (CWT) with the initial spread of the Gaussian wavelet set at 2.5/nro0 (ro0 being the central frequency of the wavelet). Wavelet transform was applied to a single epoch in 1-40 Hz frequency band. To calculate Event-Related Spectral Perturbations (ERSPs), the absolute values of single-trial time-frequency maps were averaged. The degree of phase-locking activity (InterTrial Coherence (ITC)) was calculated by taking the absolute of the average of complex values of single-trial time-frequency maps divided by absolute values of single-trial time-frequency maps.
Results
To compare responses to presentation of emotional facial expressions in conditions of passive and active attention, we analyzed the average amplitudes and latencies of ERP components: P100, P1, N1, P2, N2, P3, N3, P4 in the Oz lead (amplitudes - Table 1, latencies -Table 2).
In response to random presentation of emotional facial expressions in a part with passive attention, the amplitude of the C1 component was -0.413 ± 0.214, N170 -0.0755 ± 0.376, P2 0.864 ± 0.308, P3 0.795 ± 0.348, P4 0.237 ± ± 0.236 - and were higher than in the part with active attention C1 -0.258 ± 0.233, N170 -0.0387 ± 0.398, P2 0.708 ± 0.248, P3 0.694 ± ± 0.349, P4 0.0789 ± -0.242 (p < 0.01) (Fig. 1). In response to linear presentation of emotional facial expressions, a statistically significant difference was found only for the C1 component and amounted to -0.317 ± 0.281 in the session of passive attention and -0.222 ± 0.228 in active attention (p < 0.01) (Fig. 2).
When comparing positive and negative facial expressions, a statistically significant difference was found for the P2 components and amounted to 0.816 ± 0.342 for positive and 0.695 ± 0.248 for negative stimuli in the part of active attention (Fig. 3), and 0.943 ± 0.348 for
Fig. 1. Comparison of ERP components in response to emotional stimuli in conditions of passive and active attention, random presentation sequence
Fig. 2. Comparison of ERP components in response to emotional stimuli in conditions of passive and active attention, linear presentation sequence
— S2,N=140I epoc — S4.N=I392 cp<x hs
hs A ~
n
\
\ v
.4 -C 1 0 0 0. 0 4 0 6 0 S 1 0
cluster p-value < 0.01 )|
-0.4 -0.2 0.0 0.2 0.4 0.6 0.8
Fig. 3. Comparison of ERP components in response to positive (S2) and negative (S4) facial expressions in condition of active attention, random presentation sequence
Fig. 4. Comparison of ERP components in response to positive (S2) and negative (S4) facial expressions in condition of passive attention, random presentation sequence
40-. 3530-
N
I 25 I
I 20|
3 a-
$ 15-1 105-
S2, N=5472 epochs
S2, N=5472 epochs
-0.50 -0.25 0.00 0.25 0.50 S4, N=5561 epochs
-0.50 -0.25
0.00 0.25 0.50 time, sec
0.75
1.00
0.5 0.4 0.3 -0.2 -0.1 0.0
Fig. 5. Inter-trial Coherence (ITC) in response to positive (S2) and negative (S4) stimuli
-0.50 -0.25 0.00 0.25 0.50 0.75 1.00 time, sec
Fig. 6. Event-Related Spectral Perturbations (ERSPs) in response to positive (S2) and negative (S4) stimuli
Table 1
Amplitudes of ERP components in response to presentation of emotional stimuli in all sessions
Clamplitude Plamplitude Nlamplitude P2_amplitude N2_amplitude P3_amplitude N3_amplitude P4_amplitude
N 536 536 522 516 484 496 531 531
Missing 0 0 14 20 52 40 5 5
Mean -0.337 1.000 0.0410 0.827 0.434 0.707 -0.0281 0.164
Median -0.330 0.919 0.0759 0.787 0.431 0.708 -0.00582 0.163
Standard 0.253 0.369 0.423 0.350 0.368 0.360 0.278 0.241
deviation
Minimum -1.17 0.163 -1.05 -0.219 -0.535 -0.293 -0.964 -0.544
Maximum 0.500 2.07 0.963 1.77 1.34 1.65 0.982 1.18
Table 2
Latency of ERP components in response to presentation of emotional stimuli in all sessions
Cllatency Pllatency Nllatency P2_latency N2_latency P3_latency N3_latency P4_latency
N 536 536 522 516 484 496 531 531
Missing 0 0 14 20 52 40 5 5
Mean 0.0669 0.129 0.177 0.232 0.276 0.317 0.391 0.429
Median 0.0700 0.128 0.174 0.230 0.274 0.316 0.386 0.420
Standard 0.0124 0.0147 0.0188 0.0190 0.0293 0.0215 0.0350 0.0368
deviation
Minimum 0.0140 0.102 0.132 0.116 0.00200 0.240 0.300 0.292
Maximum 0.0983 0.184 0.244 0.304 0.344 0.390 0.497 0.549
w
S
n
0
1 >
ZA
o
►Tj
w
s
0 d
1
►Tj >
n
M T3
o n
M
ZA
zn
3
o <
H
r $
r
H
►c
positive and 0.803 ± 0.306 for negative in the part of passive attention (Fig. 4).
Comparison of the latency of the ERP components did not show a statistically significant difference for emotional stimuli with passive and active attention to them.
After presentation of the stimulus in the VR helmet in the form of facial expressions, an increase in coherence (Fig. 5) and a decrease in amplitude (Fig. 6) in the range of 8-15 Hz were revealed.
Discussion
The aim of this work was to study the ERP correlates of the perception of 3D faces in VR. All relatively recent studies of ERP components directly related to the processing of emotional stimuli (N1, P1, N170, VPP, N250, N3, P3, LPP; LPC and EPN) were performed with use of 2D facial expressions (Luo et al., 2010). ERP components have been shown to differentially encode information about different emotional facial expressions (Luo et al., 2010). This is due to the presence of three stages in the recognition of emotional facial expressions: the stage of automatic and hard time processing (N1 and P1); the stage of distinguishing between emotional and neutral facial expressions (latency of the response of N170 and VPP) and, finally, the stage of processing and awareness of facial emotions (N3 and P3). In our study, for the first time, VR was used to study ERP in response to presentation of emotional facial expressions in conditions of passive and active attention. For this we developed random and linear protocols of presentation of 3D facial expressions of anger and joy implementing the oddball paradigm. First of all, it should be noted that in our study of 3D facial expressions all known ERP components were identified: C1,
P1, N1, P2, N2, P3, N3, and P4 but only the early ERP components had significant difference between com-pared stimuli and conditions. For the first time it was found that in response to random presentation of 3D emotional facial expressions in condition of subject's passive attention to them, the amplitude of the C1, N1, P2, N2, P3 ERP components were higher than in condition of subject's active attention to presented 3D emotional facial expressions (p <
< 0.01). The amplitude of the C1 ERP component in response to linear presentation of 3D facial expressions was higher in the condition of passive attention compared to the active attention (p < 0.01). Only the P2 ERP component had statistically significant difference in amplitude between positive and negative facial expressions, with P2 ERP amplitude being higher for positive stimuli (p <
< 0.01). However, the latencies of ERP components did not change depending on stimuli and attention type. The time-frequency analysis revealed the specific increase of coherence among epochs in the wide frequency range from 3 to 40 Hz in 250 ms after stimulus presentation, and amplitude decrease in the range of 8-15 Hz from 200 to 600 ms.
Thus, in our work, for the first time, data on the impact of the valence of emotion of facial expressions and attention type as well as their combination on ERP components when presenting 3D facial expressions in VR. The 3D facial expressions evoked stronger ERP responses in the condition of passive attention. Further research employing wider sets of stimuli is needed to confirm and to investigate in depth the present findings.
Acknowledgements
The study was funded by RFBR, project number No 19-315-90120.
References
CALVO M.G., BELTRÁN D. & FERNÁNDEZ-MARTÍN A. (2014): Processing of facial expressions in peripheral vision: Neurophysiological evidence. Biological Psychology, 100, 60-70. doi: 10.1016/j.bi-opsycho.2014.05.007
DE CESAREI A. & CODISPOTI M. (2008): Fuzzy picture processing: Effects of size reduction and blurring on emotional processing. Emotion, 8(3), 352-363. doi:10.1037/1528-3542.8.3.352
DIEMER J., ALPERS G.W., PEPERKORN H.M., SHIBAN Y. & MÜHLBERGER A. (2015): The impact of perception and presence on emotional reactions: a review of research in virtual reality. Frontiers in Psychology, 6. doi: 10.3389/fpsyg.2015.00026 DING R., LI P., WANG W. & LUO W. (2017): Emotion Processing by ERP Combined with Development
and Plasticity. Neural Plasticity, 2017, 1-15. doi:10.1155/2017/5282670 EBNER N., RIEDIGER M. & LINDENBERGER U. (2010): FACES—A database of facial expressions in young, middle-aged, and older women and men: Development and validation. Behavior research Methods, 42, 351-362. doi: 10.3758/BRM.42.1.351. JARYMOWICZ M. (2012): Understanding Human Emotions. Journal of Russian & East European Psychology, 50(3), 9-25. doi: 10.2753/rpo1061-0405500301 HAJCAK G., MACNAMARA A. & OLVET D.M. (2010): Event-Related Potentials, Emotion, and Emotion Regulation: An Integrative Review. Developmental Neuropsychology, 35(2), 129-155. doi:10.1080/87565640903526504 HINOJOSA J.A., MERCADO F. & CARRETIÉ L. (2015): N170 sensitivity to facial expression: A metaanalysis. Neuroscience & Biobehavioral Reviews, 55, 498-509. doi:10.1016/j.neubiorev.2015.06.002 IMBIR K.K., JARYMOWICZ M.T., SPUSTEK T., KUS R. & ZYGIEREWICZ J. (2015): Origin of Emotion Effects on ERP Correlates of Emotional Word Processing: The Emotion Duality Approach. PLOS ONE, 10(5), e0126129. doi:10.1371/journal.pone.0126129 LUO W., FENG W., HE W., WANG N.-Y. & LUO Y.-J. (2010): Three stages offacial expression processing: ERP study with rapid serial visual presentation. NeuroImage, 49(2), 1857-1867. doi:10.1016/j.neu-roimage.2009.09.
MARÍN-MORALES J., LLINARES C., GUIXERES J. & ALCAÑIZ M. (2020): Emotion Recognition in Immersive Virtual Reality: From Statistics to Affective Computing. Sensors, 20(18), 5163. doi:10.3390/s20185163
POURTOIS G., SCHETTINO A. & VUILLEUMIER P. (2013): Brain mechanisms for emotional influences on perception and attention: What is magic and what is not. Biological Psychology, 92(3), 492-512. doi :10.1016/j .biopsycho.2012.02.007 RELLECKE J., SOMMER W. & SCHACHT A. (2012): Emotion Effects on the N170: A Question of Reference? Brain Topography, 26(1), 62-71. doi:10.1007/s10548-012-0261-y SCHUBRING D., KRAUS M., STOLZ C., WEILER N., KEIM DA. & SCHUPP H. (2020): Virtual Reality Potentiates Emotion and Task Effects of Alpha/Beta Brain Oscillations. Brain Sciences, 10(8), 537. doi:10.3390/brainsci10080537 SINGH M.I. & SINGH M. (2021): Emotion Recognition: An Evaluation of ERP Features Acquired from
Frontal EEG Electrodes. Applied Sciences, 11(9), 4131. doi:10.3390/app11094131 SUHAIMI N.S., MOUNTSTEPHENS J. & TEO J. (2020): EEG-Based Emotion Recognition: A State-of-the-Art Review of Current Trends and Opportunities. Computational Intelligence and Neuroscience, 2020, 1-19. doi: 10.1155/2020/8875426 ZHANG D., HE W., WANG T., LUO W., ZHU X., GU R., ... LUO Y. (2014): Three stages of emotional word processing: an ERP study with rapid serial visual presentation. Social Cognitive and Affective Neuroscience, 9(12), 1897-1903. doi:10.1093/scan/nst188