Научная статья на тему 'EVALUATION OF EEG IDENTIFICATION POTENTIAL USING STATISTICAL APPROACH AND CONVOLUTIONAL NEURAL NETWORKS'

EVALUATION OF EEG IDENTIFICATION POTENTIAL USING STATISTICAL APPROACH AND CONVOLUTIONAL NEURAL NETWORKS Текст научной статьи по специальности «Медицинские технологии»

CC BY
86
11
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
DEEP LEARNING / MULTILAYER NEURAL NETWORKS / BIOMETRICS / MACHINE LEARNING / FEATURE EXTRACTION / ELECTRICAL BRAIN ACTIVITY / PSYCHOPHYSIOLOGICAL STATE / PATTERN RECOGNITION / SPECTROGRAMS

Аннотация научной статьи по медицинским технологиям, автор научной работы — Sulavko A.E., Lozhnikov P.S., Choban A.G., Stadnikov D.G., Nigrey A.A.

Introduction: Electroencephalograms contain information about the individual characteristics of the brain activities and the psychophysiological state of a subject. Purpose: To evaluate the identification potential of EEG, and to develop methods for the identification of users, their psychophysiological states and activities performed on a computer by their EEGs using convolutional neural networks. Results: The information content of EEG rhythms was assessed from the viewpoint of the possibility to identify a person and his/her state. A high accuracy of determining the identity (98.5-99.99% for 10 electrodes, 96.47% for two electrodes Fpl and Fp2) with a low transit time (2-2.5 s) was achieved. A significant decrease in accuracy was detected if the person was in different psychophysiological states during the training and testing. In earlier studies, this aspect was not given enough attention. A method is proposed for increasing the robustness of personality recognition in altered psychophysiological states. An accuracy of 82-94% was achieved in recognizing states of alcohol intoxication, drowsiness or physical fatigue, and of 77.8-98.72% in recognizing the user's activities (reading, typing or watching video). Practical relevance: The results can be applied in security and remote monitoring applications.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «EVALUATION OF EEG IDENTIFICATION POTENTIAL USING STATISTICAL APPROACH AND CONVOLUTIONAL NEURAL NETWORKS»

-\ УПРАВЛЕНИЕ В МЕДИЦИНЕ И БИОЛОГИИ I

UDC 004.93'1 Articles

doi:10.31799/1684-8853-2020-6-37-49

Evaluation of EEG identification potential using statistical approach and convolutional neural networks

A. E. Sulavkoa, PhD, Tech., Associate Professor, orcid.org/0000-0002-9029-8028, [email protected]

P. S. Lozhnikova, Dr. Sc., Tech., Associate Professor, orcid.org/ 0000-0001-7878-1976

A. G. Chobana, Student, orcid.org/0000-0003-1834-6651

D. G. Stadnikovа, Student, orcid.org/0000-0002-5405-2450

A. A. Nigreyb, Post-Graduate Student, orcid.org/0000-0002-8391-5374

D. P. Inivatova, Student, orcid.org/0000-0001-9911-1218

aOmsk State Technical University, 11, Mira Pr, 644050, Omsk, Russian Federation

bOmsk State Transport University, 35, Karl Marx Pr., 644046, Omsk, Russian Federation

Introduction: Electroencephalograms contain information about the individual characteristics of the brain activities and the psychophysiological state of a subject. Purpose: To evaluate the identification potential of EEG, and to develop methods for the identification of users, their psychophysiological states and activities performed on a computer by their EEGs using convolutional neural networks. Results: The information content of EEG rhythms was assessed from the viewpoint of the possibility to identify a person and his/her state. A high accuracy of determining the identity (98.5-99.99% for 10 electrodes, 96.47% for two electrodes Fpl and Fp2) with a low transit time (2-2.5 s) was achieved. A significant decrease in accuracy was detected if the person was in different psychophysiological states during the training and testing. In earlier studies, this aspect was not given enough attention. A method is proposed for increasing the robustness of personality recognition in altered psychophysiological states. An accuracy of 82-94% was achieved in recognizing states of alcohol intoxication, drowsiness or physical fatigue, and of 77.8-98.72% in recognizing the user's activities (reading, typing or watching video). Practical relevance: The results can be applied in security and remote monitoring applications.

Keywords — deep learning, multilayer neural networks, biometrics, machine learning, feature extraction, electrical brain activity, psychophysiological state, pattern recognition, spectrograms.

For citation: Sulavko A. E., Lozhnikov P. S., Choban A. G., Stadnikov D. G., Nigrey A. A., Inivatov D. P. Evaluation of EEG identification potential using statistical approach and convolutional neural networks. Informatsionno-upravliaiushchie sistemy [Information and Control Systems], 2020, no. 6, pp. 37-49. doi:10.31799/1684-8853-2020-6-37-49

Introduction

Today, technologies of creating neurocomputer interfaces transmitting commands to devices contact-free are actively developing. The majority of the neurointerfaces are based on the registration and interpretation of electroencephalograms (EEG), which reflect the dynamics of changes in brain electrical activity over time. The identification potential of electroencephalograms is extremely high — EEG analysis is used in tasks such as brain-computer interfaces [1], biometric identification and authentication [2], risky behavior identification [3], evaluation of a personal functional (physical, mental, emotional) state [4]. The last group of tasks is of particular interest since timely detection of the fact that the subject (an employee, an operator, a driver, a student) is asleep or intoxicated will help avoid emergencies. These states are characterized by reduced performance and reactions, distracted attention [5]. It is also possible to conclude the brain state (sleep, waking up, relaxation/meditation, concentration while performing heavy intellectual tasks) by analyzing the EEG. It allows building a continuous process

of identifying not only a subject [2] but a type of a task as well (at work, in gaming, during distant examinations, etc.). Such methods are in particular demand when it is necessary to automatically monitor the subjects' activities without direct surveillance and to exclude the "human factor" in decision-making.

The automatic EEG analysis is difficult to perform as signals are noisy and depend on many factors: equipment (frequency of electrode interrogation, the number, and characteristics of electrodes), installation and location of electrodes, individual features of EEG subjects. Traditional EEG analysis methods based on frequency filtering and artifact removal provide under-represented results to be implemented; the resulted features prove to have low information value that leads to low accuracy in classifying EEG images.

This study focuses on the development of methods for recognizing the user identity, his or her psychophysiological state (PPS), and the tasks he or she performs on a computer using multilayer convolution neural networks (CNN) and deep learning methods [1]. These tasks are considered in one paper, as they are closely connected. The

study focuses on this connection and estimates the informational value of EEG rhythms in terms of their ability to recognize the subject identity or particular states.

Psychophysiological state is a set of personal characteristics that reflect the biological aspects of adaptation to changing environmental conditions [6]. The following key PPS are considered in the paper: mild alcohol intoxication, drowsiness, physical fatigue, the norm. By the "norm" state it is understood that before the experiment, the test person has not been subjected to any physical or mental stress or taken any medication affecting the PPS.

The research has the following objectives.

1. To estimate the informational value of the EEG rhythms in terms of the possibility of identifying the user and his/her PPS.

2. To propose the architecture of convolutional networks for the recognition of EEG images.

3. To evaluate the accuracy (a ratio of correct decisions to the total number of experiments) of user identification, his or her PPS, as well as the tasks performed by the user in the "norm" state (watching an entertaining video, reading scientific literature, typing, inactivity/rest).

Base of EEG test persons

Data from the EEG was collected for 30 test persons. The location of the electrodes in all experiments was as shown in Fig. 1, according to the standard "10-20" scheme. The recording of the EEG in the "normal" state was done when the test users were performing fore tasks (each task preceded by the re-installation of the electrodes):

— the test user was sitting in a chair with his eyes closed (standard conditions);

— the test user was typing a scientific and technical text on the keyboard;

— the user was reading a scientific article;

— the user was watching a comedy.

All the test persons have performed each task within 10 min. Experiments with the first task were carried out on different days and using two various devices: Mitsar EEG-201 (19 channels with a noise level of < 2 pV and a sampling rate (SN) of 250 Hz per channel) and Neuron Spectrum-4/P from Neurosoft (21 channels with a noise level less than 0.3 pV and SN 500 Hz per channel) in order to assess the variability of the EEG over time, depending on the installation and device. NeuronSpectrum data for each test person were recorded 7 times on different days.

The recording of EEG data in states of intoxication, drowsiness, and fatigue was done only under standard conditions. In order to make the test person intoxicated, necessary doses of alcohol were calculated according to the modified formula Vidmark [see 7, formula (1)], based on a quantity of 0.7 %%, which corresponds to the second stage of intoxication according to the Federal Aviation Regulation (CFR) 91.17 classification. In order to put the test persons into a state of drowsiness, they were asked to take 2 tablets of Leonurus cardiaca 200 mg and to be sitting in a chair for 20 min in a quiet and dark room just before starting EEG recording. To record the EEG in a fatigued state, the test persons experienced intensive physical activity before the experiment, the minimum amount of which was determined by the Martinet method (20 squats in 30 s), and then varied according to the physical abilities

of the person. EEG recordings were made for each subject in each state with a duration of 5 min. The EEG recordings for each of the test persons' states were made on a separate day.

Estimation of the informational value of the EEG rhythms in terms of PPS

In this section, Mitsar data was analyzed, as only this data are obtained with the PPS in mind. In the first instance, a band-reject filter was applied to the original signals to suppress interference from power lines that operate at 50 Hz in Russia. EEG signals can have an "overtone effect", where interference is also observed at higher frequencies divisible by 50 Hz. The filtration was therefore performed in the 45-55, 95-105, and 145-155 Hz frequency bands.

The following EEG rhythms are distinguished by their frequency, duration, amplitude and waveform: delta (1-4 Hz), theta (4-8 Hz), alpha (8-14 Hz), beta (13-35 Hz), gamma (30-170 Hz), lambda (4-5 Hz), mu (7-13 Hz), kappa (8-12 Hz), tau (8-13 Hz), sigma (10-14 Hz). The main rhythms are the first five

ones. Let us assess the informational value of the EEG rhythms in terms of the possibility of identifying the subject, the subject's PPS, and the task the subject is performing on a computer.

Spectrograms provide sufficient enough representation of the signal, in this study they were calculated by applying a short-term (window) rapid Fourier transform using a rectangular window (the duration is 1 s, a window overlap is 50%). If the amplitude a of each harmonic with the frequency v is taken as a feature, it is possible to build a probability density function (PDF) of this harmonic for each image class (e. g. when identifying a PPS, a class for each state should be formed from one-second EEG images if the subjects who were in the corresponding state).

The area of intersection of the PDF values of a particular feature for two classes (with numbers j and i) is approximately equal to the probability of error Er(v)j i of the two-class identification of images by this feature (Fig. 2, a-d). The probability of correct classification is numerically equal to i(v)j i = 1 - Er(v)j i. The accuracy of this assessment depends on the sample size using which the relevant PDFs were built. The informational value of

a

a)

0.003

0.002

0.001

b)

c)

0.0020 0.0015

271 813 1354 1896 aiiHz (signal T3), pV

2438

Norm All subjects

a(

0.0010

0.0005

, I —.............

939 2816 4694 6571 a23Hz (signal Fp2), pV

8448

0.006 0.005 0.004 0.003 0.002 0.001 0

d)

105 315 525 735 a23Hz (signal T3), pV

946

0.0025

0.0020

H0.0015

a(

0.0010

0.0005

1 Watching ^ the film /2()

Standard conditions

All subjects

■h-

+

•I1

509

1528 2547 3566 a23Hz (signal Fp2), pV

4584

■ Fig. 2. Examples of determining the error probability in the classification of EEG images by one feature: a — subject identification under standard conditions in the "norm" state; b — assessment of the subject EEG variability under standard conditions in the "norm" state; c — PPS identification under standard conditions; d — identification of the task performed by the subject in the "norm" state

0

0

0

the feature for the maj ority of classes can be judged by the average probability estimation of the correct classification Iv = m(i(v)j t) for all pairs of classes (the higher the Iv is, the more informational value the frequency has). In a first approximation, in terms of the ability to identify a subject the informational value of the EEG rhythm can be calculated as the average of estimates I = m(Iv) in the corresponding frequency range. In addition, it must be borne in mind that the amplitude spectra of EEG signals may vary at different moments of time and depending on the installation of the electrodes. These changes can be estimated through the corresponding probability densities Pj day 1(avHz) and Pj day 2(avHz), which are derived from the EEG data of subjects recorded at identical PPS but on different days (see Fig. 2, b). Therefore, a correction of the informational value should be made to take into account the average probability of density

mismatch pj day_1(avHz) and Pj,day_2(avHz) for aU test persons in the "norm" state (hereinafter referred

to as the Ernorm).

Figure 3 shows graphs of the informational values of EEG rhythms for the task of the user identification (under standard conditions of EEG recording), with a correction made to take into account the dependence of the EEG on the installation and the subject's PPS: I = I■ ErPPS■ Ernorm, where ErPPS is the average error probability for the two-class identification in the state "norm", where the first class is the state "norm" and the second is any other state. This work reflects the probability of correct identification of the subject in a case of a mismatch between the installation and the PPS.

Figure 4, a—f shows graphs of the informational values of EEG rhythms for PPS identification tasks and the subject's activity. This assessment takes into account the dependence of the informational val-

i

Frequency, Hz

-»-Fp1 -«-Fp2 -*-F7 -m-F3 -»-Fz -»-F4 -t-F8 —T3 —T4 -»-T5 -«-T6

■ Fig. 3. Informational value of rhythms in multiclass subject identification under standard conditions in the "norm" state with respect to EEG variability due to changing mounting and a psychophysiological state

ue on the individual features of the subjects' EEG and installation.

In general, all the rhythms individually are not informative enough for highly accurate automatic identification, both of the subject identity and the PPS. However, some results should be noted. The most information-bearing signal for EEG identification is recorded in the rear right temporal zones (T6). In this area, the most information-bearing rhythm is kappa rhythm, as well as mu and tau rhythms. The frequency range of 7-14 Hz in this area is very information-bearing when recognizing types of activity that require concentration (watching films, reading). The most information-bearing rhythms (/« 0,395) are lambda and theta rhythms in the right frontal area (Fp2) when recognizing the subject's intense activity related to typing on a computer, and sigma rhythm recorded in the rear left side of the temporal area (T5) when identifying a subject. High frequencies contain less information about the subject's individual EEG characteristics.

Based on the analysis carried out, it can be concluded that it is not worth excluding any frequencies from the input data while building a classifier, since all rhythmic oscillations carry parts of information for certain classes of images (states or subjects).

The mutual correlation between the amplitudes of harmonics with different frequencies has been assessed (Fig. 5, a-d).

Figure 5, d demonstrates that on average (for all subjects and all PPS) approximately 50% of the harmonic vibrations of the EEG signals have a remarkable or high mutual correlation relationship (over 0.3). The nature of the correlation relationships varies both from subject to subject and from PPS to PPS, as shown in Fig. 5. Pattern recognition methods should therefore be used that take into account the nature of correlative relationships between features (e. g., in this case, the Bayesian naive classification scheme is ineffective). In this respect, convolutional neural networks can be preferred because they can take into account the peculiarities of spectrum changes over time as well as the correlation links between different rhythms and electrodes.

Identification of EEG images

Two series of experiments were carried out.

1. Identification of a subject (from a closed set). The data generated by this study (Mitsar, NeuronSpectrum) and the Physionet data set (64-channel EEGs with a duration of one minute, 109 test subjects in the "norm" state recorded under standard conditions with a sampling rate of 160 Hz) were used. The Physionet data set is one of the most rep-

a)

e)

I 0.23 0.21 0.19 0.17 0.15 0.13 0.11 0.09 0.07

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

b)

I 0.40 0.37 0.34 0.31 0.28 0.25 0.22 0.19 0.16 0.13 0.10 0.07

I 0.30 0.28 0.26 0.24 0.22 0.20 0.18 0.16 0.14 0.12 0.10 0.08

Frequency, Hz

----

- iZSp^—

■v

B

iSoc

rr

M x

J3 2

a

a Tf

5

S ce

fo .^cc

« c

st

« o Mcc

Frequency, Hz

I 0.23 0.21 0.19 0.17 0.15 0.13 0.11 0.09 0.07 0.05

d) I

0.29 0.27 0.25 0.23 0.21 0.19 0.17 0.15 0.13 0.11 0.09 0.07

Frequency, Hz

Frequency, Hz

Frequency, Hz

Frequency, Hz

Fp1 *Fp2 F7 F3 Fz F4 F8 T3 T4 -.-T5 -.-T6

■ Fig. 4. The informational value of rhythms in the two-class identification of psychophysiological state and the tasks performed by the subject, where the first class is the "norm" state, and the second is: a — intoxication; b — drowsiness; c — fatigue; d — watching a movie; e — typing; f — reading

resentative as it includes many test subjects and is often used to compare EEG classification methods [8]. The training was carried out based on the EEGs recorded in the "norm" state on one or more days, and testing was carried out by cross-checking for data not used in training (in the "norm" state or other PPS).

2. Identification of the PPS and the activity of the subjects (from a closed set). Only the Mitsar data set was used (30 subjects). The training sample was formed from all EEG data of 25 subjects, da-

ta from other subjects (not included in the training sample) were used for testing.

Electroencephalograms records were divided (with a 50% window overlapping) into shorter fragments: 2.5 s each (for Neuron-Spectrum data), 5 s each (for Mitsar data) and 2 s each (for Physionet data), as shown in Fig. 6. Each fragment obtained is an EEG image.

Images were submitted to the artificial neural network (ANN) input in two variants: as initial signals (IS) and as spectrograms (SG). The spectro-

a)

. 0.12

о

n e 0.1

& e 0.08

r

e 0.06

>

0.04

la

le g 0.02

0

b) >.

о р

&

lili.-

0.1 0.08 0.06 0.04 0.02

-0.335

0.109 0.116 0.343 0.569

0.796

0

-0.327

»111

II,..

0.102 0.122 0.347 0.572

0.797

Coefficient of pair correlation between the amplitudes of harmonics of EEG signals

c)

0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

-0.336 -0.135

d)

. 0.08

о n 0.07

e u 0.06

и e 0.05

0.04

0.03

la 0.02

le g 0.01

0

Is 0.02 1,00

0.327 0.559

0.791

-0.082 0.100 0.284 0.467

111.....

0.651 0.834

Coefficient of pair correlation between the amplitudes of harmonics of EEG signals

■ Fig. 5. Histograms of the relative frequencies of the pair correlation coefficients between the amplitudes of different harmonics (with frequencies from 1 to 128 Hz): a — for the test subject 1 in the "norm" state; b — for the test subject 2 in the "norm" state; c — for the test subject 1 in a state of intoxication; d — for all subjects in all states

50

> 0

^ ■ IMA iE 5 ^ - ÏMÂ^E 7 J

-c

-500

ü

в <

100

0

4000

4500 5000

500 1000 1500 2000 2500 3000 3500

Sample number Ai

-Fp1-AA -Fp2-AA -F3-ÂÂ -F4-ÂÂ -O1-ÂÂ -O2-ÂA -Fpz-ÂÂ -Fz-ÂÂ -Cz-ÂÂ —Oz-AA

■ Fig. 6. Division of EEG into fragments of 2.5 s (Neuron-Spectrum data)

grams were calculated with window 64 and step 16 of the reports and further normalized to the minimum and maximum amplitude values of all signals for all subjects (to bring them to a single amplitude value interval [0; 1]).

Many ANN architectures were formed, focused on both IS and SG processing. Each architecture was built with respect to the peculiarities of a particular data set: sampling frequency, the number of electrodes, the number of identifiable classes (test subjects). The process of creating a network model based on some architecture consisted of several stages. At the architecture design stage, the

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

neural network structure (the number of layers, the number of neurons in layers) was selected and hyperparameter calibration was done. The model was trained 10 times, and each time the EEG data were randomly divided into training and test samples (in the proportions originally set). During the training, an initial accuracy assessment was made on the validation sample (which was a subset of the test sample and includes 5-10% of the test examples). The primary accuracy estimates for each model were averaged. Next, the best models (with an accuracy margin of more than 10%) were fully tested (using the full test sample), after which the

average Q accuracy estimates for each model were determined.

The networks were formed from constructions in a form of BLOCKS (from two to four per network). Each BLOCK consisted of two convolution

layers (CL) with ReLu neuron activation functions, one layer of batch normalization (BN) and a dropout layer. The categorical cross entropy was used as an error function [9]. Each network also included an input (IL) and two fully connected layers (FL).

■ Table 1. Configuration of one of the promising ANNs for analysis of spectrograms

Layer Type Layer Parameters

Input Layer Dimension = 11, 32, 75 ([channel][frequency][time])

BLOCK_2d Convolutional 2D Number of filters = 10, convolution window = 3.3, stride = 2.2

Convolutional 2D Number of filters = 10, convolution window = 3.3, stride = 2.2

Batch Normalization

Dropaut Rate = 0.125

BLOCK_2d Convolutional 2D Number of filters = 20, convolution window = 3.3, stride = 2.2

Convolutional 2D Number of filters = 20, convolution window = 3.3, stride = 2.2

Batch Normalization

Dropaut Rate = 0.175

BLOCK_2d Convolutional 2D Number of filters = 30, convolution window = 2.1, stride = 1.1

Convolutional 2D Number of filters = 30, convolution window = 1.5, stride = 1.1

Batch Normalization

Dropaut Rate = 0.25

Fully connected layer Number of neurons = 30, activation function: sigmoid

Fully connected layer Number of neurons = number of classes, activation: softmax

Input, quantity of channels ANN structure Quantity of epochs, batch size Training sample per 1 man Test sample (PPS) Q, %

Neuron-Spectrum (30 classes, transit time (image size) 2.5 s, sampling rate 500 Hz)

IS, 10 IL + 3BLOCKS 1d + + 2FL 20, 50 Day 1 (4 min), day 2 (1 min) Norm, day 2 > 99.99

IS, 2 (Fp1, Fp2) 95

SG, 10 IL + 3BLOCKS 2d + + FL + BN + FL > 99.99

SG, 2 (Fp1, Fp2) 96.47

Mitsar (30 classes, transit time (image size) 5 s, sampling rate 250 Hz)

IS, 11 IL + 3BLOCKS 1d + + 2FL 100, 20 Day 1 (5 min) Norm 94

100, 20 Intoxication, drowsiness, fatigue 28.3

SG, 11 IL + 3BLOCKS 2d + + 2FL (Table 1) 20, 20 Norm 98.5

20, 20 Intoxication, drowsiness, fatigue 41.8

25, 20 Days 1 & 2 (2.5 min each) 64

35, 30 Days 1, 2 & 3 (2.5 min each) 78.7

50, 30 7 days (2.5 min each) 97.59

Physionet (109 classes, transit time (image size) 2 s, sampling rate 160 Hz)

IS, 64 IL + 4BLOCKS 1d + + FL + BN + FL 70, 50 40 s Norm 96.97

SG, 64 IL + 2BLOCKS 2d + + FL + BN + FL 100, 50 98.5

■ Table 2. The results of subject identification by EEG

■ Table 3. The results of the identification of the subject's PPS and activity by the EEG

Input, quantity of channels ANN structure Quantity of epochs, batch size Classes Q, %

IS, 11 IL + 3BLOCKS_1d + 2FL 100, 50 4 (norm, intoxication, drowsiness, fatigue) 35

SG, 11 IL + 3BLOCKS_2d + 2FL (Table 1) 40, 50 41.5

IS, 11 IL + 3BLOCKS_1d + 2FL 100, 50 4 (norm, typing, reading, watching a movie) 56.68

SG,11 IL + 3BLOCKS_2d + 2FL (Table 1) 40, 50 59.21

SG,11 IL + 3BLOCKS_2d + 2FL (Table 1) 20, 25 2 (norm, fatigue) 94

IL + 3BLOCKS_2d + FL + BN + FL 20, 25 2 (norm, reading) 92

20, 25 2 (norm, drowsiness) 84

20, 25 2 (norm, intoxication) 82

IS,11 IL + 3BLOCKS_1d + FL + BN + FL 50, 25 72

SG,11 IL + 3BLOCKS 2d + 2FL (Table 1) 20, 25 2 (norm, movie) 98.72

20, 25 2 (norm, typing) 77.8

Thus, each neural network included from six to ten hidden layers (CL and FL). The parameters of the convolutions differed in various network implementations. One-dimensional convolutions were used for the IS analysis (time series analysis [9]), and two-dimensional convolutions were used for SG (image analysis [9]). Table 1 provides a description of one of the CNN architectures used in the experiment. The most representative results, as well as consolidated data on the parameters of the experiment (a number of training epochs, a mini-batch size, layers used, description of samples, etc.), are presented in Tables 2 and 3 (these tables describe CNN architectures in a shorter form).

Analysis of the obtained results for the identification of the EEG and their comparison with previous achievements

The survey has shown that traditional methods of signal analysis (frequency filtering, removal of artifacts, reduction of feature space dimension by the principal component analysis method (PCA)) and pattern recognition (k-nearest neighbors method (k-NN), support vector machine (SVM), C4.5 decision tree algorithm, etc.) are used more often to solve the problems under consideration [2, 4]. Artificial neural networks (ANNs) are also used in EEG analysis, with CNN giving better results in some tasks (emotion recognition in particular) [10].

The great majority of known studies consider the problems of identity recognition and human PPS identification as independent [2, 4]. There is little data available on the robustness of the identification results obtained in cases where subjects were

in different PPS during the training and testing phases. The use of the EEG method of identification in practice requires that the results are consistent in a case of changes in the installation, and a state of the subject is identified.

The experiment carried out has shown that the accuracy of identification by EEG is reduced to unsatisfactory results if the subject's PPSs do not coincide during the training and testing stages. This indicates a high variability of the EEG depending on the subject's state (and possibly installation). In previous studies, this aspect has either not been taken into account or has been taken into account indirectly (for example, by normalizing signals to alpha rhythm [8], that does not guarantee their independence from PPS). It is proposed the network to be trained on EEG data recorded by the subject on different days. This has led to a significant increase in the reliability of the identification, including when the PPS changes, which follows from the results obtained — training with the 2-day data increases the accuracy by 22.2%, with the 5-day data by 53.5% (see Table 2).

When identifying an individual without respect to PPS (see Table 2), the high accuracy was obtained — it was 98.5% for Mitsar (30 classes) and Physionet (109 classes). When using the Neuron-Spectrum-4/P device, which has an increased sampling rate and low internal noise levels, the accuracy exceeded 99.99% (no errors were recorded). It is significant that when only two frontal electrodes were used, the accuracy on this device was 96.47%.

The percentage of correct decisions for two-class identification of PPS (see Table 3) ranged from 82 to 94%, and for two-class identification of tasks performed by a user on a computer — from 77.8 to 98.72%. For multiclass identification, the accuracy

is significantly reduced (approximately twofold). The most difficult to identify is when the user is typing texts in real time on the keyboard (for this task, the accuracy was 77.8%).

Based on the results, it can be concluded that recognition accuracy and training rates increase (fewer epochs are required) if spectrograms are used as input data (see Tables 2 and 3).

Let us give a brief summary of the achieved accuracy rates of the subject's recognition and the PPS identification by EEG (Tables 4 and 5). The results of the analytical study on these issues are described in more detail in [2, 4].

As can be seen (see Table 4), convolution networks allow obtaining higher EEG identification accuracy with much shorter transit times. The re-

sult achieved in this paper surpasses the previous results.

The accuracy achieved in this paper in detecting drowsiness using the EEG is comparable to that obtained by other researchers; for intoxication, the result is on average slightly lower. However, the results of the recognition of fatigue, as well as the tasks performed by the user on the computer, are quite high. No direct analogy of these results have been found in the literature for direct comparison. The paper [3] should be mentioned that presents several hypotheses about the possibility of defining a subject's "risky behavior" based on EEG data (the possibility of performing dangerous or illegal actions), and a number of experiments have been carried out to test them. Sixty-two volunteers

■ Table 4. Comparison of the results of user recognition by EEG

Number of test persons Methods Transit time, s Accuracy, %

45 Analysis of the activity of the brain areas responsible for reading and recognizing words based on artificial neural networks. A combination of 3 classifiers was used: cross-correlation, "naive" Bayes and feed forward ANN. Single-channel EEGs were used 50 97 [11]

50 Evaluation of individual brain reactions to various stimuli: primary visual perception, recognition of familiar faces, tastes. A time series cross correlation (fragments of EEG signals) was used as a classifier. Testing was repeated to account for the effect of EEG variability on the result over time. PPS was not monitored. The test subjects were placed in a camera protected from radio frequencies 27 No error recorded [12]

15 The application of an algorithm of generating 256-bit EEG-based keys using P300 evoked potential and two-layer neural networks trained in accordance with GOST R 52633.5.2011 (EEG authentication, one-to-one comparison). To be authenticated the user made a certain movement with his eyes (left, right, up, down etc.) Over 10 10-10 [13]

80 Conversion of the test subjects' EEG in the "norm" state into a cryptographic key based on the fuzzy extractor method (EEG authentication, one-to-one comparison). The effect of alcohol ingestion on accuracy was studied Before alcohol: 0.9742 After: 0.9389 [14]

109 (Physionet) The EEG was recorded under standard conditions. The EEG was normalized by level, a bandpass filter (1-50 Hz), STFT (Hemming window) and the Fisher linear discriminant classifier were applied. The accuracy is higher at the moment of relaxation of the test subjects (alpha rhythm determines the optimal moment for authentication) 10 95.3-97.2 (64 electrodes) 93 (1 electrode) [8]

2 98.5 (PPS is norm)

30 (NeuronSpectrum) The results obtained. Identification of images using CNN (EEG spectrogram analysis), one-to-many comparison. The impact of different PPSs was taken into account 2.5 No errors recorded (PPS is norm)

30 (Mitsar) 5 98.5 (norm) 97.59 (modified)

■ Table 5. Comparison of results of recognizing subject's states and actions by EEG

Test persons, electrodes Methods Accuracy, %

Sleep stage/ drowsiness

29 test persons, Fp1, A1 SVM 72.7 [15]

10 test persons, 19 electrodes: Fp1-2, F3-4, C3-4, P3-4, O1-2, F7-8, T3-6, Fz, Cz, Pz ANN 83.3 [7]

12 test persons, 32 electrodes SVM, ANN, random tree and k-NN 93-97 [16]

6 test persons, 32 electrodes SVM, k-NN 95 [17]

- Hurst method 52.2 [18]

- Higher-order spectrum analysis 88.7 [19]

- Fuzzy logic, cluster analysis, Euclidian distance 80 [20]

30 test persons, 11 electrodes: Fp1, Fp2, Fz, F3-8, T3-6 Achieved result (drowsiness recognition) 84

Intoxication stages (the first stage — soberness)

Alcohol ingestion: 50 ml, 40% ABV, 3 times a day (3 stages), electrodes: AF3-4, F3-8, FC5-6, P7-8, T7-8, O1-2 The signal is divided into fragments: 1 s each with a step of 0.5 s. 11 features are extracted from each fragment 89.95 (4 stages) [21]

The test persons ingested beer (750 and 1500 ml), 1 electrode Fp1 ANN 92.3 and 59.2 (2 and 7 stages) [22]

50 test persons in a "norm" state and 50 intoxicated persons, 64 electrodes ANN, training — 40 subjects for each class, and 10 subjects for a test 95 (1200 training epochs) [23]

40 test persons in a "norm" state and 40 intoxicated persons SVM, the training sample contains 20 subjects per class, the test sample contains 20 subjects. EEG has been processed with a filter (0.5-50 Hz) 98.83 [24]

50 intoxicated persons and in a "norm" state Features are Yule — Walker equations autocorrelation coefficients. A training sample contains 40 users, a test sample contains 10 subjects 95 [25]

1341 visual records of evoked potentials (EP) (1129 — for training and 212 — for a test) Power spectrum of EEG signals, average and dispersion of EP reports, PCA, fuzzy output 98.11 [26]

10 test persons in the "norm" state and intoxicated, 64 electrodes (sampling rate is 256 Hz) PCA, C4.5, k-NN, SVM 79.3 (1 electrode) and 96.8 (64 electrodes) [27]

30 test persons, 11 electrodes: Fp1, Fp2, Fz, F3-8, T3-6 Obtained results (recognition of intoxication) 82

Other states (obtained results)

30 test persons, 11 electrodes: Fp1, Fp2, Fz, F3-8, T3-6 Movie 98.72

Fatigue 94

Reading scientific articles 92

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Typing 77.8

(16 women and 46 men) were invited to conduct the Conclusion experiments. The EEG was registered at 128 scalp

areas. It was found that risk behavior is effectively The tasks of identification of an individual and

predicted by the EEG through event-related electri- the PPS by EEG are closely connected. The EEG con-

cal potentials. tains information on both the individual character-

istics of the subject's brain and the subject's state, as well as states that depend on his or her activity in real time. This paper estimates the informational value of EEG rhythms for recognizing a subject and a subject's PPS (with respect to the variability of the EEG over time, in a case of changes in installation and in dependence on PPS), recorded by the electrodes Fp1, Fp2, Fz, F3-8, T3-6 (in accordance with the "10-20" scheme). The most informative personal identification signal is recorded in the rear right (especially kappa, mu and tau rhythms) and left (sigma rhythm) temporal zones. The frequency range of 7-14 Hz in the posterior right temporal area is also meaningful when recognizing types of activity that require concentration (watching movies, reading). When recognizing typing activities on a computer, the lambda and theta rhythms in the right frontal lobe are the most meaningful.

Based on the results of the experiments, the convolution neural networks showed a higher accuracy of EEG identification (98.5-99.99% for 10 or more electrodes), with a shorter transit time (from 2 to 5 s) compared to the previously achieved level. It is significant that high accuracy is observed when only two frontal sensors are used (96.47%), which makes it possible to use "dry" electrodes that come into direct contact with the skin.

The use of the EEG method of identification requires results to be consistent in a case of changes in the installation and a state of the identified subject. Without the PPS, EEG-based identification results are of limited value. Our experiments have shown a significant drop in the accuracy of identification if the subject was in different PPSs during the training and testing phase of the EEG. In the course of previous studies, insufficient attention was paid to this aspect, and the tasks of identifying (authenticating) a subject and recognizing his or her EEG-based PPSs were perceived as independent. In this work, it has been suggested that the network should be trained on EEG data recorded by

1. Craik A., He Y., Contreras-Vidal J. L. Deep learning for electroencephalogram (EEG) classification tasks: a review. Journal of Neural Engineering, 2019, no. 3, vol. 16. doi:10.1088/1741-2552/ab0ab5

2. Sulavko A. E., Kuprik A. I., Starkov M. A., Stadni-kov D. G. Analysis of methods for recognizing human images by the characteristics of electroencephalograms (Review). Information Security Questions, 2018, no. 4, pp. 36-46 (In Russian).

3. Vance A., Anderson B. B., Kirwan B. C., Eargle D. Using measures of risk perception to predict information security behavior: Insights from electroen-cephalography (EEG). Journal of the Association for

the subject on different days (without the control of the PPS). The results we have obtained indicate that this could significantly improve the reliability of identification, including the cases when the subject's PPS changes. If systems are trained on EEG data recorded on several different days, the recognition results become almost robust regardless of the condition of the subjects.

The accuracy of recognition of PPS by EEG achieved in this paper is comparable to the level obtained by other researchers. The percentage of correct decisions for two-class identification ranges from 82 to 94% (depending on the PPS detected — "norm", alcohol intoxication, drowsiness, physical fatigue). For multiclass identification, the accuracy is several times lower. However, it is worth noting that accuracy is achieved when there are 25 test subjects in a training sample, which indicates a high potential for the convolution networks in this task.

It was also possible to obtain the following estimates of the accuracy of the two-class identification of tasks performed by the user on the computer (where the first class characterizes inactivity/rest when the EEG is recorded under standard conditions, the second one is one of the following tasks): reading scientific texts 92%; watching an entertainment video 98.72%; typing a text on the keyboard 77.8%. For multiclass identification, the accuracy drops to 59.21 per cent. It is the first time these results are obtained, and they can be used when it is necessary to automatically monitor the activity of subjects without the ability to directly observe them (for example, when taking examinations remotely).

Financial support

This work was supported by Russian Foundation for Basic Research, No. 18-41-550002.

Information Systems, 2014, vol. 15, no. 10, pp. 679722. doi:10.17705/1jais.00375

4. Nigrey A. A., Zhumazhanova S. S., Sulavko A. E. Methods for automatic assessment of the psychophysiological state of a person according to the parameters of electroencephalograms (review). Biomedical Radioelectronics, 2020, no. 5, pp. 5-18 (In Russian).

5. Yazdani A., Setaterhdan S. K. Classification of EEG signal correlated with alcohol abusers. Proceedings of the ISSPA Conference Sharjah UAE, 2007. doi: 10.1109/ISSPA.2007.4555309

6. Bogomolov A. V., Gridin L. A., Kukushkin Yu. A., Ushakov I. B. Diagnostika sostoyaniya cheloveka: matematicheskie podxody [Diagnosis of human con-

dition: mathematical approaches]. Moscow, Medicina Publ., 2003. 464 p. (In Russian).

7. Schmitz A., Grillon C. Assessing fear and anxiety in humans using the threat of predictable and unpredictable aversive events (the NPU-threat test). Nature Protocols, 2012, pp. 527-532. doi:10.1038/nprot. 2012.001

8. Suppiah R., Vinod A. P. Biometric identification using single channel EEG during relaxed resting state. IET Biometrics, 2018, vol. 7, pp. 342-348. doi:10. 1049/iet-bmt.2017.0142

9. Deng L., Yu D. Deep learning: methods and applications. Foundations and Trends in Signal Processing, 2014, vol. 7, no. 3-4, pp. 197-387.

10. Yang H., Han J., Min K. A multi-column CNN model for emotion recognition from EEG signals. Sensors, 2019, vol. 19, iss. 21. doi:10.3390/s19214736

11. Armstrong B., Blondet M. R., Khalifian N., Kurtz K. J., Zhanpeng Jin, Laszlo S. Brainprint: Assessing the uniqueness, collectability, and permanence of a novel method for ERP. Neurocomputing, 2015, vol. 166, pp. 59-67. doi:10.1016/j.neucom.2015.04.025

12. Ruiz-Blondet M. V., Zhanpeng Jin, Laszlo S. CERE-BRE: A novel method for very high accuracy event-related potential biometric identification. IEEE Transactions on Information Forensics and Security, 2016, vol. 11, pp. 1618-1629. doi:10.1109/TIFS.2016. 2543524

13. Goncharov S. M., Borshevnikov A. E. Neural network transformer "Biometry — access code" based on the electroencephalogram in modern cryptographic applications. Vestnik SibGUTI, 2016, no. 1, pp. 17-22. (In Russian).

14. Nguyen D., Tran D., Sharma D., Ma W. On the study of impacts of brain conditions on eeg-based cryptographic key generation systems. Procedia Computer Science, 2018, vol. 126, pp. 713-722.

15. Ogino M., Mitsukura Y. Portable drowsiness detection through use of a prefrontal single-channel electroencephalogram. Sensors, 2018, vol. 18, no. 12. doi:10.3390/s18124477

16. Zunhammer M., Eberle H., Eichhammer P., Busch V. Somatic symptoms evoked by exam stress in university students: the role of alexithymia, neuroticism, anxiety and depression. PLOS One, 2013, vol. 8, iss. 12, pp. 1-11.

17. Guntekin B., Basar E. A review of brain oscillations in perception of faces and emotional pictures. Neu-ropsychologia, 2014, pp. 33-51. doi:10.1016/j.neu-ropsychologia.2014.03.014

18. Antipov O. I., Zakharov A. V., Poverennova I. E., Ne-ganov V. A., Erofeev A. E. Facilities of different

methods of automatic recognition of sleep stages. Saratov Journal of Medical Scientific Research,

2012, vol. 8, no. 2, pp. 374-379 (In Russian).

19. Rajendra Acharya U., Eric Chern-Pin Chua, Kuang Chua, Lim Choo, Toshiyo Tamura. Analysis and automatic identification of sleep stages using higher order spectra. International Journal of Neural Systems, 2010, vol. 20, no. 6, pp. 509-530.

20. Zaharov E. S. Automated sleep stage recognition. Iz-vestiya SFedU. Engineering sciences, 2008, no. 5, pp. 117-120 (In Russian).

21. Tzimourta K. D., Tsoulos I. G., Bilero T., Tzallas A. T., Tsipouras M., Giannakeas N. Direct assessment of alcohol consumption in mental state using brain computer interfaces and grammatical evolution. Inventions, 2018, vol. 3, no. 3, 51 p. doi:10.3390/inven-tions3030051

22. Karungaru S., Yoshida T., Seo T., Fukumi M., Terada K. Monotonous tasks and alcohol consumption effects on the brain by EEG analysis using neural networks. International Journal of Computational Intelligence and Applications, 2012, vol. 11, no. 03. doi:10.1142/ S1469026812500150

23. Sarraf J., Chakrabarty S., Pattnaik P. K. EEG based oscitancy classification system for accidental prevention. Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, 2017, pp. 235-243. doi:10.1007/978-981-10-3156-4_24

24. Thangarajah V., Denshiya D. A., Senaka A., Jayalath E. BCI-based alcohol patient detection. 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent, 2017. doi:10.1109/IFSA-SCIS.2017. 8305564

25. Ziya E., Akif A., Mehmet R. B. The classification of EEG signals recorded in drunk and non-drunk people. International Journal of Computer Applications,

2013, vol. 68, no. 10, pp. 40-44.

26. Yazdani A., Ataee P., Setarehdan K., Araabi B. N., Lucas C. Neural, fuzzy and neurofuzzy approach to classification of normal and alcoholic electroencephalograms. Proceedings of the 5th International Symposium on Image and Signal Processing and Analysis, 2007. doi:10.1109/ISPA.2007.4383672

27. Shuaifang Wang, Yan Li, Pen Wen, Guohun Zhu. Analyzing EEG signals using graph entropy based principle component analysis and J48 decision tree. International Journal of Signal Processing Systems, 2016, vol. 4, no. 1, pp. 67-72. doi:10.12720/ijsps. 4.1.67-72

УДК 004.93'1

doi:10.31799/1684-8853-2020-6-37-49

Оценка идентификационного потенциала электроэнцефалограмм с использованием статистического подхода и сверточных нейронных сетей

А. Е. Сулавкоа, канд. техн. наук, доцент, orcid.org/0000-0002-9029-8028, [email protected]

П. С. Ложникова, доктор техн. наук, доцент, orcid.org/0000-0001-7878-1976

А. Г. Чобана, студент, orcid.org/0000-0003-1834-6651

Д. Г. Стадникова, студент, orcid.org/0000-0002-5405-2450

А. А. Нигрей6, аспирант, orcid.org/0000-0002-8391-5374

Д. П. Иниватова, студент, orcid.org/0000-0001-9911-1218

аОмский государственный технический университет, Мира пр., 11, Омск, 644050, РФ 6Омский государственный университет путей сообщения, К. Маркса пр., 35, Омск, 644046, РФ

Введение: электроэнцефалограммы содержат информацию об индивидуальных особенностях работы мозга и психофизиологическом состоянии субъекта. Цель исследования: оценить идентификационный потенциал электроэнцефалограмм; разработать методы идентификации личности и психофизиологического состояния субъектов, а также действий пользователя, выполняемых на компьютере, по электроэнцефалограмме с использованием аппарата сверточных нейронных сетей. Результаты: оценена информативность ритмов электроэнцефалограмм с точки зрения возможности идентификации человека и его состояния. Достигнута высокая точность идентификации личности (98,5-99,99 % для 10 электродов, 96,47 % для двух электродов Fp1 и Fp2) при низком времени прохода (2-2,5 с). Обнаружено существенное падение точности идентификации, если на этапе обучения и тестирования сети субъект находился в разных психофизиологических состояниях. (В ранних исследованиях данному аспекту уделялось недостаточно внимания.) Предложен способ повышения робастности распознавания личности в измененных состояниях. Достигнута точность 82-94 % при распознавании состояний алкогольного опьянения, сонливости, физической усталости и 77,8-98,72 % при распознавании действий пользователя (чтение, набор текста, просмотр видео). Практическая значимость: результаты будут востребованы в приложениях информационной безопасности и удаленного мониторинга субъектов (при отсутствии возможности непосредственно наблюдать за ними).

Ключевые слова — глубокое обучение, многослойные нейронные сети, биометрия, машинное обучение, извлечение признаков, электрическая активность мозга, психофизиологическое состояние, распознавание образов, спектрограммы.

Для цитирования: Sulavko A. E., Lozhnikov P. S., Choban A. G., Stadnikov D. G., Nigrey A. A., Inivatov D. P. Evaluation of EEG identification potential using Statistical approach and convolutional neural networks. Информационно-управляющие системы, 2020, № 6, с. 37-49. doi:10.31799/1684-8853-2020-6-37-49

For citation: Sulavko A. E., Lozhnikov P. S., Choban A. G., Stadnikov D. G., Nigrey A. A., Inivatov D. P. Evaluation of EEG identification potential using statistical approach and convolutional neural networks. Informatsionno-upravliaiushchie sistemy [Information and Control Systems], 2020, no. 6, pp. 37-49. doi:10.31799/1684-8853-2020-6-37-49

В опубликованную статью Зотин А. Г., Фаворская М. Н. Применение штрихкодирования для цифрового маркирования видеопоследовательностей на основе частотных преобразований, 2020, № 5, авторы вносят дополнение.

Финансовая поддержка

Работа выполнена при поддержке Российского фонда фундаментальных исследований, проект 19-07-

00047 А.

i Надоели баннеры? Вы всегда можете отключить рекламу.