Лингвистическое профилирование интернет-личности

Лингвистическое профилирование интернет-личности

Ключевые слова
лингвокриминалистика / профайлинг / интернет-личность / лингвистическая экспертиза / фоноскопия / автороведение / forensic linguistics / profiling / internet personality / linguistic examination / phonoscopy / authorship studies

Аннотация научной статьи по языкознанию и литературоведению, автор научной работы — Р. К. Потапова, В. В. Потапов, К. А. Нефедова

В связи с появлением и повсеместным использованием Интернета изменилось понятие языковой личности. Оно получило название «цифровая личность». Основная цель исследования – проверить, сможет ли новый алгоритм позволить профилировать и идентифицировать «цифровую личность» по письменному тексту. В статье рассматриваются такие методы профилирования личности, как: фоноскопическая и автороведческая экспертизы, особенности цифровой личности и методика исследования новых способов профилирования и идентификации личности с учетом смены парадигмы лингвистической экспертологии.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
Linguistic Profiling of Internet Personality

Due to the advent and widespread use of the Internet, the concept of a linguistic personality has changed, while obtaining the name of “digital personality”. The main goal of the study is to find out whether the new algorithm can make it possible to profile and identify a “digital personality” by their written speech. The article discusses such methods of personality profiling as: phonoscopic and authorship examination, features of a digital personality and a methodology for developing new ways of personality profiling and identification, taking into account the shift in the linguistic expertology paradigm.

Лингвистическое профилирование интернет-личности

Научная статья УДК 81'33:81'42:343.98

Лингвистическое профилирование интернет-личности

Р. К. Потапова1, В. В. Потапов2, К. А. Нефедова3

1Московский государственный лингвистический университет, Москва, Россия, RKpotapova@yandex.ru 2Московский государственный университет имени М. В. Ломоносова, Москва, Россия, Volikpotapov@gmail.com 3Московский исследовательский центр Департамента региональной безопасности и противодействия коррупции, Москва, Россия, nord344@mail.ru

Аннотация. В связи с появлением и повсеместным использованием Интернета изменилось понятие языко-

вой личности. Оно получило название «цифровая личность». Основная цель исследования - проверить, сможет ли новый алгоритм позволить профилировать и идентифицировать «цифровую личность» по письменному тексту. В статье рассматриваются такие методы профилирования личности, как: фоноскопическая и автороведческая экспертизы, особенности цифровой личности и методика исследования новых способов профилирования и идентификации личности с учетом смены парадигмы лингвистической экспертологии.

Ключевые слова: лингвокриминалистика, профайлинг, интернет-личность, лингвистическая экспертиза, фоноско-пия, автороведение

Для цитирования: Потапова Р. К., Потапов В. В., Нефедова К. А. Лингвистическое профилирование интернет-личности // Вестник Московского государственного лингвистического университета. Гуманитарные науки. 2024. Вып. 6 (887). С. 78-83.

Original article

Linguistic Profiling of Internet Personality

Rodmonga K. Potapova1, Vsevolod V. Potapov2, Kseniya A. Nefedova3

1 Moscow State Linguistic University, Moscow, Russia, RKpotapova@yandex.ru 2Lomonosov Moscow State University, Moscow, Russia, Volikpotapov@gmail.com

3Moscow Research Center of the Department of Regional Security and Anti-Corruption, Moscow, Russia, nord344@mail.ru


Due to the advent and widespread use of the Internet, the concept of a linguistic personality has changed, while obtaining the name of "digital personality". The main goal of the study is to find out whether the new algorithm can make it possible to profile and identify a "digital personality" by their written speech. The article discusses such methods of personality profiling as: phonoscopic and authorship examination, features of a digital personality and a methodology for developing new ways of personality profiling and identification, taking into account the shift in the linguistic expertology paradigm.

Keywords: forensic linguistics, profiling, internet personality, linguistic examination, phonoscopy, authorship


For citation: Potapova, R. K., Potapov, V. V., Nefedova, K. A. (2024). Linguistic profiling of internet personality.

Vestnik of Moscow State Linguistic University. Humanities, 6(887), 78-83.


With the advent and development of the Internet, the so-called "electronic communication" began to take shape, which is an integral part of social communication [Курьянова, Лопаткин, 2022]. An increasing number of people prefer Internet communication, using various Internet resources for digital communication, such as: Telegram, WhatsApp, VKontakte, etc. This widespread use of the Internet has led to the emergence of such a concept as "Internet personality", or "digital personality" [Потапова, Курьянова, 2022]. It has its own differences from a linguistic personality due to its specific features, which creates challenges in its identification when traditional profiling methods are applied.

Initially, the term "profiling" was used in the field of forensics in the area of creating a kind of search psychological portrait of the suspect. Recently, this term has been used in a broader context up to the discovery of hidden psychological information about the individual under consideration (see, e. g.: [Ekman, 1985]).

Anonymity and the possibility of masking the identity of the Internet user have led to uncontrollable behavior of people on the Internet. Concepts such as "cybercrime", "cyber threat" have emerged and have spread rapidly on the Internet in recent years. With increasing frequency, one can encounter copyright infringement, fraud, violation of human dignity, blackmail, extortion, as well as the appearance of calls for extremist activity and other illegal actions on the Internet. In this regard, it is necessary to search for new methods for profiling a digital personality. This problem determines the relevance of this study [Курьянова, Лопаткин, 2022].

Compilation of a speech portrait of the speaker, i.e. linguistic profiling, is based on the description of psychic phenomena through the establishment of their correlation with features of another kind, specifically, with their manifestations in speech. These tasks in linguistic profiling are addressed in the context of phonoscopic and authorship examinations [Курьянова, Лопаткин, 2022; Галяшина, 2003].

Phonoscopic examination deals with identification (the speaker is recognized by their speech features (according to the principle "one of many')) and verification (an expert determines whether a given voice sample belongs to a particular speaker or not (according to the principle "yes - no") of the speaker on the basis of their voice and speech features [Потапова, Потапов, 2006].

The object of phonoscopic examination is a phonogram (sound recordings). The specificity of such an examination is the fact that the expert must have

good knowledge in such areas as: linguistics, acoustics, radio engineering, mathematics, etc. (e.g., see: [Gibbons, 2003; Hollien, 1990; Hollien, 2002; Hudson, McDougall, Hughes, 2021; Галяшина, 2021]).

It is worth paying special attention to the fact that the human voice can have not only linguistic features, but also contain information regarding the emotional state of the speaker, their psychological and physiological peculiarities, geographical affiliation and social status, level of education, etc. [Потапова, Потапов, 2006].

Such studies are always interdisciplinary in nature and are conducted taking into account factors that can affect the change in the speech signal. These are: stress, mental state, fatigue, drug or alcohol intoxication, illness, defects in the structure of the articulatory organs, as well as various techniques of disguising a voice [Бурыгина, 2016]. Voice disguise can include transformations of the speaker's language or dialect, change in the category of age in speech, an appearance of a prominent feature, such as "harsh / creaky voice" [Потапова, Потапов, 2006].

Authorship examination is a type of examination in which the text is examined to establish its authorship, individual features of the author, and the conditions for creating the text. The features of the author can be identified by a level-by-level analysis of speech: from punctuation to discourse [Моисеева, Огорелков, 2022].

When identifying the author of a text, consideration should be given to the probability that the author could disguise (mask) the individual features of their personality. The fact of masking can usually be determined by identifying inconsistencies in the text between individual elements of linguistic features. For example, a high level of spelling skills may be identified, which may not correspond to a low level of syntactic skills.

The traditional methods used by linguists when conducting authorship or phonoscopic examinations, of course, are the basis of all linguistic forensic studies. The transition to a Digital Personality, which realizes itself in a virtual communication environment, required new fundamental developments from forensic linguists [Карпова, 2007; Лутовинова, 2009]. The digital personality has characteristic language features that are manifested clearly on the Internet mainly due to anonymity, but also in the absence of any editors. The Internet has given rise to a completely new form of communication, located at the intersection of the two known forms - written spoken speech, which differs from both standard written speech and standard spoken speech [Потапова, Курьянова, 2021]. Such speech may include vernacular or

foreign words and neologisms, and it may also be characterized by a large number of errors and typos, because such speech is almost never checked and corrected. In addition, a characteristic feature of the digital personality is the simplification of the language and the desire to use the least effort in the process of speech production.

Since this communication has a computermediated nature and, as a rule, the communicants do not see each other (there is no video channel), this implies another characteristic feature of the virtual personality: the use of graphic techniques to convey paralinguistic information in writing, such as bold, italic, underline, capital letters, etc. A special role is given to emoticons (pictograms depicting emotions), which are used everywhere in Internet communication.

Thus, due to the emergence and spread of network communication, the speech behavior of the speaker changes, thus leading to a linguistic transformation. The Internet-personality acquires its own qualities and features that distinguish it from the linguistic personality. In addition, in recent years, Internet users have been using various digital processing tools that help the author disguise their speech features while significantly changing them, thus complicating the work of an expert identifying them, which dictates the need for the search for new methods of profiling such a personality in forensic examination [Потапова, Курьянова, 2022].


An experimental study was carried out to solve the tasks at hand. The purpose of the study was to profile and identify the previously selected respondent among other participants of the experiment by identifying the most significant individual features and peculiarities of their written speech.

The experiment involved 10 male and female subjects aged 19 to 25 years.

The material of the study included essays written by each respondent in Russian in electronic format based on the painting by I. Repin «Ivan the Terrible and his son Ivan on November 16, 1581». The main respondent to be identified during the study also had provided an additional sample of written speech, later used for a comparative analysis to identify this respondent.

The experimental research methodology includes:

1) formation of databases (DBs) for further analysis (n=10);

2) material analysis:

- analysis of the text sentiment (all words were divided into 3 groups: negative,

positive and neutral). The results were converted to relative data (in %);

- identification of tenses used in the text (all verbs in the text were divided into 3 tense groups: present, past and future tense. The largest group was the one that determined the main tense of the text);

- analysis of the intertextuality of the text (identification of references to someone else's opinion in the text).

Next, some individual language writing skills were considered:

a) individual attributes of punctuation skills:

- permanent violation of a certain punctuation rule;

- peculiar or predominant use of certain punctuation marks;

b) individual attributes of spelling skills:

- permanent violation of the spelling rules;

c) individual attributes of lexical and phraseological skills:

- use of certain lexemes;

- permanent lexical and phraseological errors;

- use of dialectisms, vulgarisms, elements of some jargon, vernacular words, etc.;

d) individual attributes of syntactic skills:

- use of simple and complex sentences;

- violation of syntax norms;

e) individual attributes of stylistic skills:

- presence of stylistic errors;

3) processing of obtained data;

4) description of the results and summing up. During the experiment, the texts were analyzed

at all language levels. After analyzing the data, we were able to identify patterns in the texts of the same author and thereby identify the same among other respondents. Their characteristic features are:

- essay structure;

- permanent violations of punctuation skills, in particular, the absence of a comma in compound and complex sentences, in adverbial participial phrases;

- frequent use of sentences with negation and conjunction;

- frequent use of phraseological units and set phrases;

- frequent use of personal pronouns: "I", "we", "you".

See table 1 for a summary of the analyzed data on the main respondent.

The results of the study led to the following conclusions:

1. The discourse of the Internet is characterized by a combination of various parameters

Table 1


Respondent features 1 2 3 4 5 6 7 8 9 10 Main

Punctuation errors 1 1 2 2 5 2 2 0 0 1 6

Personal pronouns 4 3 8 0 15 4 9 13 0 2 18

Set phrases and phraseological units 5 3 5 5 11 6 7 8 5 5 10

Simple sentences - 5 15 19 11 31 35 11 16 19 6 20

Complex sentences 19 16 34 4 26 16 13 23 7 14 19

Negation - 8 6 11 4 6 13 4 10 2 10 12

Conjunction 1 0 2 1 5 2 2 2 2 2 13

manifested in the verbal characteristics of a written text.

2. The methodological approach to text analysis being tested in this study has proven to be applicable in the analysis of written texts for solving the tasks of profiling and identifying a digital personality.

3. The approach proposed in this study makes it possible to provide a holistic complex analysis of written speech, which allows making a decision about the identity of the author of the text and compiling their speech portrait.


The technological and communicative development of the Internet and migration of the personality to the digital space have led to the emergence of new opportunities for intruders and fraudsters to commit crimes remotely. On the Internet, one can often become a victim of fraud, blackmail or extortion. This is only a small part of the threats that can be encountered, while communicating over the Internet [Курьянова, Лопаткин, 2022].

The large-scale use and rapid development of Internet technologies has led to the transformation

of a Linguistic personality into a digital one. Various ways of self-presentation, a variety of virtual communication environments, anonymity, absence of all restraints in virtual behavior, a written colloquial form of communication, simplification of speech are some of the main features of such a digital personality.

The transformation into a digital personality has complicated the tasks of forensic linguistics, since the traditional methods used earlier to analyze speech now prove unequal to the task of profiling and identifying a digital personality. They cannot contribute to identification of an informative individual set, by which it is possible to identify and characterize a corresponding digital personality. Therefore, the main objective of this study was to determine and test those conventional and newly available methods that would allow us to solve the tasks of profiling and identifying a personality by their written speech. As a result, the developed algorithm allows us to profile and identify a personality by their written speech, which means it can be used to study written texts published on the Internet. With examinations described above, an expert can identify the author of both a spoken and written message by their steady skills and abilities manifested in generation of a verbal utterance, as well as by the results of the level-by-level analysis of written speech.


Потапова Родмонга Кондратьевна

доктор филологических наук, профессор

действительный член Международной академии информатизации директор Института прикладной и математической лингвистики Московского государственного лингвистического университета

Потапов Всеволод Викторович

доктор филологических наук

старший научный сотрудник Учебно-научного компьютерного центра филологического факультета Московского государственного университета им. М. В. Ломоносова

Нефедова Ксения Андреевна

эксперт Московского исследовательского центра

Департамента региональной безопасности и противодействия коррупции


Potapova Rodmonga Kondratyevna

Doctor of Philology (Dr. habil.), Professor

Full Member of the International Informatization Academy

Director of Institute of Applied and Mathematical Linguistics of Moscow State Linguistic University

Potapov Vsevolod Viktorovich

Doctor of Philology (Dr. habil.)

Senior Researcher of the Centre of New Technologies for Humanities, Philological Faculty, Lomonosov Moscow State University

Nefedova Kseniya Andreevna

Forensic Expert

Moscow Research Center of the Department of Regional Security and Anti-Corruption

