Научная статья на тему '"Sociopolitical Insider" system: promises and limitations for the political analysis and prognosis'

"Sociopolitical Insider" system: promises and limitations for the political analysis and prognosis Текст научной статьи по специальности «СМИ (медиа) и массовые коммуникации»

CC BY
93
38
i Надоели баннеры? Вы всегда можете отключить рекламу.
Журнал
PolitBook
ВАК
Ключевые слова
BIG DATA / СОЦИАЛЬНЫЕ СЕТИ / SOCIAL NETWORKS / СИСТЕМА СОЦИАЛЬНО-ПОЛИТИЧЕСКИЙ ИНСАЙДЕР / SOCIOPOLITICAL INSIDER SYSTEM / МАЙНИНГ В СОЦСЕТЯХ / SOCIAL MEDIA MINING / ИНТЕРНЕТИЗАЦИЯ ПОЛИТИЧЕСКОГО ПРОСТРАНСТВА / INTERNERNETIZATION OF THE POLITICAL SPACE

Аннотация научной статьи по СМИ (медиа) и массовым коммуникациям, автор научной работы — Nikiporets-Takigawa Galina

The paper defines the current stage of the internetization of the political space in Russia which is characterized by an increasing interest to the analysis and prognosis of the political and social trends in real time based on the Big Data. At first, the 'Sociopolitical Insider' System which was created by a team leading by the author for these tasks, is introduced. Secondly, a project which was commissioned by the government is described and the methods how to handle the Big Data for this project, such as the social media mining, online community detection, social network analysis, opinion mining, leaders' detection, quantitative and qualitative methods of the content-analysis and the sentiment analysis, are discussed. Based on the results of the project, the Big Data limitations and the system limitations are discussed and it is argued that the major limitation is the manipulation with the results by preselecting of the sources and the practice of the usage of the Big Data in the electoral campaign as a populism tool. Finally, the paper discusses the better practices and examples how the Big Data and the System can be used for the political analysis and prognosis and related for both academic and applied needs, and draws to the conclusion about the need of the better understanding of the potential of the Big Data technology in the finding and collecting relevant, credible and timely information from the vast spectrum of sources. The paper also concludes that the skills will the government need to assess and utilize all-source information should be obtained in the close partnership with the researchers and the research laboratories and centres where academics from the fields of political and sociological science work with the IT specialists and developers of the Big Data processing system.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

СИСТЕМА "СОЦИАЛЬНО-ПОЛИТИЧЕСКИЙ ИНСАЙДЕР": ВОЗМОЖНОСТИ И ОГРАНИЧЕНИЯ ДЛЯ ПОЛИТОЛОГИЧЕСКОГО АНАЛИЗА И ПРОГНОЗА

В статье описывается нынешний этап интернетизации политического пространства в России, который характеризуется растущим интересом к анализу и прогнозированию политических и социальных трендов в реальном времени на основе больших данных. Прежде всего, представлена система «Социально-политический инсайдер», созданная командой, возглавляемой автором, для выполнения этих задач. Во-вторых, описывается проект, который был заказан правительством, и методы обработки Больших Данных для этого проекта, такие как майнинг в социальных сетях, поиск онлайн-сообществ, анализ социальных сетей, поиск ведущих мнений, выявление лидеров, количественный и качественный контент-анализ и анализ тональности высказываний. На основе результатов проекта обсуждаются ограничения Big Data и системы, и утверждается, что основным ограничением является манипулирование результатами с помощью предварительного выбора источников и практики использования больших данных в избирательной кампании как инструмент популизма. Наконец, в статье обсуждаются лучшие практики и примеры того, как большие данные и наша система могут использоваться для политического анализа и прогноза и связанных задач, как научных, так и прикладных, и делается вывод о необходимости лучшего понимания потенциала технологий больших данных в поиске и сборе соответствующей достоверной и своевременной информации из огромного спектра источников. В документе также делается вывод о том, что навыки, необходимые правительству для оценки и использования информации из всех источников, должны быть получены в тесном партнерстве с исследователями и исследовательскими лабораториями и центрами, где ученые из областей политической и социологической науки работают с IT-специалистами и разработчиками систем обработки больших данных.

Текст научной работы на тему «"Sociopolitical Insider" system: promises and limitations for the political analysis and prognosis»

ТЕМА НОМЕРА

Г.Ю. Никипорец-Такигава

СИСТЕМА «СОЦИАЛЬНО-

ПОЛИТИЧЕСКИЙ

ИНСАЙДЕР»:

ВОЗМОЖНОСТИ

И ОГРАНИЧЕНИЯ ДЛЯ

ПОЛИТОЛОГИЧЕСКОГО

АНАЛИЗА И ПРОГНОЗА

Аннотация

В статье описывается нынешний этап интернетизации политического пространства в России, который характеризуется растущим интересом к анализу и прогнозированию политических и социальных трендов в реальном времени на основе больших данных. Прежде всего, представлена система «Социально-политический инсайдер», созданная командой, возглавляемой автором, для выполнения этих задач. Во-вторых, описывается проект, который был заказан правительством, и методы обработки Больших Данных для этого проекта, такие как майнинг в социальных сетях, поиск онлайн-сообществ, анализ социальных сетей, поиск ведущих мнений, выявление лидеров, количественный и качественный контент-анализ и анализ тональности высказываний. На основе результатов проекта обсуждаются ограничения Big Data и системы, и утверждается, что основным ограничением является манипулирование результатами с помощью предварительного выбора источников и практики использования больших данных в избирательной кампании как инструмент популизма. Наконец, в статье обсуждаются лучшие практики и примеры того, как большие данные и наша система могут использоваться для политического анализа и прогноза и связанных задач, как научных, так и прикладных, и делается вывод о необходимости лучшего понимания

G. Nikiporets-Takigawa

«SOCIOPOLITICAL INSIDER» SYSTEM: PROMISES AND LIMITATIONS FOR THE POLITICAL ANALYSIS AND PROGNOSIS

Abstract

The paper defines the current stage of the internetization of the political space in Russia which is characterized by an increasing interest to the analysis and prognosis of the political and social trends in real time based on the Big Data. At first, the 'Sociopolitical Insider' System which was created by a team leading by the author for these tasks, is introduced. Secondly, a project which was commissioned by the government is described and the methods how to handle the Big Data for this project, such as the social media mining, online community detection, social network analysis, opinion mining, leaders' detection, quantitative and qualitative methods of the content-analysis and the sentiment analysis, are discussed. Based on the results of the project, the Big Data limitations and the system limitations are discussed and it is argued that the major limitation is the manipulation with the results by preselecting of the sources and the practice of the usage of the Big Data in the electoral campaign as a populism tool. Finally, the paper discusses the better practices and examples how the Big Data and the System can be used for the political analysis and prognosis and related for both academic and applied needs, and draws to the conclusion about the need of the better understanding of the potential of the

потенциала технологий больших данных в поиске и сборе соответствующей достоверной и своевременной информации из огромного спектра источников. В документе также делается вывод о том, что навыки, необходимые правительству для оценки и использования информации из всех источников, должны быть получены в тесном партнерстве с исследователями и исследовательскими лабораториями и центрами, где ученые из областей политической и социологической науки работают с IT-специалистами и разработчиками систем обработки больших данных.

Ключевые слова:

Big Data; социальные сети; система Социально-политический Инсайдер; майнинг в соцсетях; интернетизация политического пространства.

Big Data technology in the finding and collecting relevant, credible and timely information from the vast spectrum of sources. The paper also concludes that the skills will the government need to assess and utilize all-source information should be obtained in the close partnership with the researchers and the research laboratories and centres where academics from the fields of political and sociological science work with the IT specialists and developers of the Big Data processing system.

Key words:

Big Data; social networks; Sociopolitical Insider System; social media mining; in-ternernetization of the political space.

Introduction

In addition to the investigation of the impact of the new technologies on social and political trends - a question that does not lose its relevance [see, for instance, 2; 3; 7-8; 17], the researchers in the field of the internet and politics began to work on another, a newer in political science, but not least important one: about the promises of the internet as the data to monitor and foresee the social and political trends [6; 16; 19; 20] Related to this, in the numerous papers, the monitoring and analysis is based on the technologies that are an "organic" part of cyberspace itself - the Big Data. In Russia, this kind of methodological perspective is booming. But because this approach is relatively new for the political scientists, very few agreement over the notion itself, a considerable disagreement over the validity of the Big Data, and, on the contrary, the overvaluation of their promises, divide the researchers deeply.

So far, the Big Data is used extensively to study the consumer behaviour and the appropriate targeting of advertising in trade, economy, banking, public administration, transport and medical services [4; 9; 10; 11; 12; 26; 27; 28], actively promoting the idea of the positive impact of monitoring demand and supply in the relevant sectors of the economy. Using Big Data for political purposes, however, is a much more recent trend [5, s. 85; 22, s. 26]. Today, obviously, the Big Data is gradually entering the Russian political space, but the number of applications for political analysis and progno-

sis is disproportionate to the demand that is formed among the politicians, political technologists and the political scientists.

Yet, the latest group - the scientists - began to incorporate the Big Data as the data and method for the political and socio-political analysis and prognosis long before the last electoral cycle 2018 began, as they understand fully the extraordinary advantages of the large amount of data in comparison to the classic data in their research [19]. But among the politicians, the Big Data was gained the interest only since the last electoral cycle began. Before that and for a long time, the Russian political elite did not recognize the potential not only of the Big Data but of the internet itself, although in the West the internet was introduced in the electoral technologies as early as in 1996, when the first election website was made and the elections were held under the slogan "The Year of the Internet". Since then, as the observers argue: "Nearly every US election since 1996 has been labeled 'the Year of the Internet'. Important milestones have indeed been reached in each of these elections, with 1996 marking the first campaign website, Jesse Ventura's Internet-supported 1998 victory in the Minnesota Governor's race, John McCain's online fundraising in the 2000 Presidential Primary, Howard Dean's landmark 2004 primary campaign, the netroots fundraising and Senator George Allen's YouTube 'Macaca Moment' in 2006, and Barack Obama's historic 2008 campaign mobilization [16].

Russian elections so far cannot be labelled 'the Year of the Internet' but the internet is evidently step by step climbing the political scene from one electoral campaign to another. In the last electoral cycle, the politicians use the Big Data from the internet as a technological hit for quick upgrade their acquaintance with their own electorate.

There are many reasons for this new popularity of the Big Data in the intersection to the electoral campaign. Populism is effective if it is based on the knowledge of the social problems, the scale of the problems, and their subject (The paper is partly based on the theses of the plenary talk at the conference "Politics of post-truth and populism in the modern world". St. Petersburg State University. September 22, 2017.). This knowledge help the politicians to target their address to the electorate and to expect a more positive reaction from them. To arm themselves with this knowledge, politicians at various level need to monitor 'social wellbeing' of their electorate, and the closer the upcoming election the more desperate the need. Thus, the

analysis and prognosis of the sociopolitical trends, movements, the mass sentiment and attitude towards various social and political issues with help of the data mining in the social networking sites and messengers is characteristic for the current stage of the internetization of the political space in Russia.

Currently being a demanded segment of the mainstream, this highly popular in the academic and professional communities but still very new for many of them terminological coin refers, at least, three different things: a large data sets which localized in open media sources and social networks; a large data sets with multiple heterogeneous characteristics in specimens that correspond to the criterion of "significant diversity"; the methods to gather, process and to analyze the large data sets.

In either of these meanings, Big Data requires an automated or semi-automated processing of the raw data or a prepared disperse sample. The Big Data provides a huge number of opportunities for social and political analysis and prognosis. They also allow to exclude any intervention of a researcher, and are more valid as a result. They are really 'big', because they provide the researcher with the amount of the information that cannot be obtained from other sources: on the internet and social media one can see the whole spectrum of social and political subcultures, movements, groups -even those that forbidden at any forms of public communications. The Big Data is the source of information about the sphere of interests, interest groups, communities, subcultures, which are in high demand among all strata of the population, but especially the generation 'z', who are accustomed to communicating and express themselves only in social media and the internet. And politicians, for whom young people are the target group in the electoral cycle, can have a look in the social media into the collective mind of this strata of voters to address their promises precisely to their aspirations.

At the same time, however, the Big Data of the social media have their limitations. Let us consider them on the base of the case study of the 'social well-being' of the citizens of the Moscow region.

The case, material, method and instrument

This project was run since March 1 to November 26, 2016 at the request of the Main Administration of the Social Communications of the Moscow Region Government. To fulfil the task of obtaining quantitative and qualitative information about the public opinion of the inhabitants of the Moscow

Region over the various topics of importance for the region in social media, no system available on the market could be used, since the customer of this research required to agree with him every step and to comply with many other conditions that any of the available on the market system cannot comply. In total, there are about five major ones, and they do not open their research 'kitchen', operating with the number of sources and an algorithm of their selection which are hidden from the consumer, providing only analytical reports, and none of the existing systems on the market imply the unloading of raw data. We tuned our own system 'Sociopolitical Insider' to agree with the all requests of our customer and in fact created a system on the market with an exceptional range of customization.

Big Data of the social media is our main empirical source, and we collect them, select, quantify and qualitatively analyze using the 'Sociopolitical Insider' system and the methods of the online community detection, social network analysis, opinion mining, leaders' detection, quantitative and qualitative methods of the content-analysis. The 'Socipolitical Insider' system can work with all social media sources.

'Sociopolitical Insider' System

The system has been created by a research team including the author of this paper together with Andrei Koniaev, Artem Krasheninnikov and Anna Larionova and the technologies used to create the system are the unique know-how of the laboratory of socio-political reality that worked at the RSSU in 2015-2016 as part of the internal grant of the RSSU for the creation of critical technologies. The creation of the system is the result of deep interdisciplinary research at the intersection of social sciences (political science, sociology, communication theory, media studies), computer science and mathematics. We created a system that is configured for any class of tasks. So far, the system is tested for more than 30 government and commercial customers. It allows monitoring of specified sources in the main social networks by key words or phrases, working with the tone of the posts and commentaries and collecting as much information about users of the social networks, as needed for a purely academic or an applied analysis. The system has at its output an opportunity for the mapping the 'social well-being' and political attitudes of the various groups, for further analysis and prognosis of the political and social trends in the real time. The peculiarity of the interface of our

systems and its advantage, which significantly exceeds those available on the market, is that it can be customized for the needs of any task and any customer and is adjustable further in the process of the monitoring. User web interface of the 'Sociopolitical Insider' system consists of an authorization module, a personal account, a personal 'basket' that stores the keywords; the queries for each topic and the list of the topics; the reports with the possibility to download the reports either in csv or xlsx format, and all posts and commentaries collected for a specified period. Reports can be transformed to the doc, docx, pdf formats from pre-created templates and, if necessary, adjusted manually, based on the uploaded data in csv, xlsx format.

The system meets the following criteria: can surpass the available analogues for the set of indicators, first, to cover commentaries, and not just posts; can be configured for a specific task but readjusted during the monitoring process; allows to change and filter sources for a specific task; allows to sort and filter posts or commentaries on any set of parameters. To create these, we used Python 2.7 x64 (including Numpy, Scipy, Pandas, Sklearn and Codecs), mySQL, VPS/VDS server, Django, and carried out:

- the development of a special program - parser for automatic parsing and compiling an array of data from social media. At the input, the parser gets a list of groups and keywords for analysis, on the output - posts or commentaries with the keywords, and those that have been identified with the parameters selected from the posts or commentaries.

- the development of a module for the classification and evaluation of the tone of the posts or commentaries. The processed posts or commentaries arrives at the input of the classification system for posts or commentaries texts and sentiment analysis. Based on the results of the posts or commentaries processing, the topic and key marks are assigned. The received posts or commentaries is stored in the database.

- the implementing of a database with processed posts or commentaries. For this, two programs, the so-called script, performing a sequential scenario. The first performs the formation and unloading of the simple reports according to the technical assignment. The data is stored on a leased virtual server using a data management system. The extraction of data and the final reports are carried out using the query language.

The keywords for the queries were formed using the method of the expert assessments summarized on the material of monitoring the social well-

being of citizens of the Moscow region and a not-deep qualitative analysis of the array of texts of various social media platforms. The list of groups was created 'manually' and formed 'by default' from the most active groups and / or users with many subscribers. To manually create a list of groups, a comparative analysis of the social media bases is used, the step-by-step methodology of which is fully described in the previous publications of the author.

List of the topics to analyse the attitude of the residents of the Moscow region is as follows: 1. "communal service";2. "road conditions"; 3. "medical care"; 4. "prices"; 5. "utility bills"; 6. "poverty line"; 7. "ecology"; 8. "drag addiction", "alcoholism"; 9. "unemployment"; 10. "corruption"; 11. "crime"; 12. "kindergartens"; 13. "Housing quality" (dilapidated housing stock); 14. morality, culture, values; 15. Schooling; 16. People with disabilities; 17. Homeless people.

Further, the users who authored the posts or commentaries related to the abovementioned topics were subjected to verification by geolocation, profiles and API addresses to verify those users who live in the Moscow region (which was important for the research question of the monitoring of the 'social wellbeing' of the residents of the Moscow region). Then, all the posts and commentaries which were collected with help of the system and the keywords which were occurred at any distance but in one post of commentary, were sent to the qualitative and quantitative content analysis, the research methodology of which is step-by-step set out in the previous publications by the author [Mapping, Tweeting, Protest 2.0] and assumes two types of categories of content analysis. The first type is associated with the need to measure the relevance of each of the topic related to the Moscow region problems, which is realized by mapping the most significant topics in the discussions of Moscow residents in social networks (here the category is the topic, see the list of topics above). The second type of the categories is related to the need to assess the prevailing tone of the discussions related to the most significant topics and monitoring the tone dynamics - here we have three main categories: "positive, negative, neutral". Then the frequency of such texts and their tone were mapped week by week and set to the analysis and prognosis.

Data mining and data collection should be done by solving several questions: where to look - (on which platforms of the mobile Internet, only in Twitter or Vkontakte (VK), or Telegram, Facebook, YouTube, etc., or in their

combinations, or in all of them together, etc.); what to look for (only posts, or posts and commentaries, or just commentaries); when to look for; how to collect the data and process it; how to discard unwanted data; how to visualize the results.

The most difficult are the first three questions: 'where', 'what', 'when'. If the first two can be solved in advance depending on the research task, then 'when' is generally a matter of questions that requires constant daily monitoring of the political agenda. There are sometimes the dramatic delays, for example, if you went on vacation, not knowing that a serious protest rally which is recruiting the young people massively will take place, and then this happens but you have no collected data. All other questions have clear answers. The right tool for Big Data is just a computer, preferably powerful, plus a programmer who will set up parsing and will do this all the time (tuning and rebuilding to the needs of the researcher, i.e. customizing) and in very close interaction to the analytics. This is an unloved topic for any customer who usually wants very cheap analytics (preferably free of charge), but at the model of 'all at once'. Forced to upset - the fact is that for Big Data you need a good programmer and a good programmer costs a lot of money and everything that a researcher can achieve with Big Data one should not even try to achieve without a good programmer. Specifically, in our case, only one but very good programmer worked with us and managed to build with us a system called 'Sociopolitical Insider' which successfully works in the market and performs a bunch of different orders at the level and superior to existing market counterparts.

The question about 'how not to work with unwanted data, spam, etc.' and to archive only relevant data depends on the cost of the server, where the data is stored. If not a lot of money, then you can keep everything that is open, since the noise can be easily neglected (when it comes to such huge amounts of data, the noise will not prevent us from catching trends and correlation). That is the essence of the Big Data, because in the classic data we take the limited samples and try to build exact models, which then spread to the whole population. In Big Data, on the contrary, we try to collect all the available data and ot build the approximate models, which, due to the huge amount of the initial information are more accurate than the classical ones.

As for the visualization, it all depends on the customer, in terms of how much the costumer understands in what s/he ordered and how clearly the

customer understand what s/he wants: if it is not enough, then it is necessary to give out the results literally in children's colouring books, and to impress the customer, and if s/he understand more, then you can talk in a serious way, cooperate and not particularly bother about the beauty and attractiveness of the charts and the tables.

Thus, all these questions can be answered only with help of the other important tool that is crucial when working with Big Data. This is the team of "a competent customer, purposely trained analyst, and the professional programmers" (G. Nikiporets-Takigawa and O. Lobazova. Forthcoming). The accuracy in the setting the task, which an analyst poses to the programmer and to the system, results in the validity of the data. That is, even though working with Big Data makes it possible to talk about the direction of trends with greater certainty, it should be understood, what tasks Big Data solve better and what is the limitations of such data. The analysts should know not only the techniques of working with Big Data, but their applications, challenges and limitations, among them: the problem of fake users, bots, false identity; sampling problems; legal features of working with 'open' data, the risks with this new approach to information gathering - such as legal, privacy, overload and disinformation risks; the problem of representativeness, depending on the tasks; the problem of the rapid technological upgrade of the platforms (As Karpf puts it: "The Internet of 2008 is different from the Internet of 1996, 2000, or 2004, and this is a recurrent, ongoing pattern. Consider the following trivia question: 'What was John Kerry's YouTube strategy in the 2004 election?' YouTube is a major component of the Internet today. The video-sharing site is the third most popular destination on the Internet, as recorded by Alexa.com. Political campaigns now develop special 'web advertisements' with no intention of buying airtime on television, simply placing the ads on YouTube in the hopes of attracting commentary from the blogosphere and resultant media coverage. The medium is viewed as being influential that an entire political science conference and a special issue of the Journal of Information Technology and Politics were devoted to 'YouTube and the 2008 election".

The most important principle for achieving "cleanliness" of the results is the responsibility of the analysist as the Big Data is sensitive to the choice of the sources, queries and keywords. We always should remember that we study 'the Internet data, which consist of a lot of data' [Karpf]. As the popu-

larity of the SNSs and messengers are constantly changing, the set of the sources should be constantly upgraded dated and the system is periodically configured. The projects that the author of this paper have conducted since 2011 trace how quickly the networks were updated. In 2011, we explored the forums for the project "Memory Wars in Russia and Ukraine", then in 2012 LJ for the protests movement, then in 2013 we studied the political mobilization in Twitter because nobody used LJ that time, and we did a massive research of the ideological identities in the 2013-2014 projects through the data of Vkontakte. We could stop here and use VK as our main source as VKontakte gives enormous amount of data, but, as we will discuss below, 'demanding customer' likes all in one and a researcher who work for a politician should climb into YouTube, Instagram, etc. - a very informative addition, as they contain visual materials, which are often more important than texts.

But we faced a new and unexpected sort of the limitations for our System. The limitations were related to the customer. Our experience turned out that the customer of the Big Data research does not understand what it is all about. As a rule, they cannot resort to such research because this is a trend, this is fashionable, and around it a stir is formed. However, this same Big Data stir combined with a lack of understanding of how data and methods can be used effectively and for which tasks are suitable does not produce the most favourable result. In addition, a higher political customer is often not ready for an objective analysis' results and, to avoid unpleasant ones, tend to be looking at the research 'kitchen' to "help" the researcher to obtain more favourable results for the customer. Normally, the interaction algorithm between a customer and a researcher is such that the customer writes a 'specification' for the project on the base of the commercial offers of the several companies. There are just few of them in Moscow as we have already mentioned above, and when we offer to a customer our service, the customer read all our offers and write a specification for a tender. From the point of view of the customer, it is much more tempting to combine the best from each candidate's offers in one specification and to ask to do all of this for the lowest possible price. As the result, that company who get the tender and must handle the project faces a very broadly and unprofessionally written specification, where one task contradicts the other and some of them are almost impossible because the capabilities of one system can differ from another and dealing with a task which were set not for your team and your sys-

tem but for others' is a quite a headache. But the most interesting fact which you face at the end when you tune your system to overcome all possible drawbacks and to meet all possible willingness of the customer is that the specification can be ignored because, in fact, the state officials' customer can be very uninterested in the precise results.

In the project that we discuss here, this turned out to be the most serious issue. Our research program did not satisfy the customer and then it became obvious that under the "monitoring of the opinions of the residents of the Moscow region", the Moscow region government understand the frequency of mentioning of the governor, other leading officials (who are not the opinion leaders in the social networks, and the major projects in the region. We were not allowed to independently formulate a list of keywords -there were requirements to add a search for the mention of Governor Vo-robyov, a set of quite specific forums as sources for the search. All attempts to explain to the customer that such a set of keywords with the names according to the wishes of the customer does not reflect the opinions of the Moscow region inhabitants and the Big Data should be considered objectively, were rejected. The customer ignored our expert comments from the very beginning, introduced additional requirements in the terms of reference. The general impression is that the customer is not interested in carrying out the tasks assigned to the project and express a very strong unwillingness to see the real picture (which, in fact, is not so unpleasant, because the Moscow region certainly belongs to the quite prosperous regions ("The level of social well-being of the inhabitants of the Moscow region has increased. The Moscow region has risen in the rating of the social well-being of the Russian regions, compiled by the Civil Society Development Foundation (FCO). Moscow region scored 58 points out of 100 possible, having risen from August to three points.

Conclusion

As a result, instead of an objective one, a glossy picture of the electorate, which does not exist, is obtained and this cannot serve as a source for the effective policy making. A comprehensive understanding of what the Big Data can give us, does not exist among the high level officials, and the Big Data, despite all the huge potential of representativeness and validity when they are used for the needs of political planning at the national and regional

levels; developing and evaluating the effectiveness of political programs; PR strategies of public and private organizations; in the practice of political consulting; in the investigation of political behavior and political conflicts, or in the development of the effective electoral campaigns - are used currently just as a new tool of populism. This is the main limitation in the use of the Big Data for the researchers in the projects which we do for the practitioners. The main practical result of the project which was carried out by us for the Moscow region government was the perfection of our Sociopolitical system that allows for the analysis and prognosis of the social and political trends. Our system is ready to be introduced to the market (personal cabinet, analytical blocks) through grants and projects, in which we are ready to enter, both for applied and scientific.

Our team is currently using the 'Sociopolitical Insider' system for two other projects. One is named the 'Monitoring and prevention of politically and socially destructive behavior of youth through Big Data and mediation in social networks'. The project was created in response to the concern about the fact that the main place where teenagers and young people communicate is the cyberspace, and the social networks at the first place, as well as about the fact that the young people are exposed to the illusions of a virtual 'market of the ideas' and emotions, risking their health and lives because many adults manipulate them. February 2017 marked a tragic record in the number of the suicides of the adolescents who were victims of a suicidal movement "Blue Whale". There is controversy about the existence and the extent of this movement, but there was a "moral panic" associated with a sharp increase in the number of the teens' suicides, primarily in the Russian province. The cases of the withdrawal from the life under the influence of virtual communities "Blue Whale" revealed to the public and the state the degree of danger of uncontrolled network communication, an indisputable fact of the teenagers' vulnerability and the need to prevent them from the antisocial, extremist, deviant behaviour.

Then March and June of 2017 brought the evidences of the political oppositional activism among the youth who was mobilized via YouTube. Again, and at the new level, this raised the task of the monitoring and prevention of the political and social behaviour of the adolescences and the young people. Many practitioners, with the head of several schools, the officials in the Ministry of Education and Science of the Ulyanovsk Oblast and the Department

of Education of Moscow among them, rushed to ask the researches to develop the security measures against the so-called "death groups", or virtual communities "Blue Whale", as well as against the other cases of the involvement of the young people in the destructive activity, and to arm the teachers and those who are involved in the problems solving with the young individuals at the various levels, with the means to counter such phenomena. In parallel with a similar request to develop a set of measures against the involvement of the young people in destructive cults at the RSSU addressed the Synod Committee on Youth Affairs.

For this project, the Big Data and our 'Sociopolitical Insider' system is efficiently used to gather a variety of the information about the main interests of the young people, the groups of interests, communities, subcultures that are in high demand among the young people. In addition to the material for monitoring, social networks are also a platform and mechanism for the preventive measures that limit or minimize negative impact of the internet and the 'information war' [13; 14; 15; 25].

Another project is related to the first one and aims to educate the professionals how to deal with the Big Data and how to apply data analysis to the texts, images, network interaction, etc. Big Data, as we have already discussed above in this paper, imply numerous material and method limitations, require additional resources and efforts in collection, processing and visualization, but ultimately, provide the sample is correctly constructed and processed thoughtfully at each stage, far exceeds the "classical" data for the sociological and political analysis. But before starting to use social networks as data, you need to understand their limitations and the consequence that the Big Data helps to solve not every task. It should also be understood that to work with the social networks, a combination of quantitative and qualitative methods is needed and also that a manual processing will not be sufficient. That require a team of the professionals who have special skills and a proper understanding of the advantages that the Big Data can add to inform the politicians.

The Big Data have a great potential for the researchers therefore St. Petersburg University, Moscow University, Higher School of Economics, and several other leading universities open Big Data research centers. In these centers, alternative systems for our system are created, and they are intended for internal use or for joint interuniversity studies in the case of

jointly won grants, or for commercial orders. Typically, the systems of other universities are almost closed to outside scientists. Therefore, any institution that seriously positions itself in the system of higher education and claims to have / create a scientific school, strives to create its own Big Data system. On the other side the state and the government should assess the value, cost, benefit, performance, and impact of information across the spectrum of sources. It is proved internationally and domestically, in academic, commercial, state and third sector, that carefully managed Big Data could provide the best practice in finding and using open source, published, and social media that the government, analytical departments of the state offices could learn from. It is also proved that the internet can be considered as a very informative source they can draw upon when they need to update their knowledge of the current political and social issues.

Increasing the interaction of power and society is possible only with a constant analysis and prognosis of the existing social processes and trends in their development. In this regard, we should expect an influx of orders for the analysis of data from the public sector.

References

1. Avtsinova G., Volodin A., Godyna V. et al. Social'naia politica v Moskovskom re-gione: trendy razvitia i opyt realizatsii. M., 2015.

2. Boyd D., Ellison N. Social network sites: definition, history, and scholarship. Journal of Computer-Mediated Communication. 2013. №13(1). article 11. URL: http://onlinelibrary.willey.com/doi/10/1111/j.1083-6101.2007.00393.x. (Accesed 06.12.2017).

3. General election 2017: what caused Labour's youth vote surge? BBC News Online. 16 June 2017. URL: http://www.bbc.com/news/uk-politics-40244905 (Accesed 06.12.2017).

4. Gorbachev A.M. Big Data kak instrument protivodejstvija ugrozam jekstremizma. Mezhdunarodno-pravovye sredstva protivodejstvija terrorizmu v uslovijah globalizacii. Problemy terroristicheskogo naemnichestva sredi molodezhi i puti ih preodolenija. Sbornik materialov vserossijskoj konferencii. Stavropol'skij gosudarstvennyj pedagogicheskij institute. 2016.

5. Gricenko R.A., Prokopchuk D.D., Tancura M.S. Ispol'zovanie «Big Data» v prik-ladnom politicheskom analize. Voprosy nacional'nyh i federativnyh otnoshenij, 2(37). 2017. №2(37).

6. Habermas J. Political communication in media society: does democracy still enjoy an epistemic dimension? The impact of normative theory on empirical research. Communication Theory. 2016. №16(4).

7. Heverin Th., Zach L. Microblogging for Crisis Communication: Examination of Twitter Use in Response to a 2009 Violent Crisis in the Seattle-Tacoma, Washington Area. In: Proceedings of the 7th International ISCRAM Conference. Seattle, USA, 2010.

8. Huang E. What you need to know about China's VPN crackdown. URL: https://qz.com/1026064/what-you-need-to-know-about-chinas-vpn-crackdown/ Quartz (Accesed 06.12.2017).

9. Il'jasova N.Ju., Kuprijanov A.V., Popov S.B., Paringer R.A. Osobennosti is-pol'zovanija tehnologij Big Data v zadachah medicinskoj diagnostiki. Sistemy vysokoj dos-tupnosti, 2017. №12(1).

10. Kazakov R.I. Tehnologii Big Data v upravlenii krupnymi bankami. Biznes-obrazovanie v jekonomike znanij, 2015. №2(2).

11. Mal'ceva A.V., Mahnytkina O.V., Shilkina N.E. Izuchenie povedencheskih patter-nov pol'zovatelej social'nyh setej: vozmozhnosti Big Data. Zhizn' issledovanija posle issledovanija: kak sdelat' rezul'taty ponjatnymi i poleznymi. VI Sociologicheskaja Grushinskaja konferencija. M.: Izd-vo RANHiGS, 2016.

12. Meshherjakov I.S. Tehnologii Big Data v dejatel'nosti organov gosudarstvennoj vlasti. Obrazovanie i nauka kak strategicheskie resursy razvitija sovremennogo gosudar-stva: sbornik nauchnyh trudov. Saratov: Povolzhskij in-t upr. im. P. A. Stolypina - fil. RANHiGS, 2017.

13. Kaplan C. Twitter terrorists, cell phone jihadists and citizen bloggers: the "global matrix of war" and the biopolitics of technoculture in Mumbai. Theory, Culture & Society. 2009. №26(7-8).

14. Karatgozianni A., Kuntsman A. (eds.) Digital cultures and the politics of emotion: feelings, affect and technological change. Basingstoke and New York: Palgrave Mac-millan, 2012.

15. Karatzogianni A. The politics of cyberconflict. Routledge, London and New York,

2006.

16. Karpf D. MoveOn effect. The unexpected transformations of American Political advocacy. Oxford University Press. 2012.

17. Labour is winning the election on social media. Campaign. 07.06.2017. URL: https://www.campaignlive.co.uk/article/labour-winning-election-social-media/1435748, last accessed 2017/06/07 (Accesed 06.12.2017).

18. Livingstone S. On the Mediation of Everything. Journal of Communication. 2009. №59(1).

19. Pochepcov G.G. Informacionnye vojny. Refl-buk, M., 2000.

20. Sizov I.A. Big Data - bol'shie dannye v biznese. Ekonomika. Biznes. Informatika. 2016 №3.

21. Smirnov V.A. Kontury novoj modeli sociologicheskogo analiza jeffektivnosti molodezhnoj politiki s ispol'zovaniem «Big Data». Aktual'nye problemy sociologii kul'tury, obrazovanija, molodezhi i upravlenija. Materialy Vserossijskoj nauchno-prakticheskoj kon-ferencii s mezhdunarodnym uchastiem, Ekaterinburg, 2016.

22. Terent'eva E.I., Morbah E.S., Vozgrina A.V. Sposoby primeneniya Big Data dlya PR-zadach. Readera. 2016. №3(17). URL: https://readera.ru/sposoby-primenenija-big-data-dlja-pr-zadach-14330410 (Accesed 06.12.2017).

23. Vasil'eva E.N., Cynarjova N.A. Informacionnaja vojna v kontekste teorii masso-voj kommunikacii. Vestnik Tverskogo gosudarstvennogo universiteta. Serija: Filologija. 2017. №3.

i Надоели баннеры? Вы всегда можете отключить рекламу.