THE IMPORTANCE OF BIG DATA PROCESSING

О.И. Кукарцева; Е.А. Корнеева; В.В. Храмков; А.А. Павленко

Актуальные проблемы авиации и космонавтики - 2022. Том 2

УДК 004.94

ВАЖНОСТЬ ОБРАБОТКИ БОЛЬШИХ ДАННЫХ

О. И. Кукарцева *, Е. А. Корнеева, В. В. Храмков Научный руководитель - А. А. Павленко

Сибирский государственный университет науки и технологий имени академика М. Ф. Решетнева Российская Федерация, 660037, г. Красноярск, просп. им. газ. «Красноярский рабочий», 31

*E-mail: gutova_ok@mail.ru

В данной статье исследуется важность эффективной обработки большого объема информации в значимые знания. Объединение инструментов для обработки объемных данных значительно облегчит этот процесс.

Ключевые слова: высокопроизводительные вычисления, локальность данных, аппаратные ускорители, моделирование и симуляция, обработка данных.

THE IMPORTANCE OF BIG DATA PROCESSING

O. I. Kukartseva*, E. A. Korneeva, V. V. Khramkov Scientific supervisor - A. A. Pavlenko

Rеshеtnеv Sibеriаn Stаtе Шггегейу оf Sсiеnсе аnd ТеЛпо^у 31, Krasnoyarskii rabochii prospekt, Krasnoyarsk, 660037, Russian Federation

*E-mail: gutova_ok@mail.ru

This article explores the importance of effectively processing a large amount of information into meaningful knowledge. Combining tools for processing volumetric data will greatly facilitate this process.

Keywords : high performance computing, data locality, hardware accelerators, modeling and simulation

The era of big data presents high-performance computing (HC) with an enormous challenge: how to effectively turn huge and often unstructured or semi-structured data first into valuable information and then into meaningful knowledge. High-performance computing tools and technologies are increasingly required in a rapidly growing number of data-intensive fields, from the biological and physical sciences to socio-economic systems. Thus, the era of big data also offers amazing opportunities for HPC to expand its reach and increase its social and economic impact [1].

High Performance Computing (HPC) is at the heart of large-scale processing of complex, dataintensive tasks to enable complex applications in various scientific and technical fields such as high energy physics, genomics, systems and synthetic biology, industrial automation, socioeconomic data analytics and medical informatics. This has led to a significant improvement in understanding of areas ranging from the evolution of the physical world to human societies. Application performance in HPC systems is currently heavily dependent on remote and local data movement overheads (network messages, memory and storage access) [2, 3].

With the advent of hardware accelerators (GPUs, FPGAs), pay-per-use cloud services, and increased performance of general-purpose processors, high-performance computing has become available to many scientific disciplines.

Секция «Информационно-управляющие системы»

COST Action IC1406 promotes interoperability between the HPC community (both developers and users) and simulation disciplines where the use of HPC tools, technologies and methodologies is still new. Data-intensive areas make the issue of efficiency especially relevant for tasks such as multi-dimensional and multi-layer integration and accelerated model development. In addition, these complex systems do not lend themselves directly to modular decomposition, which is an important condition for parallelization and, therefore, support for high-performance computing. They often require a significant amount of computing resources, with datasets scattered across multiple sources and geographical locations [4].

Modeling and simulation (MS) are considered important tools in science and technology to inform the prediction and analysis of complex systems and natural phenomena. Modeling traditionally solves the problem of complexity by increasing the level of abstraction and aiming for a meaningful representation of the domain. This has led to a difficult trade-off between accuracy and efficiency. In other words, the properties of a system can be studied by reproducing (that is, modeling) its behavior through its abstract representation. Perhaps the application layer context should be reconsidered. For example, a Monte Carlo simulation must receive input data, store intermediate results, and filter and combine output data in a correct and reliable manner. In this way [5, 6],

Both BB and MIS are well-established research areas in their own right. However, their better integration, aimed at applications from different areas, will bring significant progress in solving big data problems. COST Action members are collaborating on a unified framework to systematically advance M&S and big data supported by leading HPC-enabled models and tools through a coordinated effort of HPC and simulation experts [7].

The main goal is to create a long-term, sustainable, reference network of research links between the BB community on the one hand and the multiple MIS research communities dealing with big data issues on the other hand. Such links provide a new and permanent basis for collaboration between the BB and MIS communities, spanning both academia and industry in Europe and beyond, with a common goal: to turn vast amounts of raw data into useful knowledge.

Библиографические ссылки

1. Кукарцев В.В., Бойко А. А. Имитационно-динамическая модель расчета приобретения оборудования с помощью облигационного займа // патент, 2020, №20, c. 3-7

2. Business Studio [Электронныйресурс] URL: https://www.businessstudio.ru/articles /article/primenenie_imitatsionnogo_modelirovaniya_na_prakti/ (дата обращения 20.01.2022)

3. Kukartsev V. V. et al. Simulation-dynamic model of working time costs calculation for performance of operations on CNC machines //Journal of Physics: Conference Series. - IOP Publishing, 2020. - Т. 1582. - №. 1. - С. 012052.

4. Vuzlit.ru [Электронный ресурс] URL: https://vuzlit.ru/2003691/imitatsionnaya_ model_zhiznennogo_tsikla_proekta (дата обращения 20.01.2022)

5. Tynchenko V. S. et al. Optimization of customer loyalty evaluation algorithm for retail company //ADVANCES IN ECONOMICS, BUSINESS AND MANAGEMENT RESEARCH (AEBMR). - 2018. - С. 177-182..

6. Milov A. V. et al. Use of artificial neural networks to correct non-standard errors of measuring instruments when creating integral joints //Journal of Physics: Conference Series. - IOP Publishing, 2018. - Т. 1118. - №. 1. - С. 012037.

7. Антамошкин О. А., Кукарцев В. В. Модели и методы формирования надежных структур информационных систем обработки информации //Информационные технологии и математическое моделирование в экономике, технике, экологии, образовании, педагогике и торговле. - 2014. - №. 7. - С. 51-94.

THE IMPORTANCE OF BIG DATA PROCESSING Текст научной статьи по специальности «Компьютерные и информационные науки»

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — О И. Кукарцева, Е А. Корнеева, В В. Храмков, А А. Павленко

Похожие темы научных работ по компьютерным и информационным наукам , автор научной работы — О И. Кукарцева, Е А. Корнеева, В В. Храмков, А А. Павленко

ВАЖНОСТЬ ОБРАБОТКИ БОЛЬШИХ ДАННЫХ

Текст научной работы на тему «THE IMPORTANCE OF BIG DATA PROCESSING»