IMPLEMENTATION OF THE INTERNET OF THINGS NETWORK FOR MONITORING AUDIO INFORMATION ON A MICROPROCESSOR AND CONTROLLER

Vishniakou V. A.; Shaya B. H.

UDK 621.349

VISHNYAKOU U.A., SHAYYA B.H.

IMPLEMENTATION OF THE INTERNET OF THINGS NETWORK FOR MONITORING AUDIO INFORMATION ON A MICROPROCESSOR AND

CONTROLLER

Belarusian State University of Informatics and Radioelectronics, Republic of Belarus

The subject of research is the development and implementation of Internet of Things (IoT) network structures for monitoring and analyzing audio information based on Raspberry microprocessor (MP) and an Arduino controller. The purpose of the article is to detail the process of developing an IoT based audio information monitoring network and evaluate the results. The authors have developed two variants of IoT structures for monitoring and analyzing audio and voice information. The IoT network includes a sound sensor (microphone), a unit for analyzing the information received from it and a decision-making module. A diagram of the first IoT structure for assessing the sound level based on the MP and controller is given.

The algorithm of IoT network functioning for the analysis of voice information is detailed. It includes receiving information from the microphone, transmitting this information to the MP, processing it according to certain rules, forming a solution by the controller and issuing recommendations to a user. The algorithm is implemented in the IoT network, which includes a microphone, Raspberry MP, Arduino controller, software, applications for the operator.

A prototype of the IoT network was created for the analysis of voice information and experiments were conducted to test its functioning. The audio recognizer was trained using various audio samples. The voice sound analysis system was tested in four scenarios: with a large and small amount of background noise, loud and quiet voice. Analysis of the results of the experiment showed that the voice sound analysis system works better when the voice is loud, as well as in a place where the situation is with minimal background noise.

Keywords: network structure, sound sensor, microprocessor, controller, software.

Introduction

According to the World Health Organization, the global population has increased dramatically in recent years, with most of this growth occurring in urban areas and cities. As a result, cities and urban planners are now facing new challenges. Monitoring cities is a new necessity for monitoring the well-being of residents, CO2 levels, water quality and noise levels, to name just a few.

The psychological well-being of people is one of the most important aspects of urban life, and noise pollution ranks first on this list. Excessive noise levels or annoying sounds that disturb people and animals in their homes, recreation areas or workplaces are called noise pollution. This type of pollution has various physical and psychological consequences for human health. A large number of vehicles, construction areas and human-generated noises, such as the sounds of nightlife, are the main causes of noise pollution.

Noise levels usually differ during the day and at night. In residential areas, restrictions are accepted that do not exceed 65 dBA during the day and 55 dBA at night [1]. The accepted standards of the recommended permissible exposure time for continuous time-weighted noise state that for every 3 dBA more than 85 dBA, the permissible exposure time to possible damage is halved, for example, 85 dBA is associated with an acceptable exposure time of 8 hours; 88 dBA for 4 hours, 91 dBA for 2 hours [2].

The authors propose the implementation of two IoT structures based on the previously developed multi-agent system for sound environmental monitoring [3]. The first variant of the proposed structure evaluates the level of environmental sounds. The second IoT structure recognizes the sound commands of the human voice. The IoT

network is a good implementation of the proposed options for a multiagent system for analyzing audio information.

Multi-agent system for monitoring sound information using IoT

This system consists of a small Raspherry Pi 3 microcomputer packed on a single board and small enough to fit in the palm of hand. Despite all this, Raspherry Pi 3 has sufficient power to process complex computer projects [4, 5]. It contains enough tools to run the full version of Raspbian OS.

Raspberry Pi 3 is a useful device that is well suited for creating an IoT network in the form of wearable and embedded implementations to reduce their size. Raspberry Pi 3 is equipped with a smaller mini HDMI connector that provides access to a smartphone, iPed and other external devices.

Raspherry Pi 3 is connected to a solar battery, which provides its power 24/7. It will also be connected to the wireless Internet using the Wi-Fi protocol (can use other communication). The sound device is also connected to the Raspherry Pi 3 to detect and analyze sound from the environment. Raspherry Pi 3 is connected to the Arduino Uno system, which is an open source microcontroller board based on the Microchip ATmega328P microcontroller and developed by Arduino. cc [6]. The board is equipped with sets of digital and analog I/O pins, which can be connected to various expansion boards (screens) and other circuits (fig.1) [4, 5].

On the Arduino side, a solar battery and a global mobile communication system (GSM) can be connected to communicate with the operator T61.

0*1

oo

Figure 1. Multi-agent system for monitoring sound information using IoT

Algorithm and structure of a multi-agent automatic voice detection system

The sound detection algorithm based on the median filter is highly reliable even in conditions of significant background noise. At the recognition stage, two statistical classifiers are compared using Gaussian Mixture models (GMM) and Hidden Markov Models (HMM), respectively. It can be shown that a fairly good recognition rate can be achieved (98% at 70 dB and above 80% at a signal-to-noise ratio of 0 dB), even with a strong deterioration in Gaussian white noise.

The steps of the proposed algorithm are as follows.

1. The sound (audio) is received by the sensor (microphone).

2. The Google voice API is used to convert audio to text.

3. The text is compared with other commands in the command configuration file, that have already been recognized.

4. The bash command interpreter works, if the text

corresponds to any command previously recognized.

5. The recognized command from Raspberry Pi as an interactive response system is sent to the Arduino controller.

6. Based on this message, the Arduino will interact with the GSM connected to it.

7. The GSM system sends a message to the desired destination to the operator.

The block diagram of the IoT system implementing the operation of this algorithm, using the appropriate software for processing voice sound, based on the Raspberry Pi MP and the Arduino controller is shown in the figure. 2.

A prototype of the proposed system was created and experiments were conducted to test the proposed system and determine its correctness of its functioning. The audio recognizer was trained using various audio samples (in the format.wav). After that, the voice of the same person, who was registered in system, was used in various situations to assess the accuracy of the IoT. Below the data on the results shell be given.

Souud Detector

Raspberry Pi

I

Send command to Arduino

Arduino + GSM

Í

Send Me ige with location

Figure 2. Block diagram for automatic detection of a sound command

Structure of IoT voice detection system

The complete detection and recognition system is described and evaluated based on an audio database containing more than 800 signals distributed across six different classes. The emphasis is on reliable methods that allow using this system in a noisy environment [7].

A Raspberry Pi single-board computer and an Arduino were used to create a sound detection module, which is mainly based on software. All hardware components are ready-to-use peripherals that are connected directly to Raspberry Pi and Arduino and configured according to the needs of the project.

There are various Raspberry Pi models, but the project is working on Model B. It has eight universal inputs/outputs, two USB ports; high definition multimedia interface (HDMI) output and other non-project related functions. In addition, the Raspberry Pi needs an operating system that is stored on a secure digital (SD) card. Choosing an operating system was not easy due to their large number; the most relevant are Raspbian, Pidora and RISC OS.

Raspbian was chosen for the current project because of the wide variety of tutorials and information available online. Raspbian is a Linux operating system with a free license, based on Debian and optimized for its use with Raspberry Pi hardware [4, 5].

1, 2022

The approach used for speech recognition

Speech recognition is the process of identifying a voice based on a spoken word by performing signal conversion [7], which is captured by an audio device (voice input device). Speech recognition - it is also a system used to recognize verbal commands of the human voice, and then convert them into data, that can be influenced by a computer. Sound is something that can be heard and has certain signal characteristics, while speech is a sound consisting of spoken words. Voice or speech recognition -this is one of the efforts needed to make the sound recognizable or identifiable so, that it can be used. Voice recognition can be divided into three approaches, namely the acoustic-phonetic approach, the approach using artificial intelligence and the approach with pattern recognition [8]. The approach to pattern recognition for speech recognition can be explained using a flowchart, it is shown in figure 3 [9].

Machine learning is part of Google>s cloud platform for creating applications that can hear, see and understand the world around them. In pre-prepared machine learning models, the Google Translate API and the Cloud Vision API were integrated into the Google Cloud Speech API.

Learning Sound

Feature Extraction

Model

Pattern Learning

(a) Block diagram of Pattern Learning

(b) Block diagram of voice recognition Learning Figure 3. Speech recognition flowchart

Feature

Match with model oatterns

Decision Logic

Recognized Sound ->

With such a complete API, developers can develop applications that can view, hear and translate [10]. The Google Cloud Speech API allows developers to turn audio into text by applying neural network models using the API. The API can recognize more than 110 languages and variants to support a global user base. You can also write custom text by dictating using the microphone of the application, enable voice control or record audio files, among many other use cases, recognizing downloaded audio on demand, and integrate with the audio storage in Google cloud storage [11].

Evaluation of sound detection using an automatic detector

A prototype of the proposed system was created and experiments were conducted to test the proposed system and determine its correctness. The audio recognizer was trained using various audio samples (in the format.wav). After that, the voice of the same person who was registered was used in various situations to assess the accuracy of the system. The voice sound detection system has been tested in various scenarios, such as.

1. An environment with a lot of background noise.

2. Quiet environment with minimal background noise.

3. An environment where the voice is not loud (quiet).

4. The situation when the voice is loud.

Figure 4 shows the results achieved.

Successful Unsuccessful!

Voice is not too Quiet Voice is loud Busy

loud environment environment

with minimal with a lot of

background noise background noise

Figure 4. Sound detection in various environments results

Figure 4 shows the results of how the system worked previously registered in the module database. Table 1 shows the on 150 test samples of the audio voice, after the audio voice was statistical data obtained.

Table.1: Success Rate of Automatic Sound Detecting System

S/No. Test Condition Number of Accuracy Percentage (%)

1 Voice is not too loud 90/60 60.00

2 Quiet environment with minimal background noise 120/30 80.00

3 Voice is loud 130/20 86.67

4 Busy environment with a lot of background noise 100/50 66.67

The analysis of the data in Table 1 showed that the voice sound analysis system works better when the voice is loud, as well as in a place where there is minimal background noise.

Conclusion

1. Sound detection and analysis is implemented in the form of an IoT system, including a sound sensor, a Raspberry Pi single-board computer, an Arduino microcontroller with the appropriate software, a notification to the operator.

2. Developed structures and tools for the analysis of sounds and the human voice. For the second structure, an algorithm is presented, that includes the conversion of voice into code, the identification of an analogue and its processing with sending a notification to the operator.

2. A study of the implementation of IoT network for voice analysis on 150 examples was conducted. From the tests conducted and the results it can be concluded, that the voice sound analysis system works better when the voice is loud, as well as in a place where there is minimal background noise.

REFERENCES

1. UNE-ISO 1996-1:2005. Acoustics: Description, Measurement and Assessment of Environmental Noise. Basic Quantities and Assessment Procedures; ISO: Geneva, Switzerland, 2005.

2. Passchier-Vermeer, W.; Passchier, W.F. Noise exposure and public health. Environ. Health Perspect. 2000, 108,123-131.

3. Vishnyakou, U.A. Approach to distributed multi-agent system for processing sound information of the environment / U.A. Vishnyakou, B.H. Sayya // System Analysis and Applied Informatics, 2019, No. 3. - P. 47-53.

4. Kasim, Mohammad; Khan, Firoz. Home Automation using Raspberry Pi-3.

5. McManus, S. (2014). Raspberry Pi for dummies. John Wiley & Sons.

6. Arduino. Arduino open-source prototyping platform. http://www.arduino.cc, 2012.

7. Lane, N.D., Georgiev, P., Qendro, L Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning, in Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 2015. Pp. 283-294.

8. Speech Recognition Application for the Speech Impaired using the Android-based Google Cloud Speech API Article ■ December 2018.

9. SN Endah, S Adhy, S Sutikno. Comparison of Feature Extraction Mel Frequency Cepstral Coefficients and Linear Predictive Coding in Automatic Speech Recognition for Indonesian. TELKOMNIKA Telecommunication Computer Electronics and Control. 2017; 15(1): 292.

10. H Purwanto. Ortopedagogik Umum. Yogyakarta. IKIP Yogyakarta. 1998.

11. M Stenman. Automatic speech recognition an evaluation of Google Speech. UMEA University. 2015.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

ЛИТЕРАТУРА

1. UNE-ISO 1996-1:2005. Acoustics: Description, Measurement and Assessment of Environmental Noise. Basic Quantities and Assessment Procedures; ISO: Geneva, Switzerland, 2005.

2. Passchier-Vermeer, W.; Passchier, W.F. Noise exposure and public health. Environ. Health Perspect. 2000, 108,123-131.

3. Вишняков, В.А. Подход к распределенной многоагентной системе обработки звуковой информации окружающей среды / В.А Вишняков, Б.Х. Сайя // Системный анализ и прикладная информатика, 2019, № 3. - С. 47-53.