Научная статья на тему 'USAGE OF EDGE SELECTION METHODS TO REFINE CANDIDATE BLOCKS IN THE MOTION COMPENSATION PROCESS, BASED ON THE SAD ALGORITHM USING CHARACTERISTIC POINTS.'

USAGE OF EDGE SELECTION METHODS TO REFINE CANDIDATE BLOCKS IN THE MOTION COMPENSATION PROCESS, BASED ON THE SAD ALGORITHM USING CHARACTERISTIC POINTS. Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
45
12
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
edge detection / Laplacian of gaussian / Canny / motion compensation / Sum of absolute differences / video compressing / SSD / PSNR.

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Dibrivniy O., Grebenyuk V

Video compression is one of the most important parts in video distribution process. The main goal of this article is to illustrate and evaluate usage of algorithms for selecting edges in the image, to optimize the algorithm for estimating the similarity of video sequence frames when using as a metric the sum of absolute differences (SAD) in video compressing trough motion compensation.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «USAGE OF EDGE SELECTION METHODS TO REFINE CANDIDATE BLOCKS IN THE MOTION COMPENSATION PROCESS, BASED ON THE SAD ALGORITHM USING CHARACTERISTIC POINTS.»

7. Баталии Б. Козлов И. Строительные материалы на основе скопа - отхода целлюлозно-бумажной промышленности // Строительные материалы. 2004. №1. C. 42-43.

8. Ширинкина Е. С., Айтжанова У. М. Переработка скопа, образующегося в технологическом процессе картонно-бумажного производства // European science № 2(12), 2016._с. 13-16.

9. Смирнова К.А. Пористая керамика для фильтрации и аэрации. - М.: Стройиздат, 1968. - 172 с.

10. Селиванов Ю.В. Получение и свойства пористой строительной керамики / Ю.В. Селиванов, В.И. Верещагин, А.Д. Шильцина // Известия Томского политехнического университета. Инжиниринг георесурсов, 2004. - с. 107-113.

11. Hammel E.C. Processing and properties of advanced porous ceramics: An application based review / E.C. Hammel, O.L.-R. Ighodaro, O.I. Okoli // Ceramics International, 2014. - Vol. 40. - Is. 10. - Part A. - pp. 15351-15370.

12. D. Hutchison. Chemical Methods of Rock Analysis / UK: Butterworth-Heinemann, 1981. - 379 p.

13. Лукин Е.С., Андрианов Н.Т. Технический анализ и контроль производства керамики. - М.: Стройиздат, 1986. - 272 с.

14. Сввдерський В.А. 1нструментальш ме-тоди хiмiчного аналiзу силшатних систем. Навчаль-ний поабник / В.А. Свщерський, Л.П. Черняк, В.Г. Сальник, В.М. О.О. Окорський, Н.О. Дорогань //. -Кшв. - 2017. - 163 с.

USAGE OF EDGE SELECTION METHODS TO REFINE CANDIDATE BLOCKS IN THE MOTION COMPENSATION PROCESS, BASED ON THE SAD ALGORITHM USING CHARACTERISTIC

POINTS.

Dibrivniy O.

Ph.D student/ senior lecturer Grebenyuk V.

Ph.D student/ senior lecturer State University of Telecommunications, Kyiv

ABSTRACT

Video compression is one of the most important parts in video distribution process. The main goal of this article is to illustrate and evaluate usage of algorithms for selecting edges in the image, to optimize the algorithm for estimating the similarity of video sequence frames when using as a metric the sum of absolute differences (SAD) in video compressing trough motion compensation.

Keywords: edge detection, Laplacian of gaussian, Canny, motion compensation, Sum of absolute differences, video compressing, SSD, PSNR.

Introduction: Video compression is the reduction and removal of redundant video data in order to optimize the storage and transfer of digital video files.

During this process, the output video signal is processed by an algorithm to create a compressed file ready for transmission and storage. To play a compressed file, an inverse algorithm is used, which actually gives the same video stream as the original video source. The time it takes to compress, send, unpack, and display a file is called a delay. The heavier the compression algorithm, the greater the delay. The joint work of a pair of algorithms is called a video codec (encoder / decoder). Video codecs are incompatible with different standards, so video data compressed using one standard cannot be unpacked using another standard. The reason for this is the fact that one algorithm cannot correctly decode the result obtained by working with another algorithm. Different video compression standards use different methods to reduce the size of the data, and thus the results differ in data rate, quality and level of delay [1].

It should be noted that different standards may be based on the same compression techniques. For example, one of the most commonly used video stream compression algorithms is a motion compensation algorithm, which works by using the similarity of neighboring frames in a video sequence and finds motion vectors of individual parts of the image (usually 16x16 and 8x8

pixel macroblocks, modern encoders also use size blocks 32x32 and 64x64) [2]. One of the main, open issues when working with the method of motion compensation is the process of assessing the similarity of images. There are quite a few metrics that allow you to evaluate this parameter, such as the standard deviation (SSD) based on the peak signal-to-noise ratio, which gives good results in terms of calculations, but shows poor results in terms of execution time (multiplication operation is slow, even the table of squares does not greatly speed up the process). Modern systems most often use a metric based on the sum of absolute differences (SAD), in fact, it is the simplest of the possible metrics and is calculated by taking the absolute difference between each pixel in the input macroblock and the corresponding pixel in the block used for comparison. It should be noted that the SAD algorithm shows worst precision compared to the above metric and has low noise resistance, so it is used only as the first stage, after which the resulting set of candidate blocks is processed by a metric that better takes into account the specifics of the human eye, such as SSD. Despite its simplicity and relatively easy parallelization (due to the independence of operations with individual pixels), the calculation of SAD takes from 40 to 80% of the time of the entire operation of encoding the video stream. Thus, it is adequate to consider the possibility of reducing the

number of operations to obtain SAD. A possible solution to this problem is to use characteristic points (equalization template for macroblocks, which are essentially square matrices, in the cells of which information about one of the components of the frame is stored) [3].

Usage of comparison templates to reduce the number of operations of the SAD algorithm: Thus, to evaluate the performance of the proposed algorithm, we will use a video stream with a frame rate of 29.97 frames per second, in MPEG4 format (H264) and YUV color format 4:2:0. The frame size is 1920x1080 pixels,

the color depth for which is 8 bits. The calculation will be performed for macroblocks of 16x16 pixels by a complete search of blocks in the range of ± 32x ± 32 pixels (which will give the highest accuracy of the desired result) for Y - brightness component of the image because the human eye is more sensitive to changes in image brightness than color. With such input data, for SAD we will need to perform 4096 metric calculations for each block, given that we will have more than 8 thousand. blocks for each frame, we get the metric calculations for a sequence of 2 frames. The first 6 frames of the sequence are presented in Figure 1.

Figure 1 - The first 6 frames of the video sequence

The usage of characteristic points pattern is to exclude part of the pixels from the calculations by assuming the uniqueness of the distribution of pixels in the frames of the video stream. Since the main purpose of this article is to illustrate the operation of edge selection algorithms for template comparison methods, we will focus on the TSAD (third SAD) (Fig. 2) template, which is to calculate the SAD based on every third pixel, on the diagonals of the matrix parallel to the side diagonal.

Figure 2 - View of the TSAD template (red pixels are used for comparison)

Thus, by reducing the number of transactions by 66.4% for the video sequence described above, we obtain the maximum relative deviation of SAD blocks of

candidates at 5.7% with an average deviation of 3.81%. When calculating the real data, another important indicator of the effectiveness of the chosen algorithm for estimating the similarity of the blocks is the number of candidate blocks for each individual block. This number is greatest in homogeneous areas of the frame, which remain unchanged when the frame changes, and such areas can be neglected, because SAD for them will be zero. As a result of using the TSAD template, we get an increase in the average number of candidate blocks from 2.35 for one block to 4.88. The overlap of the blocks found using the TSAD template will be 97.34% of those blocks that were found by the classic SAD. As a result, we have an increase in the number of blocks of candidates in a little more than 2 times, with a slight loss of the required information. It is also worth noting that these effects manifest themselves in different parts of the image. Thus, the loss of candidate blocks takes place in areas with minimal entropy, while the increase in the number of candidate blocks in contrast to the blocks with the highest entropy (sharp differences in the brightness component of the image), which will be observed at the edges of the image. And since homogeneous sections of blocks, in most cases will be neglected, due to the lack of inter-frame difference. For the remaining blocks, we can re-evaluate the similarity of the blocks with the candidate blocks using SSD, which will eliminate uncertainty among the candidate blocks obtained through the classic SAD and reduce, and reduce the average number of candidate blocks for TSAD to 1.89 (which is not enough to form a coded sequence). It is also worth noting that due to the peculiarities of the implementation of SAD, the refinement of candidate blocks by this algorithm can lead to the appearance of artifacts in the field of high entropy. So in our case it is worth to consider algorithms which show themselves most effectively at work with areas of high entropy [4].

Such methods are edge selection algorithms. Due to the fact that changes in lighting and color usually do not greatly affect the edges of the image, the search for edges is usually performed on the image in shades of gray, which in the version of the video stream in YUV format, the luminance component Y - is essentially an image in shades of gray.

The idea of using image edge search algorithms is as follows: after creating an array of candidate blocks using TSAD, we reject those in which the sum of absolute differences is lower than S, which is selected based on the desired image quality parameters, is immediately considered the correct candidate. from standard methods such as SDD or RDO. Next, we remove the compensated areas from the data of the subsequent calculation and perform the selection of edges using one of the algorithms that will be described later in the section. After that, for blocks whose difference is greater than the selected validity threshold, we calculate the SAD between the candidate blocks and the block over which the search is performed exclusively on the obtained map of the edges of objects, choosing the one for which the difference of SAD will be the smallest. In this article, we will consider two methods for selecting the edges of Canny and LoG (Laplacian of gaussian).

Overview of edge selection algorithms

The Canny algorithm consists of four stages:

1. Image blur (decreases the variance of additive noise in the image).

2. Differentiation of the blurred image and calculation of gradient values in the x direction and the y direction.

3. NOT maximum suppression.

4. Threshold processing.

In the first stage of the Canny algorithm, the image is smoothed using a mask with a Gaussian filter.

The equation of the Gaussian distribution in N dimensions has the form:

G(r) =_1_e ~r2/(2ff2) (1)

G(r) (2rnr2)^/2 e (1)

where r is the blur radius, is the standard deviation of the Gaussian distribution.

Next, the area gradient is searched by convolving the smoothed image derived from the Gaussian function in both vertical and horizontal directions.

We use the Sobel operator to solve this problem[5]. The control is done by simply moving the filter mask from point to point in the image. At each point (x, y), the filter response is calculated using predefined relationships.

For this step we use the following matrices:

1 0 1" ' 1 2 1 "

II s - 2 0 2 ; KGY 0 0 0 ; (2)

-1 0 1 -1 - 2 -1

Gx and Gy are two matrices, where each point contains approximate derivatives of x and y. By multiplying the matrix Gx and Gy and summing both matrices, and writing result in the current coordinates x and y in the new image:

G

+ G 2

(3)

Using this information, we can also calculate the direction of the gradient:

\Gy\

6 = artcg(^j) Gx

(4)

The result is an initial selection of edges on the object of interest.

The next step is to compare each pixel with neighboring ones along the gradient direction and calculate the local maximum. Gradient direction information is needed to remove pixels near the boundary without breaking the boundary near the local gradient maxima: boundary pixels are the points at which the local gradient maximum is reached in the direction of the gradient vector.

The next step is to use a threshold to determine the location of the boundary at a given point in the image. The smaller the threshold, the more limits there will be, but the more susceptible to noise the result will be, highlighting the extra image data. Conversely, a high threshold can ignore weak edges or get a border of fragments. Boundary selection uses two filtering thresholds: if the pixel value is above the upper limit - it takes the maximum value (the limit is considered valid), if

below - the pixel is suppressed, points with values in the range between the thresholds take a fixed average value. As a result, the task is to select groups of pixels that received an intermediate value in the previous stage, and assign them to the border (if they are associ-

ated with one of the established limits) or their suppression (otherwise). A pixel is added to a group if it collides with it in one of 8 directions. The selection of edges for areas in which there is an inter-frame difference, for the second frame of the video sequence is presented in Fig.3.

Figure 3 - The result of the selection of object boundaries by the Canny method

The Laplace operator is essentially the second derivative, its application emphasizes the gaps in the brightness levels in the image and suppresses areas with weak changes in brightness. This results in an image containing grayish lines in place of contours and other gaps superimposed on a dark background without features. However, the background can be "restored", while maintaining the effect of sharpening achieved by Laplacian. For this it is enough to make the initial image and Laplacian. And to obtain the final result, the Laplacian image is subtracted from the original im-age[6].

Thus, the generalized algorithm for using Lapla-cian to improve images is as follows:

\f (x, y) -V 2f (x, y), w(0,0) < 0

where w(0,0) is the value of the coefficients of the Laplacian mask, and V2 f - (Laplace operator),

which for two variables is defined as follows:

•¡2 s a2

V7 2 , 5 2 f 5 2 f V2 f = + -

(6)

d2x d2y

Since derivatives of any order are linear operators, Laplacian is a linear operator. The discrete formula of the two-dimensional Laplacian given by equation (6) is obtained by combining the second-order partial derivatives with respect to the variables x and y:

g ( x, y) = ■

f ( x, y) -V 2f (x, y), Rmo w(0,0) > 0

(5)

52 x 52 y

= f (x +1, y) + f (x -1, y) - 2f (x, y) (7)

= f (x, y +1) + f (x, y -1) - 2f (x, y) (8)

where can we get the formula for Laplacian: V2f = [f (x +1, y) + f (x -1, y) + f (x, y +1) + f (x, y -1)] - 2f (x, y) (9)

An example of the algorithm for areas where there is an inter-frame difference, for the second frame of the video sequence is presented in Fig.4.

Figure 3. - The result of the selection of object boundaries by the Log method

When choosing a method of selecting edges, we will be guided by the following factors:

1. The relative noise level of the image.

2. The desired accuracy of the obtained edge selection.

3. The speed of the algorithm.

It should be noted that in our case, the noise in the image will be relatively insignificant, because we highlight the edges after the first of the three stages of image processing in which noise occurs, ignoring the rest:

1. The stage of digitization of the image, in which there is noise caused by external influences (electromagnetic, thermal) on the sensors and analog-to-digital converters (ADC) of the registration system.

2. The encoding step, in which noise is caused by quantization and lossy data compression;

3. Transmission stage, noise caused by partial distortion or loss of data as a result of interference.

Thus, we can consider methods that are highly dependent on image noise, such as LoG, which has a rather weak noise resistance.

As a result, by selecting the edges for the image area with undefined candidate blocks, and then calculating the SAD for the blocks along the image boundaries, we can obtain the following reduction in the number of candidate blocks for the blocks at the object boundary. After processing with LoG, the average number of candidate blocks decreased to 1,119, while for Canny 1,173, which is due to the excessive allocation of small objects by the Canny method. It is also worth noting that the selection of edges using the Canny algorithm takes on average 31.6% more time than using LoG. As can be seen from the obtained values, we obtained a significant reduction in the number of undefined blocks, after which we can determine the remaining blocks using the same SSD + PSNR. It is worth noting that after the final evaluation of the candidate blocks through the SAD and the map of the regions, we have 13.1% of undefined edge blocks for HSAD and 10.7% of undefined blocks for TSAD.

It remains to estimate the time spent searching for motion vectors using the classic SAD and the proposed TSAD search template supplemented by edge selection. Thus, for the calculation of the test video sequence by the classical algorithm SAD was spent 29.7% more time than the selection of edges using the proposed algorithm.

Conclusions:

The proposed method consists of the following steps:

1. Selection of the inter-frame difference and exclusion from the following steps of the reference image blocks whose inter-frame difference is zero.

2. Obtaining candidate blocks using SAD and the proposed comparison template (one in three) - TSAD.

3. Elimination of blocks for which we have 1 candidate block.

4. Select the edges of the rest of the image using the LoG filter

5. Clarification of candidate blocks for the obtained image using SAD.

6. Final clarification of candidate blocks by one of the similarity metrics such as SSD based on PSNR.

The obtained method allowed to reduce the execution time by 29.7% for the test video sequence with a length of 1.31s, with a frame rate of 29.97 frames / second, in MPEG4 format (H264) and color format YUV 4: 2: 0. The frame size is 1920x1080 pixels, the color depth for which is 8 bits. At the same time, the number of operations at the stage of formation of candidate blocks was reduced by 66.4% due to comparisons by characteristic points. The disadvantage of using this template was a sharp deterioration in the accuracy of image estimation in areas with high entropy (at object boundaries and in areas of sharp color transition) - an increase in the average number of candidate blocks from 2.35 for one block to 4.88, which was eliminated by objects. The performance of two methods was evaluated - LoG and Canny, among which LoG performed better, the number of candidate blocks was reduced to

1,119 from 1,173 in Canny, with a lead time of 31.6% less than in Canny.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

It should be noted that when developing the method, the authors did not aim at software implementation of the product, so when calculating the algorithm, the relative execution time was estimated for the same software implementation and with the same computer parameters, so it should be kept in mind that changes in algorithm time may differ for different software. implementations.

References

1. Chen et al T.C. 'Analysis and Architecture Design of an HD720p 30 Frames/s H.264/AVC Encoder, IEEE Trans. Cir. Syst. Video Tech., vol. 16, no. 6, 2006 - 673-688pp.

2. Deng L., Gao W., An efficient hardware implementation for motion estimation of AVC standard,

IEEE Trans. Consumer Electron., vol. 51, no. 4, 2005 -1360-1366pp.

3. Duanmu X. Q., Duanmu C. J., Zou C. R. A multilevel successive elimination algorithm for block matching motion estimation, IEEE Trans. Image Process, vol. 9, no. 3, 2005 - 501-504pp.

4. Liu J., Yuan L., Xie X. (2016) 'Hardware-Oriented Adaptive Multiresolution Motion Estimation Algorithm and Its VLSI Architecture', vol.78, no.1, 2016 - 4799-5341pp.

5. Weibin R., Zhanjing L., Wei Z., Lining S., 'An improved Canny edge detection algorithm', IEEE International Conference on Mechatronics and Automation, 2014.

6. Kong Hui., Akakin H. C., Sarma S. E., 'A Generalized Laplacian of Gaussian Filter for Blob Detection and Its Applications', IEEE Transactions on Cybernetics Volume: 43, Issue: 6, 2013 - 1719 - 1733p.p.

НАБЛЮДАТЕЛЬ ПОТОКОСЦЕПЛЕНИЯ РОТОРА В СИСТЕМЕ ВЕКТОРНОГО УПРАВЛЕНИЯ

АСИНХРОННОЙ МАШИНОЙ

Клюев О.В.

кандидат технических наук, доцент Днепровский государственный технический университет, г. Каменское

Днепропетровская область, Украина

ROTOR FLUX LINKAGE OBSERVER IN THE SYSTEM VECTOR CONTROL OF

ASYNCHRONOUS MACHINE

Klyuyev O.

Dniprovsky State Technical University, Kamianske, Dnipropetrovsk region, Ukraine

АННОТАЦИЯ

Обоснована возможность декомпозиции процесса синтеза наблюдателя потокосцепления ротора по каждой оси его математической модели, что упрощает структуру наблюдателя и число корректирующих коэффициентов при обеспечении требуемой точности наблюдателя и его инвариантности к изменению активного сопротивления ротора в электроприводах со спокойно меняющейся нагрузкой.

ABSTRACT

The possibility of decomposition of the synthesis process of the observer of the rotor flux linkage along each axis of its mathematical model is substantiated, which simplifies the structure of the observer and the number of correcting coefficients while ensuring the required accuracy of the observer and his invariance to changes in the active resistance of the rotor in electric drives with quietly changing load.

Ключевые слова: асинхронный двигатель, потокосцепление ротора, наблюдатель потокосцепления, векторное управление, асимптотическая устойчивость.

Keywords: asynchronous motor, rotor flux linkage, flux linkage observer, vector control, asymptotic stability.

Постановка проблемы. Основным недостатком алгоритмов векторного управления асинхронными двигателями (АД) является их чувствительность к вариациям параметров электрической машины. Основным параметрическим возмущением в рассматриваемых системах является изменение активного сопротивления ротора АД вследствие его нагрева [1]. При этом нарушаются условия поле-ориентирования, вследствие чего ухудшаются показатели качества регулирования координат, воз-

можна даже потеря устойчивости, снижается эффективность процесса электромеханического преобразования энергии.

Измерение потокосцеплений в зазорах под фазными обмотками АД с помощью датчиков Холла практически сложно реализовать [2]. Поэтому целесообразно применение наблюдателей, которые позволяют по легкодоступным в измерении сигналам фазных напряжений и токов статора восстановить значения составляющих потокосцеп-

ления ротора в осях О, Р. Значительная часть алгоритмов наблюдения вектора потокосцепления и

i Надоели баннеры? Вы всегда можете отключить рекламу.