- TexnuHecKue nayHU -
IMPLEMENTATION AND COMPARISON OF THE SOBEL OPERATOR ON CPU AND GPU USING CUDA
K.A. Spiridonov1, Graduate Student I.S. Stulov1, Graduate Student I.A. Ferapontov2, Graduate Student
1Moscow Aviation Institute (National Research University) 2Plekhanov Russian University of Economics (Russia, Moscow)
DOI:10.24412/2500-1000-2024-10-5-66-69
Abstract. This article examines the Sobel operator, which is used to highlight contours in images. Special attention is paid to two variants of its implementation: on the central processing unit (CPU) and on the graphics processor (GPU). The paper discusses in detail the technical aspects of the implementation of the Sobel method on the GPU, including the features of optimization and distribution of calculations on the graphics architecture. In addition, a comparative analysis of the method's performance is performed when it is performed on the CPU and GPU, which allows you to evaluate the efficiency of using the GPU for such tasks. The article also focuses on key aspects of algorithm development using the CUDA programming language, which is designed for parallel computing on GPUs.
Keywords: CPU, GPU, CUDA, Sobel operator.
One of the most important convolutions is the calculation of derivatives. Derivatives play a very important role in mathematics and physics, and the same can be said about computer vision. The images we work with consist of pixels, which, for a grayscale image, set the brightness value. That is, our picture is just a two - dimensional matrix of numbers. Therefore, the derivative in the field of working with images is the ratio of the value of the pixel increment in y to the value of the pixel increment in x.
Working with image A, we work with a function of two variables A(x,y), i.e. with a scalar field. Therefore, it is more correct to speak not about the derivative, but about the gradient of the image.
The operator calculates the brightness gradient of the image at each point. This is the direction of the greatest increase in brightness and the magnitude of its change in this direction. The result shows how "sharply" or "smoothly" the brightness of the image changes at each point, which means that the probability of finding a point on the edge, as well as the orientation of the border. In practice, calculating the magnitude of the brightness change (the probability of belonging to a face) is more reliable and easier to interpret than calculating the direction.
One such convolution is the Sobel operator. This operator is used in computer vision to highlight boundaries. To apply the Sobel operator, we use two matrices:
Gx —
Gy =
1 0
[2 0
1 0
1 2
0 0
-1 -2
* A
0
2 -1
* A
1
where * - convolution operation. CPU Implementation:
void apply_sobel_operator(uint8_t *img, int width, int height, int channels, uint8_t *res, int8_t Wx[][3], int8_t Wy[][3]) { double Gx, Gy, grad;
- TexHuuecKue HayKU -
for (int i = 0; i < width; ++i) {
for (int j = 0; j < height; ++j) { Gx = 0; Gy = 0; for (int u = -1; u <= 1; ++u) {
for (int v = -1; v <= 1; ++v) { int ip = max(min(i + u, width-1), 0), jp = max(min(j + v, height-1), 0);
double pix = rgb_to_gray(img[(jp * width + ip) * channels], img[(jp * width + ip) * channels + 1], img[(jp * width + ip) * channels + 2]);
Gx += Wx[u+1][v+1] * pix; Gy += Wy[u+1][v+1] * pix; }
}
grad = min(255., sqrt(Gx * Gx + Gy * Gy));
res[(j * width + i) * channels] = static_cast<uint8_t>(grad);
res[(j * width + i) * channels + 1] = static_cast<uint8_t>(grad);
res[(j * width + i) * channels + 2] = static_cast<uint8_t>(grad);
res[(j * width + i) * channels + 3] = img[(j * width + i) * channels + 3];
}
}
}
The implementation on the CPU does not have any particularly unique or advanced features. One area where improvements could be made is in the matrix multiplication process. By optimizing the way the image matrix is stored, we could potentially reduce the number of cache misses, thereby enhancing performance. However, achieving this would necessitate preprocessing the image, which would in turn require additional memory resources.
GPU Implementation:
_constant_char Wx[3][3], Wy[3][3];
_global_void apply_sobel_operator(cudaTextureObject_t img, uchar4 *res, int width, int height) {
double Gx, Gy, grad, pix; uchar4 p;
for(int y = idy; y < height; y += off_y)
for(int x = idx; x < width; x += off_x) { Gx = 0; Gy = 0; for (int u = -1; u <= 1; ++u) {
for (int v = -1; v <= 1; ++v) {
p = tex2D<uchar4>(img, x + u, y + v); pix = 0.299 * p.x + 0.587 * p.y + 0.114 * p.z; Gx += Wx[u+1][v+1] * pix;
Gy += Wy[u+1][v+1] * pix; }
}
grad = min(255., sqrt(Gx * Gx + Gy * Gy));
res[y * width + x] = make_uchar4(grad, grad, grad, p.w);
}
}
A little bit about constant memory
Constant memory is the fastest GPU available. A distinctive feature of constant memory is the ability to write data from the host, but at the same time, only reading from this memory is possible within the
- TexnuHecKue uayHU -
GPU, which determines its name. The_constant_specifier is provided for storing data in constant
memory.
If it is necessary to use an array in constant memory, then its size must be specified in advance, since dynamic allocation, unlike global memory, is not supported in constant memory. To write from the host to the constant memory, the cudaMemcpyToSymbol function is used, and to copy from the device to the cudaMemcpyFromSymbol host, as you can see, this approach is somewhat different from the approach when working with global memory.
To write in constant memory, use these functions: cudaMemcpyToSymbol(Wx, host_Wx, 9); cudaMemcpyToSymbol(Wy, host_Wy, 9); Benchmarks and results:
Table 1. Benchmark
Configuration Execution time, ms
CPU 1.902 48.823 200.103 853.682 5218.232
1x1, 32x1 0.618 11.989 65.734 232.912 1308.420
1x1, 32x32 0.179 3.102 15.083 49.431 299.033
32x32, 32x8 0.111 0.732 2.682 10.001 58.932
32x32, 32x32 0.157 0.973 3.992 12.783 59.562
64x64, 32x8 0.204 1.291 4.712 11.421 60.058
64x64, 32x32 0.361 1.401 4.302 16.103 62.842
Size of test 100x100 500x500 1000x1000 2000x2000 5000x5000
Results:
Figure 1. Original picture
Figure 2. The Sobel operator applied to that image
- Технические науки -
Библиографический список
1. Гонсалес Р., Вудс Р. Цифровая обработка изображений. - 3-е изд. - Москва: Техносфера, 2012. - 1104 с.
2. Кормен Т.Х., Лейзерсон Ч.Э., Ривест Р.Л., Штайн К. Алгоритмы: построение и анализ. - 3-е изд. - Москва: Вильямс, 2013. - 1328 с.
3. Сандерс Дж., Кэндрот Э. Технология СЦОА в примерах. - Москва: ДМК Пресс, 2011. -312 с.
4. Страуструп, Б. Программирование: принципы и практика с использованием С++. - 2-е изд. - М.: Addison-Wesley, 2014. - 1312 с.
5. Керниган Б., Ритчи Д. Язык программирования С. - 2-е изд. - М.: Мир, 1989. - 272 с.
РЕАЛИЗАЦИЯ И СРАВНЕНИЕ ОПЕРАТОРА СОБЕЛЯ НА CPU И GPU С ИСПОЛЬЗОВАНИЕМ CUDA
К.А. Спиридонов1, магистрант И.С. Стулов1, магистрант И.А. Ферапонтов2, магистрант
1Московский авиационный институт (национальный исследовательский университет) 2Российский экономический университет имени Г. В. Плеханова (Россия, г. Москва)
Аннотация. В данной статье проводится исследование оператора Собеля, который используется для выделения контуров на изображениях. Особое внимание уделено двум вариантам его реализации: на центральном процессоре (CPU) и на графическом процессоре (GPU). В работе подробно обсуждаются технические аспекты реализации метода Собеля на GPU, включая особенности оптимизации и распределения вычислений на графической архитектуре. Кроме того, выполняется сравнительный анализ производительности метода при его выполнении на CPU и GPU, что позволяет оценить эффективность использования графического процессора для подобных задач. Также в статье уделено внимание ключевым аспектам разработки алгоритмов с использованием языка программирования CUDA, который предназначен для параллельных вычислений на графических процессорах. Ключевые слова: CPU, GPU, CUDA, оператор Собеля.