On Graph-Based Image Segmentation Using Graph Cuts in Feature Space
Anna Fabijanska
Abstract—In this paper problem of graph based image segmentation is considered. Modification of min-cut/max-flow algorithm is proposed. The main change introduced by the proposed approach is to regard neighborhood in feature space rather than spatial neighborhood of pixels as in case the original method. Results provided by the proposed approach are presented, compared with the results of the source method and discussed.
Index Terms—Image segmentation, Graph theory, Min-cut/max-flow, Feature space.
Firstly, in Section 2 graph representation of an image is explained. Next, in Section 3 the glance at min-cut/max-flow image segmentation is given. This is followed in Section 4 by the description of the proposed approach. Results of this approach are presented and compared with the source method in Section 5. Some of the computational issues are discusses in Section 6. Finally, Section 7 concludes the paper.
II. Graph-based Image Representation
I. Introduction
IMAGE segmentation is a crucial problem in machine vision. Therefore it has been widely studied over the years and numerous distinctly different approaches to image segmentation have already been proposed.
Recent research on image segmentation has seen an increasing interest in graph based techniques which preform object extraction by partitioning graph based image representation into sub-graphs.
The most representative methods for graph based image segmentation are: (i) spectral graph partitioning using the eigenvectors of the graph Laplacian to partition the graph [1] and (ii) combinatorial graph cuts which perform segmentation by solving min-cut/max-flow problem [2,3]. Spectral graph partitioning methods are problems of NP-complexity and they are too slow for practical applications of machine vision. Therefore they will not be considered in this paper.
The main attention of this paper is focused on these image segmentation methods which use combinatorial graph cuts. Specifically, an extension of min-cut/max-flow method introduced in [2] is proposed. The main change of the proposed approach is to regard neighborhood in feature space rather than spatial neighborhood of adjacent pixels as in case the original method.
The following part of this paper is organized as follows.
Digital image can be considered as a weighted graph with pixels represented by nodes vi e V connected by edges еу={уь Vj}e E. Each edge has the nonnegative weight Wj which describes similarity between the incident nodes. Regarding graph based image representation image segmentation is a partitioning of graph G=(V, E) into two disjoint sets A and B where A u B=V and A n B=0. The partitioning is performed according to some criterion and aims at removing edges that connect subgraphs A and B (see Fig. 1).
Fig. 1. Graph partitioning; (a) input graph G; (b) edges to be removed are denoted by dashed lines; (c) disjoint subgraphs Ga and Gb
III. Min-Cut/Max-Flow Image Segmentation
Manuscript received March 28, 2012. This research was supported by Ministry of Science and Higher Education of Poland in a framework of research project no. N N516 490439 (funds for science in years 20102012). The author receives financial support from the Foundation for Polish Science in a framework of START fellowship.
Anna Fabijanska is with the Computer Engineering Department, Technical University of Lodz, 18/22 Stefanowskiego Str., 90-924 Lodz, Poland, phone: +48 42 631-27-50; fax: +48 42 631-27-55; e-mail: an_fab@ kis.p.lodz.pl.
A. Min-Cut/Max-Flow Theorem
A cut (S, T) of a directed graph is a set of edges C e E such that the two terminals become separated on the induced graph G’=(V, E/C). Minimal cut (min-cut) is a cut of minimum total capacity. According to min-cut/max-flow theorem the minimal cut is equal to maximum flow that can
R&I, 2012, №4
95
be passed from the source S to sink T [4]. This concept is presented in Figure 2.
cut
s<j T1/3>I)
Fig. 2. An idea of min-cut/max-flow theorem. The maximum flow is equal to a total capacity of minimal cut C ={{S,A},{S,B}}
B. Graph-Cut Image Segmentation The inspiration of the approach presented in this paper was min-cut/max flow segmentation proposed in [2]. In this method an image is represented by a weighted and undirected graph G=(V,E). The set of nodes V=P и {S,T} consists from subset P of nodes corresponding with pixels and two terminal nodes: the object terminal S (source) and the background terminal T (sink). The set E of edges consists of two types of undirected edges: n-links which connect neighboring pixels and t-links which connect pixels with the terminals. Every pixel has up to four n-links to the closest, neighboring pixels and two t-links: {p, S} and {p, T} connecting it to source and sink respectively. Exemplary graph obtained for 3x3 image is shown in Figure 3.
source (SJ
Fig. 3. Exemplary graph obtained for 3x3 image [2]
Weights B{pq} assigned to n-links represent boundary term and describe similarity between the neighboring nodes p and q. The higher the weight - the higher similarity between pixels. Weights Rp(-) assigned to t-links represent regional term and define the individual penalties Rp(“obj”) and Rp(“bkg”) for assigning pixel p to object and background respectively. The weights of edges suggested by Boykov and Jolly are given in T able I, where:
K = 1 + ma* Z B{p.q} (1)
p q{ p.q}
and X is a scaling factor indicating the importance of regional term versus boundary term.
Having the graph defined above image segmentation is defined by the edges which get saturated when maximum
tableI
Weights for N-links and T-links
Edge Weight For
{p, q} B{pai {p, q}e N
X ■ Rp(“bkg”) peP,pi ob
{p, S} K pe O
0 pe B
X ■ Rp(“obj”) pe P, pi о и B
{p, T} 0 pe obj
K pe bkg
flow is send from terminal S to terminal T. The maximum flow is determined using an efficient algorithm based on augmenting paths proposed in [3].
IV. The Proposed Approach
The main idea of the approach proposed in this paper is to modify Boykov and Jolly’s method by replacing spatial neighborhood of pixels with the neighborhood in a feature space. Specifically, in the graph representing input image n-links connect neighbors in the feature space i.e. the most similar pixels according to some similarity measure. These are not necessarily the adjacent pixels.
In the graph every pixel p e P can have up to k+MN n-links: k connecting it to its k-nearest neighbors, and up to MN ones coming from pixels which found the pixel p to be its nearest neighbor (MN are image dimensions). If there are more than needed equally similar pixels, consecutive neighbors are chosen randomly.
The idea of graph construction using the proposed method is explained in Figure 4. specifically, Figure 4a shows exemplary 3x3 image. Intensity of each pixel is given with bold font. Additionally, number (id) of the node corresponding with every pixel is given in brackets. Figure 4b presents adjacency matrix obtained for spatial adjacency graph as proposed by Boykov and Jolly, where each pixel is connected with up to 4 adjacent pixels. 1’s denote that nodes of a given ids are connected by n-links; lack of connection between nodes is denoted by 0’s. Graph adjacency matrix obtained for the proposed approach is shown in Figure 4b. The graph was built with regard that each pixel is connected to its 4 closest (with respect to intensity) neighbors.
It should be mentioned, that connecting pixels according to their similarities, makes n-links directed edges. It’s because the neighborhood in a feature space is not always symmetric (as in case of the spatial neighborhood). This may be observed from graph adjacency matrices shown in Figure 4.
96
R&I, 2012, №4
255 127 64
(1) (2) (3)
255 127 64
(4) (5) (6)
255 0 0
(7) (8) (9)
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 1 0 1 0 0
2 1 0 1 0 1 0 0 0 0 2 0 0 1 0 1 1 0 0 1
3 0 1 0 0 0 1 0 0 0 3 0 1 0 0 1 1 0 0 1
4 1 0 0 0 1 0 1 0 0 4 1 1 0 0 1 0 1 0 0
5 0 1 0 1 0 1 0 1 0 5 0 1 1 0 0 1 0 1 0
6 0 0 1 0 1 0 0 0 1 6 0 1 1 0 1 0 0 1 0
7 0 0 0 1 0 0 0 1 0 7 1 1 0 1 1 0 0 0 0
8 0 0 0 0 1 0 1 0 1 8 0 0 1 0 1 1 0 0 1
9 0 0 0 0 0 1 0 1 0 9 0 0 1 0 0 1 1 1 0
Fig.4. Idea of connecting pixels via n-links; (a) exemplary image 3x3; (b) graph adjacency matrix obtained for a spatial neighborhood; (c) graph adjacency matrix obtained for a neighborhood in a feature space for 4 nearest neighbors with respect to pixel intensities
It should be mentioned, that connecting pixels according to their similarities, makes n-links directed edges. It’s because the neighborhood in a feature space is not always symmetric (as in case of the spatial neighborhood). This may be observed from graph adjacency matrices shown in Figure 4.
Weights B{pq} assigned to n-links in the proposed approach are determined with respect to Euclidean distance d(p,q) in the feature space. Specifically, they are given by following equation:
I » for epq <£ E
B{pq} = 1 - d(p-q)
" f0r epq 6 E
(2)
where
d (p- g)- q<)2 (3)
Additionally, a is some scaling factor and pi is a i-th feature describing pixel p.
Weights assigned to t-links describe probabilities that each pixel belongs to background and foreground and are described by the following equations:
Rp("obj") = -ln Pr(lp \"obj") (4)
Rp("bkg") = -lnPr(/p \"bkg") (5)
where Ip denotes intensity of pixel p and probability Pr is determined based on intensity distribution in regions of object and background indicated by the user.
V. Results
This section shows results of applying the proposed approach to exemplary 8-bit grey-scale images from Figure
5. Specifically, images of frog (Fig. 5a), yarn (Fig. 5b), tree (Fig. 5c), brain (Fig. 5d) and plane (Fig. 5e) are considered. Spatial resolution of regarded images do not exceed 256x256 pixels.
During the experiments every pixel was described by three features: its intensity, average intensity in 3x3
neighborhood and the corresponding variance of the intensity.
Fig.5. Exemplary images (a) frog; (b)yarn; (c) tree; (d) brain; (e)plane
Results of applying the proposed method to exemplary images are shown and compared with results of min-cut/max-flow segmentation in Figure 6. Specifically, the
Fig.6. Results of image segmentation using the proposed method compared to results provided by Boykov and Jolly’s algorithm
first column presents input image with conditions imposed by the user on background and foreground. Green lines indicate pixels which must be included into the object. Similarly, red lines indicate regions belonging to the background. In the second column results provided by Boykov and Jolly’s algorithm are shown. The remaining
R&I, 2012, №4
97
columns present segmentation results obtained using the proposed approach for an increasing number k={5,10,20} of nearest neighbors used to build the graph.
Firstly, it should be noticed, that the main parameter influencing performance of the proposed method is a number k of nearest neighbors used to build the graph. Changing value of k allows to adjust accuracy of image segmentation. Including more neighbors from the feature space into the graph increases the compactness of the result and allows to eliminate regions of slightly lower similarity from the resulting image.
It can also be observed, that (regardless of number of neighbors used to build the graph) performing graph cuts in the feature space as proposed in this paper increases quality of image segmentation. While the segmentation using the graph build with regard to spatial adjacency allows to obtain only the coarse shape of the objects, application of the proposed approach increases the level of details present in the output image. This can be observed for example in case of yarn image where the proposed method extracted both - the yarn core and the protruding fibers or in case of frog image, where the proposed method extracted not only the frog trunk, but legs as well.
VI. Computational Issues
The proposed algorithm requires nearest neighbor searching. This makes it more computationally complex and time consuming, than the source method proposed by Boykov and Jolly which just checks four adjacent pixels. using the brute force solution for nearest neighbors searching drastically increases time of image segmentation. However, the proposed approach usually requires a small, fixed number of neighbors which can be found efficiently using the approximate nearest neighbor searching algorithm and the corresponding ANN library [5]. As a result, a running time of the proposed method is less than 30 seconds for an image sized 256x256 pixels (Intel Core i7 3,2GHz, 12 GB RAM).
VII. Conclusions
In this paper problem of graph-based min-cut/max flow image segmentation was considered. The new approach was proposed. It regards neighborhood in the feature space during graph construction rather than the spatial neighborhood of pixels as in case of grid graphs used by the previous approaches. This allow to capture non-local properties of images and obtain more accurate image segmentation for a wide spectrum of different images.
Acknowledgements
This research was supported by Ministry of Science and Higher Education of poland in a framework of research project no. N N516 490439 (funds for science in years 2010-2012). The author receives financial support from the Foundation for Polish Science in a framework of START fellowship.
References
[1] J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(22), 2000,pp.888-905
[2] Y. Boykov, M.-P. Jolly, “Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images”,
International Conference on Computer Vision, 2001, pp. 105-112.
[3] Y. Boykov, V. Kolmogorov, “An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision” ,
IEEE Transactions on Pattern Analysis and Machine Intelligence,
9(26), 2004, pp. 1124-1137
[4] L. R. Ford, D. R. Fulkerson, “Maximal Flow Through a Network”, Canadian Journal of Mathematics, 8, 1956, pp. 399-404
[5] M. D. Mount, S. Arya, “ANN: A Library for Approximate Nearest Neighbor Searching”, 2010, available on-line at: http://www.cs.umd.edu/ ~mount/ANN/
Anna Fabijanska is an Assistant Professor at Computer Engineering Department of Technical University of Lodz (Poland). She received her Ph.D. in Computer Science from Technical University of Lodz in 2007. Her research interests focus on development of image processing and analysis algorithms for industrial and biomedical vision systems.
98
R&I, 2012, №4