Probabilistic analysis of the greedy algorithm1
N. N. Kuzjurin
Abstract. It is shown that the greedy algorithm in the average case (in some probabilistic model) finds almost minimum covers. It is shown also that in the average case the ratio ol the size ol minimum cover to the size ol minimum fractional cover has logarithmic order in the size of the ground set.
1. Introduction
Set cover is one of the oldest and most studied NP-hard problems [7, 6, 8, 2, 4]. Given a ground set U of n? elements, the goal is to cover U with the smallest possible number of subsets from a given family S' = {Si,..., S',,}, S',; C U. A cover is arbitrary subfamily S(I), I C [n] such that
U = UjgjS,:,
where [n] = {1, 2,..., n}.
The value |/| is called the size of a cover. A cover of the smallest size is called minimum cover. The size of minimum cover is denoted by C(S).
One of the best polynomial time algorithms for approximating set cover is the greedy algorithm: at each step choose the unused set from the family S' which covers the largest number of remaining elements.
R is called an approximation ratio of an algorithm A if for all input data S' the following holds
<?a(S)
C(S) - ’
where C'a(S) denotes the size of a cover obtained by the algorithm A.
Lovasz [8] and Johnson [6] showed that the approximation ratio of the greedy algorithm is no worse than H(m), where H(m) = 1 + 1/2 + .. . + 1/m is the mth harmonic number, a value which is clearly between Inn? and 1 + lnm. Similar results were obtained in [9, 10]. These results were improved slightly by Slavik [11] who showed that the approximation ratio of the greedy algorithm is exactly Inn? — In Inn? + 0(1). Feige [3] proved that for any t > 0 no polynomial time algorithm can approximate set cover within (1 — t) Inn? unless NP C DTIME[??°(loglog ")].
L Supported by RFBR, grants 02-01-00713 and 04-01-00359.
It is well-known that set cover forms an important class of integer programs
mincx| Ax. > b, x € {0, l}n, (1)
where c = (1,1), b = (1,1)T and A = (a.y) is an arbitrary m x n (0,1)-
matrix.
To see this it is sufficient to chose (0, l)-matrix A = (a.y) such that a.y = 1
iff Ui € Sj, where U = {wi,... , wm}. In such a way we correspond covering of
(0,l)-matrix to covering by a family of subsets. The size of minimum cover we will denote by C(A) as well.
In particular, it is known (see, [10]) that for any (0,l)-matrix of size m x n with at least k l’s in each row the size of the minimum cover C(A) satisfies
mk
+ (2)
n—k
In fact, it is known that the size of any cover obtained by the greedy algorithm satisfies (2).
But all these investigations were related to the worst case performance of the greedy algorithm. In this paper we consider the average case and show that the asymptotic approximation ratio of the greedy algorithm in the average case is at most 1 + s for arbitrary constant t > 0. It is shown also that the ratio of the size of minimum cover to the size of the fractional cover is approximately lnmp in the average case.
2. Average case analysis of the greedy algorithm
In this section we consider a probabilistic model in which A = (a.y) is a random (0,l)-matrix such that P{a.y = 1} = p and P{a.y = 0} = 1 — p independently for all i, j. Then, the value C(A) becomes a random variable.
Lemma 1 [1]. Let Y be a sum of n independent random variables each taking the value 1 with probability p and 0 with probability I — p. Then
P{|y — np\ > Snp} < 2exp{— (62/3)np}.
Let L0 = -
Theorem 1. Let the probability p be such that 0 < p < c < 1, where c is a constant. Let
In In n
---------> 0 as n —> oo, (3)
In mp
In m
0 as n —> oo. (4)
np
Then for any fixed t > 0
P{(1 _ £)Lq < C(A) < (1 + s)L0} - 1
(5)
a s n —> oo.
Corollary 1. Let m = cn, where c is some constant and p be a constant. Consider the problem (1) with a random A defined above. Then (4) holds. Proof of Theorem 1. Lower bound. Let X(l) be the random variable equal to the number of covers in A of size I. We have
where P(l) is the probability that fixed I columns form a cover in A. It is not difficult to see that
Considering two cases (p is a constant, and p —> 0) it is not difficult to see that for any fixed 0 < S < 1 under the condition (3) the last expression tends to — oo as n tends to infinity.
Thus, the probability that there are no covers of size Iq in a random (0,1 )-matrix A tends to 1, because
Clearly, if there are no covers of size Iq in A then there are no covers of size smaller than Iq as well. Therefore,
P(l) = (l - (1 - p)l)m < exp{-m(l - p)'}. Thus, using the inequality ('^) < nk, we have
lnEX(Z) < linn — m{ 1 — p)1 .
Taking / = /q = — f(l — ¿)lnmp/ln(l — £>)] we get
P{X(/0) > 1} < EX(Z0) —> 0.
P{C(A) > /0} ^ 1.
Upper bound. We use the upper bound (2).
By Lemma 1, in a random (0,l)-matrix for any S > 0 with probability tending to 1 each row contains k l’s where (1 + (5)pn > k > (1 — 5)pn. Indeed, Lemma 1 implies that the probability that some fixed column contains k l’s with (1 + S )pn > k > (1 — S )pn is
Pbad < 2exp{—((52/3)nj)},
and the expectation of the number of such rows is at most mP^ad.. It is not difficult to see that
inPbad < 2exp{lnm — (S2/3)np} = 2exp{lnm — O(np)} —> 0, as n —> oo,
by the condition (In m)/np —> 0 as n —> oo. Hence, Markov’s inequality P{X > 1} < EX implies that the probability of the event ’each row contains k l’s where (1 + 5)pn > k > (1 — 5)pn tends to 1.
Thus, we obtain
C(A\ < ln^ < ln(mPd +<*)) < __Mmp(l + 5))
v / — n — n — 1
- ln(l-p(l-6)y
Simplifying we get
ln(mp( 1 + 6)) ln(mp) + ln( 1 + 6)
ln(l — p(l — S)) ln(l — p(l — S))
For any constant t > 0 there exists a constant S > 0 such that the latter expression is at most
_n + ^
1 ln(l — p)'
Combining the inequality with the lower bound of C(A) we arrive at the desired inequality. The proof of Theorem 1 is complete.
We can reformulate our result in other words. Let us define an asymptotic approximation ratio of an algorithm as the limit of approximation ratio when n goes to infinity. Then Theorem 1 gives the conditions guaranteeing the asymptotic approximation ratio of the greedy algorithm is equal to 1 in the average case.
3. Integral and fractional covers
In this section we consider the same probabilistic model in which A = (a.y) is a random (0,l)-matrix such that P{a.y = 1} = p and P{a.y = 0} = 1 — p independently for all i, j. In the previous section we have estimated the value of
C(A) for almost all matrices. In this section we find the value of the optimum of the linear relaxation of (1).
Recall that the linear relaxation of (1) is the same program where the restriction x € {0, l}n is replaced by 0 < Xj < 1, j = 1,..., n. We denote the optimum value of the linear relaxation by q{A).
Theorem 2. Let
In m In n
----------> 0,----------> 0 as n —> oo.
np mp
Then for any fixed t > 0
P{(1 - ¿)/P < q(A) < (1 +s)/p} -*■ 1
as n —> oo.
Proof. We have already shown that the condition (In m)/np —> 0 as n —> oo implies that the probability of the event “each row contains k l’s” where (1 + 6)pn > k > (1 — 5)pn tends to 1.
Similarly we can show that the condition (In m)/tip —> 0 as n —> oo implies that the probability of the event “each column contains t l’s” where (1 + S)pm > t > (1 — S)pm tends to 1.
Proof of Claim. 1. Let x = (xi,.. . ,xn)T be an optimal solution of the linear relaxation of (1). We have
m n m
' X 1
i— 1 j—1 i— 1
On the other hand,
m n n m n m n
EE EE E Xj E Q'ij ^ E x.j( 1 + S)pm = q( 1 + S)mp.
i— 1 j — 1 j — 1 i— 1 j— 1 i— 1 j — 1
Therefore, q > ((1 + 5)p)~1. Furthermore,
1 1
(1 — 6)np' (1 —6)np
is a feasible solution to the linear relaxation of (1) because
n n 1 1 n
g<»«*) = «a(!_{)„,, = {1_S)npS“a -
This implies 'Ylj—i > 4i that is, 1/(1 — S)p > q. We have
((1 + д)рГ1 <q< ((1 -6)p)~1.
This implies the assertion of Theorem 2.
Corollary 2. Let all the conditions of Theorem 1 and 2 hold. Then for any fixed t > 0
P{( 1 — t) In mp < ^ (1 + t) In nip} —> 1
as n —> oo.
4. Average case analysis: towards the general case
It seems interesting to extend our technique to the general distribution where P{aij = f} = Pij- The main ingredient of this technique was obtaining lower bounds for the size of minimum cover of random matrices.
In this section we do the first step towards this goal. We consider a probabilistic model in which A = (a.y) is a random (0,l)-matrix such that P{a.y = 1} = pi and P{a.y = 0} = 1 — pi independently for all i, j. The difference between this model and the one from the previous section is that we allow here different probabilities for different rows.
m
p = m~1 ■ У Pi, pmax = max p{.
Z--* i
i= 1
Theorem 3. Let pmax —> 0 as n —> oo, and
In In n
-—-—— —> 0 as n —> oo. ln(mp)
Then for any fixed t > 0
P{(i _ < C(.4)} ^ i
P
as n —> oo.
Proof of Theorem 3. Let X(L) be the random variable equal to the number of covers in A of size L.
E X(L)=^P(L),
where P{L) is the probability that arbitrary fixed L columns form a cover in A. Using the inequality 1 — x < e~x, it is not difficult to see that
m m
P{L) = П (: - -Pi)L) ^ exP{“E(1
i= 1 i= 1
Thus, taking into account that ('^) < nk, we get
m
lnEX(L) < Lin?? — m
i= 1
Using the fact that the arithmetic mean is always at least as the geometric mean we can estimate the sum as follows:
m / 1 m \ / m \ l/m m
i=l \ i= 1 / \i= 1 / ¿=1
Taking this into account we get
m
In EX (L) < L In n — m [[(1 -Pi)L/m-
i=l
Using the inequality
1 — x > e~ — , 0 < x < 1,
we have
m
lnEX(L) < Lin?? — mexpi— ------------------——} < Lin?? — m expl—Lp(l + o(l))}.
m 1 — p.;
i=l
Let
P
The inequality above implies
lnEX(Li) < ln("?P) lnw _ TOeXp{-(l - g)ln(w^jo(l + o(l))}
P P
In Tft
< _ In?? — ??? expl—(1 — s) ln(mp)(l + o(l))}
P
= = 111 ??? Ill ?? —
P
= \ ^lnmln?? — ?7?(?7?|?)e^1+0^1^_0^1^ .
For any fixed t > 0 this expression tends to — oo when ?? goes to infinity in view of the conditions of Theorem 3.
Thus, the probability that there are no covers of size Li in a random (0, l)-matrix A tends to 1, because by Markov’s inequality
P{X(Li) > 1} < EX{Lx) -► 0.
Clearly, if there are no covers of size L\ in A then there are also no covers of size smaller than L\. Therefore, with probability tending to 1
C(A) > L, = (1 P
References
[1] N. Alon and J.H. Spencer, The Probabilistic Method, Wiley, 1992.
[2] V. Chvatal, A greedy heuristic for the set-covering problem, Mathematics of Operations Research, 4 (1979) 233-235.
[3] U. Feige, A threshold of Inn for the approximating set cover, Proceedings of the ACM Symposium on Theory of Computing, 1996, pp. 314-318.
[4] M.R. Carey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, Freeman, New York, 1979.
[5] G.P. Gavrilov, A.A. Sapozhenko, Problems and exercises in discrete mathematics, Ivluwer Texts in Math. Sci., v.14, Ivluwer Academic Publishers, 1996.
[6] D.S. Johnson, Approximation algorithms for combinatorial problems, J. Comput. System Sci., 9 (1974) 256-278.
[7] R.M. Ivarp, Reducibility among combinatorial problems, in, Complexity of Computer Computations (R.E. Miller and J.W. Tatcher, Eds.), Plenum, New York, 1972, 85-103.
[8] L. Lovasz, On the ratio of optimal integral and fractional covers, Discrete Math. 13 (1975) 383-390.
[9] R.G. Nigmatullin, An algorithm of steepest descent in the set cover problem (Russian), Proceedings of Symposium on approximation algorithms. Kiev, May 17-22, 1969, p. 36.
[10] A.A. Sapozhenko, On the size of disjunctive normal forms obtained by the gradient algorithm (Russian), Discrete analysis. Novosibirsk, 1972, N 5, p. 111-116.
[11] P. Slavik, A tight analysis of the greedy algorithm for set cover, J. Algorithms, 25 (1997) 237-254.
108