On vector summation problem in the Euclidean space ? Edward Kh. Gimadi1,2 , Ivan A. Rykov1,2 , and Yury V. Shamardin1 1
2
Sobolev Institute of Mathematics, 4 Acad. Koptyug avenue, 630090 Novosibirsk, Russia Novosibirsk State University, 2 Pirogova Str. 630090, Novosibirsk, Russia
[email protected],
[email protected],
[email protected]
Abstract. We consider a problem of finding a subset of the smallest size in the given set of vectors such that the norm of sum vector is greater or equal to some given value. We show that the problem can be solved optimally with the same complexity as the problem of finding the subset of given cardinality with minimun norm of sum vector.
Keywords: vector subset, sum vector, Euclidean space, exact algorithm
1
Inroduction
We consider the following problem: given the set of vectors V = v1 , . . . , vn in Euclidean space Rd and a real number B. Find a subset X in the set V of minimum cardinality, provided that the norm of sum vector is greater of equal to B. |X| → min ,
P X⊂V (1)
v ≥ B.
v∈X
The given Problem 1 is closely related to the Largest m-Vector Sum (m-LVS) Problem considered in papers [2–7]. In this problem for a given cardinality m one needs to find a subset of exactly m vectors so that the Euclidean norm of the sum vector is maximized. The corresponding problem with arbitrary cardinality of the subset is referred to as LVS [7]. m-LVS Problem is NP-hard in general case [2,6]. However, it was proved that for any fixed size of dimension d of the space Rd an optimal solution can be found in polynomial time [4]. In [7] the exact algorithm with the best known complexity O(nd+1 ) was suggested. In [5] an asymptotically exact randomized algorithm with significantly better time complexity O(nd3/2 (8/7 ln n)d ) was introduced. Problem 1 plays a ”complementary” role to this problem: a cardinality of the subset is minimized provided that the norm of sum vector remains sufficiently large. We will denote the Problem 1 as B-LVS. ?
This research was supported by the Russian Scientifuc Foundation(project 16-1110041)
In particular, the decision variant of the Problem 1 can be stated as ”given a set V , real value B and integer m, is there a subset of size less or equal to m, s.t. the length sum vector is greater or equal to B?”. For the particular case of m equal to n this statement exactly equal to the decision variant of LVS Problem. Thus, B-LVS Problem is also NP-hard.
2
Exact algorithm for solving B-LVS Problem
The idea of exact algorithm for solving this problem is based on using the exact algorithm for the problem of finding subset of vectors of the given cardinality with a minimum norm of sum vector, introduced in [4]. Indeed, solving the m-LVS for m = 1, 2, . . . n, we will choose the smallest m, such that the maximal norm of sum vector for the subset of cardinality m is greater or equals B. In [4] it is shown that m-LVS can be solved optimally with complexity O(d2 n2d ). It is easy to show that the B-LVS Problem can be solved with the same complexity. Theorem 1. Optimal solution of the Problem 1 can be found in O(d2 n2d ) time. Proof. Indeed, consider the exact algorithm for solving m-LVS from [4]. Following notation from [1] let us call an orthogonal hyperplane for the nonzero vector v ∈ Rd a hyperplane defined by the equation (v, x) = 0. This hyperplane is a (d − 1)-dimensional linear subspace, which consists of vectors orthogonal to vector v. We call a family of solution domains for the given non-zero vectors u1 , u2 , . . . , ut a family of maximal (by inclusion) connected subsets of the space Rd , such that these subsets does not intersect with orthogonal hyperplanes of vectors u1 , u2 , . . . , ut . A set of vectors of the space Rd , which contains exactly one vector for each solution domain is called solution domains representatives for the vectors u1 , u2 , . . . , ut . An estimation of time complexity of the algorithm AGP R is based on the following lemma. Lemma 1 ( [1]). A family of domain representatives for non-zero vectors u1 , u2 , . . . , ut in Rd has cardinality O(dtd−1 ) and can be constructed in O(d2 td ) time. In context of solving m-LVS we consider uij = vi − vj , 1 ≤ i < j ≤ n. The full algorithm can be described as follows ( [4]): Step 1. For the set of vectors uij , 1 ≤ i < j ≤ n construct a family W of solution domains representatives. For each w ∈ W perform the steps 2–3; Step 2. Sort the vectors v1 , v2 , . . . , vn by non-increasing projection to the direction given by w; denote the corresponding ordering as v10 , v20 , . . . , vn0 ;
0 Step 3. Put U w = {v10 , v20 , . . . , vm } (i.e., m vectors with maximum projections to w w are taken as U ; Step 4. Find such U ∈ {U w | w ∈ W }, that
X
F (U ) = v = max{F (U w ), | w ∈ W }. v∈U
U is output as algorithm result. By applying Lemma 1 to the set of n(n − 1)/2 vectors {uij } we can see that the time complexity of the step 1 is O(d2 n2d ) and that Steps 2–3 are performed O(dn2d−2 ) times. The total complexity of Steps 2–3 is hence defined by the complexity of Step 2 (equal to O(dn log n)) and Step 3 (equal to O(dn)) multiplied by O(dn2d−2 ), which does not exceed the complexity of the first step. Step 4 has complexity equal to O(dn2d−2 · dn). Now let us notice that the exact algorithm for solving the Problem 1 (we denote it as AB-LVS ) can be obtained from the described algorithm by substituting Steps 3 and 4 by the following steps: 0 w }; = {v10 , v20 , . . . , vm Step 30 . For each m = 1, 2, . . . n put Um 0 w Step 4 . Considering m = 1, 2, . . . n choose Um ∈ {Um , | w ∈ W, m = 1, 2, . . . n}, such
X
w F (Um ) = v = max{F (Um ), | w ∈ W, m = 1, 2, . . . n}; v∈Um
for the smallest m, such that F (Um ) > B, take Um as a solution of the Problem 1. Indeed, the modified algorithm AB-LVS with new steps 30 and 40 consequentially solves m-LVS for each m = 1, 2, . . . n and chooses the least m such that the norm of the the longest sum vector among all the subsets of cardinality m exceeds specified threshold B. At that, the time complexity of Step 30 in the new algorithm equals O(dnd−2 · n · dn) which does not exceed the time complexity of Step 1. Similarly, the time complexity of Step 40 equals O(dn2d−1 · dn2 ), which does not exceed the time complexity of Step 1. Hence, total time complexity of the algorithm is equal to O(d2 n2d ). Thus, we have shown that the algorithm AB-LVS find optimal solution of the Problem 1 in stated time. The proof of the Theorem 1 is complete.
3
Randomized algorithm for the B-LVS Problem
In paper [5] a randomized algorithm Arand based on the similar ”projecting” idea is introduced. Instead of constructing the set of solution domains representatives this algorithm uniformly generates L random directions and solves ”m projections” subproblem (Steps 2, 3 of the algorithm described in previous section) for each of them.
A Recall that a randomized algorithm A is said to have (εA n , δn ) estimates (called relative error and failure probability, correspondingly) over the set In of all the instances with n elements in thee input data, if ∗ f (I) − fA (I) A I ∈ I ≥ 1 − δnA P < ε n n f ∗ (I)
Based on theorem from [5] it is easy to show that with certain value of parameter L algorithm fullfills the condition of the follownig theorem. Theorem 2. Algorithm Arand finds asymptotically optimal solution of m-LVS Problem with the relative error εn ≤
1 2 ln2 n
and the failure probability δn ≤ in
1 n2
T = O nd3/2 (8/7 ln n)d+1
time. It should be noted that this algorithm can’t be directly adapted for solving Problem 1, since it is not guaranteed to find the optimal solution of the problem. Hence, when the input boundary value B is set equal to the optimal length of the sum vector (for any cardinality of the subset), an algorithm can fail to find any feasible solution. E.g., consider the problem input given by value B and the following vectors in R2 : v1 = v2 = . . . = vs = vs+1 = . . . = vs+s/2
B
,0
s B = − ,0 s
for some integer number s. The optimal solution of the Problem 1 if given by the set of m∗ = s vectors with X ∗ = (v1 , . . . vs ). In order to obtain solution with the norm greater or equal B for a sampled random direction, one has to sample exactly (1, 0), the probability of which equals 0. ∗ Denote as Bm — the optimal value of the m-LVS Problem. We consider conditions when application of Arand to each m will provides a solution of the problem. Consider the algorithm AB-LVS-rand : m Step 1. For each m = 1, 2, . . . n find subset Um = {um 1 , . . . um } as result of application of Arand for corresponding m-LVS; Step 2. Choose the least m for which sum vector for Um has norm greater or equal to B. Put U = Um for this m as a solution for problem 1.
∗ ∗ ∗ Theorem 3. If B ≤ maxm (Bm ) and B ∈ / Im = (1 − 2 ln12 n )Bm , Bm for each m = 1, 2, . . . n, then the algorithm AB-LVS-rand finds optimal solution of Problem 1 with failure probability 1 δ≤ n in T = O(n2 d3/2 (8/7 ln n)d+1 ) time. Proof. It is easy to see, that the algorithm AB-LVS-rand finds an optimal solution under the conditions of the theorem. Indeed, optimal value m∗ for the B-LVS ∗ . Under conditionsP of the theorem Problem is the least m, such that B ≤ Bm 1 ∗ ∗ B ≤ Bm if and only if B ≤ (1 − 2 ln2 n )Bm ≤ Cm , where Cm = i (um i ) is norm of the sum vector of the solution Um . The latter inequality holds due to the Theorem 2 in case of no failure. Hence, choosing the least m with B ≤ Cm will provide an optimal solution. The failure probability of solving the Problem m-LVS for each m can be estimated as 1 − (1 − n12 )n ≤ n1 (since the failure is independent for each of n problems). The proof of the Theorem 3 is complete.
4
Conclusion
In this paper we cosidered the problem of minimizing cardinality of vector subset provided that the norm sum vector exceeds certain boundary value. We showed that the exact algorithm for m-LVS can be adapted for solving this problem without increasing of time complexity. We also showed the conditiones on value of B when the randomized algorithm with significantly less complexity can find optimal solution of the problem with high probability (i.e., tending to 1 when n tends to infinity). Acknowledgments. This research was supported by the Russian Scientific Foundation for Basic Research (project 16-11-10041).
References 1. A. Baburin, A. Pyatkin. PolynomialAlgorithms for Solving the Vector Sum Problem. // Journal of Applied and Industrial Mathematics, 2007, Vol. 1, No. 3, pp. 268-272. 2. A. Baburin, E. Gimadi, N. Glebov, A. Pyatkin. The Problem of Finding a Subset of Vectors with the Maximum Total Weight. // Journal of Applied and Industrial Mathematics, 2008, Vol. 2, No. 1, pp. 32-38. 3. E. Gimadi, A. Kel’manov, M. Kel’manova, S. Khamidullin. A Posteriori Detecting a Quasiperiodic Fragment in a Numerical Sequence // Pattern Recognition and Image Analysis. 2008. Vol. 18, No. 1, pp. 30-42.
4. E. Gimadi, A. Pyatkin, I. Rykov. On polynomial solvability of some problems of a vector subset choice in a Euclidean space of fixed dimension. // Journal of Applied and Industrial Mathematics, 2010, Vol. 4, No. 48, pp. 48-53. 5. E. Gimadi, I. Rykov. A randomized algorithm for finding a subset of vectors with the maximum Euclidean norm of their sum // Journal of Applied and Industrial Mathematics, 2015, Vol. 9, No. 3, pp. 351-357. 6. A. Pyatkin. On the complexity of the maximum sum length vectors subset choice problem. // Journal of Applied an Industrial Mathematics, 2010, Vol. 4, No. 4, pp. 549-552 7. V. Shenmaier. Solving some vector subset problems by Voronoi diagrams // Journal of Applied and Industrial Mathematics, 2016, Vol. 10, No. 4, pp. 550–566