4

I've developed the following backtrack algorithm, and I'm trying to find out it time complexity.

A set of $K$ integers defines a set of modular distances between all pairs of them. In this algorithm, I considered the inverse problem of reconstructing all integer sets which realize a given distance multiset. i.e. :


Inputs: $D=\{p_i−p_j \mod N, i≠j \},K $
Output : $P=\{p_1,p_2,...,p_K\},\qquad p_i \in \{0,1,2,...,N-1\},\qquad p_i > p_j $ for $i>j$

Simply saying, the algorithm puts $K$ blanks to be filled. Initially, puts 1 in the first blank. For the second blank it looks for the first integer that if we add to P, it doesn't produce any difference exceeding the existent differences in $D$. Then, it does so, for next blanks. While filling a blank if it checked all possible integers and found no suitable integer for that blank, it turns back to the previous blank and looks for next suitable integer for it. If all blanks are filled, it has finished his job, otherwise it means that there weren't any possible $P$'s for this $D$.

Here's my analysis so far. Since the algorithm checks at most all members of $\{2,...,N\}$ for each blank (upper bound) there is $N-1$ search for each blank. If each visited blank was filled in visiting time, the complexity would be $O((K-1)(N-1))$ since we have $K-1$ blank (assuming first one is filled with 1). But the algorithm is more complex since for some blanks it goes backward and some blanks may be visited more that once. I'm looking for the worst case complexity i.e. the case that all blanks are visited and no solution is found.

Mahdi Khosravi
  • 171
  • 2
  • 2
  • 9
  • I don't understand the algorithm (paragraph 3 of the question). What is the relationship between $K$ and $n$? between the blanks and the set of $n$ integers? Can you give a more self-contained description of the algorithm? By the way, instead of thinking about this as a backtracking algorithm, you could think about this as a recursive algorithm that makes a choice among some number of options at each step; that might be easier to analyze. – D.W. Jul 12 '13 at 00:22

2 Answers2

6

The running time of your algorithm is at most $N (N-1) (N-2) \cdots (N-K+1)$, i.e., $N!/(N-K)!$. This is $O(N^K)$, i.e., exponential in $K$.

Justification: There are $N$ possible choices for what you put into the first blank, and in the worst case you might have to explore each. There are $N-1$ choices for the second blank, and so on. You can draw a tree of the choices made: the first level shows the choice of what to put in the first blank, the second level shows the choice of what to put in the second blank, and so on. The degree of the root is $N$; the degree of the nodes at the second level is $N-1$; and so on. The number of leaves is the product of the degrees at each level, i.e., $N (N-1) (N-2) \cdots (N-K+1)$. In the worst case, your algorithm might have to explore every possible node in this tree (if it is not able to stop early before reaching the $K$th level and backtrack from a higher-up node). Therefore, this is a valid upper bound for the running time of your algorithm.

If you want a tighter analysis, here is the exact worst-case running time (not an upper bound). The number of leaves in your search tree, in the worst case, is the number of strictly increasing sequences of length $K$ over $\{1,\dots,N\}$ that start with 0. (We can assume without loss of generality that the first blank contains a 0, as you point out, which is why we can restrict to sequences that start with a 0.) That number is exactly $C(N-1,K-1) = (N-1)!/((K-1)!(N-K)!)$.

This is a tighter analysis, but it doesn't save us from exponential running time. When $N\gg K$, $C(N-1,K-1)$ is still $O(N^K)$, i.e., exponential in $K$.

That said, evaluating your algorithm experimentally (by testing it on some real data sets) would probably be a better way to evaluate your algorithm than trying to derive a worst-case running time. You might want to compare it to the performance of translating your problem into a SAT instance and using an off-the-shelf SAT solver. Depending upon the value of $N$ and $K$, there might be other better alternatives as well.

See also the following question for a closely related problem, and for algorithms to solve it:

D.W.
  • 159,275
  • 20
  • 227
  • 470
  • Thank you for kindly answering. I've drawn the tree for the case of $(N,K)=(6,4)$ here (Note that we can put the first blank a $0$ without loss of anything). As you said in the $l$'th level we have at most $N-l$ choices and the worst case is the case when the algorithm explores all branches. (continued in next comment...)
  • – Mahdi Khosravi Jul 12 '13 at 11:00
  • ... But, when the algorithm explores all of the branches, it means that it chose the last branch $(N-1)$. But the worst search spaces are within the first branches of each nodes (as it is visible from the tree)! and the going through first branches is in contrast with the worst case of exploring all branches! So is this a good upper bound? – Mahdi Khosravi Jul 12 '13 at 11:03
  • The worst algorithm available is a combinatorial search which has complexity of $\binom NK = \frac{N!}{(N-K)!K!}$ which has less complexity than the proposed algorithm. Is it possible?
  • – Mahdi Khosravi Jul 12 '13 at 11:06
  • Would you mind telling me what is a SAT instance and off-the-shelf SAT solver ? Thanks
  • – Mahdi Khosravi Jul 12 '13 at 11:07