4

There is a complete graph $G$ with $n$ vertices and each edge has a distinct weight.

Is there an efficient (not necessarily optimal) algorithm to select $k$ vertices from the graph $G$, such that the total cost of the min-spanning tree of the selected $k$ vertices will be highest?

In my case, $n$ is around 1000, and $k$ around 100.

Juho
  • 22,554
  • 7
  • 62
  • 115

2 Answers2

4

Your problem is NP-complete, so you should not expect to find any efficient algorithm for it.

If we could solve your problem efficiently, we could solve the $k$-clique problem, too. Let $G$ be an unweighted, undirected graph; we are curious whether it has a $k$-clique. Translate it into an instance of your problem by making each edge have weight 2, and then for each pair of vertices $u,v$ that are not connected by an edge, add a new edge of weight 1. Now ask for the solution to your problem. Notice that there exists a way to select $k$ vertices such that the corresponding spanning tree has cost $2k$, if and only if $G$ has a $k$-clique (if $G$ does not have a $k$-clique, no matter how you select $k$ vertices, the cost of the spanning tree will always be $\le 2k-1$).

Since the $k$-clique problem is NP-complete, your problem is NP-complete, too. Therefore, we cannot expect an efficient solution for your problem, when $k$ is arbitrary.

D.W.
  • 159,275
  • 20
  • 227
  • 470
2

A theoretically efficient algorithm can be obtained as follows for constant $k$. First, notice you can compute a maximum weight spanning tree by negating the edge weights and running Kruskal's algorithm. Then, observe there are $\Theta(n^k)$ ways to choose a set of $k$ vertices. We can step through each set, and compute a maximum weight spanning tree for each. The total run time will be $O(n^k m \log n)$, or even a little bit better by tuning Kruskal's algorithm with a disjoint-set data structure.

In practice, we can't live with the $n^k$ term when $k$ gets even moderately large. Since you don't require an exact solution, you could further apply heuristics into this. For example, to make $k$ smaller, first identify suitably many heaviest edges, and choose their endpoints. Then by some strategy, complete this graph to a spanning tree (say by trying all ways of picking the rest of the vertices).

Alternatively, you could consider a totally different approach. Take some good heuristic for finding a maximum weight $k$-clique, and then drop enough lightest edges to obtain a spanning tree. Or just a craft a more direct method yourself, I could imagine a genetic algorithm to be pretty good in your problem: pick random $k$-sets for your individuals, and as your fitness function you can use Kruskal's algorithm really.

By spending some more time thinking about this, you can probably find a faster exact algorithm too. I would start by following the idea in my second paragraph.

Juho
  • 22,554
  • 7
  • 62
  • 115