Pick a subgraph that maximizes the total cost of min-spanning tree among all subgraphs of the same size

Question

There is a complete graph $G$ with $n$ vertices and each edge has a distinct weight.

Is there an efficient (not necessarily optimal) algorithm to select $k$ vertices from the graph $G$, such that the total cost of the min-spanning tree of the selected $k$ vertices will be highest?

In my case, $n$ is around 1000, and $k$ around 100.

score 4 · Accepted Answer · answered Aug 03 '14 at 01:26

Your problem is NP-complete, so you should not expect to find any efficient algorithm for it.

If we could solve your problem efficiently, we could solve the $k$-clique problem, too. Let $G$ be an unweighted, undirected graph; we are curious whether it has a $k$-clique. Translate it into an instance of your problem by making each edge have weight 2, and then for each pair of vertices $u,v$ that are not connected by an edge, add a new edge of weight 1. Now ask for the solution to your problem. Notice that there exists a way to select $k$ vertices such that the corresponding spanning tree has cost $2k$, if and only if $G$ has a $k$-clique (if $G$ does not have a $k$-clique, no matter how you select $k$ vertices, the cost of the spanning tree will always be $\le 2k-1$).

Since the $k$-clique problem is NP-complete, your problem is NP-complete, too. Therefore, we cannot expect an efficient solution for your problem, when $k$ is arbitrary.

That said, all hope is not lost. – Raphael Aug 03 '14 at 20:52 — Raphael, Aug 03 '14 at 20:52

Juho · Answer 2 · 2014-08-03T20:16:30.100

A theoretically efficient algorithm can be obtained as follows for constant $k$. First, notice you can compute a maximum weight spanning tree by negating the edge weights and running Kruskal's algorithm. Then, observe there are $\Theta(n^k)$ ways to choose a set of $k$ vertices. We can step through each set, and compute a maximum weight spanning tree for each. The total run time will be $O(n^k m \log n)$, or even a little bit better by tuning Kruskal's algorithm with a disjoint-set data structure.

In practice, we can't live with the $n^k$ term when $k$ gets even moderately large. Since you don't require an exact solution, you could further apply heuristics into this. For example, to make $k$ smaller, first identify suitably many heaviest edges, and choose their endpoints. Then by some strategy, complete this graph to a spanning tree (say by trying all ways of picking the rest of the vertices).

Alternatively, you could consider a totally different approach. Take some good heuristic for finding a maximum weight $k$-clique, and then drop enough lightest edges to obtain a spanning tree. Or just a craft a more direct method yourself, I could imagine a genetic algorithm to be pretty good in your problem: pick random $k$-sets for your individuals, and as your fitness function you can use Kruskal's algorithm really.

By spending some more time thinking about this, you can probably find a faster exact algorithm too. I would start by following the idea in my second paragraph.

Pick a subgraph that maximizes the total cost of min-spanning tree among all subgraphs of the same size

2 Answers2