1

I was looking at the Advent of Code 2023 day 11 problem and misread it to think that the question ended up asking something akin to what's below (after transforming the positions of the "galaxies"), and I'm wondering what kind of optimization problem this is. See nearly equivalent problem, but with specified values for distance constraint.

Given a set of $N$ points on a lattice where pairwise distance is $D_{ij} = |(\Delta x)_{ij}| + |(\Delta y)_{ij}|$, select the set of distance pairs, that include at least $N-1$ of the points such that the sum of these chosen distances are minimized and that no point is in more than one of the distance pairs. I.e., in graph $G = (V,E)$, find a perfect or near-perfect matching that minimizes the sum of weights in the matching.

I heard of the vertex cover problem from Numberphile, so I googled for edge cover and got something on wikipedia, but I'm not sure if this is a case of that problem or not.

D.W.
  • 159,275
  • 20
  • 227
  • 470
Orion Yeung
  • 113
  • 5

1 Answers1

1

If you just want any polynomial-time algorithm, you can solve this with a maximum weight matching algorithm.

Draw a graph with a vertex per point, and the weight on an edge $(i,j)$ is $C-D_{ij}$. Here $C$ is a very large constant, chosen to be bigger than $N \cdot \max_{i,j} D_{ij}$. Now find a matching of maximum weight, e.g., with Edmonds' blossom algorithm. I claim this gives you the solution to your original problem.

By construction, this matching will contain $\lfloor N/2 \rfloor$ edges and $2 \lfloor N/2 \rfloor$ vertices. (Why? Because any matching with fewer edges will have total weight at most $(\lfloor N/2 \rfloor -1) \cdot C$, and any matching with $\lfloor N/2 \rfloor$ edges will have total weight at least $\lfloor N/2 \rfloor \cdot (C - \max_{i,j} D_{ij})$, and the latter is larger.)

Among matchings of such size, maximizing the sum of $C-D_{ij}$ values is equivalent to minimizing the sum of $D_{ij}$ values. Therefore, the maximum-weight matching minimizes the sum of $D_{ij}$ values, among all solutions that include exactly $2 \lfloor N/2 \rfloor$ vertices.

What about the requirement that we include either $N$ or $N-1$ vertices? Well, in any matching, the number of vertices included must be even. If $N$ is even, this means that we only care about solutions that include exactly $N$ vertices. If $N$ is odd, this means we only care about solutions that include exactly $N-1$ vertices. That is equivalent to searching over the space of all solutions with exactly $2 \lfloor N/2 \rfloor$ vertices.

That proves that this algorithm solves your optimization problem.

It's possible there might be a more efficient algorithm, by taking into account the structure of the graph (specifically, that all edge weights are Manhattan distances from a set of points in 2D).

D.W.
  • 159,275
  • 20
  • 227
  • 470
  • Thanks for the answer! The reference was quite helpful. Would expand in your answer as to why a perfect matching would exist on such a graph (fully connected)? I'm new to all the terms, but my expectation is that one would need an even $N$ for a perfect matching, so that only one of those graphs (original and +1 vertex) would have a perfect matching – Orion Yeung Dec 12 '23 at 18:29
  • I may have not stated the problem well enough, I would like to choose a matching (at most one edge from the selected set will be incident on a given node). – Orion Yeung Dec 12 '23 at 20:47
  • @OrionYeung, I've revised my answer to fix some errors, and I believe it should now address your questions as well. My apologies for my mistakes, and thank you for your patience with me. I hope it is correct now, but please check it carefully for yourself. – D.W. Dec 13 '23 at 01:43
  • thanks for the revision and edit of my question. I think this argument works well. It is curious to me that while C=0 does not motivate the proof, C≠0 does not seem to be a constraint on the constructed graph as far as I can see in the applicability of Edmonds' blossom algorithm. I might look at using the nature of the distance information. – Orion Yeung Dec 13 '23 at 04:32
  • @OrionYeung, Thank you for the review of this answer and helping me debug it! It seems to me that the problem with C=0 is that the maximum matching (i.e., the matching whose total weight is maximum) need not be a perfect matching or a near-perfect matching, and Edmonds' algorithm gives the maximum-weight matching, not necessarily the maximum-weight perfect matching. I think. I could be wrong. – D.W. Dec 13 '23 at 05:05