This problem is basically the problem of unit-time job sequencing with deadlines and profits as appeared on Geeksforgeeks, if we consider each fruit with its expiry date and energy as a one-day job with corresponding deadline and profit.
Basic idea
The basic ideas to solve this problem are
- A fruit with higher energy level is prioritized over another fruit with lower energy level, and
- a fruit should be picked up on the latest day among the remaining available days so as to leave more available days for fruits with more stringent expiry dates.
A greedy algorithm
- Let the total energy be 0.
- Obtain a sorted list of all fruits. A fruit with higher energy level precedes another fruit with lower energy level.
- For each fruit $f$ in the sorted list, try finding $d$, the latest available date before the expiry date of $f$.
- If there is no such date, do nothing (this fruit is ignored).
- Otherwise, pick up this fruit on date $d$. Add the energy level of $f$ to the total energy. Mark date $d$ as unavailable.
- Return the total energy.
Proof of correctness
Agreement(k), $0\le k\le n.$
There is an optimal scheduling that agrees with the algorithm on the scheduling of the first $m$ fruits (in the sorted list of all fruits) at steps 3.
The correctness of the algorithm is the proposition Agreement(n). Let us prove Agreement(k) for all $k$.
Proof. The base case, when $k=0$, is of course correct.
As the induction hypothesis, assume the proposition is correct for $k$, i.e., there is an optimal scheduling $\mathcal O$ that agrees with the algorithm on the scheduling of the first $k$ fruits.
Suppose the algorithm has just finished processing the first $k$ fruits at step 3. Consider the next iteration of step 3, when the next fruit, $g$ is processed.
- If there is no available date before the expiry date of $g$, the algorithm will not pick up $g$. This is what $\mathcal O$ has to do as well.
- Otherwise, the algorithm will pick up fruit $g$ on date $d$, the latest available date before the expiry date of fruit $g$.
- If $\mathcal O$ does not pick up any fruit on date $d$,
- If $\mathcal O$ does pick up $g$, modify $\mathcal O$ so that $\mathcal O$ pick up $g$ on date $d$.
- Otherwise, $\mathcal O$ does not pick up $g$. Modify $\mathcal O$ so that $\mathcal O$ picks up $g$ on date $d$, which will increase the total energy, which is impossible. So this case cannot happen.
- Otherwise, $\mathcal O$ picks up some fruit $g'$ on date $d$. Note that $g'$ cannot be one of the first $k$ fruits.
- If $g'$ is $g$, we are done.
- Otherwise, $g'$ is not $g$.
- If $\mathcal O$ pick up fruit $g$ on some date $d'$, modify $\mathcal O$ so that $\mathcal O$ picks up $g$ on date $d$ and picks up $g'$ on date $d'$. This modification is valid since both $g$ and $g'$ expires later than date $d$ and date $d'$ must be earlier than date $d$.
- Otherwise $\mathcal O$ does not pick up fruit $g$. Modify $\mathcal O$ so that $\mathcal O$ picks up $g$ instead of $g'$ on date $d$. Since the list of fruits are sorted primarily by energy level, the energy level of $g$ is no less than that of $g'$. So after modification, $\mathcal O$ should be optimal still.
In all cases, $\mathcal O$ with possible modification is an optimal scheduling that agrees with the algorithm on the scheduling of the first $k+1$ fruits. $\quad\checkmark$
Implementation with $O(n\log n)$ time-complexity
- Use a sorting algorithm with $O(n\log n)$ time-complexity such as merge sort to sort all fruits.
- How can we track available days and find some particular available day?
- We can use a balanced binary search tree where each node represents some consecutive available days.
- We can also use disjoint-set data structure (DSDS). Make a DSDS out of the first $n$ days. Maintain a partition of the set where each part consists of some consecutive days such that the only available day among them is the earliest one. Initially each day constitutes its own part of the partition. Whenever a day is used to pick up a fruit, the two parts near that day are combined into one part. Thanks to the power of path compression, the scheduling takes $O(n\log n)$ time.
An implementation in Python
Here is an implementation in Python.
There is a minor tweak to boost performance. We will apply the greedy algorithm only for all fruits whose expiry dates are no less than $n$. All other fruits will be counted towards total energy. It is not hard to verify the correctness of this modification. (We do not have to use this tweak. We can just replace every expiry date that is no less than $n$ with $n$.)
def maximum_energy(n, energy, expiry):
"""Return the maximum total energy
There are fruit 0, 1, ..., n-1.
`energy[i]` is the energy level of fruit `i`
`expiry[i]` is the expiry day of fruit `i`
"""
fruits = [(energy[i], expiry[i]) for i in range(n) if expiry[i] < n]
fruits.sort() # ordered primarily by energy level
# disjoint-set data structure (DSDS)
# day `parent[d]` is some day no later than day `d`.
# for any `d < n`, day `d` is available iff `parent[d] == d`.
parent = {i: i for i in range(n)} # all days are available
parent[-1] = -1 # day -1 is yesterday
def find(date):
""" return the latest available day no later than `date`"""
if parent[date] != date:
parent[date] = find(parent[date])
return parent[date]
else:
return date
# customized union method of DSDS. `date` was available.
def use(date):
# day `date` is not available any more.
parent[date] = find(date - 1)
total_energy = sum(energy[i] for i in range(n) if expiry[i] >= n)
while fruits:
energy, expiry = fruits.pop() # from the end
pick_up_day = find(expiry - 1)
if pick_up_day >= 0:
total_energy += energy
use(pick_up_day)
return total_energy
def test():
energy = [76, 66, 52, 51, 47, 23]
expiry = [3, 2, 2, 4, 2, 1]
max_energy = maximum_energy(6, energy, expiry)
print("maximum energy:", max_energy) # 245
test()