In-place linear sort of integers, again

Question

I am amazed by the many discussion regarding the existence of any linear and in-place sorting algorithm, and variants, see e.g. is-this-implementation-of-bucket-sort-considered-in-place is-counting-sort-in-place-stable-or-not sorting-in-linear-time-and-in-place can-a-counting-sort-be-considered-in-place-when-the-problem-constrains-the-numbe fastest-in-place-sorting-algorithm-for-epochtime in-place-and-out-place-sorting-meaning fast-stable-almost-in-place-radix-and-merge-sorts sorting-in-place-stable-in-linear-time

I am even more amazed by the lack of a clear and definitive answer, due in part to the fact that "in-place" is loosely defined.

I would therefore like to give it one more try, aiming at being more precise and rigorous, with your help.

Sorry if this is too redundant; if so, please do not hesitate to tell me where.

Problem statement.

Input: $n$ positive integers of values lower than $k$, with $k\le n$.

Output: the $n$ integer in non-decreasing order.

Proposal.

This is the closest to in-place and linear sorting algorithm that I could find. It relies on a variant of counting sort that parses the original array and counts occurrences of integers in-place, within this array. To do so, it swaps the content of cells that it modifies, and encodes cells that contain a counter by a negative integer. Then, it prints the result without storing it.

This is a python implementation, with comments explaining the details:

import random,sys
random.seed()
create input
n = 30
k = 4
l = [random.randrange(0,k) for i in range(n)]
for debug: print original list and its native sort
print(" ".join([str(x) for x in l]))
print(" ".join([str(x) for x in sorted(l)]))
"in-place" counting sort
i = 0
while i<n:
  x = l[i]      # the value to count
  if l[x]<0:    # there alreaty is a counter for it
    l[x] -= 1   # (negatively) increment it
    l[i] = -1   # set current cell to 0, encoded as -1
  else:         # first occurrence of x
    l[i] = l[x] # store the value in l[w] before erasing it
    l[x] = -2   # new counter in l[x], init to 1, encoded as -2
  while i<n and l[i]<0: # go to next non-counter cell
    i += 1
print result
l[i]=-x-1 means there are x occurrences of i
for i in range(k):
  for j in range(-l[i]-1):
    sys.stdout.write("%d "%i)
sys.stdout.write("\n")

Remarks.

This algorithm seems to be in-place, since it seems to only use a constant amount of memory in addition to its input.

However, it is not really in-place, because:

It uses negative numbers to encode counter information within the input array, which actually costs $n$ bits.
Its output is just printed, not stored (not even in the original array) and so it is not available for further computations.

Questions.

A first set of questions is about possible improvements:

Is it possible to save even more space? For instance, only $k$ bits are really needed, as there are $k$ counters. Anything else?
Is it possible to write the sorted values in the input array? or may we prove that this is not possible?
Is it possible to extend it to more general inputs? Clearly, any other integer interval of length at most $k$ is possible. Anything else?

A second set of questions is about possible uses:

Are there cases where one wants to print the result, without storing it? It seems related to streaming algorithms.
Does this algorithm help in finding better solutions to other problems, like median or cumulative distribution computations? or removing duplicate values?

Thanks, interesting reference that indeed answers some of my questions :) They seem to have an implementation; do you know if it is available? — Matthieu Latapy, Jan 11 '21 at 07:00

In-place linear sort of integers, again

create input

for debug: print original list and its native sort

"in-place" counting sort

print result

l[i]=-x-1 means there are x occurrences of i

0 Answers0