As I understand, a quick-select algorithm could use median of medians to find best suited pivot to yield the i-th item in the array, say A.
I have referred to Median of Medians algorithm and steps given here for deterministic selection.
This is the pseudo code as I understood for the sake of algorithm.
define partition5(arr)
Returns middle of the array
define median_of_medians(arr, N)
Returns the median index
if # of elements are 5 or less just find the median
partition5(arr)
get n/5 slices
create array of slices
fill array with ranges of slices from 0 to n in step=5
find the median for each of slice of len 5 or less using partition5
find the median of this median array recursively
define a swap function
define partition(ar, lo, hi, pivotIndex):
i.e. a 3 way djisktra partition method
start = lo
take the elemet @ hi as the pivot and swap it to pivotIndex position
swap(pivotIndex, hi, ar)
pivotIndex = hi
eq = lo
loop range
if element == pivot
inc eq
if element < pivot
swap index, lo
inc lo
inc eq
swap(lo, pivotIndex, ar)
return lo
find the kth smallest element using
define quickSelect(arr, startIndex, endIndex, k)
get start, end index range
if (startIndex <= endIndex):
find pivot using median_of_medians()
divide using Linear-partitioning partition(ar, startIndex, endIndex, pivot)
if ( pivotIndex == k ): return arr[pivotIndex]
if (pivotIndex > k):
recurse using quickSelect [startIndex, pivotIndex-1]
else if (pivotIndex < k):
recurse using quickSelect [pivotIndex+1, endIndex]
This is how I understood how quickSelect could be implemented using median of medians.
def partition5(arr):
'''Returns middle of the array'''
return len(arr)/2
def median_of_medians(arr, n, orig_arr=None):
'''Returns the median index'''
if n < 6:
# if # of elements are 5 or less just find the median
return partition5(arr)
orig_arr = orig_arr or arr
# find the numbers of slices
slot_num = n/5 + (1 if n%5 else 0)
# create array of slices
median_slots = ([None] * slot_num)
# and a null aux array
aux = [None]*slot_num
count = 0
# fill the aux array with extremum ranges of slices
for slot_index in xrange(0,n,5):
aux[count] = (slot_index,(n-slot_index)%n + 5)
count += 1
count = 0
# find the median for each slice of len 5 or less
for r in aux:
median_slots[count] = partition5(sorted(xrange(r[0], r[1]), key = lambda x: orig_arr[x]))
count += 1
# print "median_slots is {}".format(median_slots)
# find the median of this median array
return median_of_medians(median_slots, len(median_slots), orig_arr)
def swap(findex, sindex, ar):
ar[findex], ar[sindex] = ar[sindex], ar[findex]
def partition(ar, lo, hi, pivotIndex):
'''3 way djisktra partition method'''
start = lo
# take the elemet @ hi as the pivot and swap it to pivotIndex position
swap(pivotIndex, hi, ar)
pivotIndex = hi
pivot = ar[pivotIndex]
eq = lo
for index in xrange(lo, hi):
if (ar[eq] == pivot):
eq += 1
if (ar[index] < pivot and index < pivotIndex):
swap(index, lo, ar)
lo += 1
eq +=1
swap(lo, pivotIndex, ar)
return lo
# find the kth smallest element by comparing the returned pivot index
def quickSelectIter(arr, startIndex, endIndex, k):
stack = deque([[startIndex,endIndex]], log(endIndex-startIndex+1, 2))
pivot = pivotIndex=0
while (stack):
pop = stack.pop()
startIndex, endIndex = pop[0], pop[1]
if (startIndex <= endIndex):
pivot = median_of_medians(arr, endIndex-startIndex+1)
# divide using Linear-partitioning
pivotIndex = partition(arr, startIndex, endIndex, pivot)
if ( pivotIndex == k ): return arr[pivotIndex]
if (pivotIndex > k):
stack.appendleft([startIndex, pivotIndex-1])
elif (pivotIndex < k):
stack.appendleft([pivotIndex+1, endIndex])
Does it look like implementing this would yield O(N) complexity?
It seems correct while running but for some cases it is not.
Lets say,
A = [34, 40, 9, 20, 62, 89, 68, 0, 90, 83, 46, 98, 6, 41, 73, 99, 35, 82, 36, 53, 70, 27, 93, 54, 64, 52, 18, 85, 58, 69, 24, 49, 25, 30, 26, 79, 55, 1, 78, 7, 8, 28, 4, 38, 71, 10, 84, 72, 50, 29, 87, 51, 37]
l = sorted(A)
# res= median_of_medians(A, len(A))
# print A[res]
for index in xrange(len(A)):
# the i-th element
print quickSelectIter(A, 0, len(A)-1, index+1), l[index]
so there's a bug, conceptually. What would be the Fix?