I stumbled across the median of medians algorithm when I was looking for an efficient way to calculate the nearest 20 cities to any city, given their latitude and longitude.
It is a selection algorithm that has a worst-case O(n) complexity for selecting the kth order statistic (kth smallest number) in an unsorted array with length n.
Say you have an unsorted array and you would like to pick the fourth smallest element. One approach would be to just sort the array and pick the fourth element. However, the best case performance for this approach is O(nlogn).
n <= 5
, sort the array and return the kth
element. (The complexity of sorting small arrays is linear).p = |L| + 1
, where
|L| is the length of
left. In other words, m is the
(|L| + 1)
th smallest element. 7. If k == p, return m. 8. If k < p, return the kth smallest element of left. 9. If k > p, then return the (k - p)th smallest element of right.
Given an array of length n = 10: [10, 6, 8, 3, 7, 1, 2, 4, 9, 5], we would like to find the kth smallest element, where k = 3.
Split the array into two sub-arrays of size 5: [10, 6, 8, 3, 7] and [1, 2, 4, 9, 5].
Sort each sub-array and select the medians: [3, 6, 7, 8, 10] and [1, 2, 4, 5, 9], medians 7 and 4.
Pick the median of the two medians m = 4.
Loop over the initial array splitting it into two arrays: left = [3, 1, 2] and right = [10, 6, 8, 7, 9, 5]
.
The position of m, p = left.length + 1
Since k < p we recursively find the kth smallest element of left.
Since left.size <= 5
, we follow step 1: sort the array and return the kth element, 3.
Since I tagged this post with 'TDD' We'll need to write a few test cases first. My favorite testing framework for ruby is currently rspec with guard.
it 'returns the 3rd smallest number' do
expect(subject.new(array1).kth_smallest(3)).to eq(3)
end
it 'returns the 2nd smallest number' do
expect(subject.new(array1).kth_smallest(2)).to eq(2)
end
it 'returns the 5th smallest number' do
expect(subject.new(array1).kth_smallest(5)).to eq(5)
end
it 'computes the same result as a number obtained by sorting' do
k = 53
array2_sorted = array2.sort
expect(subject.new(array2).kth_smallest(k)).to eq(array2_sorted[k-1])
end
it 'returns the 3rd smallest number' do
expect(subject.new(array1).kth_smallest(3)).to eq(3)
end
it 'returns the 2nd smallest number' do
expect(subject.new(array1).kth_smallest(2)).to eq(2)
end
it 'returns the 5th smallest number' do
expect(subject.new(array1).kth_smallest(5)).to eq(5)
end
it 'computes the same result as a number obtained by sorting' do
k = 53
array2_sorted = array2.sort
expect(subject.new(array2).kth_smallest(k)).to eq(array2_sorted[k-1])
end
Then we fix the breaking specs by writing the lib:
class KthSmallest
attr_reader :array
def initialize(array)
@array = array
end
def kth_smallest(array = array, k)
return select(array, k) if (array.length <= 5)
medians = []
left,right = [],[]
array.each_slice(5) do |array_5|
medians << select(array_5, array_5.length/2)
end
m = kth_smallest(medians, medians.length/2)
array.each { |n| n < m ? left << n : (n > m ? right << n : nil) }
m_posn = left.length + 1
if k == m_posn
m
elsif k < m_posn
kth_smallest(left, k)
else
kth_smallest(right, k - m_posn)
end
end
private
def select(array, k)
array.sort[k - 1]
end
end
class KthSmallest
attr_reader :array
def initialize(array)
@array = array
end
def kth_smallest(array = array, k)
return select(array, k) if (array.length <= 5)
medians = []
left,right = [],[]
array.each_slice(5) do |array_5|
medians << select(array_5, array_5.length/2)
end
m = kth_smallest(medians, medians.length/2)
array.each { |n| n < m ? left << n : (n > m ? right << n : nil) }
m_posn = left.length + 1
if k == m_posn
m
elsif k < m_posn
kth_smallest(left, k)
else
kth_smallest(right, k - m_posn)
end
end
private
def select(array, k)
array.sort[k - 1]
end
end
You can find the complete code on github