Today's Question:  What are you most afraid of as a programmer?        GIVE A SHOUT

Technical Article => Programming =>  Algorithm

Find the kth smallest number in an array

  sonic0002      2013-01-09 06:20:54      3,893    0    0

This is an classical question, the general solution to this question is first using sorting algorithm to sort the array, then get the kth number in the array, the complexity is O(nlogn), we can also use selective sort or head sort to, the complexity of selective sort is O(kn) and heap sort is O(nlogk). The better solution is to use quick sort to find the kth smallest number, the complexity is O(n), the worst case cost is O(n^2). But today we introduce one more solution which has the worst case cost O(n).

First let's check the quick sort solution, quick sort is to find one number, then put all numbers in the array which is less than the number chosen on one side and then put the other numbers on the other side. Then recursively  sort them again. Then to find the kth smallest number in the array, we only need to find it in one side of the array.

  1. import random
  2.  
  3. def partition(arr, left, right, pivot):
  4.     v = arr[pivot]
  5.     arr[pivot], arr[right-1] = arr[right-1], arr[pivot]
  6.     index = left
  7.     for i in xrange(left, right):
  8.         if arr[i] <= v:
  9.             arr[i], arr[index] = arr[index], arr[i]
  10.             index += 1
  11.     return index-1
  12.  
  13. def select(arr, left, right, k):
  14.     while right - left > 1:
  15.         index = partition(arr, left, right, random.randint(left, right-1))
  16.         dist = index - left + 1
  17.         if dist == k:
  18.             return arr[index]
  19.         if dist < k:
  20.             k -= dist
  21.             left = index + 1
  22.         else:
  23.             right = index
  24.     return arr[left]

Here arr is the array to search, we can call select to find the kth smallest number, if we cannot find the pivot element correctly, the worst case cost will be O(n^2).

Now we discuss the solution which has worst case cost O(n), we can divide all these numbers into some subarrays which have 5 elements in each subarray, then there will be n/5 subarrays. For each subarray, we can find the middle number very fast, then we can find the middle number of these n/5 middle numbers. This algorithm is called Median of Medians algorithm.

If we use the median of the medians as the pivot, then there will be 3/5*1/2 which is 3/10 numbers which are less or equal to pivot, also there will be 3/10 numbers which are greater than pivot, so the worst case cost is the array will be divided into 30%,70% or 70%,30% two parts.

The cost is :

T(n)<=T(n/5)+T(7/10*n)+O(n)<=c*n*(1+9/10+(9/10)^2....)
So T(n)=O(n)

The worst case cost is O(n).

  1. import heapq
  2.  
  3. def partition(arr, left, right, pivot):
  4.     v = arr[pivot]
  5.     arr[pivot], arr[right-1] = arr[right-1], arr[pivot]
  6.     index = left
  7.     for i in xrange(left, right):
  8.         if arr[i] <= v:
  9.             arr[i], arr[index] = arr[index], arr[i]
  10.             index += 1
  11.     return index-1
  12.  
  13. def select_heap(arr, left, right, k):
  14.     tmp = arr[left:right]
  15.     heapq.heapify(tmp)
  16.     [heapq.heappop(tmp) for i in xrange(k-1)]
  17.     return heapq.heappop(tmp)
  18.  
  19. def median(arr, left, right):
  20.     num = (right - left - 1) / 5
  21.     for i in xrange(num+1):
  22.         sub_left = left + i*5
  23.         sub_right = sub_left + 5
  24.         if sub_right > right:
  25.             sub_right = right
  26.         m_index = select_heap(arr, sub_left, sub_right, (sub_right-sub_left)/2)
  27.         arr[left+i], arr[m_index] = arr[m_index], arr[left+i]
  28.     return select(arr, left, left+num+1, (num+1)/2)
  29.  
  30. def select(arr, left, right, k):
  31.     while right - left > 1:
  32.         pivot = median(arr, left, right)
  33.         index = partition(arr, left, right, pivot)
  34.         dist = index - left + 1
  35.         if dist == k:
  36.             return arr[index]
  37.         if dist < k:
  38.             k -= dist
  39.             left = index + 1
  40.         else:
  41.             right = index
  42.     return arr[left]

Source : http://www.isnowfy.com/top-k-number/

SORT SEARCH QUICK SORT SMALLEST

  SAVE AS PDF   MARK AS READ   MARK AS IMPORTANT

Share on Facebook  Share on Twitter  Share on Google+  Share on Weibo  Share on Reddit  Share on Digg  Share on Tumblr    Delicious

  RELATED


  0 COMMENT


No comment for this article.


  WRITE ARTICLE

IKEA Bookbook

By sonic0002
IKEA 2015 Catalogue bookbook. Download a free copy or upload yourself to IKEA. Give you an advice, IKEA, be careful of the DOS attack. If you are in mainland China, please watch the video IKEA Bookbook. To be honest. A very creative design.It gets over 10 million visits in less than 15 days.