Go4Expert

Go4Expert (http://www.go4expert.com/)
-   C (http://www.go4expert.com/articles/c-tutorials/)
-   -   Ultimate Guide to Sorting (http://www.go4expert.com/articles/ultimate-guide-sorting-t21294/)

techgeek.in 11Mar2010 13:05

Ultimate Guide to Sorting
 

Introduction



In computer Science and mathematics sorting is any technique that puts elements of a list in a certain numerical or lexicographical order.We will discuss some of the commonly used sorting algorithms which has been implemented in various high level languages like C, C++, Java etc.

Bubble Sort



It is the simplest sorting algorithm. It works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm is so named because in each pass the smallest element in the list "bubble" to the top of the list.

Example:

Let us take an array of numbers A= (5, 1, 4, 2, 8 ). We want them to be sorted in ascending order.

First Pass:

( 5 1 4 2 8 )---->( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps them.
( 1 5 4 2 8 )---->( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 )----> ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 )----> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.

Second Pass:

( 1 4 2 5 8 )----> ( 1 4 2 5 8 )
( 1 4 2 5 8 )----> ( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 )----> ( 1 2 4 5 8 )
( 1 2 4 5 8 )----> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without any swap to know it is sorted.

Third Pass:

( 1 2 4 5 8 )----> ( 1 2 4 5 8 )
( 1 2 4 5 8 )----> ( 1 2 4 5 8 )
( 1 2 4 5 8 )----> ( 1 2 4 5 8 )
( 1 2 4 5 8 )----> ( 1 2 4 5 8 )
Finally, the array is sorted, and the algorithm can terminate.

Pseudocode:

Code:

  procedure bubbleSort( array A)
    do
      swapped := false
      for each i in 0 to length(A) - 2 inclusive do:
        if A[i] > A[i+1] then
          swap( A[i], A[i+1] )
          swapped := true
        end if
      end for
    while swapped
  end procedure

Code in C language:

Code:

  Void BubbleSort (int A[], int N)
  {
              int i, j, Tmp;
  for (i=0; i<N-1; i++)
              for (j=i+1; j<N; j++)
                          if (A[i]>A[j])
                          {
                            Tmp=A[i];
                            A[i]=A[j];
                            A[j]=Tmp;
                          }
  }

Runtime Complexity:

The complexity of this algorithm is O(N2) whatever be the nature of the List initially, be it fully sorted, fully unsorted or partly sorted.


Insertion Sort



This technique actually goes through N-1 passes through the list where N is number of elements in the array. Each pass P can be considered to be for the elements from 1 to N-1. Each pass is actually an effort to place the element P in its correct position among the elements from position 0 to P.

Example:
Let us take an array of numbers A= (34, 8 , 64, 51, 32, 21).

Code:

  +---------------+-------+-------+-------+-------+-------+------+-----------------+
  |  Original    |  34  |  8  |  64  |  51  |  32  |  21  | Position Moved  |
  +---------------+-------+-------+-------+-------+-------+------+-----------------+
  | After P1      |    8  |  34  |  64  |  51  |  32  |  21  |          1    |
  +---------------+-------+-------+-------+-------+-------+------+-----------------+
  | After p2      |    8  |  34  |  64  |  51  |  32  |  21  |          0    |
  +---------------+-------+-------+-------+-------+-------+------+-----------------+
  | After p3      |    8  |  34  |  51  |  64  |  32  |  21  |          1    |
  +---------------+-------+-------+-------+-------+-------+------+-----------------+
  | After p4      |    8  |  32  |  34  |  51  |  64  |  21  |          3    |
  +---------------+-------+-------+-------+-------+-------+------+-----------------+
  | After p5      |    8  |  21  |  32  |  34  |  51  |  64  |          4    |
  +---------------+-------+-------+-------+-------+-------+------+-----------------+

Pseudo code:
Code:

  insertionSort(array A)
  begin
      for i := 1 to length[A]-1 do
      begin
          value := A[i];
          j := i - 1;
          done := false;
          repeat
              if A[j] > value then
              begin
                  A[j + 1] := A[j];
                  j := j - 1;
                  if j < 0 then
                      done := true;
              end
              else
                  done := true;
          until done;
          A[j + 1] := value;
      end;
  end;

Code in C language:

Code:

  void InsertionSort (int A[], int N)
  {
              int j, P;
              int Tmp;
 
              for (P=1; P<N; P++)
              {
                          Tmp=A[P];
                          for (j=P; j>0 && A[j-1]>Tmp; j--)
                          A[j] =A[j-1];
                          A[j]=Tmp;
              }
  }

Runtime Complexity:

It is O (N2). This is because there are two for loops and in the worst case the two for loops are both going to run N times. The point where Insertion Sort scores over Bubble Sort is its best case analysis. Note that when the elements are fully sorted, then the inner for loop will immediately break. This will give a runtime of O(N).

Selection Sort



This algorithm works by finding the minimum element in the list then swapping it with the value in the first position.This step is repeated for the remainder of the list.

Example:
Let us take an array of numbers A= (64 25 12 22 11)

64 25 12 22 11 ->Here the minimum element is 11.This is swapped with the value in the first position.

11 25 12 22 64 ->Here the minimum element is 12.This is swapped with the value in the second position.

11 12 25 22 64 ->Here the minimum element is 22.This is swapped with the value in the third position.

11 12 22 25 64 -> Here the minimum element is 25. No swapping.

11 12 22 25 64-> Final sorted array.

Pseudo Code

Code:

  SelectionSort(A)
  begin
    n := length[A]
          for i := 1 to n
            j := FindIndexOfSmallest( A, i, n )
            swap A[i] with A[j]
        end for
  end procedure
 
 
 
  FindIndexOfSmallest( A, i, n )
  // GOAL: return j in the range [i,n] such that A[j]<=A[k] for all k in range [i,n]
  smallestAt := i ;
            for j := (i+1) to n
                  if ( A[j] < A[smallestAt] ) smallestAt := j
            end for
  return smallestAt
  end procedure

Code in C language:

Code:

  void selectionSort( int a[ ] )
  {
      int i = 0;
          while ( i < sizeof( a ) / sizeof( int ) )
                {
                    int min = i;
                    int k = i + 1;
                    while ( k < sizeof( a ) / sizeof( int ) )
                        {
                            if ( a[ k ] < a[ min ] )
                              {
                                  min = k;
                              }
                            ++k;
                        }
                    if ( i != min )
                        {
                          int swap = a[ i ];
                          a[ i ] = a[ min ];
                          a[ min ] = swap;
                        }
                  ++i;
              }
  }

Runtime Complexity:

It has O(n2) complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort.


Merge Sort



This algorithm works on the basis of Divide and Conquer strategy i.e. dividing the problem into sub problems, solving the sub problems independently and then merging the outcomes of the sub problems to get the outcome of the actual problem.

Example:

if we need to mergesort 24,13,26,1,2,27,38,15, we would divide the problem into the subproblems of sorting 24,13,26,1 and 2,27,38,15 independently, get the outcomes 1,13,24,26 and 2,15,27,38 and then merging the outcomes to get the result 1,2,13,15,24,26,27,38.

Pseudo code

Code:

  function merge_sort(m)
      if length(m) ≤ 1
          return m
      var list left, right, result
      var integer middle = length(m) / 2
            for each x in m up to middle
                  add x to left
                      for each x in m after middle
                        add x to right
                        left = merge_sort(left)
                        right = merge_sort(right)
                            if left.last_item > right.first_item
                              result = merge(left, right)
                            else
                            result = append(left, right)
                            return result
 
 
  function merge(left,right)
      var list result
      while length(left) > 0 and length(right) > 0
          if first(left) ≤ first(right)
              append first(left) to result
              left = rest(left)
          else
              append first(right) to result
              right = rest(right)
      end while
      if length(left) > 0
          append left to result
      else 
          append right to result
      return result

Code in C

Code:

  Void MSort (int A[], int TmpArray[], int Left, int Right)
{
int Center;
          if (Left<Right)
              {
                Center=(Left+Right)/2;
                MSort(A, TmpArray,        Left, Center);
                      MSort(A, TmpArray,        Center+1, Right);
                Merge(A, TmpArray, Left, Center+1,Right);
              }
}

void MergeSort(int A[], int N)
{
int *TmpArray;

TmpArray=malloc(N*sizeof(int));
        if(TmpArray!=NULL)
            {
              MSort(A,TmpArray, 0,N-1);
              free(TmpArray);
            }
        else
        printf(“Out of Space”);
            }

void Merge(int A[], int TmpArray[], int Lpos, int Rpos, int RightEnd)
{
int i, LeftEnd, NumElements,TmpPos;
LeftEnd=Rpos-1;
TmpPos=Lpos;
NumElements=RightEnd-        Lpos+1;
          while(Lpos<=LeftEnd && Rpos<=RightEnd)
                {
                if(A[Lpos]<=A[Rpos])                           
                TmpArray[TmpPos++] =A[Lpos++];
                else
                TmpArray[TmpPos++]  =A[Rpos++];
                    while(Lpos<=LeftEnd)
                      TmpArray[TmpPos++] =A[Lpos++];
                    while(Rpos<=RightEnd)
                      TmpArray[TmpPos++]=A[Rpos++];
                        for(i=0;i<NumElements;i++, RightEnd--)
                          A[RightEnd]  =TmpArray[RightEnd];
                  }

Runtime Complexity

Clearly it is a recurrence relation
T(1)=1
T(N)=2T(N/2)+N
This recurrence relation is solved leading to
T(N)=O(N log N)


Quick sort



The technique for Quick Sort suggests that, we choose one element in the list as the pivot, and then place all the elements which are lesser than the pivot in one side of the pivot and then place all the elements which are larger then the pivot on the other side of the pivot. Then recursively Quick Sort the elements on the two sides of the pivot, then output the lesser side, the pivot and then the greater side. This is another example of Divide and Conquer.

The algorithm:

1.Pick an element, called a pivot, from the list.

2.Reorder the list so that all elements which are less than the pivot come before the pivot and so that all elements greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.

3.Recursively sort the sub-list of lesser elements and the sub-list of greater elements.

How to select the pivot?
A bad choice of pivot may make all other elements in the list all smaller or all greater than the pivot. Thus almost the entire list has to be sorted again as all other elements in the list except the pivot has been placed in a single subsequence.

First let us solve an example
13, 81, 92, 43, 65, 31, 57, 26, 75, 0

http://www.go4expert.com/images/arti...rting/pics.jpg

http://www.go4expert.com/images/arti...ting/pics2.jpg

Picking the Pivot

This is a very important consideration while measuring runtime, since a bad choice of pivot would lead to an O(N2) runtime.
Random
Often the pivot is randomly chosen. But this is not also a good choice, since random number generators are not often very well designed and may often generate the 1st position.

Median of Three

A technique most often used and also gives good result is to choose three numbers randomly and then the median of the three numbers is taken as the position whose element should be chosen as the pivot.

Partitioning Strategy

The strategy of partitioning is first when the pivot is chosen, then to place the pivot element in the last position of the list. Then a pointer j is set to point to the element in the last but one position and another pointer i is set to point to the first element in the list. i increments without doing anything if the element pointed by it, is lesser than the pivot element. However, if the element pointed by i is larger than the pivot then i stops. j decrements without doing anything if the element pointed by it is larger than the pivot. However if it is smaller j stops. So when both stops, the elements in positions i and j are swapped and the process begins. The idea is that, all the elements lesser than pivot will be to the left of the list and those larger than the pivot should lie to the right of the list. Now, when i and j crosses, the element in position i is swapped again with the pivot present in the last position. When the process is done, the elements to the left of i should be smaller and those to the right of i should be larger than the pivot. The process is continued recursively.

Let us analyse the partitioning strategy with an example:

8 1 4 9 0 3 5 2 7 6

A[i]=8, A[j]=7
A[i] = 8 > 6, so i stops.

8 1 4 9 0 3 5 2 7 6

A[i]=8, A[j]=2
A[j] = 2 < 6, so j stops.
i < j so A[i] and A[j] swapped.

2 1 4 9 0 3 5 8 7 6

A[i]=2, A[j]=8

2 1 4 9 0 3 5 8 7 6

A[i]=1, A[j]=5

2 1 4 9 0 3 5 8 7 6

when, A[i]=9, A[j]=5
A[i] = 9 > 6, so i stops.

2 1 4 9 0 3 5 8 7 6

A[i]=9, A[j]=5
A[j] = 5 < 6, so j stops.
i < j so A[i] and A[j] swapped.

2 1 4 5 0 3 9 8 7 6

A[i]=0, A[j]=3

2 1 4 5 0 3 9 8 7 6

A[i]=9, A[j]=3
A[i] = 9 > 6, so i stops.


2 1 4 5 0 3 9 8 7 6

A[i]=9, A[j]=3
A[j] = 3 < 6, so j stops.

2 1 4 5 0 3 6 8 7 9

i > j, so A[i] and A[j] not swapped, A[i] exchanged with pivot.

Note that after this process is over, everything below i is less than A[i] and everything after i is greater than A[i]. So Quick Sort can be performed individually on elements below i and above i.

Code in C
Code:

  Void QuickSort(int A[], int N)
  {
              Qsort(A, 0, N-1);
  }
 
  int Median(int A[], int Left, int Right)
  {
              int left, center, right;
              int Center=(Left+Right)/2;
              left=A[left];
              center=A[center];
              right=A[right];
              if(left>center)
  Swap(&left,&center);
  if(left>right)
              Swap(&left,&right);
              if(center>right)
              Swap(&center,&right);
  if(center= =A[left])
  Swap(&A[left], &A[right]);
  else if (center= =A[center])
              Swap(&A[center], &A[right]);       
  return A[right];
  }
             
  void Qsort(int A[], int Left, int Right)
  {
              int i, j,Pivot;
  if(Left<Right)
              {
                          Pivot=Median(A, Left, Right);
                          i=Left-1; j=Right;
                          for(;;)
                          {
                                      while (A[++i]<Pivot){}
                                      while (A[--j]>Pivot){}
                                      if (i<j)
                                                  Swap (&A[i], &A[j]);
                                      else
                                                  break;
                          }
                          Swap(&A[i], &A[right]);
                          Qsort(A,Left,i-1);
                          Qsort(A,i+1,Right);
              }
  }

Runtime Complexity

Worst Case- O(n2)
Average Case- O(N log N)


Radix Sort



This is a Sorting technique. In this technique we take a Table of Tablespace equal to the maximum element present in the list. While sorting, any element is placed in a position whose address is equal to the element value. Then writing the elements that are present in the Table sequentially gives the sorted list. This algorithm has the dream runtime of O(N) but the space complexity as worse as possible since, with the maximum element in the list being the maximum address that is possible in the machine, the entire memory space can be filled up. Hence, to counter that, we take a Table of Tablespace equal to 10 having addresses 0 to 9. Then an element is placed in the Table location whose address is equal to the least significant digit of that element. With this there is a possibility of collision, hence every location in the table is not taken as a single memory space but as a Queue. Thus, when the least significant digit of more than one number is equal then they are placed in the Queue of that memory location whose address is equal to the least significant digit. This is the story of the first pass. However in the second pass, the elements are placed in the memory locations whose addresses are given by the second lest significant digit of the numbers. In the third pass the third least significant digit is taken and so on in the kth pass the kth least significant digit is taken. With the end of all the digits present the elements of the queues are listed one after the other, which gives the Sorted list.

http://www.go4expert.com/images/arti...rting/pic4.jpg

The Program for Radix Sort
Code:

#include<stdio.h>
#include<QueueList.h>

void radixsort(int a[], int n )
{
        int greatest, msd, k, i, y, j, qu;
        Queue Q[10];
        for (i=0; i<10; i++)
                MakeQueue(Q[i]);
        greatest=max(a,n);
        msd=countdigit(greatest);

        for (k=1; k<=msd; k++)
        {
                for (i=0; i<n; i++)
                {
                        y=a[i];
                        j=digitatpos(k,y);
                        Enqueue(y,Q[j]);
                }
                for(i=0,qu=0 ; ;)
                {
                        while (qu!=10 && IsEmpty(Q[qu])
                                qu++;
                        if(qu= =10) break;
                        a[i++]=  DeQueue(Q[qu]);
                }
}


Deadly Ghos7 14Mar2010 07:10

Re: Ultimate Guide to Sorting
 
Good article, man. Will help the beginners in sorting.

techgeek.in 14Mar2010 15:30

Re: Ultimate Guide to Sorting
 
thanx a lot..

seangtz 23Mar2010 10:02

Re: Ultimate Guide to Sorting
 
Now, sorting seems very easy for me....

ASRRAJ 24Mar2010 11:18

Re: Ultimate Guide to Sorting
 
Good one !!! put some computation analysis for the same

techgeek.in 25Mar2010 23:00

Re: Ultimate Guide to Sorting
 
Quote:

Originally Posted by seangtz (Post 66104)
Now, sorting seems very easy for me....

Bingo!! then u have learned most of the data structure. Data structure is all about array,queue, stack, searching and sorting..:happy: And sorting part is most tricky out of all. :pleased:

techgeek.in 25Mar2010 23:05

Re: Ultimate Guide to Sorting
 
Quote:

Originally Posted by ASRRAJ (Post 66148)
Good one !!! put some computation analysis for the same

Computation analysis is given on some of the sorting algorithms. Some are missed to keep the article short and simple. Special requests wud be taken care by me..:)..so feel free to ask in which algorithm u need the computation analysis...

shabbir 2Apr2010 09:12

Re: Ultimate Guide to Sorting
 
If you like this this article nominate it for Article of the month - Mar 2010

GeorgeWalker 10Apr2010 14:57

Re: Ultimate Guide to Sorting
 
Thanks for sharing valuable information

manoj1987 12Apr2010 18:45

Re: Ultimate Guide to Sorting
 
i have trouble understanding the big oh, big omega notation and all those related notations. Can anyone direct me to a site or a link with this site that explains it neat and simple.
Thanks :)


All times are GMT +5.5. The time now is 18:35.