Sorting Algorithms: Insertion Sort

History- 

Sorting is the process of putting data into a defined order, either ascending or descending.   It is based on a predefined comparison function which takes 2 pieces of data and determines if the first one is greater than, less than, or equal to the second one.  By using the comparison function iteratively to different pairs of data, a sorting algorithm is formed.  This may then be used to transform the data set from an unsorted state, to one which is.  Sorting is one of the most fundamental tasks performed by the computer.   It is most often used so that searching data sets may be done easier.  The efficiency of a given algorithm depends on the data set that it is being used on.  There are 3 different ways to express the efficiency of an algorithm.   The best case, which describes how the algorithm works under the best conditions, the worst case, which is how it works under the worst conditions, and the average case, which is  considered to be the most common situation.  Since different sets of data perform best under different applications, there are a variety of sorting strategies that can be used: bubble sort, selection sort, insertion sort, merge sort, quick sort and heap sort.

The insertion sort strategy which can be used to sort items contained in an array.  This sort partitions the array into 2 regions, the sorted and the unsorted.  All of the items in the array start out as unsorted, but at each step, the insertion takes the first item of the unsorted part and puts it in its correct position on the sorted side until all the unsorted items are sorted.  This algorithm is very similar to sorting the cards as they are dealt to you in a card game one by one.  As the cards are dealt you you, you pick them up, one at a time, and then put it in its correct position in your hand, whether it be ascending or descending ordered.  

One example of how the insertion sort works is shown below:

23   5   2   11   8   6    {  At the beginning, we assume the first 
|-|  *                        element is sorted, and then we attempt
                              to place the second element into its
                              proper place  }
                                    
5   23   2   11   8   6    }  5 and 23 are now correctly sorted, and
|----|   *                    we'll attempt to put 2 into its place.  } 
      
2   5   23  11   23   6    {  2 has been put into it's place in the
|--------|   *                list of three items.  It's a coincidence
                              that it is also its correct final position  }
                                    
2   5   11   23   8   6    {  Now the first four items are sorted, and
|-------------|   *           we'll insert 8 into that list  }
      
2   5   8   11   23   6    {  8 is inserted between 5 and 11  }
|-----------------|   *
      
2   5   6   8   11   23    {  After inserting 6, the list is sorted  }
 
The Code-
public void insertion(){
   int counter;
   int index;
   int num;
   int temp;
   num = 20;
   for (counter = 1; counter < num; counter++) {
      temp = a[counter];
      // Figure out where this value belongs in 
      // the sorted part of the array 
      index = counter-1;
      while ((index >= 0) && (a[index] > temp)) {
          a[index+1] = a[index]; // move over one position
          index--;
      }
      a[index+1] = temp; // move the copy of a[counter] into order
   }
}

Big-O Analysis-

Assuming that there are n elements in the array, n-1 searches must be made.  For each entry you may need to check and exchange up to n-1 other entries, resulting in a O(n2) algorithm.  In the best case, which is a sorted array, it would run one time with one comparison and no shift, and would result in an O(n) algorithm.  The average case would also be a O(n2) algorithm as like the worst case.

 

Compare & Contrast- 

When measuring the efficiency of a given algorithm, there are mainly 2 things that should be taken into consideration.   Since the  algorithms have to compare the magnitude of different elements and move things around, counting the number of comparisons needed and the number of exchanges made are good measures of performance.  When comparing with larger structures, the number of exchanges would be the primary principal of performance, since exchanging two records would require a lot of work, but in smaller arrays, the number of comparisons would be primary.  

 Below is a table showing the number of elements with the number of comparisons for various sorts:
Sort/Elements 50 100 200 300 400 500
Selection Sort 1225 4950     19900   44850   79800   124750
Exchange Sort 1410 5335 20300 45650 79866 126585
Insertion Sort 1391 5399 20473 44449 78779 123715
Quick Sort         399        990 1954 3384 5066 6256

 

Within the first 3 sorts (all of which are O(n2) algorithms), the insertion sort was the fastest, having to use the least amount of comparisons when the number of elements increased.  Although the quick sort has O(n2) for its worst case, which is the same as the first 3, its average case is O(n*log2n), which will run significantly faster on larger arrays.

Bibliography

Carrano, Frank M., Janet J. Prichard.  Data Abstraction and Problem Solving With Java.  New York: Addison Wesley Longman, Inc., 2001.

http://joda.cis.temple.edu/~koffman/cis223/sortsPart1.doc

http://www.aihorizon.com/essays/basiccs/lists/sorting/insertion.htm

http://planetmath.org/encyclopedia/InsertionSort.html

http://www.cs.hope.edu/csci120/topics/sorting.html

http://atschool.eduweb.co.uk/mbaker/sorts.html

http://epaperpress.com/sortsearch/index.html