Searching and Sorting

Searching and sorting techniques play a vital role in maintaining database applications. Since a database is a collection of the massive volume of data which is stored in a form of records. Each record contains numerous different types of attributes depending on the structure defined. It is very difficult and time-consuming to retrieve a specific record from the database, where records are not stored in sorted order if there are no proper searching and sorting methods.

Searching

Searching is a process of finding the correct location of an element from a list or array of elements. In an array, elements are stored in consecutive memory locations with the remaining element until the exact match is found. If the element is found, the search process is said to be successful or else the search process is terminated unsuccessfully.

The time complexity of searching technique depending on the number of comparisons made to find an exact, match. The methods used for searching are,

  1. Linear /Sequential Search
  2. Binary / Logarithmic Search

Sequential Searching

Linear searching is nothing but searching an element in a linear way. This can be an array or in a linked list. We have a need to start the search from beginning and scan the element one by one until the end of the array or linked list. If the search is successful then it will return the location of the element, otherwise, it will return the failure notification.

Algorithm for sequential search

Let us take a linear search algorithm for array

Let us take the linear search algorithm for array

Analysis

As the linear search behavior of this algorithm, it searches elements one by one in a linear manner. In the case of a successful search, it will search the element up to the position in the array where it finds the element in the array. Suppose it finds the element at the first position of the array then it will behave like best case. in the case of an unsuccessful search, it will search up to the last element array.

The main disadvantage of the linear search method is that, it is a time-consuming method, and also is inefficient when a large number of an element is present in a list. The time complexity for linear search method is O(n)

Logarithmic Search

The sequential search situation will be in the worst case if the element is at the end of the list. For eliminating this problem, we have one efficient searching technique called binary search. The condition for binary search is that all the data should be in a sorted array. We compare the element with the middle element of the array. If it is less than the middle element then search it in the left position of the array and if it is greater than the middle element then the search will be in the right portion of the array. We will take that portion only for search and compare it with the middle element of that portion. This process will be in iteration until we find the element or middle element has no left or right portion to search.

An iterative algorithm for Binary Search

K list of N elements in ascending order, X element to be found, Low temporary variable for the lower limit of the search interval, Middle temporary variable for a middle limit of the search interval and high temporary variable for the upper limit of the search interval.

A recursive algorithm for Binary search

K list of N elements in ascending order, X element to be found, Low temporary variable for the lower limit of the search interval, Middle temporary variable for the middle limit of the search interval and high temporary variable for the upper limit of the search interval, Loc position variable

The time complexity for a binary search method is A non-recursive binary search algorithm using a while loop. The number of iterations takes place depends on the number of elements in the list. This method is fast when compared to the linear search method. Thus, a worst case search requires O(log n) iterations, n is the number of elements in the list.

Fibonacci Search

Fibonacci search is also known as Fibonaccian search, is an improved version of binary search. The improvement is not to use the middle element at each step for splitting the array but rather to guess the key within the smaller internals. This internals is determined with the help of Fibonacci numbers, which are defined as follows.

The Fibonacci search is a divide and conquers algorithm for searching a sorted array. This searching method splits the array corresponding to the Fibonacci number. Fibonacci search is much faster than a binary search.

Algorithm for Fibonacci Search

Let fn is the nth Fibonacci number (where fn+2=fn+fn-1) for k>=0) and f0=0,f1=1. To find whether an item is in the list of l= fn ordered numbers, the following procedure is used.

The time complexity for Fibonacci Search algorithm is O(log(n)) 

Sorting

 Sorting is a technique of organizing the data.  It is a process of arranging the records either in ascending or descending order i.e., bringing some orderliness in the data. Sort methods are very important in data structures. Sorting can be performed on any one or a combination of one or more attributes present in each record. It is very easy and efficient to perform searching if data is stored in sorted order. The efficiency of sorting technique depending on the utilization of memory, sorting time, and utilization of processor, etc. sorting is classified in the following types

Exchange/Bubble Sort

Bubble sort algorithm for sorting is one of the simplest but inefficient algorithms, which organize the sorting procedure by uniformly checking the two adjacent elements and accordingly adjusting their positions.

It is inefficient because the number of iterations increases with the increase in the number of elements to be sorted. Hence, if there are n elements to be sorted then the number of iterations required are n-1.

If N elements are given in memory then for sorting we do the following steps,

The Time Complexity of Bubble Sort Algorithm

Analysis

In bubble sort, n-1 comparisons will be in the 1st pass, n-2 in the 2nd pass, n-3 in a 3rd pass and so on. So it’s very simple to calculate the number of comparisons. Total number of comparisons will be (n-1)+(n-2)+(n-3)+……+3+2+1. it’s a form of arithmetic progression series. So we can apply the formula

sum=n(n-1)/2 which is of O(n2)

The main advantage is the simplicity of the algorithm, the additional space requirement is only one temporary variable and it behaves as O(n) for a sorted array of the element.

Selection Sort

A selection sort algorithm sorts the list by selecting the largest element in the unsorted list and places it at the appropriate position (i.e., swapping). In each pass, the largest element in the list is placed at the appropriate position. If there are n elements in the list,

In the first pass, all the elements are compared and the largest element is placed in the last position. In the second pass, first n-1 elements are compared and the process continues for list of n-2 elements, then for n-3 elements and so on until the first element. This process is continued until all the elements are sorted.

[quote]Note: If there are n elements in a list, we require n-1 passes.[/quote]

Pass 1:

Pass 2:

Pass 3:

Pass N-1:

Result arr[0]…….arr[N-1].

Advantage

Selection sort minimizes data movement by putting one entry at its final position at every pass. Hence this algorithm is useful for a continuous list of large elements where the movement of entries is expensive.

Disadvantage

Analysis

As we have seen selection sort algorithm will search the smallest element in the array and then that element will be a proper position. So in pass 1, it will compare n-1 elements the same process will be for pass 2 but this comparison will be n-2 because the first element is already at the proper position. The elements, which are already at the correct position, will not be disturbed.

Insertion Sort

As the name suggests, insertion sort is a technique, which inserts an element at an appropriate location(in an array) by comparing each element with the corresponding elements present at its left and accordingly moves rest of the elements of the given array.

Algorithm for Insertion Sort

A list of N elements i.e., array, t temporary variable for interchange the value, k total no of passes and j & i another control variables

The time complexity for insertion sort

Analysis

In insertion sort, we insert the element before or after and we start comparison from the first element. Since the first element has no other elements before it. So it does not require any comparison. The second element requires 1 comparison, third requires 2 comparisons. The fourth requires 3 comparisons and so on. The last element requires n-1 comparisons.

Shell Sort/Increment Sort

The shell sort is based on the insertion sort. The given list of elements is broken down in a few smaller lists. After this, the insertion sort is applied on such smaller lists. These smaller lists are again recombined. Another partition is made and the procedure is repeated. The breaking down of the list into smaller lists is by selecting elements at a specific location.

This is determined by a factor called increment. Consider the list 32,21,6,43,31,7,44.we may initially take the increment as 3. This would break the list as 32,43,44;21,31 and 6,7. The first list is done using element[0],element[3] and element[6] where the increment is 3. This list is sorted using insertion sort. The next list is element[1] and element[4] where the increment is the same 3. The final list is element[2] and element[5] with the same increment of 3. After sorting them, the sublists are recombined.

As a next stage, the increment is changed and the process is repeated. The process continues till the increment is 1.the choice of increment is left to our choice.

Algorithm for Shell Sort

A list of N elements i.e., array, incr maximum increment  value, k temporary variable j & i control variables

Quick Sort/Partition Exchange Sort

The most powerful sorting algorithm is quicksort. It uses the policy of divide and conquers. It is also known as a partition exchange sort. In Quicksort we divide the original list into two sublists. We choose the item from the list called key from which all the left side elements are smaller and all the right side of elements are greater than that element. One list is on the left side of the key and the second list is on the right side of the key, means the key will be in its actual position. Hence there are two conditions for choosing key-

All the elements on the left side of the key should be smaller or equal to the key. All the elements on the right side of the key should be greater than are equal to key. Similarly, we choose the pivot for dividing the sublist until there are two or more elements in the sublist.

The process for sorting the elements through quicksort is as-

For placing the key at the proper place we have a need to do the following process

Analysis

The time requirement of quicksort depending on the position of the key in the list, how the key is dividing the list into sublists. It may be an equal division of list or maybe it will not divide also.

The time complexity for Quicksort

Average case: in the average case, we assume that the list is equally divided means list1 is equally divided into two sublists, these two sublists in four sublists and so on. So the total number of elements at a particular level will be 21-1. So the total number of steps will be log2n. the number of comparison at any level will be maximum n. so we can say run time of quicksort will be of O( n log n).

Worst case: Suppose a list of the element is already in sorted order. When we find the key then it will be the first element. So hear it produces only 1 sublist which is on the right side of the first element starts from the second element. Similarly, other sublists will be created only at the right side. The number of comparison for the first element is n, the second element requires n-1 comparisons and so on. In this case O(n2) time required.

Algorithm for Quick Sort

An array of n elements, LB lower bound of the current sub-list, UB upper bound of the current sub-list, i index variable, j index variable, key is key value and flag logical variable

Merge/2-way merge sort

Merge sort is a sorting technique that is generally applied between two sorted arrays. Here a third array is created and the elements of both the sorted arrays are matched, correspondingly, the third array is filled. Hence, the final third array obtained will be consisting of elements belonging to both the arrays in sorted order.

Algorithm for Merge Sort

Consider, that there are two arrays namely array A and B respectively. Each of these arrays consists of ‘3’ elements. The two arrays with their elements are represented below

In order to proceed with the algorithm, a third empty array ‘C’ is created, with a capacity of 6 bits (as array A and B consists of a total of 6 bits). The following steps narrate various stages and mechanisms of the algorithm.

Step 1: Initially both the arrays ‘A’ and ‘B’ are sorted by considering the algorithm like bubble sort or inserting sort etc. the resultant arrays obtained are given below.Step 2: Compare the elements in both arrays. i.e. A and B. and drop that element into the array ‘C’.

Step 3: Element ‘4’ of array A is compared with element 6 of array ‘B’. As A[1]<B[0], hence A[1] is dropped into the array C.

               

Step 4: Element ‘5’ of array ‘A’ is compared with element ‘6’ of array ‘B’. As A[2]<B[0] hence A[2] is dropped into the array ‘C’.

Step 5: As there are no elements left in ‘A’, hence the process gets terminated by inserting rest of the elements of ‘B’ into ‘C’. The resultant ‘C’ obtained will consist of elements in ascending order.

Analysis

Let us take an array of size n is used for merge sort. Because here we take elements in pairs and merge with another pair after sorting. So the merge sort requires maximum log2n passes. Each pass, the total number of comparisons can be maximum n. hence we can say merge sort requires maximum n* log2 n compares with is of O(nlog2n). The main disadvantage of the merge sort is space requirement it requires extra space of O(n).

The time complexity for Mergesort

Heap Sort

The elements of the heap tree are represented by an array. The root will be the largest element of the heap tree. Since it is maintained in the array, so the largest value should be the last element of the array.

Algorithm for Heapsort

Note: The maximum index value of the array is decremented by one when an element is deleted.

Comparing the different type of sorting:

Sorting Technique Best Case Average Case Worst Case
Bubble O(n) O(n2) O(n2)
Insertion O(n) O(n2) O(n2)
Selection O(n2) O(n2) O(n2)
Shell    –
Quick O(n2) O(n log n) O(n log n)
Merge O(n log n) O(n log n) O(n log n)
Heap O(n log n) O(n log n) O(n log n)