Home Linear and Binary search
Post
Cancel

Linear and Binary search

An example of a computer searching algorithm is linear search. This is a simple algorithm used to find a value in a list of data. The algorithm runs as follows:

  • Identify a search term.
  • Look at the first item in the list.
  • Compare the item with the search term.
  • Is the current item the same as the search term? If so, the item has been found. If not, move to the next item.
  • Repeat from step two until the last item in the list has been reached.
  • If the end of the list has been reached and the search term has not been found, then the search term is not in the list and the algorithm can stop.

Linear search example

This algorithm could be used to search the following list for the number 1:

3, 2, 4, 1, 5

The algorithm would produce:

  • 3, 2, 4, 1, 5 (1 compared to 3 - not found)

  • 3, 2, 4, 1, 5 (1 compared to 2 - not found)

  • 3, 2, 4, 1, 5 (1 compared to 4 - not found)

  • 3, 2, 4, 1, 5 (1 compared to 1- found)

Because the linear search algorithm simply moves up the list and checks each item, the data in the list does not need to be in order. However, as the list gets longer the algorithm takes longer to run, making it inefficient for large lists.

In pseudo-code, this may look like:

find <-- 1
found <-- False
length <-- length(list)
counter <-- 0


WHILE found = False AND counter <= length
     IF list[counter] = find THEN
          found <-- True
          OUTPUT 'Found at position', counter
        ELSE
          counter <-- counter + 1
        ENDIF
ENDWHILE
IF found = False THEN
   OUTPUT 'Item not found'
ENDIF

Binary search

Binary search is a ‘divide and conquer’ algorithm which requires the initial array to be sorted before searching.

It is called binary because it splits the array into two halves as part of the algorithm. Initially, a binary search will look at the item in the middle of the array and compare it to the search terms.

This leads to three possible scenarios:

  • Search criteria = middle item: Item Found
  • Search criteria is less than middle item: Search the first half of the array
  • Search criteria is larger than middle item: Search the upper half of the array

The process of dividing the array continues until the middle item meets the search criteria.

The binary search algorithm can be expressed by the following pseudo-code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
OUTPUT "Which customer do you want to find?"
INPUT user inputs John Smith
STORE the user's input in the customer_name variable
customer_found = False
(a flag that identifies if the customer is found)

WHILE customer_found = False:
  Find midpoint of list
  IF customer_name = record at midpoint of list THEN
    customer_found = True
  ELSE IF customer comes before the midpoint THEN
    throw away the second half of the list
  ELSE 
    throw away the first part of the list
    OUTPUT customer details
array = [3, 14, 15, 29, 31, 35, 40, 66, 68, 98]
length = list.length                                     
number = input("What number would you like to find?")
lowerBoundry = 0
upperBoundry = length – 1
match = false
while match == false AND lowerBoundry != upperBoundry
  midPoint = round((lowerBoundry + upperBoundry)/2)
  if array[midPoint] == number then
    print("We have found your number!")
    match = true
  elseif array[midPoint] - < number then
    lowerBoundry = midPoint + 1
  else
    upperBoundry = midPoint – 1
  endif
endwhile

This can be simplified as:

  • Let n = length of the array, let min = 0 and max = n-1 Calculate guess as the average of max and min, rounded down (so it is an integer)
  • If array[guess] equals target, then stop. You found it. Return guess
  • If the guess was too low, that is, array[guess] - < target, then set min = guess + 1
  • Otherwise, the guess was too high. Set max = guess - 1
  • Go back to step 2
  • Evaluating linear and binary searches
  • Although a binary search is a little harder to program, it is far more efficient than a linear search.

Example: For an array with 16 elements, the best case scenario is that a binary search will find the element on the first go and, in the worst case, on the fourth go (24 = 16).

For an array with 1 million elements, this would mean a worst case scenario of 20 goes, while the worst case for a linear search of 1 million items would be 1 million probes.

When should you use a binary search?

  • If the list is large and changing often, with items constantly being added or deleted, then the time it takes to constantly re-order the list to allow for a binary search might be longer than a simple serial search in the first place
  • If the list is large and static (e.g., a database of telephone numbers), then a binary search is very fast compared to a linear search (in maths terms it takes 2log2(n) for a binary search over n items)
  • If the list is small, then it might be simpler to use a linear search
  • If the list is random, then a linear search is the only way to search
  • If the list is skewed so that the most often searched items are placed at the beginning then, on average, a linear search is likely to be better
This post is licensed under CC BY 4.0 by the author.