Binary search


When we are looking in a sorted list, we are looking for which part of the list begins with the character we want. This technique can be implemented in Perl known as binary search, because we divide the list in two pieces after each look. We are starting in the middle of the list and we check if the current word is lower or greater than the word we are looking for. If it is lower, we are going in the middle of the bottom half of the list, if it is higher, we are going in the middle of the top half of the list, and we are continuing the process, dividing the list in halves until we find the word we are looking for, or we are out of the list. This process is faster than linear searching because we skip quite a lot of words in large arrays. The biggest drawn back is that the list must be sorted, but this is fine if we sort the array once and then we make multiple checks to see if some words exists inside the list.

This algorithm is best suited to look for some lines in a very large asciibeticaly sorted file.

It makes between 14 and 16 comparisons before reporting if a given word exists or not inside a file with 32897 lines.

It has full locale support and uses 'seek' and 'tell' for moving between positions inside the file: https://github.com/trizen/perl-scripts/blob/master/Finders/file_binsearch.pl

Comments