Algorithms

Sorting

Insertion sort

http://en.wikipedia.org/wiki/Insertion_sort

  • Worst case performance O(n^2) comparisons, swaps
  • Best case performance O(n) comparisons, O(1) swaps
  • Average case performance О(n^2) comparisons, swaps
  • Worst case space complexity О(n) total, O(1) auxiliary
def insertion_sort(data):
    for i in range(len(data)):
        j = i
        while j > 0 and data[j] < data[j - 1]:
            data[j - 1], data[j] = data[j], data[j - 1]
            j -= 1

"A useful optimization in practice for these algorithms [quicksort, mergesort] is to use insertion sort for sorting small sublists, where insertion sort outperforms these more complex algorithms. The size of list for which insertion sort has the advantage varies by environment and implementation, but is typically between eight and twenty elements."

Quicksort

http://en.wikipedia.org/wiki/Quicksort

  • Worst case performance, O(n^2) (extremely rare)
  • Best case performance, O(n log n)
  • Average case performance, O(n log n)
  • Worst case space complexity O(n) auxiliary (naive), O(log n) auxiliary

http://en.literateprograms.org/Quicksort_(Python)

def quicksort(data):
    if data == []:
        return data
    pivot = data[0]
    lesser = quicksort([x for x in data[1:] if x <= pivot])
    greater = quicksort([x for x in data[1:] if x > pivot])
    return lesser + [pivot] + greater

Merge sort

http://en.wikipedia.org/wiki/Merge_sort

  • Worst case performance O(n log n)
  • Best case performance O(n log n) typical, O(n) natural variant
  • Average case performance O(n log n)
  • Worst case space complexity O(n) auxiliary
def merge(left, right):
    result = []
    i, j = 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result += left[i:]
    result += right[j:]
    return result


def mergesort(lst):
    if len(lst) <= 1:
        return lst
    middle = int(len(lst) / 2)
    left = mergesort(lst[:middle])
    right = mergesort(lst[middle:])
    result = merge(left, right)
    return result

Merge sort is often the best choice for sorting a linked list: in this situation it is relatively easy to implement a merge sort in such a way that it requires only Θ(1) extra space, and the slow random-access performance of a linked list makes some other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible.

Heapsort

http://en.wikipedia.org/wiki/Heapsort

  • Worst case performance O(log(n))
  • Average case performance O(n log(n))
  • Worst case space complexity O(n)
def heapsort(data):
    # heapq.heapify creates heap from a list in linear time
    heapq.heapify(data)
    return [heapq.heappop(data) for i in range(len(data))]

Searching

Sequential search

http://en.wikipedia.org/wiki/Linear_search

In computer science, linear search or sequential search is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is found.

The performance of linear search improves if the desired value is more likely to be near the beginning of the list than to its end. Therefore, if some values are much more likely to be searched than others, it is desirable to place them at the beginning of the list.

As a result, even though in theory other search algorithms may be faster than linear search (for instance binary search), in practice even on medium sized arrays (around 100 items or less) it might be infeasible to use anything else. On larger arrays, it only makes sense to use other, faster search methods if the data is large enough, because the initial time to prepare (sort) the data is comparable to many linear searches.

Binary search

http://en.wikipedia.org/wiki/Binary_search_algorithm

bisect Python module: This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. For long lists of items with expensive comparison operations, this can be an improvement over the more common approach. The module is called bisect because it uses a basic bisection algorithm to do its work. The source code may be most useful as a working example of the algorithm (the boundary conditions are already right!).

def binary_search(data, value):
    '''
    Assumes that data is sorted
    :returns: True if value is found, False if not
    '''

    if len(data) == 0:
        return False

    left = 0
    right = len(data) - 1

    if left == right:
        if data[left] == value:
            return True
        else:
            return False

    while right - left > 1:
        middle = (left + right) / 2
        if data[middle] == value:
            return True
        if data[middle] < value:
            left = middle
        else:
            right = middle

    if data[left] == value:
        return True
    if data[right] == value:
        return True

    return False

Hash table

http://en.wikipedia.org/wiki/Hash_table

Best/Worst

  • Space O(n)/O(n)
  • Search O(1)/O(n)
  • Insert O(1)/O(n)
  • Delete O(1)/O(n)

Binary search trees

http://en.wikipedia.org/wiki/Binary_search_tree

  • The left subtree of a node contains only nodes with keys less than the node's key.
  • The right subtree of a node contains only nodes with keys greater than the node's key.
  • The left and right subtree must each also be a binary search tree.
  • There must be no duplicate nodes.

Best/Worst

  • Space O(n)/O(n)
  • Search O(log n)/O(n)
  • Insert O(log n)/O(n)
  • Delete O(log n)/O(n)
Last edited by Artem Dudarev, 2013-10-13 02:01:37. Edit