# Algorithms

## Sorting

### Insertion sort

• Worst case performance O(n^2) comparisons, swaps
• Best case performance O(n) comparisons, O(1) swaps
• Average case performance О(n^2) comparisons, swaps
• Worst case space complexity О(n) total, O(1) auxiliary
def insertion_sort(data):
for i in range(len(data)):
j = i
while j > 0 and data[j] < data[j - 1]:
data[j - 1], data[j] = data[j], data[j - 1]
j -= 1


"A useful optimization in practice for these algorithms [quicksort, mergesort] is to use insertion sort for sorting small sublists, where insertion sort outperforms these more complex algorithms. The size of list for which insertion sort has the advantage varies by environment and implementation, but is typically between eight and twenty elements."

### Quicksort

• Worst case performance, O(n^2) (extremely rare)
• Best case performance, O(n log n)
• Average case performance, O(n log n)
• Worst case space complexity O(n) auxiliary (naive), O(log n) auxiliary
def quicksort(data):
if data == []:
return data
pivot = data[0]
lesser = quicksort([x for x in data[1:] if x <= pivot])
greater = quicksort([x for x in data[1:] if x > pivot])
return lesser + [pivot] + greater


### Merge sort

• Worst case performance O(n log n)
• Best case performance O(n log n) typical, O(n) natural variant
• Average case performance O(n log n)
• Worst case space complexity O(n) auxiliary
def merge(left, right):
result = []
i, j = 0, 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result += left[i:]
result += right[j:]
return result

def mergesort(lst):
if len(lst) <= 1:
return lst
middle = int(len(lst) / 2)
left = mergesort(lst[:middle])
right = mergesort(lst[middle:])
result = merge(left, right)
return result


Merge sort is often the best choice for sorting a linked list: in this situation it is relatively easy to implement a merge sort in such a way that it requires only Θ(1) extra space, and the slow random-access performance of a linked list makes some other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible.

### Heapsort

• Worst case performance O(log(n))
• Average case performance O(n log(n))
• Worst case space complexity O(n)
def heapsort(data):
# heapq.heapify creates heap from a list in linear time
heapq.heapify(data)
return [heapq.heappop(data) for i in range(len(data))]


## Searching

### Sequential search

In computer science, linear search or sequential search is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is found.

The performance of linear search improves if the desired value is more likely to be near the beginning of the list than to its end. Therefore, if some values are much more likely to be searched than others, it is desirable to place them at the beginning of the list.

As a result, even though in theory other search algorithms may be faster than linear search (for instance binary search), in practice even on medium sized arrays (around 100 items or less) it might be infeasible to use anything else. On larger arrays, it only makes sense to use other, faster search methods if the data is large enough, because the initial time to prepare (sort) the data is comparable to many linear searches.

## Binary search

bisect Python module: This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. For long lists of items with expensive comparison operations, this can be an improvement over the more common approach. The module is called bisect because it uses a basic bisection algorithm to do its work. The source code may be most useful as a working example of the algorithm (the boundary conditions are already right!).

def binary_search(data, value):
'''
Assumes that data is sorted
:returns: True if value is found, False if not
'''

if len(data) == 0:
return False

left = 0
right = len(data) - 1

if left == right:
if data[left] == value:
return True
else:
return False

while right - left > 1:
middle = (left + right) / 2
if data[middle] == value:
return True
if data[middle] < value:
left = middle
else:
right = middle

if data[left] == value:
return True
if data[right] == value:
return True

return False


### Hash table

Best/Worst

• Space O(n)/O(n)
• Search O(1)/O(n)
• Insert O(1)/O(n)
• Delete O(1)/O(n)

### Binary search trees

• The left subtree of a node contains only nodes with keys less than the node's key.
• The right subtree of a node contains only nodes with keys greater than the node's key.
• The left and right subtree must each also be a binary search tree.
• There must be no duplicate nodes.

Best/Worst

• Space O(n)/O(n)
• Search O(log n)/O(n)
• Insert O(log n)/O(n)
• Delete O(log n)/O(n)
Last edited by Artem Dudarev, 2013-10-13 02:01:37. Edit