Table of Contents
- Introduction to Sorting
- Sorting Algorithm Classification
- Simple Sorting Algorithms
- Efficient Sorting Algorithms
- Non-Comparison Sorts
- Python Built-in Sorting
- Specialized Sorting Algorithms
- Sorting Applications
- Performance Analysis
- Best Practices
Introduction to Sorting
Sorting is the process of arranging data in a particular order (ascending or descending). It’s one of the most fundamental operations in computer science.
Why Sorting is Important
- Search Efficiency: Binary search requires sorted data (O(log n) vs O(n))
- Data Organization: Makes data easier to understand and process
- Algorithm Prerequisites: Many algorithms require sorted input
- Database Operations: Indexes, query optimization
- User Experience: Display data in meaningful order
Real-World Applications
- E-commerce: Sort products by price, rating, popularity
- Social Media: Sort posts by time, likes, relevance
- File Systems: Sort files by name, date, size
- Databases: ORDER BY queries
- Search Engines: Rank search results
- Banking: Transaction history
Sorting Algorithm Classification
By Time Complexity
| Class | Algorithms | Best/Avg/Worst |
|---|---|---|
| O(n²) | Bubble, Selection, Insertion | Simple, small datasets |
| O(n log n) | Merge, Heap, Quick (avg) | General purpose |
| O(n) | Counting, Radix, Bucket | Special cases |
By Space Complexity
- In-Place: O(1) extra space (Bubble, Selection, Insertion, Heap, Quick)
- Not In-Place: O(n) extra space (Merge, Counting, Radix)
By Stability
- Stable: Preserves relative order of equal elements (Bubble, Insertion, Merge, Tim)
- Unstable: May change relative order (Selection, Heap, Quick)
By Method
- Comparison-Based: Compare elements (most algorithms)
- Non-Comparison: Use other properties (Counting, Radix, Bucket)
By Adaptability
- Adaptive: Faster on partially sorted data (Insertion, Bubble, Tim)
- Non-Adaptive: Same speed regardless (Selection, Merge, Heap)
Simple Sorting Algorithms
1. Bubble Sort
Repeatedly swaps adjacent elements if they’re in wrong order.
def bubble_sort(arr):
"""
Bubble Sort - Compare adjacent elements and swap
Time: O(n²) - Best: O(n) if already sorted
Space: O(1)
Stable: Yes
Adaptive: Yes (with optimization)
"""
n = len(arr)
for i in range(n):
# Flag to detect if any swap happened
swapped = False
# Last i elements are already in place
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True
# If no swap occurred, array is sorted
if not swapped:
break
return arr
def bubble_sort_verbose(arr):
"""Bubble sort with step-by-step output"""
n = len(arr)
print(f"Initial array: {arr}")
for i in range(n):
swapped = False
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True
print(f"Pass {i + 1}: {arr}")
if not swapped:
print("Array is sorted!")
break
return arr
# Example
arr = [64, 34, 25, 12, 22, 11, 90]
print("Sorted array:", bubble_sort(arr.copy()))PythonWhen to Use:
- Small datasets (< 10 elements)
- Nearly sorted data
- Educational purposes
- Simple implementation needed
2. Selection Sort
Repeatedly finds minimum element and places it at the beginning.
def selection_sort(arr):
"""
Selection Sort - Select minimum and swap with first unsorted
Time: O(n²) - Always
Space: O(1)
Stable: No
Adaptive: No
"""
n = len(arr)
for i in range(n):
# Find minimum element in remaining unsorted array
min_idx = i
for j in range(i + 1, n):
if arr[j] < arr[min_idx]:
min_idx = j
# Swap found minimum with first element
arr[i], arr[min_idx] = arr[min_idx], arr[i]
return arr
def selection_sort_descending(arr):
"""Selection sort in descending order"""
n = len(arr)
for i in range(n):
max_idx = i
for j in range(i + 1, n):
if arr[j] > arr[max_idx]:
max_idx = j
arr[i], arr[max_idx] = arr[max_idx], arr[i]
return arr
# Example
arr = [64, 25, 12, 22, 11]
print("Selection sort:", selection_sort(arr.copy()))
print("Descending:", selection_sort_descending(arr.copy()))PythonWhen to Use:
- Memory is limited (in-place)
- Minimize number of swaps
- Small datasets
3. Insertion Sort
Builds sorted array one element at a time by inserting elements in correct position.
def insertion_sort(arr):
"""
Insertion Sort - Insert each element at correct position
Time: O(n²) - Best: O(n) if already sorted
Space: O(1)
Stable: Yes
Adaptive: Yes
"""
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
# Move elements greater than key one position ahead
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
return arr
def insertion_sort_binary(arr):
"""Insertion sort with binary search for position"""
import bisect
for i in range(1, len(arr)):
key = arr[i]
# Find position using binary search
pos = bisect.bisect_left(arr, key, 0, i)
# Shift elements and insert
arr = arr[:pos] + [key] + arr[pos:i] + arr[i+1:]
return arr
def insertion_sort_recursive(arr, n=None):
"""Recursive insertion sort"""
if n is None:
n = len(arr)
if n <= 1:
return arr
# Sort first n-1 elements
insertion_sort_recursive(arr, n - 1)
# Insert last element at correct position
last = arr[n - 1]
j = n - 2
while j >= 0 and arr[j] > last:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = last
return arr
# Example
arr = [12, 11, 13, 5, 6]
print("Insertion sort:", insertion_sort(arr.copy()))PythonWhen to Use:
- Small datasets
- Nearly sorted data (very efficient!)
- Online sorting (sort as data arrives)
- Part of more complex algorithms (Timsort)
Efficient Sorting Algorithms
1. Merge Sort
Divide-and-conquer algorithm that divides array into halves, sorts them, and merges.
def merge_sort(arr):
"""
Merge Sort - Divide and conquer
Time: O(n log n) - Always
Space: O(n)
Stable: Yes
Adaptive: No
"""
if len(arr) <= 1:
return arr
# Divide
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
# Conquer (merge)
return merge(left, right)
def merge(left, right):
"""Merge two sorted arrays"""
result = []
i = j = 0
# Compare elements from left and right
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
# Append remaining elements
result.extend(left[i:])
result.extend(right[j:])
return result
def merge_sort_inplace(arr, left=0, right=None):
"""In-place merge sort (still O(n) auxiliary space for merge)"""
if right is None:
right = len(arr) - 1
if left < right:
mid = (left + right) // 2
merge_sort_inplace(arr, left, mid)
merge_sort_inplace(arr, mid + 1, right)
merge_inplace(arr, left, mid, right)
return arr
def merge_inplace(arr, left, mid, right):
"""Merge for in-place merge sort"""
# Create temporary arrays
left_arr = arr[left:mid + 1]
right_arr = arr[mid + 1:right + 1]
i = j = 0
k = left
while i < len(left_arr) and j < len(right_arr):
if left_arr[i] <= right_arr[j]:
arr[k] = left_arr[i]
i += 1
else:
arr[k] = right_arr[j]
j += 1
k += 1
while i < len(left_arr):
arr[k] = left_arr[i]
i += 1
k += 1
while j < len(right_arr):
arr[k] = right_arr[j]
j += 1
k += 1
# Example
arr = [38, 27, 43, 3, 9, 82, 10]
print("Merge sort:", merge_sort(arr))PythonWhen to Use:
- Guaranteed O(n log n) performance
- Stable sort needed
- Linked lists (efficient in-place for linked lists)
- External sorting (large datasets on disk)
2. Quick Sort
Divide-and-conquer using pivot element to partition array.
def quick_sort(arr):
"""
Quick Sort - Partition around pivot
Time: O(n log n) avg, O(n²) worst
Space: O(log n) recursion stack
Stable: No
Adaptive: No
"""
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
def quick_sort_inplace(arr, low=0, high=None):
"""In-place quick sort"""
if high is None:
high = len(arr) - 1
if low < high:
# Partition and get pivot index
pi = partition(arr, low, high)
# Recursively sort elements before and after partition
quick_sort_inplace(arr, low, pi - 1)
quick_sort_inplace(arr, pi + 1, high)
return arr
def partition(arr, low, high):
"""Partition array around pivot"""
pivot = arr[high] # Choose last element as pivot
i = low - 1 # Index of smaller element
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return i + 1
def quick_sort_random_pivot(arr, low=0, high=None):
"""Quick sort with random pivot (avoids worst case)"""
import random
if high is None:
high = len(arr) - 1
if low < high:
pi = partition_random(arr, low, high)
quick_sort_random_pivot(arr, low, pi - 1)
quick_sort_random_pivot(arr, pi + 1, high)
return arr
def partition_random(arr, low, high):
"""Partition with random pivot"""
import random
# Random pivot
pivot_idx = random.randint(low, high)
arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
return partition(arr, low, high)
def quick_sort_three_way(arr, low=0, high=None):
"""
Three-way quick sort (efficient for many duplicates)
Partitions into: < pivot, = pivot, > pivot
"""
if high is None:
high = len(arr) - 1
if low >= high:
return arr
lt, gt = low, high
pivot = arr[low]
i = low
while i <= gt:
if arr[i] < pivot:
arr[lt], arr[i] = arr[i], arr[lt]
lt += 1
i += 1
elif arr[i] > pivot:
arr[i], arr[gt] = arr[gt], arr[i]
gt -= 1
else:
i += 1
quick_sort_three_way(arr, low, lt - 1)
quick_sort_three_way(arr, gt + 1, high)
return arr
# Example
arr = [10, 7, 8, 9, 1, 5]
print("Quick sort:", quick_sort(arr.copy()))
print("In-place:", quick_sort_inplace(arr.copy()))PythonWhen to Use:
- Average case O(n log n) is very fast (cache-friendly)
- In-place sorting with O(log n) space
- When stability is not required
- General-purpose sorting
3. Heap Sort
Uses binary heap data structure to sort.
def heap_sort(arr):
"""
Heap Sort - Build max heap and extract elements
Time: O(n log n) - Always
Space: O(1)
Stable: No
Adaptive: No
"""
n = len(arr)
# Build max heap
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# Extract elements from heap one by one
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0] # Swap
heapify(arr, i, 0)
return arr
def heapify(arr, n, i):
"""Heapify subtree rooted at index i"""
largest = i
left = 2 * i + 1
right = 2 * i + 2
# Check if left child is larger
if left < n and arr[left] > arr[largest]:
largest = left
# Check if right child is larger
if right < n and arr[right] > arr[largest]:
largest = right
# If largest is not root
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)
def heap_sort_using_heapq(arr):
"""Heap sort using Python's heapq module"""
import heapq
# heapq creates min heap
heapq.heapify(arr)
return [heapq.heappop(arr) for _ in range(len(arr))]
# Example
arr = [12, 11, 13, 5, 6, 7]
print("Heap sort:", heap_sort(arr.copy()))PythonWhen to Use:
- Guaranteed O(n log n) performance
- Memory is limited (in-place)
- When finding k largest/smallest elements
- Priority queue applications
Non-Comparison Sorts
1. Counting Sort
Counts occurrences of each value (works for integers in known range).
def counting_sort(arr):
"""
Counting Sort - Count occurrences and reconstruct
Time: O(n + k) where k is range of input
Space: O(k)
Stable: Yes
Works for non-negative integers
"""
if not arr:
return arr
# Find range
max_val = max(arr)
min_val = min(arr)
range_size = max_val - min_val + 1
# Count occurrences
count = [0] * range_size
output = [0] * len(arr)
for num in arr:
count[num - min_val] += 1
# Cumulative count
for i in range(1, len(count)):
count[i] += count[i - 1]
# Build output array
for i in range(len(arr) - 1, -1, -1):
output[count[arr[i] - min_val] - 1] = arr[i]
count[arr[i] - min_val] -= 1
return output
def counting_sort_simple(arr):
"""Simpler version (not stable)"""
if not arr:
return arr
max_val = max(arr)
min_val = min(arr)
range_size = max_val - min_val + 1
count = [0] * range_size
# Count occurrences
for num in arr:
count[num - min_val] += 1
# Reconstruct array
result = []
for i, freq in enumerate(count):
result.extend([i + min_val] * freq)
return result
# Example
arr = [4, 2, 2, 8, 3, 3, 1]
print("Counting sort:", counting_sort(arr))PythonWhen to Use:
- Small range of integers
- Need O(n) time complexity
- Stable sort required
- As subroutine in radix sort
2. Radix Sort
Sorts numbers digit by digit (uses counting sort as subroutine).
def radix_sort(arr):
"""
Radix Sort - Sort by each digit
Time: O(d * (n + k)) where d is digits, k is base
Space: O(n + k)
Stable: Yes
Works for non-negative integers
"""
if not arr:
return arr
# Find maximum number to know number of digits
max_val = max(arr)
# Do counting sort for every digit
exp = 1
while max_val // exp > 0:
counting_sort_by_digit(arr, exp)
exp *= 10
return arr
def counting_sort_by_digit(arr, exp):
"""Counting sort based on digit represented by exp"""
n = len(arr)
output = [0] * n
count = [0] * 10
# Count occurrences
for num in arr:
digit = (num // exp) % 10
count[digit] += 1
# Cumulative count
for i in range(1, 10):
count[i] += count[i - 1]
# Build output array
for i in range(n - 1, -1, -1):
digit = (arr[i] // exp) % 10
output[count[digit] - 1] = arr[i]
count[digit] -= 1
# Copy output to arr
for i in range(n):
arr[i] = output[i]
def radix_sort_negative(arr):
"""Radix sort that handles negative numbers"""
if not arr:
return arr
# Separate positive and negative
positive = [x for x in arr if x >= 0]
negative = [-x for x in arr if x < 0]
# Sort both
radix_sort(positive)
radix_sort(negative)
# Combine (negatives reversed)
return [-x for x in reversed(negative)] + positive
# Example
arr = [170, 45, 75, 90, 802, 24, 2, 66]
print("Radix sort:", radix_sort(arr.copy()))PythonWhen to Use:
- Sorting integers or strings
- Fixed-length keys
- Large number of elements
- O(n) time needed
3. Bucket Sort
Distributes elements into buckets and sorts each bucket.
def bucket_sort(arr, bucket_size=5):
"""
Bucket Sort - Distribute into buckets and sort
Time: O(n + k) average, O(n²) worst
Space: O(n + k)
Stable: Depends on sorting algorithm used
Works best for uniformly distributed data
"""
if not arr:
return arr
# Find range
min_val, max_val = min(arr), max(arr)
# Create buckets
bucket_count = (max_val - min_val) // bucket_size + 1
buckets = [[] for _ in range(bucket_count)]
# Distribute elements into buckets
for num in arr:
bucket_idx = (num - min_val) // bucket_size
buckets[bucket_idx].append(num)
# Sort each bucket and concatenate
result = []
for bucket in buckets:
result.extend(sorted(bucket)) # Can use any sorting algorithm
return result
def bucket_sort_floats(arr):
"""Bucket sort for floats in range [0, 1)"""
n = len(arr)
buckets = [[] for _ in range(n)]
# Distribute into buckets
for num in arr:
bucket_idx = int(num * n)
if bucket_idx == n: # Handle edge case for 1.0
bucket_idx = n - 1
buckets[bucket_idx].append(num)
# Sort each bucket
for bucket in buckets:
insertion_sort(bucket)
# Concatenate
result = []
for bucket in buckets:
result.extend(bucket)
return result
# Example
arr = [0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434]
print("Bucket sort:", bucket_sort_floats(arr))PythonWhen to Use:
- Uniformly distributed data
- Floating-point numbers in range
- Need O(n) average time
- Can parallelize easily
Python Built-in Sorting
1. sorted() Function
# Basic usage
arr = [64, 34, 25, 12, 22, 11, 90]
sorted_arr = sorted(arr) # Returns new sorted list
print(f"Original: {arr}")
print(f"Sorted: {sorted_arr}")
# Reverse order
reverse_sorted = sorted(arr, reverse=True)
print(f"Descending: {reverse_sorted}")
# Sort with key function
words = ['banana', 'pie', 'Washington', 'book']
sorted_words = sorted(words, key=str.lower) # Case-insensitive
print(f"Case-insensitive: {sorted_words}")
# Sort by length
sorted_by_len = sorted(words, key=len)
print(f"By length: {sorted_by_len}")
# Sort complex objects
students = [
{'name': 'John', 'grade': 90, 'age': 20},
{'name': 'Jane', 'grade': 95, 'age': 19},
{'name': 'Dave', 'grade': 85, 'age': 21}
]
# Sort by grade
sorted_by_grade = sorted(students, key=lambda x: x['grade'], reverse=True)
print("By grade:", [s['name'] for s in sorted_by_grade])
# Multiple sort keys
from operator import itemgetter
sorted_multi = sorted(students, key=itemgetter('age', 'grade'))
print("By age then grade:", [s['name'] for s in sorted_multi])Python2. list.sort() Method
# In-place sorting
arr = [64, 34, 25, 12, 22, 11, 90]
arr.sort() # Modifies original list
print(f"Sorted in-place: {arr}")
# With parameters
arr = [64, 34, 25, 12, 22, 11, 90]
arr.sort(reverse=True)
print(f"Descending: {arr}")
# Custom comparison
class Student:
def __init__(self, name, grade):
self.name = name
self.grade = grade
def __repr__(self):
return f"{self.name}({self.grade})"
students = [
Student('John', 90),
Student('Jane', 95),
Student('Dave', 85)
]
students.sort(key=lambda s: s.grade, reverse=True)
print("Students by grade:", students)Python3. Timsort (Python’s Default Algorithm)
Python’s built-in sort uses Timsort, a hybrid stable sorting algorithm derived from merge sort and insertion sort.
"""
Timsort Properties:
- Time: O(n log n) worst case, O(n) best case
- Space: O(n)
- Stable: Yes
- Adaptive: Yes (fast on partially sorted data)
Key Features:
1. Identifies "runs" (already sorted subsequences)
2. Uses insertion sort for small runs
3. Merges runs using merge sort
4. Optimized for real-world data
Why it's great:
- Excellent performance on real-world data
- Handles partially sorted data very well
- Stable (maintains order of equal elements)
- Used by Python, Java, Android, Swift
"""
def timsort_demo():
"""Demonstrate Timsort's efficiency on different data"""
import time
# Nearly sorted data (Timsort excels here)
nearly_sorted = list(range(10000))
nearly_sorted[100], nearly_sorted[500] = nearly_sorted[500], nearly_sorted[100]
start = time.time()
sorted(nearly_sorted)
print(f"Nearly sorted: {time.time() - start:.4f}s")
# Random data
import random
random_data = [random.randint(1, 10000) for _ in range(10000)]
start = time.time()
sorted(random_data)
print(f"Random data: {time.time() - start:.4f}s")
timsort_demo()Python4. Advanced Sorting with functools
from functools import cmp_to_key
# Custom comparison function
def compare(x, y):
"""
Compare function:
Return -1 if x < y
Return 0 if x == y
Return 1 if x > y
"""
if x < y:
return -1
elif x > y:
return 1
else:
return 0
# Sort using comparison function
arr = [5, 2, 8, 1, 9]
sorted_arr = sorted(arr, key=cmp_to_key(compare))
print(f"With cmp_to_key: {sorted_arr}")
# Complex comparison
def compare_complex(item1, item2):
"""Compare by length first, then alphabetically"""
if len(item1) != len(item2):
return len(item1) - len(item2)
if item1 < item2:
return -1
elif item1 > item2:
return 1
return 0
words = ['banana', 'pie', 'apple', 'cherry', 'a']
sorted_words = sorted(words, key=cmp_to_key(compare_complex))
print(f"Complex sort: {sorted_words}")PythonSpecialized Sorting Algorithms
1. Shell Sort
Generalization of insertion sort that allows exchange of far items.
def shell_sort(arr):
"""
Shell Sort - Insertion sort with gap
Time: O(n²) worst, better in practice
Space: O(1)
Stable: No
Adaptive: Yes
"""
n = len(arr)
gap = n // 2
while gap > 0:
for i in range(gap, n):
temp = arr[i]
j = i
while j >= gap and arr[j - gap] > temp:
arr[j] = arr[j - gap]
j -= gap
arr[j] = temp
gap //= 2
return arr
# Example
arr = [12, 34, 54, 2, 3]
print("Shell sort:", shell_sort(arr.copy()))Python2. Comb Sort
Improvement over bubble sort with gap.
def comb_sort(arr):
"""
Comb Sort - Bubble sort with shrinking gap
Time: O(n²) worst, O(n log n) average
Space: O(1)
Stable: No
"""
n = len(arr)
gap = n
shrink = 1.3
sorted_flag = False
while not sorted_flag:
gap = int(gap / shrink)
if gap <= 1:
gap = 1
sorted_flag = True
i = 0
while i + gap < n:
if arr[i] > arr[i + gap]:
arr[i], arr[i + gap] = arr[i + gap], arr[i]
sorted_flag = False
i += 1
return arr
# Example
arr = [8, 4, 1, 56, 3, -44, 23, -6, 28, 0]
print("Comb sort:", comb_sort(arr.copy()))Python3. Cycle Sort
Minimizes number of writes (useful for flash memory).
def cycle_sort(arr):
"""
Cycle Sort - Minimize number of writes
Time: O(n²)
Space: O(1)
Stable: No
Useful when writing to memory is expensive
"""
writes = 0
for cycle_start in range(len(arr) - 1):
item = arr[cycle_start]
# Find position where we put the item
pos = cycle_start
for i in range(cycle_start + 1, len(arr)):
if arr[i] < item:
pos += 1
# If item is already in correct position
if pos == cycle_start:
continue
# Skip duplicates
while item == arr[pos]:
pos += 1
# Put item to its right position
arr[pos], item = item, arr[pos]
writes += 1
# Rotate rest of the cycle
while pos != cycle_start:
pos = cycle_start
for i in range(cycle_start + 1, len(arr)):
if arr[i] < item:
pos += 1
while item == arr[pos]:
pos += 1
arr[pos], item = item, arr[pos]
writes += 1
return arr, writes
# Example
arr = [1, 8, 3, 9, 10, 10, 2, 4]
sorted_arr, write_count = cycle_sort(arr.copy())
print(f"Cycle sort: {sorted_arr}, writes: {write_count}")Python4. Pancake Sort
Sorting by flipping prefixes.
def pancake_sort(arr):
"""
Pancake Sort - Sort by reversing prefixes
Time: O(n²)
Space: O(1)
Stable: No
Fun algorithm, not practical
"""
def flip(arr, k):
"""Reverse arr[0:k+1]"""
left, right = 0, k
while left < right:
arr[left], arr[right] = arr[right], arr[left]
left += 1
right -= 1
n = len(arr)
for curr_size in range(n, 1, -1):
# Find index of maximum in arr[0:curr_size]
max_idx = arr.index(max(arr[0:curr_size]))
# Move maximum to beginning
if max_idx != 0:
flip(arr, max_idx)
# Move maximum to correct position
flip(arr, curr_size - 1)
return arr
# Example
arr = [23, 10, 20, 11, 12, 6, 7]
print("Pancake sort:", pancake_sort(arr.copy()))PythonSorting Applications
1. Sort Objects by Multiple Criteria
from operator import attrgetter, itemgetter
class Employee:
def __init__(self, name, age, salary):
self.name = name
self.age = age
self.salary = salary
def __repr__(self):
return f"Employee({self.name}, {self.age}, ${self.salary})"
employees = [
Employee('Alice', 30, 70000),
Employee('Bob', 25, 60000),
Employee('Charlie', 30, 80000),
Employee('Diana', 25, 75000)
]
# Sort by age, then by salary (descending)
sorted_emp = sorted(employees, key=lambda e: (e.age, -e.salary))
print("By age, then -salary:", sorted_emp)
# Using attrgetter
sorted_emp2 = sorted(employees, key=attrgetter('age', 'salary'))
print("Using attrgetter:", sorted_emp2)Python2. Stable Sort for Maintaining Order
# Stability example
data = [
('Alice', 25),
('Bob', 30),
('Charlie', 25),
('Diana', 30)
]
# Sort by age only - stable sort preserves original order for equal keys
sorted_data = sorted(data, key=lambda x: x[1])
print("Stable sort:", sorted_data)
# Output: [('Alice', 25), ('Charlie', 25), ('Bob', 30), ('Diana', 30)]
# Alice before Charlie, Bob before Diana (preserved)Python3. Custom Sort for Special Requirements
def sort_by_custom_order(arr, order):
"""Sort array based on custom ordering"""
order_map = {val: idx for idx, val in enumerate(order)}
return sorted(arr, key=lambda x: order_map.get(x, float('inf')))
# Example: Sort by custom priority
items = ['banana', 'apple', 'cherry', 'date']
priority = ['cherry', 'apple', 'date', 'banana']
sorted_items = sort_by_custom_order(items, priority)
print("Custom order:", sorted_items)
def sort_version_numbers(versions):
"""Sort version numbers (e.g., '1.2.10' vs '1.2.9')"""
def version_key(version):
return tuple(map(int, version.split('.')))
return sorted(versions, key=version_key)
versions = ['1.2.10', '1.2.9', '1.10.1', '1.2.1']
print("Sorted versions:", sort_version_numbers(versions))Python4. Sorting with Missing or None Values
def sort_with_none(arr, none_last=True):
"""Sort array with None values"""
if none_last:
return sorted(arr, key=lambda x: (x is None, x))
else:
return sorted(arr, key=lambda x: (x is not None, x))
data = [3, None, 1, None, 2, 5]
print("None last:", sort_with_none(data, none_last=True))
print("None first:", sort_with_none(data, none_last=False))Python5. Topological Sort Applications
def topological_sort(tasks):
"""
Topological sort for task scheduling
tasks: dict mapping task to list of dependencies
"""
from collections import defaultdict, deque
# Build graph and calculate in-degrees
graph = defaultdict(list)
in_degree = defaultdict(int)
for task, deps in tasks.items():
for dep in deps:
graph[dep].append(task)
in_degree[task] += 1
if task not in in_degree:
in_degree[task] = 0
# Find tasks with no dependencies
queue = deque([task for task in in_degree if in_degree[task] == 0])
result = []
while queue:
task = queue.popleft()
result.append(task)
for next_task in graph[task]:
in_degree[next_task] -= 1
if in_degree[next_task] == 0:
queue.append(next_task)
return result if len(result) == len(in_degree) else None
# Example
tasks = {
'A': [],
'B': ['A'],
'C': ['A'],
'D': ['B', 'C'],
'E': ['D']
}
print("Task order:", topological_sort(tasks))PythonPerformance Analysis
Complexity Comparison Table
| Algorithm | Best | Average | Worst | Space | Stable | Method |
|---|---|---|---|---|---|---|
| Bubble Sort | O(n) | O(n²) | O(n²) | O(1) | Yes | Comparison |
| Selection Sort | O(n²) | O(n²) | O(n²) | O(1) | No | Comparison |
| Insertion Sort | O(n) | O(n²) | O(n²) | O(1) | Yes | Comparison |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes | Comparison |
| Quick Sort | O(n log n) | O(n log n) | O(n²) | O(log n) | No | Comparison |
| Heap Sort | O(n log n) | O(n log n) | O(n log n) | O(1) | No | Comparison |
| Counting Sort | O(n+k) | O(n+k) | O(n+k) | O(k) | Yes | Non-comparison |
| Radix Sort | O(nk) | O(nk) | O(nk) | O(n+k) | Yes | Non-comparison |
| Bucket Sort | O(n+k) | O(n+k) | O(n²) | O(n) | Yes* | Non-comparison |
| Timsort | O(n) | O(n log n) | O(n log n) | O(n) | Yes | Hybrid |
*Depends on sorting algorithm used for buckets
Benchmark Code
import time
import random
def benchmark_sort(sort_func, arr, name):
"""Benchmark a sorting algorithm"""
arr_copy = arr.copy()
start = time.time()
sort_func(arr_copy)
end = time.time()
return end - start
def run_benchmarks(size=10000):
"""Run comprehensive benchmarks"""
print(f"Benchmarking with {size} elements\n")
# Different data patterns
patterns = {
'Random': [random.randint(1, size) for _ in range(size)],
'Sorted': list(range(size)),
'Reverse': list(range(size, 0, -1)),
'Nearly Sorted': list(range(size)),
'Many Duplicates': [random.randint(1, 100) for _ in range(size)]
}
# Modify nearly sorted
for i in range(0, size, 100):
if i < size - 1:
patterns['Nearly Sorted'][i], patterns['Nearly Sorted'][i+1] = \
patterns['Nearly Sorted'][i+1], patterns['Nearly Sorted'][i]
# Algorithms to test
algorithms = {
'Bubble Sort': bubble_sort,
'Selection Sort': selection_sort,
'Insertion Sort': insertion_sort,
'Merge Sort': merge_sort,
'Quick Sort': quick_sort,
'Heap Sort': heap_sort,
'Python sorted()': lambda arr: sorted(arr)
}
# Run benchmarks
for pattern_name, data in patterns.items():
print(f"\n{pattern_name} data:")
print("-" * 50)
for algo_name, algo_func in algorithms.items():
try:
time_taken = benchmark_sort(algo_func, data[:1000], algo_name)
print(f"{algo_name:20}: {time_taken:.4f}s")
except Exception as e:
print(f"{algo_name:20}: Error - {e}")
# Run with small size for quick results
# run_benchmarks(1000)PythonWhen to Use Which Algorithm
def choose_sorting_algorithm(data_characteristics):
"""
Decision tree for choosing sorting algorithm
"""
recommendations = {
'small_size': {
'description': 'Array size < 50',
'algorithm': 'Insertion Sort',
'reason': 'Simple, efficient for small arrays, low overhead'
},
'nearly_sorted': {
'description': 'Data is nearly sorted',
'algorithm': 'Insertion Sort or Timsort',
'reason': 'O(n) performance on nearly sorted data'
},
'guaranteed_performance': {
'description': 'Need guaranteed O(n log n)',
'algorithm': 'Merge Sort or Heap Sort',
'reason': 'No worst-case O(n²) like Quick Sort'
},
'memory_constrained': {
'description': 'Limited memory (in-place required)',
'algorithm': 'Heap Sort or Quick Sort',
'reason': 'O(1) or O(log n) auxiliary space'
},
'stability_required': {
'description': 'Must preserve relative order',
'algorithm': 'Merge Sort or Timsort',
'reason': 'Stable sorting algorithms'
},
'integer_range': {
'description': 'Integers in small range',
'algorithm': 'Counting Sort',
'reason': 'O(n + k) linear time'
},
'general_purpose': {
'description': 'Mixed, real-world data',
'algorithm': 'Python sorted() (Timsort)',
'reason': 'Optimized for real-world patterns'
},
'linked_list': {
'description': 'Sorting linked list',
'algorithm': 'Merge Sort',
'reason': 'Efficient in-place for linked lists'
}
}
return recommendationsPythonBest Practices
1. Use Built-in Sort When Possible
# ✓ Recommended: Use Python's built-in sort
arr = [3, 1, 4, 1, 5, 9, 2, 6]
sorted_arr = sorted(arr) # Timsort - highly optimized
# ✗ Avoid: Implementing your own for production
# (unless you have very specific requirements)Python2. Choose Right Key Function
# ✓ Efficient key function
data = ['apple', 'Banana', 'cherry']
sorted_data = sorted(data, key=str.lower)
# ✗ Inefficient: Multiple transformations
# sorted_data = sorted(data, key=lambda x: (x.lower(), len(x), x[::-1]))
# Better: precompute if used multiple timesPython3. Understand Stability
# Stable sort preserves original order for equal keys
students = [
('Alice', 90),
('Bob', 85),
('Charlie', 90),
('Diana', 85)
]
# First sort by name (establishes order)
students.sort(key=lambda x: x[0])
# Then stable sort by grade (preserves name order within same grade)
students.sort(key=lambda x: x[1], reverse=True)
print(students)
# Students with same grade maintain alphabetical orderPython4. Profile Before Optimizing
import cProfile
import pstats
def profile_sort():
"""Profile sorting performance"""
data = [random.randint(1, 1000) for _ in range(10000)]
profiler = cProfile.Profile()
profiler.enable()
sorted(data)
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10)
# profile_sort()Python5. Type Hints for Clarity
from typing import List, TypeVar, Callable, Any
T = TypeVar('T')
def generic_sort(
arr: List[T],
key: Callable[[T], Any] = None,
reverse: bool = False
) -> List[T]:
"""
Generic sorting function with type hints
Args:
arr: List to sort
key: Optional key function
reverse: Sort in descending order if True
Returns:
Sorted list
"""
return sorted(arr, key=key, reverse=reverse)Python6. Error Handling
def safe_sort(arr, key=None, reverse=False):
"""Sort with error handling"""
try:
return sorted(arr, key=key, reverse=reverse)
except TypeError as e:
print(f"Cannot compare elements: {e}")
# Return original or handle appropriately
return arr
except Exception as e:
print(f"Unexpected error: {e}")
return arr
# Test with mixed types
mixed = [1, '2', 3, 'a']
print(safe_sort(mixed))Python7. Memory-Efficient Sorting
# For very large datasets, consider:
# 1. Generator expressions (don't create intermediate lists)
def sorted_items_generator(items):
"""Yield sorted items without storing all in memory"""
yield from sorted(items)
# 2. External sorting for data larger than memory
def external_merge_sort(file_path, chunk_size=1000000):
"""
Sort file larger than memory
1. Read chunks
2. Sort each chunk
3. Write to temp files
4. Merge temp files
"""
# Implementation would involve:
# - Reading chunks from file
# - Sorting each chunk
# - Writing sorted chunks to temp files
# - K-way merge of temp files
pass
# 3. In-place algorithms for memory constraints
arr = [3, 1, 4, 1, 5, 9, 2, 6]
arr.sort() # In-place, no extra list createdPythonSummary
Key Takeaways
- Know the Basics: Understand O(n²), O(n log n), and O(n) algorithms
- Use Built-in: Python’s
sorted()andlist.sort()are highly optimized - Timsort is Amazing: Excellent on real-world data, adaptive and stable
- Context Matters: Choose algorithm based on data characteristics
- Stability: Important when sorting by multiple criteria
- Space vs Time: Trade-offs between in-place and auxiliary space
Quick Decision Guide
Small data (< 50 elements):
→ Insertion Sort
Nearly sorted data:
→ Insertion Sort or Timsort
Need stability:
→ Merge Sort or Timsort
Memory limited:
→ Heap Sort or Quick Sort
Integers in small range:
→ Counting Sort
General purpose:
→ Python sorted() (Timsort)
Guaranteed O(n log n):
→ Merge Sort or Heap SortPythonPython Sorting Cheat Sheet
# Sort list in-place
arr.sort()
arr.sort(reverse=True)
arr.sort(key=lambda x: x[1])
# Return new sorted list
sorted_arr = sorted(arr)
sorted_arr = sorted(arr, reverse=True)
sorted_arr = sorted(arr, key=len)
# Multiple criteria
sorted_arr = sorted(arr, key=lambda x: (x.age, -x.salary))
# Custom comparison
from functools import cmp_to_key
sorted_arr = sorted(arr, key=cmp_to_key(compare_func))
# Operator functions
from operator import itemgetter, attrgetter
sorted_arr = sorted(arr, key=itemgetter(1, 2))
sorted_arr = sorted(arr, key=attrgetter('name', 'age'))PythonFurther Reading
- Sorting Networks
- Parallel Sorting Algorithms
- Cache-Aware Sorting
- External Sorting
- Approximate Sorting
- Online Sorting Algorithms
Practice Resources:
- LeetCode Sorting Problems
- HackerRank Sorting Challenges
- Sorting Algorithm Visualizations
- Algorithm Design Manual
Happy Sorting! 🔄
Discover more from Altgr Blog
Subscribe to get the latest posts sent to your email.
