Sorting Algorithms in Python – DSA

    Table of Contents

    1. Introduction to Sorting
    2. Sorting Algorithm Classification
    3. Simple Sorting Algorithms
    4. Efficient Sorting Algorithms
    5. Non-Comparison Sorts
    6. Python Built-in Sorting
    7. Specialized Sorting Algorithms
    8. Sorting Applications
    9. Performance Analysis
    10. Best Practices

    Introduction to Sorting

    Sorting is the process of arranging data in a particular order (ascending or descending). It’s one of the most fundamental operations in computer science.

    Why Sorting is Important

    • Search Efficiency: Binary search requires sorted data (O(log n) vs O(n))
    • Data Organization: Makes data easier to understand and process
    • Algorithm Prerequisites: Many algorithms require sorted input
    • Database Operations: Indexes, query optimization
    • User Experience: Display data in meaningful order

    Real-World Applications

    • E-commerce: Sort products by price, rating, popularity
    • Social Media: Sort posts by time, likes, relevance
    • File Systems: Sort files by name, date, size
    • Databases: ORDER BY queries
    • Search Engines: Rank search results
    • Banking: Transaction history

    Sorting Algorithm Classification

    By Time Complexity

    ClassAlgorithmsBest/Avg/Worst
    O(n²)Bubble, Selection, InsertionSimple, small datasets
    O(n log n)Merge, Heap, Quick (avg)General purpose
    O(n)Counting, Radix, BucketSpecial cases

    By Space Complexity

    • In-Place: O(1) extra space (Bubble, Selection, Insertion, Heap, Quick)
    • Not In-Place: O(n) extra space (Merge, Counting, Radix)

    By Stability

    • Stable: Preserves relative order of equal elements (Bubble, Insertion, Merge, Tim)
    • Unstable: May change relative order (Selection, Heap, Quick)

    By Method

    • Comparison-Based: Compare elements (most algorithms)
    • Non-Comparison: Use other properties (Counting, Radix, Bucket)

    By Adaptability

    • Adaptive: Faster on partially sorted data (Insertion, Bubble, Tim)
    • Non-Adaptive: Same speed regardless (Selection, Merge, Heap)

    Simple Sorting Algorithms

    1. Bubble Sort

    Repeatedly swaps adjacent elements if they’re in wrong order.

    def bubble_sort(arr):
        """
        Bubble Sort - Compare adjacent elements and swap
    
        Time: O(n²) - Best: O(n) if already sorted
        Space: O(1)
        Stable: Yes
        Adaptive: Yes (with optimization)
        """
        n = len(arr)
    
        for i in range(n):
            # Flag to detect if any swap happened
            swapped = False
    
            # Last i elements are already in place
            for j in range(0, n - i - 1):
                if arr[j] > arr[j + 1]:
                    arr[j], arr[j + 1] = arr[j + 1], arr[j]
                    swapped = True
    
            # If no swap occurred, array is sorted
            if not swapped:
                break
    
        return arr
    
    
    def bubble_sort_verbose(arr):
        """Bubble sort with step-by-step output"""
        n = len(arr)
        print(f"Initial array: {arr}")
    
        for i in range(n):
            swapped = False
    
            for j in range(0, n - i - 1):
                if arr[j] > arr[j + 1]:
                    arr[j], arr[j + 1] = arr[j + 1], arr[j]
                    swapped = True
    
            print(f"Pass {i + 1}: {arr}")
    
            if not swapped:
                print("Array is sorted!")
                break
    
        return arr
    
    
    # Example
    arr = [64, 34, 25, 12, 22, 11, 90]
    print("Sorted array:", bubble_sort(arr.copy()))
    Python

    When to Use:

    • Small datasets (< 10 elements)
    • Nearly sorted data
    • Educational purposes
    • Simple implementation needed

    2. Selection Sort

    Repeatedly finds minimum element and places it at the beginning.

    def selection_sort(arr):
        """
        Selection Sort - Select minimum and swap with first unsorted
    
        Time: O(n²) - Always
        Space: O(1)
        Stable: No
        Adaptive: No
        """
        n = len(arr)
    
        for i in range(n):
            # Find minimum element in remaining unsorted array
            min_idx = i
    
            for j in range(i + 1, n):
                if arr[j] < arr[min_idx]:
                    min_idx = j
    
            # Swap found minimum with first element
            arr[i], arr[min_idx] = arr[min_idx], arr[i]
    
        return arr
    
    
    def selection_sort_descending(arr):
        """Selection sort in descending order"""
        n = len(arr)
    
        for i in range(n):
            max_idx = i
    
            for j in range(i + 1, n):
                if arr[j] > arr[max_idx]:
                    max_idx = j
    
            arr[i], arr[max_idx] = arr[max_idx], arr[i]
    
        return arr
    
    
    # Example
    arr = [64, 25, 12, 22, 11]
    print("Selection sort:", selection_sort(arr.copy()))
    print("Descending:", selection_sort_descending(arr.copy()))
    Python

    When to Use:

    • Memory is limited (in-place)
    • Minimize number of swaps
    • Small datasets

    3. Insertion Sort

    Builds sorted array one element at a time by inserting elements in correct position.

    def insertion_sort(arr):
        """
        Insertion Sort - Insert each element at correct position
    
        Time: O(n²) - Best: O(n) if already sorted
        Space: O(1)
        Stable: Yes
        Adaptive: Yes
        """
        for i in range(1, len(arr)):
            key = arr[i]
            j = i - 1
    
            # Move elements greater than key one position ahead
            while j >= 0 and arr[j] > key:
                arr[j + 1] = arr[j]
                j -= 1
    
            arr[j + 1] = key
    
        return arr
    
    
    def insertion_sort_binary(arr):
        """Insertion sort with binary search for position"""
        import bisect
    
        for i in range(1, len(arr)):
            key = arr[i]
            # Find position using binary search
            pos = bisect.bisect_left(arr, key, 0, i)
            # Shift elements and insert
            arr = arr[:pos] + [key] + arr[pos:i] + arr[i+1:]
    
        return arr
    
    
    def insertion_sort_recursive(arr, n=None):
        """Recursive insertion sort"""
        if n is None:
            n = len(arr)
    
        if n <= 1:
            return arr
    
        # Sort first n-1 elements
        insertion_sort_recursive(arr, n - 1)
    
        # Insert last element at correct position
        last = arr[n - 1]
        j = n - 2
    
        while j >= 0 and arr[j] > last:
            arr[j + 1] = arr[j]
            j -= 1
    
        arr[j + 1] = last
        return arr
    
    
    # Example
    arr = [12, 11, 13, 5, 6]
    print("Insertion sort:", insertion_sort(arr.copy()))
    Python

    When to Use:

    • Small datasets
    • Nearly sorted data (very efficient!)
    • Online sorting (sort as data arrives)
    • Part of more complex algorithms (Timsort)

    Efficient Sorting Algorithms

    1. Merge Sort

    Divide-and-conquer algorithm that divides array into halves, sorts them, and merges.

    def merge_sort(arr):
        """
        Merge Sort - Divide and conquer
    
        Time: O(n log n) - Always
        Space: O(n)
        Stable: Yes
        Adaptive: No
        """
        if len(arr) <= 1:
            return arr
    
        # Divide
        mid = len(arr) // 2
        left = merge_sort(arr[:mid])
        right = merge_sort(arr[mid:])
    
        # Conquer (merge)
        return merge(left, right)
    
    
    def merge(left, right):
        """Merge two sorted arrays"""
        result = []
        i = j = 0
    
        # Compare elements from left and right
        while i < len(left) and j < len(right):
            if left[i] <= right[j]:
                result.append(left[i])
                i += 1
            else:
                result.append(right[j])
                j += 1
    
        # Append remaining elements
        result.extend(left[i:])
        result.extend(right[j:])
    
        return result
    
    
    def merge_sort_inplace(arr, left=0, right=None):
        """In-place merge sort (still O(n) auxiliary space for merge)"""
        if right is None:
            right = len(arr) - 1
    
        if left < right:
            mid = (left + right) // 2
    
            merge_sort_inplace(arr, left, mid)
            merge_sort_inplace(arr, mid + 1, right)
            merge_inplace(arr, left, mid, right)
    
        return arr
    
    
    def merge_inplace(arr, left, mid, right):
        """Merge for in-place merge sort"""
        # Create temporary arrays
        left_arr = arr[left:mid + 1]
        right_arr = arr[mid + 1:right + 1]
    
        i = j = 0
        k = left
    
        while i < len(left_arr) and j < len(right_arr):
            if left_arr[i] <= right_arr[j]:
                arr[k] = left_arr[i]
                i += 1
            else:
                arr[k] = right_arr[j]
                j += 1
            k += 1
    
        while i < len(left_arr):
            arr[k] = left_arr[i]
            i += 1
            k += 1
    
        while j < len(right_arr):
            arr[k] = right_arr[j]
            j += 1
            k += 1
    
    
    # Example
    arr = [38, 27, 43, 3, 9, 82, 10]
    print("Merge sort:", merge_sort(arr))
    Python

    When to Use:

    • Guaranteed O(n log n) performance
    • Stable sort needed
    • Linked lists (efficient in-place for linked lists)
    • External sorting (large datasets on disk)

    2. Quick Sort

    Divide-and-conquer using pivot element to partition array.

    def quick_sort(arr):
        """
        Quick Sort - Partition around pivot
    
        Time: O(n log n) avg, O(n²) worst
        Space: O(log n) recursion stack
        Stable: No
        Adaptive: No
        """
        if len(arr) <= 1:
            return arr
    
        pivot = arr[len(arr) // 2]
        left = [x for x in arr if x < pivot]
        middle = [x for x in arr if x == pivot]
        right = [x for x in arr if x > pivot]
    
        return quick_sort(left) + middle + quick_sort(right)
    
    
    def quick_sort_inplace(arr, low=0, high=None):
        """In-place quick sort"""
        if high is None:
            high = len(arr) - 1
    
        if low < high:
            # Partition and get pivot index
            pi = partition(arr, low, high)
    
            # Recursively sort elements before and after partition
            quick_sort_inplace(arr, low, pi - 1)
            quick_sort_inplace(arr, pi + 1, high)
    
        return arr
    
    
    def partition(arr, low, high):
        """Partition array around pivot"""
        pivot = arr[high]  # Choose last element as pivot
        i = low - 1  # Index of smaller element
    
        for j in range(low, high):
            if arr[j] <= pivot:
                i += 1
                arr[i], arr[j] = arr[j], arr[i]
    
        arr[i + 1], arr[high] = arr[high], arr[i + 1]
        return i + 1
    
    
    def quick_sort_random_pivot(arr, low=0, high=None):
        """Quick sort with random pivot (avoids worst case)"""
        import random
    
        if high is None:
            high = len(arr) - 1
    
        if low < high:
            pi = partition_random(arr, low, high)
            quick_sort_random_pivot(arr, low, pi - 1)
            quick_sort_random_pivot(arr, pi + 1, high)
    
        return arr
    
    
    def partition_random(arr, low, high):
        """Partition with random pivot"""
        import random
    
        # Random pivot
        pivot_idx = random.randint(low, high)
        arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
    
        return partition(arr, low, high)
    
    
    def quick_sort_three_way(arr, low=0, high=None):
        """
        Three-way quick sort (efficient for many duplicates)
        Partitions into: < pivot, = pivot, > pivot
        """
        if high is None:
            high = len(arr) - 1
    
        if low >= high:
            return arr
    
        lt, gt = low, high
        pivot = arr[low]
        i = low
    
        while i <= gt:
            if arr[i] < pivot:
                arr[lt], arr[i] = arr[i], arr[lt]
                lt += 1
                i += 1
            elif arr[i] > pivot:
                arr[i], arr[gt] = arr[gt], arr[i]
                gt -= 1
            else:
                i += 1
    
        quick_sort_three_way(arr, low, lt - 1)
        quick_sort_three_way(arr, gt + 1, high)
    
        return arr
    
    
    # Example
    arr = [10, 7, 8, 9, 1, 5]
    print("Quick sort:", quick_sort(arr.copy()))
    print("In-place:", quick_sort_inplace(arr.copy()))
    Python

    When to Use:

    • Average case O(n log n) is very fast (cache-friendly)
    • In-place sorting with O(log n) space
    • When stability is not required
    • General-purpose sorting

    3. Heap Sort

    Uses binary heap data structure to sort.

    def heap_sort(arr):
        """
        Heap Sort - Build max heap and extract elements
    
        Time: O(n log n) - Always
        Space: O(1)
        Stable: No
        Adaptive: No
        """
        n = len(arr)
    
        # Build max heap
        for i in range(n // 2 - 1, -1, -1):
            heapify(arr, n, i)
    
        # Extract elements from heap one by one
        for i in range(n - 1, 0, -1):
            arr[0], arr[i] = arr[i], arr[0]  # Swap
            heapify(arr, i, 0)
    
        return arr
    
    
    def heapify(arr, n, i):
        """Heapify subtree rooted at index i"""
        largest = i
        left = 2 * i + 1
        right = 2 * i + 2
    
        # Check if left child is larger
        if left < n and arr[left] > arr[largest]:
            largest = left
    
        # Check if right child is larger
        if right < n and arr[right] > arr[largest]:
            largest = right
    
        # If largest is not root
        if largest != i:
            arr[i], arr[largest] = arr[largest], arr[i]
            heapify(arr, n, largest)
    
    
    def heap_sort_using_heapq(arr):
        """Heap sort using Python's heapq module"""
        import heapq
    
        # heapq creates min heap
        heapq.heapify(arr)
        return [heapq.heappop(arr) for _ in range(len(arr))]
    
    
    # Example
    arr = [12, 11, 13, 5, 6, 7]
    print("Heap sort:", heap_sort(arr.copy()))
    Python

    When to Use:

    • Guaranteed O(n log n) performance
    • Memory is limited (in-place)
    • When finding k largest/smallest elements
    • Priority queue applications

    Non-Comparison Sorts

    1. Counting Sort

    Counts occurrences of each value (works for integers in known range).

    def counting_sort(arr):
        """
        Counting Sort - Count occurrences and reconstruct
    
        Time: O(n + k) where k is range of input
        Space: O(k)
        Stable: Yes
    
        Works for non-negative integers
        """
        if not arr:
            return arr
    
        # Find range
        max_val = max(arr)
        min_val = min(arr)
        range_size = max_val - min_val + 1
    
        # Count occurrences
        count = [0] * range_size
        output = [0] * len(arr)
    
        for num in arr:
            count[num - min_val] += 1
    
        # Cumulative count
        for i in range(1, len(count)):
            count[i] += count[i - 1]
    
        # Build output array
        for i in range(len(arr) - 1, -1, -1):
            output[count[arr[i] - min_val] - 1] = arr[i]
            count[arr[i] - min_val] -= 1
    
        return output
    
    
    def counting_sort_simple(arr):
        """Simpler version (not stable)"""
        if not arr:
            return arr
    
        max_val = max(arr)
        min_val = min(arr)
        range_size = max_val - min_val + 1
    
        count = [0] * range_size
    
        # Count occurrences
        for num in arr:
            count[num - min_val] += 1
    
        # Reconstruct array
        result = []
        for i, freq in enumerate(count):
            result.extend([i + min_val] * freq)
    
        return result
    
    
    # Example
    arr = [4, 2, 2, 8, 3, 3, 1]
    print("Counting sort:", counting_sort(arr))
    Python

    When to Use:

    • Small range of integers
    • Need O(n) time complexity
    • Stable sort required
    • As subroutine in radix sort

    2. Radix Sort

    Sorts numbers digit by digit (uses counting sort as subroutine).

    def radix_sort(arr):
        """
        Radix Sort - Sort by each digit
    
        Time: O(d * (n + k)) where d is digits, k is base
        Space: O(n + k)
        Stable: Yes
    
        Works for non-negative integers
        """
        if not arr:
            return arr
    
        # Find maximum number to know number of digits
        max_val = max(arr)
    
        # Do counting sort for every digit
        exp = 1
        while max_val // exp > 0:
            counting_sort_by_digit(arr, exp)
            exp *= 10
    
        return arr
    
    
    def counting_sort_by_digit(arr, exp):
        """Counting sort based on digit represented by exp"""
        n = len(arr)
        output = [0] * n
        count = [0] * 10
    
        # Count occurrences
        for num in arr:
            digit = (num // exp) % 10
            count[digit] += 1
    
        # Cumulative count
        for i in range(1, 10):
            count[i] += count[i - 1]
    
        # Build output array
        for i in range(n - 1, -1, -1):
            digit = (arr[i] // exp) % 10
            output[count[digit] - 1] = arr[i]
            count[digit] -= 1
    
        # Copy output to arr
        for i in range(n):
            arr[i] = output[i]
    
    
    def radix_sort_negative(arr):
        """Radix sort that handles negative numbers"""
        if not arr:
            return arr
    
        # Separate positive and negative
        positive = [x for x in arr if x >= 0]
        negative = [-x for x in arr if x < 0]
    
        # Sort both
        radix_sort(positive)
        radix_sort(negative)
    
        # Combine (negatives reversed)
        return [-x for x in reversed(negative)] + positive
    
    
    # Example
    arr = [170, 45, 75, 90, 802, 24, 2, 66]
    print("Radix sort:", radix_sort(arr.copy()))
    Python

    When to Use:

    • Sorting integers or strings
    • Fixed-length keys
    • Large number of elements
    • O(n) time needed

    3. Bucket Sort

    Distributes elements into buckets and sorts each bucket.

    def bucket_sort(arr, bucket_size=5):
        """
        Bucket Sort - Distribute into buckets and sort
    
        Time: O(n + k) average, O(n²) worst
        Space: O(n + k)
        Stable: Depends on sorting algorithm used
    
        Works best for uniformly distributed data
        """
        if not arr:
            return arr
    
        # Find range
        min_val, max_val = min(arr), max(arr)
    
        # Create buckets
        bucket_count = (max_val - min_val) // bucket_size + 1
        buckets = [[] for _ in range(bucket_count)]
    
        # Distribute elements into buckets
        for num in arr:
            bucket_idx = (num - min_val) // bucket_size
            buckets[bucket_idx].append(num)
    
        # Sort each bucket and concatenate
        result = []
        for bucket in buckets:
            result.extend(sorted(bucket))  # Can use any sorting algorithm
    
        return result
    
    
    def bucket_sort_floats(arr):
        """Bucket sort for floats in range [0, 1)"""
        n = len(arr)
        buckets = [[] for _ in range(n)]
    
        # Distribute into buckets
        for num in arr:
            bucket_idx = int(num * n)
            if bucket_idx == n:  # Handle edge case for 1.0
                bucket_idx = n - 1
            buckets[bucket_idx].append(num)
    
        # Sort each bucket
        for bucket in buckets:
            insertion_sort(bucket)
    
        # Concatenate
        result = []
        for bucket in buckets:
            result.extend(bucket)
    
        return result
    
    
    # Example
    arr = [0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434]
    print("Bucket sort:", bucket_sort_floats(arr))
    Python

    When to Use:

    • Uniformly distributed data
    • Floating-point numbers in range
    • Need O(n) average time
    • Can parallelize easily

    Python Built-in Sorting

    1. sorted() Function

    # Basic usage
    arr = [64, 34, 25, 12, 22, 11, 90]
    sorted_arr = sorted(arr)  # Returns new sorted list
    print(f"Original: {arr}")
    print(f"Sorted: {sorted_arr}")
    
    # Reverse order
    reverse_sorted = sorted(arr, reverse=True)
    print(f"Descending: {reverse_sorted}")
    
    # Sort with key function
    words = ['banana', 'pie', 'Washington', 'book']
    sorted_words = sorted(words, key=str.lower)  # Case-insensitive
    print(f"Case-insensitive: {sorted_words}")
    
    # Sort by length
    sorted_by_len = sorted(words, key=len)
    print(f"By length: {sorted_by_len}")
    
    # Sort complex objects
    students = [
        {'name': 'John', 'grade': 90, 'age': 20},
        {'name': 'Jane', 'grade': 95, 'age': 19},
        {'name': 'Dave', 'grade': 85, 'age': 21}
    ]
    
    # Sort by grade
    sorted_by_grade = sorted(students, key=lambda x: x['grade'], reverse=True)
    print("By grade:", [s['name'] for s in sorted_by_grade])
    
    # Multiple sort keys
    from operator import itemgetter
    sorted_multi = sorted(students, key=itemgetter('age', 'grade'))
    print("By age then grade:", [s['name'] for s in sorted_multi])
    Python

    2. list.sort() Method

    # In-place sorting
    arr = [64, 34, 25, 12, 22, 11, 90]
    arr.sort()  # Modifies original list
    print(f"Sorted in-place: {arr}")
    
    # With parameters
    arr = [64, 34, 25, 12, 22, 11, 90]
    arr.sort(reverse=True)
    print(f"Descending: {arr}")
    
    # Custom comparison
    class Student:
        def __init__(self, name, grade):
            self.name = name
            self.grade = grade
    
        def __repr__(self):
            return f"{self.name}({self.grade})"
    
    students = [
        Student('John', 90),
        Student('Jane', 95),
        Student('Dave', 85)
    ]
    
    students.sort(key=lambda s: s.grade, reverse=True)
    print("Students by grade:", students)
    Python

    3. Timsort (Python’s Default Algorithm)

    Python’s built-in sort uses Timsort, a hybrid stable sorting algorithm derived from merge sort and insertion sort.

    """
    Timsort Properties:
    - Time: O(n log n) worst case, O(n) best case
    - Space: O(n)
    - Stable: Yes
    - Adaptive: Yes (fast on partially sorted data)
    
    Key Features:
    1. Identifies "runs" (already sorted subsequences)
    2. Uses insertion sort for small runs
    3. Merges runs using merge sort
    4. Optimized for real-world data
    
    Why it's great:
    - Excellent performance on real-world data
    - Handles partially sorted data very well
    - Stable (maintains order of equal elements)
    - Used by Python, Java, Android, Swift
    """
    
    def timsort_demo():
        """Demonstrate Timsort's efficiency on different data"""
        import time
    
        # Nearly sorted data (Timsort excels here)
        nearly_sorted = list(range(10000))
        nearly_sorted[100], nearly_sorted[500] = nearly_sorted[500], nearly_sorted[100]
    
        start = time.time()
        sorted(nearly_sorted)
        print(f"Nearly sorted: {time.time() - start:.4f}s")
    
        # Random data
        import random
        random_data = [random.randint(1, 10000) for _ in range(10000)]
    
        start = time.time()
        sorted(random_data)
        print(f"Random data: {time.time() - start:.4f}s")
    
    timsort_demo()
    Python

    4. Advanced Sorting with functools

    from functools import cmp_to_key
    
    # Custom comparison function
    def compare(x, y):
        """
        Compare function:
        Return -1 if x < y
        Return 0 if x == y
        Return 1 if x > y
        """
        if x < y:
            return -1
        elif x > y:
            return 1
        else:
            return 0
    
    # Sort using comparison function
    arr = [5, 2, 8, 1, 9]
    sorted_arr = sorted(arr, key=cmp_to_key(compare))
    print(f"With cmp_to_key: {sorted_arr}")
    
    # Complex comparison
    def compare_complex(item1, item2):
        """Compare by length first, then alphabetically"""
        if len(item1) != len(item2):
            return len(item1) - len(item2)
        if item1 < item2:
            return -1
        elif item1 > item2:
            return 1
        return 0
    
    words = ['banana', 'pie', 'apple', 'cherry', 'a']
    sorted_words = sorted(words, key=cmp_to_key(compare_complex))
    print(f"Complex sort: {sorted_words}")
    Python

    Specialized Sorting Algorithms

    1. Shell Sort

    Generalization of insertion sort that allows exchange of far items.

    def shell_sort(arr):
        """
        Shell Sort - Insertion sort with gap
    
        Time: O(n²) worst, better in practice
        Space: O(1)
        Stable: No
        Adaptive: Yes
        """
        n = len(arr)
        gap = n // 2
    
        while gap > 0:
            for i in range(gap, n):
                temp = arr[i]
                j = i
    
                while j >= gap and arr[j - gap] > temp:
                    arr[j] = arr[j - gap]
                    j -= gap
    
                arr[j] = temp
    
            gap //= 2
    
        return arr
    
    
    # Example
    arr = [12, 34, 54, 2, 3]
    print("Shell sort:", shell_sort(arr.copy()))
    Python

    2. Comb Sort

    Improvement over bubble sort with gap.

    def comb_sort(arr):
        """
        Comb Sort - Bubble sort with shrinking gap
    
        Time: O(n²) worst, O(n log n) average
        Space: O(1)
        Stable: No
        """
        n = len(arr)
        gap = n
        shrink = 1.3
        sorted_flag = False
    
        while not sorted_flag:
            gap = int(gap / shrink)
    
            if gap <= 1:
                gap = 1
                sorted_flag = True
    
            i = 0
            while i + gap < n:
                if arr[i] > arr[i + gap]:
                    arr[i], arr[i + gap] = arr[i + gap], arr[i]
                    sorted_flag = False
                i += 1
    
        return arr
    
    
    # Example
    arr = [8, 4, 1, 56, 3, -44, 23, -6, 28, 0]
    print("Comb sort:", comb_sort(arr.copy()))
    Python

    3. Cycle Sort

    Minimizes number of writes (useful for flash memory).

    def cycle_sort(arr):
        """
        Cycle Sort - Minimize number of writes
    
        Time: O(n²)
        Space: O(1)
        Stable: No
    
        Useful when writing to memory is expensive
        """
        writes = 0
    
        for cycle_start in range(len(arr) - 1):
            item = arr[cycle_start]
    
            # Find position where we put the item
            pos = cycle_start
            for i in range(cycle_start + 1, len(arr)):
                if arr[i] < item:
                    pos += 1
    
            # If item is already in correct position
            if pos == cycle_start:
                continue
    
            # Skip duplicates
            while item == arr[pos]:
                pos += 1
    
            # Put item to its right position
            arr[pos], item = item, arr[pos]
            writes += 1
    
            # Rotate rest of the cycle
            while pos != cycle_start:
                pos = cycle_start
    
                for i in range(cycle_start + 1, len(arr)):
                    if arr[i] < item:
                        pos += 1
    
                while item == arr[pos]:
                    pos += 1
    
                arr[pos], item = item, arr[pos]
                writes += 1
    
        return arr, writes
    
    
    # Example
    arr = [1, 8, 3, 9, 10, 10, 2, 4]
    sorted_arr, write_count = cycle_sort(arr.copy())
    print(f"Cycle sort: {sorted_arr}, writes: {write_count}")
    Python

    4. Pancake Sort

    Sorting by flipping prefixes.

    def pancake_sort(arr):
        """
        Pancake Sort - Sort by reversing prefixes
    
        Time: O(n²)
        Space: O(1)
        Stable: No
    
        Fun algorithm, not practical
        """
        def flip(arr, k):
            """Reverse arr[0:k+1]"""
            left, right = 0, k
            while left < right:
                arr[left], arr[right] = arr[right], arr[left]
                left += 1
                right -= 1
    
        n = len(arr)
    
        for curr_size in range(n, 1, -1):
            # Find index of maximum in arr[0:curr_size]
            max_idx = arr.index(max(arr[0:curr_size]))
    
            # Move maximum to beginning
            if max_idx != 0:
                flip(arr, max_idx)
    
            # Move maximum to correct position
            flip(arr, curr_size - 1)
    
        return arr
    
    
    # Example
    arr = [23, 10, 20, 11, 12, 6, 7]
    print("Pancake sort:", pancake_sort(arr.copy()))
    Python

    Sorting Applications

    1. Sort Objects by Multiple Criteria

    from operator import attrgetter, itemgetter
    
    class Employee:
        def __init__(self, name, age, salary):
            self.name = name
            self.age = age
            self.salary = salary
    
        def __repr__(self):
            return f"Employee({self.name}, {self.age}, ${self.salary})"
    
    employees = [
        Employee('Alice', 30, 70000),
        Employee('Bob', 25, 60000),
        Employee('Charlie', 30, 80000),
        Employee('Diana', 25, 75000)
    ]
    
    # Sort by age, then by salary (descending)
    sorted_emp = sorted(employees, key=lambda e: (e.age, -e.salary))
    print("By age, then -salary:", sorted_emp)
    
    # Using attrgetter
    sorted_emp2 = sorted(employees, key=attrgetter('age', 'salary'))
    print("Using attrgetter:", sorted_emp2)
    Python

    2. Stable Sort for Maintaining Order

    # Stability example
    data = [
        ('Alice', 25),
        ('Bob', 30),
        ('Charlie', 25),
        ('Diana', 30)
    ]
    
    # Sort by age only - stable sort preserves original order for equal keys
    sorted_data = sorted(data, key=lambda x: x[1])
    print("Stable sort:", sorted_data)
    # Output: [('Alice', 25), ('Charlie', 25), ('Bob', 30), ('Diana', 30)]
    # Alice before Charlie, Bob before Diana (preserved)
    Python

    3. Custom Sort for Special Requirements

    def sort_by_custom_order(arr, order):
        """Sort array based on custom ordering"""
        order_map = {val: idx for idx, val in enumerate(order)}
        return sorted(arr, key=lambda x: order_map.get(x, float('inf')))
    
    # Example: Sort by custom priority
    items = ['banana', 'apple', 'cherry', 'date']
    priority = ['cherry', 'apple', 'date', 'banana']
    
    sorted_items = sort_by_custom_order(items, priority)
    print("Custom order:", sorted_items)
    
    
    def sort_version_numbers(versions):
        """Sort version numbers (e.g., '1.2.10' vs '1.2.9')"""
        def version_key(version):
            return tuple(map(int, version.split('.')))
    
        return sorted(versions, key=version_key)
    
    versions = ['1.2.10', '1.2.9', '1.10.1', '1.2.1']
    print("Sorted versions:", sort_version_numbers(versions))
    Python

    4. Sorting with Missing or None Values

    def sort_with_none(arr, none_last=True):
        """Sort array with None values"""
        if none_last:
            return sorted(arr, key=lambda x: (x is None, x))
        else:
            return sorted(arr, key=lambda x: (x is not None, x))
    
    data = [3, None, 1, None, 2, 5]
    print("None last:", sort_with_none(data, none_last=True))
    print("None first:", sort_with_none(data, none_last=False))
    Python

    5. Topological Sort Applications

    def topological_sort(tasks):
        """
        Topological sort for task scheduling
    
        tasks: dict mapping task to list of dependencies
        """
        from collections import defaultdict, deque
    
        # Build graph and calculate in-degrees
        graph = defaultdict(list)
        in_degree = defaultdict(int)
    
        for task, deps in tasks.items():
            for dep in deps:
                graph[dep].append(task)
                in_degree[task] += 1
            if task not in in_degree:
                in_degree[task] = 0
    
        # Find tasks with no dependencies
        queue = deque([task for task in in_degree if in_degree[task] == 0])
        result = []
    
        while queue:
            task = queue.popleft()
            result.append(task)
    
            for next_task in graph[task]:
                in_degree[next_task] -= 1
                if in_degree[next_task] == 0:
                    queue.append(next_task)
    
        return result if len(result) == len(in_degree) else None
    
    # Example
    tasks = {
        'A': [],
        'B': ['A'],
        'C': ['A'],
        'D': ['B', 'C'],
        'E': ['D']
    }
    
    print("Task order:", topological_sort(tasks))
    Python

    Performance Analysis

    Complexity Comparison Table

    AlgorithmBestAverageWorstSpaceStableMethod
    Bubble SortO(n)O(n²)O(n²)O(1)YesComparison
    Selection SortO(n²)O(n²)O(n²)O(1)NoComparison
    Insertion SortO(n)O(n²)O(n²)O(1)YesComparison
    Merge SortO(n log n)O(n log n)O(n log n)O(n)YesComparison
    Quick SortO(n log n)O(n log n)O(n²)O(log n)NoComparison
    Heap SortO(n log n)O(n log n)O(n log n)O(1)NoComparison
    Counting SortO(n+k)O(n+k)O(n+k)O(k)YesNon-comparison
    Radix SortO(nk)O(nk)O(nk)O(n+k)YesNon-comparison
    Bucket SortO(n+k)O(n+k)O(n²)O(n)Yes*Non-comparison
    TimsortO(n)O(n log n)O(n log n)O(n)YesHybrid

    *Depends on sorting algorithm used for buckets

    Benchmark Code

    import time
    import random
    
    def benchmark_sort(sort_func, arr, name):
        """Benchmark a sorting algorithm"""
        arr_copy = arr.copy()
        start = time.time()
        sort_func(arr_copy)
        end = time.time()
        return end - start
    
    def run_benchmarks(size=10000):
        """Run comprehensive benchmarks"""
        print(f"Benchmarking with {size} elements\n")
    
        # Different data patterns
        patterns = {
            'Random': [random.randint(1, size) for _ in range(size)],
            'Sorted': list(range(size)),
            'Reverse': list(range(size, 0, -1)),
            'Nearly Sorted': list(range(size)),
            'Many Duplicates': [random.randint(1, 100) for _ in range(size)]
        }
    
        # Modify nearly sorted
        for i in range(0, size, 100):
            if i < size - 1:
                patterns['Nearly Sorted'][i], patterns['Nearly Sorted'][i+1] = \
                    patterns['Nearly Sorted'][i+1], patterns['Nearly Sorted'][i]
    
        # Algorithms to test
        algorithms = {
            'Bubble Sort': bubble_sort,
            'Selection Sort': selection_sort,
            'Insertion Sort': insertion_sort,
            'Merge Sort': merge_sort,
            'Quick Sort': quick_sort,
            'Heap Sort': heap_sort,
            'Python sorted()': lambda arr: sorted(arr)
        }
    
        # Run benchmarks
        for pattern_name, data in patterns.items():
            print(f"\n{pattern_name} data:")
            print("-" * 50)
    
            for algo_name, algo_func in algorithms.items():
                try:
                    time_taken = benchmark_sort(algo_func, data[:1000], algo_name)
                    print(f"{algo_name:20}: {time_taken:.4f}s")
                except Exception as e:
                    print(f"{algo_name:20}: Error - {e}")
    
    # Run with small size for quick results
    # run_benchmarks(1000)
    Python

    When to Use Which Algorithm

    def choose_sorting_algorithm(data_characteristics):
        """
        Decision tree for choosing sorting algorithm
        """
    
        recommendations = {
            'small_size': {
                'description': 'Array size < 50',
                'algorithm': 'Insertion Sort',
                'reason': 'Simple, efficient for small arrays, low overhead'
            },
            'nearly_sorted': {
                'description': 'Data is nearly sorted',
                'algorithm': 'Insertion Sort or Timsort',
                'reason': 'O(n) performance on nearly sorted data'
            },
            'guaranteed_performance': {
                'description': 'Need guaranteed O(n log n)',
                'algorithm': 'Merge Sort or Heap Sort',
                'reason': 'No worst-case O(n²) like Quick Sort'
            },
            'memory_constrained': {
                'description': 'Limited memory (in-place required)',
                'algorithm': 'Heap Sort or Quick Sort',
                'reason': 'O(1) or O(log n) auxiliary space'
            },
            'stability_required': {
                'description': 'Must preserve relative order',
                'algorithm': 'Merge Sort or Timsort',
                'reason': 'Stable sorting algorithms'
            },
            'integer_range': {
                'description': 'Integers in small range',
                'algorithm': 'Counting Sort',
                'reason': 'O(n + k) linear time'
            },
            'general_purpose': {
                'description': 'Mixed, real-world data',
                'algorithm': 'Python sorted() (Timsort)',
                'reason': 'Optimized for real-world patterns'
            },
            'linked_list': {
                'description': 'Sorting linked list',
                'algorithm': 'Merge Sort',
                'reason': 'Efficient in-place for linked lists'
            }
        }
    
        return recommendations
    Python

    Best Practices

    1. Use Built-in Sort When Possible

    # ✓ Recommended: Use Python's built-in sort
    arr = [3, 1, 4, 1, 5, 9, 2, 6]
    sorted_arr = sorted(arr)  # Timsort - highly optimized
    
    # ✗ Avoid: Implementing your own for production
    # (unless you have very specific requirements)
    Python

    2. Choose Right Key Function

    # ✓ Efficient key function
    data = ['apple', 'Banana', 'cherry']
    sorted_data = sorted(data, key=str.lower)
    
    # ✗ Inefficient: Multiple transformations
    # sorted_data = sorted(data, key=lambda x: (x.lower(), len(x), x[::-1]))
    # Better: precompute if used multiple times
    Python

    3. Understand Stability

    # Stable sort preserves original order for equal keys
    students = [
        ('Alice', 90),
        ('Bob', 85),
        ('Charlie', 90),
        ('Diana', 85)
    ]
    
    # First sort by name (establishes order)
    students.sort(key=lambda x: x[0])
    
    # Then stable sort by grade (preserves name order within same grade)
    students.sort(key=lambda x: x[1], reverse=True)
    
    print(students)
    # Students with same grade maintain alphabetical order
    Python

    4. Profile Before Optimizing

    import cProfile
    import pstats
    
    def profile_sort():
        """Profile sorting performance"""
        data = [random.randint(1, 1000) for _ in range(10000)]
    
        profiler = cProfile.Profile()
        profiler.enable()
    
        sorted(data)
    
        profiler.disable()
        stats = pstats.Stats(profiler)
        stats.sort_stats('cumulative')
        stats.print_stats(10)
    
    # profile_sort()
    Python

    5. Type Hints for Clarity

    from typing import List, TypeVar, Callable, Any
    
    T = TypeVar('T')
    
    def generic_sort(
        arr: List[T],
        key: Callable[[T], Any] = None,
        reverse: bool = False
    ) -> List[T]:
        """
        Generic sorting function with type hints
    
        Args:
            arr: List to sort
            key: Optional key function
            reverse: Sort in descending order if True
    
        Returns:
            Sorted list
        """
        return sorted(arr, key=key, reverse=reverse)
    Python

    6. Error Handling

    def safe_sort(arr, key=None, reverse=False):
        """Sort with error handling"""
        try:
            return sorted(arr, key=key, reverse=reverse)
        except TypeError as e:
            print(f"Cannot compare elements: {e}")
            # Return original or handle appropriately
            return arr
        except Exception as e:
            print(f"Unexpected error: {e}")
            return arr
    
    # Test with mixed types
    mixed = [1, '2', 3, 'a']
    print(safe_sort(mixed))
    Python

    7. Memory-Efficient Sorting

    # For very large datasets, consider:
    
    # 1. Generator expressions (don't create intermediate lists)
    def sorted_items_generator(items):
        """Yield sorted items without storing all in memory"""
        yield from sorted(items)
    
    # 2. External sorting for data larger than memory
    def external_merge_sort(file_path, chunk_size=1000000):
        """
        Sort file larger than memory
    
        1. Read chunks
        2. Sort each chunk
        3. Write to temp files
        4. Merge temp files
        """
        # Implementation would involve:
        # - Reading chunks from file
        # - Sorting each chunk
        # - Writing sorted chunks to temp files
        # - K-way merge of temp files
        pass
    
    # 3. In-place algorithms for memory constraints
    arr = [3, 1, 4, 1, 5, 9, 2, 6]
    arr.sort()  # In-place, no extra list created
    Python

    Summary

    Key Takeaways

    1. Know the Basics: Understand O(n²), O(n log n), and O(n) algorithms
    2. Use Built-in: Python’s sorted() and list.sort() are highly optimized
    3. Timsort is Amazing: Excellent on real-world data, adaptive and stable
    4. Context Matters: Choose algorithm based on data characteristics
    5. Stability: Important when sorting by multiple criteria
    6. Space vs Time: Trade-offs between in-place and auxiliary space

    Quick Decision Guide

    Small data (< 50 elements):
        → Insertion Sort
    
    Nearly sorted data:
        → Insertion Sort or Timsort
    
    Need stability:
        → Merge Sort or Timsort
    
    Memory limited:
        → Heap Sort or Quick Sort
    
    Integers in small range:
        → Counting Sort
    
    General purpose:
        → Python sorted() (Timsort)
    
    Guaranteed O(n log n):
        → Merge Sort or Heap Sort
    Python

    Python Sorting Cheat Sheet

    # Sort list in-place
    arr.sort()
    arr.sort(reverse=True)
    arr.sort(key=lambda x: x[1])
    
    # Return new sorted list
    sorted_arr = sorted(arr)
    sorted_arr = sorted(arr, reverse=True)
    sorted_arr = sorted(arr, key=len)
    
    # Multiple criteria
    sorted_arr = sorted(arr, key=lambda x: (x.age, -x.salary))
    
    # Custom comparison
    from functools import cmp_to_key
    sorted_arr = sorted(arr, key=cmp_to_key(compare_func))
    
    # Operator functions
    from operator import itemgetter, attrgetter
    sorted_arr = sorted(arr, key=itemgetter(1, 2))
    sorted_arr = sorted(arr, key=attrgetter('name', 'age'))
    Python

    Further Reading

    • Sorting Networks
    • Parallel Sorting Algorithms
    • Cache-Aware Sorting
    • External Sorting
    • Approximate Sorting
    • Online Sorting Algorithms

    Practice Resources:

    • LeetCode Sorting Problems
    • HackerRank Sorting Challenges
    • Sorting Algorithm Visualizations
    • Algorithm Design Manual

    Happy Sorting! 🔄


    Discover more from Altgr Blog

    Subscribe to get the latest posts sent to your email.

    Leave a Reply

    Your email address will not be published. Required fields are marked *