Python NumPy – Altgr Blog

Introduction to NumPy
Installation
NumPy Arrays
Array Creation
Array Attributes
Array Indexing and Slicing
Array Operations
Mathematical Functions
Array Reshaping
Broadcasting
Linear Algebra
Statistical Functions
Random Number Generation
File I/O
Advanced Topics
Best Practices

Introduction to NumPy

NumPy (Numerical Python) is the fundamental package for scientific computing in Python. It provides:

A powerful N-dimensional array object
Sophisticated broadcasting functions
Tools for integrating C/C++ and Fortran code
Useful linear algebra, Fourier transform, and random number capabilities

Why Use NumPy?

Performance: NumPy arrays are stored at one continuous place in memory (unlike lists), making them faster to access and manipulate
Convenience: Built-in functions for mathematical operations
Less Code: Operations that would require loops in Python can be done in one line with NumPy
Foundation: Many scientific libraries (pandas, scikit-learn, TensorFlow) are built on top of NumPy

Installation

# Using pip
pip install numpy

# Using conda
conda install numpy

# Verify installation
python -c "import numpy; print(numpy.__version__)"

# Using pip
pip install numpy

# Using conda
conda install numpy

# Verify installation
python -c "import numpy; print(numpy.__version__)"

Bash

Import NumPy in your Python script:

import numpy as np

import numpy as np

Python

NumPy Arrays

The core of NumPy is the ndarray (n-dimensional array) object. Unlike Python lists, NumPy arrays:

Are homogeneous (all elements must be of the same type)
Have a fixed size at creation
Support vectorized operations
Are more memory-efficient

Array vs List Comparison

import numpy as np

# Python list
python_list = [1, 2, 3, 4, 5]

# NumPy array
numpy_array = np.array([1, 2, 3, 4, 5])

# Multiplying by 2
# List: requires loop or list comprehension
result_list = [x * 2 for x in python_list]

# NumPy: vectorized operation
result_numpy = numpy_array * 2
print(result_numpy)  # [2 4 6 8 10]

import numpy as np

# Python list
python_list = [1, 2, 3, 4, 5]

# NumPy array
numpy_array = np.array([1, 2, 3, 4, 5])

# Multiplying by 2
# List: requires loop or list comprehension
result_list = [x * 2 for x in python_list]

# NumPy: vectorized operation
result_numpy = numpy_array * 2
print(result_numpy)  # [2 4 6 8 10]

Python

Array Creation

From Python Lists

import numpy as np

# 1D array
arr1d = np.array([1, 2, 3, 4, 5])
print(arr1d)  # [1 2 3 4 5]

# 2D array (matrix)
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d)
# [[1 2 3]
#  [4 5 6]]

# 3D array
arr3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr3d.shape)  # (2, 2, 2)

import numpy as np

# 1D array
arr1d = np.array([1, 2, 3, 4, 5])
print(arr1d)  # [1 2 3 4 5]

# 2D array (matrix)
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d)
# [[1 2 3]
#  [4 5 6]]

# 3D array
arr3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr3d.shape)  # (2, 2, 2)

Python

Using Built-in Functions

# Array of zeros
zeros = np.zeros((3, 4))  # 3x4 array of zeros
print(zeros)

# Array of ones
ones = np.ones((2, 3, 4))  # 2x3x4 array of ones

# Empty array (uninitialized values)
empty = np.empty((2, 2))

# Array with a range of values
arange = np.arange(0, 10, 2)  # [0 2 4 6 8]

# Array with evenly spaced values
linspace = np.linspace(0, 1, 5)  # [0.   0.25 0.5  0.75 1.  ]

# Identity matrix
identity = np.eye(3)  # 3x3 identity matrix

# Array filled with a constant value
full = np.full((2, 3), 7)  # 2x3 array filled with 7

# Array like another array
arr = np.array([1, 2, 3])
zeros_like = np.zeros_like(arr)
ones_like = np.ones_like(arr)

# Array of zeros
zeros = np.zeros((3, 4))  # 3x4 array of zeros
print(zeros)

# Array of ones
ones = np.ones((2, 3, 4))  # 2x3x4 array of ones

# Empty array (uninitialized values)
empty = np.empty((2, 2))

# Array with a range of values
arange = np.arange(0, 10, 2)  # [0 2 4 6 8]

# Array with evenly spaced values
linspace = np.linspace(0, 1, 5)  # [0.   0.25 0.5  0.75 1.  ]

# Identity matrix
identity = np.eye(3)  # 3x3 identity matrix

# Array filled with a constant value
full = np.full((2, 3), 7)  # 2x3 array filled with 7

# Array like another array
arr = np.array([1, 2, 3])
zeros_like = np.zeros_like(arr)
ones_like = np.ones_like(arr)

Python

Specifying Data Types

# Integer array
int_arr = np.array([1, 2, 3], dtype=np.int32)

# Float array
float_arr = np.array([1, 2, 3], dtype=np.float64)

# Complex array
complex_arr = np.array([1+2j, 3+4j], dtype=np.complex128)

# Boolean array
bool_arr = np.array([True, False, True], dtype=np.bool_)

# String array
str_arr = np.array(['a', 'b', 'c'], dtype='U1')

# Check data type
print(int_arr.dtype)  # int32

# Integer array
int_arr = np.array([1, 2, 3], dtype=np.int32)

# Float array
float_arr = np.array([1, 2, 3], dtype=np.float64)

# Complex array
complex_arr = np.array([1+2j, 3+4j], dtype=np.complex128)

# Boolean array
bool_arr = np.array([True, False, True], dtype=np.bool_)

# String array
str_arr = np.array(['a', 'b', 'c'], dtype='U1')

# Check data type
print(int_arr.dtype)  # int32

Python

Array Attributes

import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Shape: dimensions of the array
print(arr.shape)  # (3, 4)

# ndim: number of dimensions
print(arr.ndim)  # 2

# size: total number of elements
print(arr.size)  # 12

# dtype: data type of elements
print(arr.dtype)  # int64 (or int32 depending on system)

# itemsize: size in bytes of each element
print(arr.itemsize)  # 8 (for int64)

# nbytes: total bytes consumed by the array
print(arr.nbytes)  # 96 (12 elements * 8 bytes)

import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Shape: dimensions of the array
print(arr.shape)  # (3, 4)

# ndim: number of dimensions
print(arr.ndim)  # 2

# size: total number of elements
print(arr.size)  # 12

# dtype: data type of elements
print(arr.dtype)  # int64 (or int32 depending on system)

# itemsize: size in bytes of each element
print(arr.itemsize)  # 8 (for int64)

# nbytes: total bytes consumed by the array
print(arr.nbytes)  # 96 (12 elements * 8 bytes)

Python

Array Indexing and Slicing

Basic Indexing

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

# Access single element
print(arr[0])   # 10
print(arr[-1])  # 50

# Modify element
arr[2] = 99
print(arr)  # [10 20 99 40 50]

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

# Access single element
print(arr[0])   # 10
print(arr[-1])  # 50

# Modify element
arr[2] = 99
print(arr)  # [10 20 99 40 50]

Python

Slicing 1D Arrays

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Basic slicing: arr[start:stop:step]
print(arr[2:7])      # [2 3 4 5 6]
print(arr[:5])       # [0 1 2 3 4]
print(arr[5:])       # [5 6 7 8 9]
print(arr[::2])      # [0 2 4 6 8]
print(arr[::-1])     # [9 8 7 6 5 4 3 2 1 0] (reverse)

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Basic slicing: arr[start:stop:step]
print(arr[2:7])      # [2 3 4 5 6]
print(arr[:5])       # [0 1 2 3 4]
print(arr[5:])       # [5 6 7 8 9]
print(arr[::2])      # [0 2 4 6 8]
print(arr[::-1])     # [9 8 7 6 5 4 3 2 1 0] (reverse)

Python

Indexing 2D Arrays

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Access single element
print(arr2d[0, 0])   # 1
print(arr2d[1, 2])   # 6
print(arr2d[-1, -1]) # 9

# Access row
print(arr2d[1])      # [4 5 6]

# Access column
print(arr2d[:, 1])   # [2 5 8]

# Slicing
print(arr2d[0:2, 1:3])
# [[2 3]
#  [5 6]]

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Access single element
print(arr2d[0, 0])   # 1
print(arr2d[1, 2])   # 6
print(arr2d[-1, -1]) # 9

# Access row
print(arr2d[1])      # [4 5 6]

# Access column
print(arr2d[:, 1])   # [2 5 8]

# Slicing
print(arr2d[0:2, 1:3])
# [[2 3]
#  [5 6]]

Python

Boolean Indexing

arr = np.array([1, 2, 3, 4, 5, 6])

# Create boolean mask
mask = arr > 3
print(mask)  # [False False False  True  True  True]

# Use mask to filter
print(arr[mask])  # [4 5 6]

# Direct boolean indexing
print(arr[arr > 3])  # [4 5 6]

# Multiple conditions
print(arr[(arr > 2) & (arr < 5)])  # [3 4]

arr = np.array([1, 2, 3, 4, 5, 6])

# Create boolean mask
mask = arr > 3
print(mask)  # [False False False  True  True  True]

# Use mask to filter
print(arr[mask])  # [4 5 6]

# Direct boolean indexing
print(arr[arr > 3])  # [4 5 6]

# Multiple conditions
print(arr[(arr > 2) & (arr < 5)])  # [3 4]

Python

Fancy Indexing

arr = np.array([10, 20, 30, 40, 50, 60])

# Index with array of integers
indices = np.array([0, 2, 4])
print(arr[indices])  # [10 30 50]

# 2D fancy indexing
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
rows = np.array([0, 2])
cols = np.array([1, 2])
print(arr2d[rows, cols])  # [2 9]

arr = np.array([10, 20, 30, 40, 50, 60])

# Index with array of integers
indices = np.array([0, 2, 4])
print(arr[indices])  # [10 30 50]

# 2D fancy indexing
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
rows = np.array([0, 2])
cols = np.array([1, 2])
print(arr2d[rows, cols])  # [2 9]

Python

Array Operations

Arithmetic Operations

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

# Element-wise operations
print(a + b)   # [11 22 33 44]
print(a - b)   # [-9 -18 -27 -36]
print(a * b)   # [10 40 90 160]
print(a / b)   # [0.1 0.1 0.1 0.1]
print(a ** 2)  # [1 4 9 16]
print(a % 2)   # [1 0 1 0]

# Operations with scalars
print(a + 5)   # [6 7 8 9]
print(a * 2)   # [2 4 6 8]

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

# Element-wise operations
print(a + b)   # [11 22 33 44]
print(a - b)   # [-9 -18 -27 -36]
print(a * b)   # [10 40 90 160]
print(a / b)   # [0.1 0.1 0.1 0.1]
print(a ** 2)  # [1 4 9 16]
print(a % 2)   # [1 0 1 0]

# Operations with scalars
print(a + 5)   # [6 7 8 9]
print(a * 2)   # [2 4 6 8]

Python

Universal Functions (ufuncs)

arr = np.array([1, 4, 9, 16, 25])

# Mathematical functions
print(np.sqrt(arr))      # [1. 2. 3. 4. 5.]
print(np.exp(arr))       # [2.72e+00 5.46e+01 8.10e+03 8.89e+06 7.20e+10]
print(np.log(arr))       # [0.    1.39  2.20  2.77  3.22]
print(np.sin(arr))       # [0.84  -0.76 0.41  -0.29 -0.13]

# Absolute value
arr_neg = np.array([-1, -2, 3, -4])
print(np.abs(arr_neg))   # [1 2 3 4]

# Sign function
print(np.sign(arr_neg))  # [-1 -1  1 -1]

arr = np.array([1, 4, 9, 16, 25])

# Mathematical functions
print(np.sqrt(arr))      # [1. 2. 3. 4. 5.]
print(np.exp(arr))       # [2.72e+00 5.46e+01 8.10e+03 8.89e+06 7.20e+10]
print(np.log(arr))       # [0.    1.39  2.20  2.77  3.22]
print(np.sin(arr))       # [0.84  -0.76 0.41  -0.29 -0.13]

# Absolute value
arr_neg = np.array([-1, -2, 3, -4])
print(np.abs(arr_neg))   # [1 2 3 4]

# Sign function
print(np.sign(arr_neg))  # [-1 -1  1 -1]

Python

Comparison Operations

a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])

print(a == b)   # [False False  True False False]
print(a < b)    # [ True  True False False False]
print(a >= 3)   # [False False  True  True  True]

# Check if any or all elements satisfy condition
print(np.any(a > 3))    # True
print(np.all(a > 0))    # True

a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])

print(a == b)   # [False False  True False False]
print(a < b)    # [ True  True False False False]
print(a >= 3)   # [False False  True  True  True]

# Check if any or all elements satisfy condition
print(np.any(a > 3))    # True
print(np.all(a > 0))    # True

Python

Aggregate Functions

arr = np.array([1, 2, 3, 4, 5])

print(np.sum(arr))      # 15
print(np.min(arr))      # 1
print(np.max(arr))      # 5
print(np.mean(arr))     # 3.0
print(np.median(arr))   # 3.0
print(np.std(arr))      # 1.414... (standard deviation)
print(np.var(arr))      # 2.0 (variance)

# 2D array operations
arr2d = np.array([[1, 2, 3], [4, 5, 6]])

print(np.sum(arr2d))           # 21 (all elements)
print(np.sum(arr2d, axis=0))   # [5 7 9] (sum along columns)
print(np.sum(arr2d, axis=1))   # [6 15] (sum along rows)

arr = np.array([1, 2, 3, 4, 5])

print(np.sum(arr))      # 15
print(np.min(arr))      # 1
print(np.max(arr))      # 5
print(np.mean(arr))     # 3.0
print(np.median(arr))   # 3.0
print(np.std(arr))      # 1.414... (standard deviation)
print(np.var(arr))      # 2.0 (variance)

# 2D array operations
arr2d = np.array([[1, 2, 3], [4, 5, 6]])

print(np.sum(arr2d))           # 21 (all elements)
print(np.sum(arr2d, axis=0))   # [5 7 9] (sum along columns)
print(np.sum(arr2d, axis=1))   # [6 15] (sum along rows)

Python

Mathematical Functions

Trigonometric Functions

import numpy as np

angles = np.array([0, np.pi/6, np.pi/4, np.pi/3, np.pi/2])

print(np.sin(angles))
print(np.cos(angles))
print(np.tan(angles))

# Inverse functions
print(np.arcsin(np.sin(angles)))
print(np.arccos(np.cos(angles)))
print(np.arctan(np.tan(angles[:-1])))  # Exclude π/2 to avoid infinity

# Hyperbolic functions
print(np.sinh(angles))
print(np.cosh(angles))
print(np.tanh(angles))

import numpy as np

angles = np.array([0, np.pi/6, np.pi/4, np.pi/3, np.pi/2])

print(np.sin(angles))
print(np.cos(angles))
print(np.tan(angles))

# Inverse functions
print(np.arcsin(np.sin(angles)))
print(np.arccos(np.cos(angles)))
print(np.arctan(np.tan(angles[:-1])))  # Exclude π/2 to avoid infinity

# Hyperbolic functions
print(np.sinh(angles))
print(np.cosh(angles))
print(np.tanh(angles))

Python

Rounding Functions

arr = np.array([1.23, 2.67, 3.45, 4.89])

print(np.round(arr))       # [1. 3. 3. 5.]
print(np.floor(arr))       # [1. 2. 3. 4.]
print(np.ceil(arr))        # [2. 3. 4. 5.]
print(np.trunc(arr))       # [1. 2. 3. 4.]

# Round to specific decimals
print(np.round(arr, 1))    # [1.2 2.7 3.4 4.9]

arr = np.array([1.23, 2.67, 3.45, 4.89])

print(np.round(arr))       # [1. 3. 3. 5.]
print(np.floor(arr))       # [1. 2. 3. 4.]
print(np.ceil(arr))        # [2. 3. 4. 5.]
print(np.trunc(arr))       # [1. 2. 3. 4.]

# Round to specific decimals
print(np.round(arr, 1))    # [1.2 2.7 3.4 4.9]

Python

Exponential and Logarithmic

arr = np.array([1, 2, 3, 4, 5])

# Exponential
print(np.exp(arr))         # e^x
print(np.exp2(arr))        # 2^x
print(np.power(3, arr))    # 3^x

# Logarithm
print(np.log(arr))         # Natural log (ln)
print(np.log10(arr))       # Base 10
print(np.log2(arr))        # Base 2

arr = np.array([1, 2, 3, 4, 5])

# Exponential
print(np.exp(arr))         # e^x
print(np.exp2(arr))        # 2^x
print(np.power(3, arr))    # 3^x

# Logarithm
print(np.log(arr))         # Natural log (ln)
print(np.log10(arr))       # Base 10
print(np.log2(arr))        # Base 2

Python

Array Reshaping

Reshape

import numpy as np

arr = np.arange(12)  # [0 1 2 3 4 5 6 7 8 9 10 11]

# Reshape to 2D
arr2d = arr.reshape(3, 4)
print(arr2d)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

# Reshape to 3D
arr3d = arr.reshape(2, 3, 2)
print(arr3d.shape)  # (2, 3, 2)

# Use -1 to infer dimension
arr_auto = arr.reshape(2, -1)  # NumPy calculates: 12/2 = 6
print(arr_auto.shape)  # (2, 6)

import numpy as np

arr = np.arange(12)  # [0 1 2 3 4 5 6 7 8 9 10 11]

# Reshape to 2D
arr2d = arr.reshape(3, 4)
print(arr2d)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

# Reshape to 3D
arr3d = arr.reshape(2, 3, 2)
print(arr3d.shape)  # (2, 3, 2)

# Use -1 to infer dimension
arr_auto = arr.reshape(2, -1)  # NumPy calculates: 12/2 = 6
print(arr_auto.shape)  # (2, 6)

Python

Flatten and Ravel

arr2d = np.array([[1, 2, 3], [4, 5, 6]])

# Flatten: returns a copy
flat = arr2d.flatten()
print(flat)  # [1 2 3 4 5 6]

# Ravel: returns a view if possible (more efficient)
ravel = arr2d.ravel()
print(ravel)  # [1 2 3 4 5 6]

arr2d = np.array([[1, 2, 3], [4, 5, 6]])

# Flatten: returns a copy
flat = arr2d.flatten()
print(flat)  # [1 2 3 4 5 6]

# Ravel: returns a view if possible (more efficient)
ravel = arr2d.ravel()
print(ravel)  # [1 2 3 4 5 6]

Python

Transpose

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # (2, 3)

# Transpose
arr_t = arr.T
print(arr_t.shape)  # (3, 2)
print(arr_t)
# [[1 4]
#  [2 5]
#  [3 6]]

# For multi-dimensional arrays
arr3d = np.arange(24).reshape(2, 3, 4)
arr3d_t = np.transpose(arr3d, (2, 0, 1))  # Swap axes
print(arr3d_t.shape)  # (4, 2, 3)

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # (2, 3)

# Transpose
arr_t = arr.T
print(arr_t.shape)  # (3, 2)
print(arr_t)
# [[1 4]
#  [2 5]
#  [3 6]]

# For multi-dimensional arrays
arr3d = np.arange(24).reshape(2, 3, 4)
arr3d_t = np.transpose(arr3d, (2, 0, 1))  # Swap axes
print(arr3d_t.shape)  # (4, 2, 3)

Python

Stack and Split

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Vertical stack (row-wise)
v_stack = np.vstack([a, b])
print(v_stack)
# [[1 2 3]
#  [4 5 6]]

# Horizontal stack (column-wise)
h_stack = np.hstack([a, b])
print(h_stack)  # [1 2 3 4 5 6]

# Concatenate along axis
concat = np.concatenate([a, b])
print(concat)  # [1 2 3 4 5 6]

# Split
arr = np.arange(9)
split = np.split(arr, 3)  # Split into 3 equal parts
print(split)  # [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Vertical stack (row-wise)
v_stack = np.vstack([a, b])
print(v_stack)
# [[1 2 3]
#  [4 5 6]]

# Horizontal stack (column-wise)
h_stack = np.hstack([a, b])
print(h_stack)  # [1 2 3 4 5 6]

# Concatenate along axis
concat = np.concatenate([a, b])
print(concat)  # [1 2 3 4 5 6]

# Split
arr = np.arange(9)
split = np.split(arr, 3)  # Split into 3 equal parts
print(split)  # [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

Python

Broadcasting

Broadcasting allows NumPy to perform operations on arrays of different shapes.

Broadcasting Rules

If arrays have different numbers of dimensions, pad the smaller shape with ones on the left
Arrays are compatible if their dimensions are equal or one of them is 1
After broadcasting, each array behaves as if it had the larger shape

Examples

import numpy as np

# Scalar with array
arr = np.array([1, 2, 3])
result = arr + 10  # 10 is broadcast to [10, 10, 10]
print(result)  # [11 12 13]

# 1D with 2D
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
arr1d = np.array([10, 20, 30])

result = arr2d + arr1d  # arr1d is broadcast to each row
print(result)
# [[11 22 33]
#  [14 25 36]]

# Column vector with row vector
col = np.array([[1], [2], [3]])  # Shape: (3, 1)
row = np.array([10, 20, 30])      # Shape: (3,)

result = col + row  # Broadcasting creates (3, 3) result
print(result)
# [[11 21 31]
#  [12 22 32]
#  [13 23 33]]

# Visual representation
# col:          row:          result:
# [[1]    +    [10 20 30] =  [[11 21 31]
#  [2]                        [12 22 32]
#  [3]]                       [13 23 33]]

import numpy as np

# Scalar with array
arr = np.array([1, 2, 3])
result = arr + 10  # 10 is broadcast to [10, 10, 10]
print(result)  # [11 12 13]

# 1D with 2D
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
arr1d = np.array([10, 20, 30])

result = arr2d + arr1d  # arr1d is broadcast to each row
print(result)
# [[11 22 33]
#  [14 25 36]]

# Column vector with row vector
col = np.array([[1], [2], [3]])  # Shape: (3, 1)
row = np.array([10, 20, 30])      # Shape: (3,)

result = col + row  # Broadcasting creates (3, 3) result
print(result)
# [[11 21 31]
#  [12 22 32]
#  [13 23 33]]

# Visual representation
# col:          row:          result:
# [[1]    +    [10 20 30] =  [[11 21 31]
#  [2]                        [12 22 32]
#  [3]]                       [13 23 33]]

Python

Common Broadcasting Patterns

# Normalize each row
arr = np.array([[1, 2, 3], [4, 5, 6]])
row_means = arr.mean(axis=1, keepdims=True)  # Shape: (2, 1)
normalized = arr - row_means
print(normalized)

# Outer product
a = np.array([1, 2, 3])
b = np.array([10, 20])
outer = a.reshape(-1, 1) * b.reshape(1, -1)
print(outer)
# [[10 20]
#  [20 40]
#  [30 60]]

# Normalize each row
arr = np.array([[1, 2, 3], [4, 5, 6]])
row_means = arr.mean(axis=1, keepdims=True)  # Shape: (2, 1)
normalized = arr - row_means
print(normalized)

# Outer product
a = np.array([1, 2, 3])
b = np.array([10, 20])
outer = a.reshape(-1, 1) * b.reshape(1, -1)
print(outer)
# [[10 20]
#  [20 40]
#  [30 60]]

Python

Linear Algebra

Matrix Operations

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Matrix multiplication
C = np.dot(A, B)  # or A @ B
print(C)
# [[19 22]
#  [43 50]]

# Element-wise multiplication
element_wise = A * B
print(element_wise)
# [[ 5 12]
#  [21 32]]

# Matrix power
print(np.linalg.matrix_power(A, 2))  # A^2

# Inner product (for 1D arrays)
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.inner(a, b))  # 32 (1*4 + 2*5 + 3*6)

# Outer product
print(np.outer(a, b))
# [[ 4  5  6]
#  [ 8 10 12]
#  [12 15 18]]

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Matrix multiplication
C = np.dot(A, B)  # or A @ B
print(C)
# [[19 22]
#  [43 50]]

# Element-wise multiplication
element_wise = A * B
print(element_wise)
# [[ 5 12]
#  [21 32]]

# Matrix power
print(np.linalg.matrix_power(A, 2))  # A^2

# Inner product (for 1D arrays)
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.inner(a, b))  # 32 (1*4 + 2*5 + 3*6)

# Outer product
print(np.outer(a, b))
# [[ 4  5  6]
#  [ 8 10 12]
#  [12 15 18]]

Python

Matrix Decomposition

# Determinant
A = np.array([[1, 2], [3, 4]])
det = np.linalg.det(A)
print(det)  # -2.0

# Inverse
A_inv = np.linalg.inv(A)
print(A_inv)
# [[-2.   1. ]
#  [ 1.5 -0.5]]

# Verify: A @ A_inv should be identity
print(np.round(A @ A_inv))
# [[1. 0.]
#  [0. 1.]]

# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)

# Singular Value Decomposition (SVD)
U, s, Vt = np.linalg.svd(A)
print("U:\n", U)
print("Singular values:", s)
print("Vt:\n", Vt)

# QR decomposition
Q, R = np.linalg.qr(A)
print("Q:\n", Q)
print("R:\n", R)

# Determinant
A = np.array([[1, 2], [3, 4]])
det = np.linalg.det(A)
print(det)  # -2.0

# Inverse
A_inv = np.linalg.inv(A)
print(A_inv)
# [[-2.   1. ]
#  [ 1.5 -0.5]]

# Verify: A @ A_inv should be identity
print(np.round(A @ A_inv))
# [[1. 0.]
#  [0. 1.]]

# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)

# Singular Value Decomposition (SVD)
U, s, Vt = np.linalg.svd(A)
print("U:\n", U)
print("Singular values:", s)
print("Vt:\n", Vt)

# QR decomposition
Q, R = np.linalg.qr(A)
print("Q:\n", Q)
print("R:\n", R)

Python

Solving Linear Systems

# Solve Ax = b
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])

x = np.linalg.solve(A, b)
print(x)  # [2. 3.]

# Verify solution
print(np.allclose(A @ x, b))  # True

# Least squares solution (overdetermined system)
A = np.array([[1, 1], [1, 2], [1, 3]])
b = np.array([2, 3, 4])

x, residuals, rank, s = np.linalg.lstsq(A, b, rcond=None)
print(x)

# Solve Ax = b
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])

x = np.linalg.solve(A, b)
print(x)  # [2. 3.]

# Verify solution
print(np.allclose(A @ x, b))  # True

# Least squares solution (overdetermined system)
A = np.array([[1, 1], [1, 2], [1, 3]])
b = np.array([2, 3, 4])

x, residuals, rank, s = np.linalg.lstsq(A, b, rcond=None)
print(x)

Python

Matrix Properties

A = np.array([[1, 2, 3], [4, 5, 6]])

# Trace (sum of diagonal elements)
B = np.array([[1, 2], [3, 4]])
print(np.trace(B))  # 5

# Rank
print(np.linalg.matrix_rank(A))  # 2

# Norm
print(np.linalg.norm(A))  # Frobenius norm
print(np.linalg.norm(A, ord=2))  # 2-norm (spectral norm)

A = np.array([[1, 2, 3], [4, 5, 6]])

# Trace (sum of diagonal elements)
B = np.array([[1, 2], [3, 4]])
print(np.trace(B))  # 5

# Rank
print(np.linalg.matrix_rank(A))  # 2

# Norm
print(np.linalg.norm(A))  # Frobenius norm
print(np.linalg.norm(A, ord=2))  # 2-norm (spectral norm)

Python

Statistical Functions

Basic Statistics

import numpy as np

data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Central tendency
print(np.mean(data))      # 5.5
print(np.median(data))    # 5.5
print(np.average(data))   # 5.5

# Weighted average
weights = np.array([1, 1, 1, 1, 1, 2, 2, 2, 2, 2])
print(np.average(data, weights=weights))  # 6.333...

# Spread
print(np.std(data))       # 2.872... (standard deviation)
print(np.var(data))       # 8.25 (variance)
print(np.ptp(data))       # 9 (peak to peak, max - min)

# Percentiles and quantiles
print(np.percentile(data, 25))   # 3.25
print(np.percentile(data, 50))   # 5.5 (median)
print(np.percentile(data, 75))   # 7.75
print(np.quantile(data, [0.25, 0.5, 0.75]))

import numpy as np

data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Central tendency
print(np.mean(data))      # 5.5
print(np.median(data))    # 5.5
print(np.average(data))   # 5.5

# Weighted average
weights = np.array([1, 1, 1, 1, 1, 2, 2, 2, 2, 2])
print(np.average(data, weights=weights))  # 6.333...

# Spread
print(np.std(data))       # 2.872... (standard deviation)
print(np.var(data))       # 8.25 (variance)
print(np.ptp(data))       # 9 (peak to peak, max - min)

# Percentiles and quantiles
print(np.percentile(data, 25))   # 3.25
print(np.percentile(data, 50))   # 5.5 (median)
print(np.percentile(data, 75))   # 7.75
print(np.quantile(data, [0.25, 0.5, 0.75]))

Python

Correlation and Covariance

x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 5])

# Correlation coefficient
correlation_matrix = np.corrcoef(x, y)
print(correlation_matrix)
# [[1.    0.775]
#  [0.775 1.   ]]

# Covariance
covariance_matrix = np.cov(x, y)
print(covariance_matrix)

# For multiple variables
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
cov_matrix = np.cov(data)
print(cov_matrix)

x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 5])

# Correlation coefficient
correlation_matrix = np.corrcoef(x, y)
print(correlation_matrix)
# [[1.    0.775]
#  [0.775 1.   ]]

# Covariance
covariance_matrix = np.cov(x, y)
print(covariance_matrix)

# For multiple variables
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
cov_matrix = np.cov(data)
print(cov_matrix)

Python

Binning and Histograms

data = np.random.randn(1000)  # Random normal distribution

# Histogram
hist, bin_edges = np.histogram(data, bins=10)
print("Histogram counts:", hist)
print("Bin edges:", bin_edges)

# Digitize (assign to bins)
bins = np.array([-2, -1, 0, 1, 2])
indices = np.digitize(data, bins)
print(indices[:10])  # First 10 bin assignments

data = np.random.randn(1000)  # Random normal distribution

# Histogram
hist, bin_edges = np.histogram(data, bins=10)
print("Histogram counts:", hist)
print("Bin edges:", bin_edges)

# Digitize (assign to bins)
bins = np.array([-2, -1, 0, 1, 2])
indices = np.digitize(data, bins)
print(indices[:10])  # First 10 bin assignments

Python

Random Number Generation

Random Module

import numpy as np

# Set seed for reproducibility
np.random.seed(42)

# Random floats between 0 and 1
print(np.random.rand(3, 2))  # 3x2 array

# Random floats from uniform distribution [low, high)
print(np.random.uniform(0, 10, size=5))

# Random integers
print(np.random.randint(0, 10, size=5))  # [low, high)

# Random integers from range
print(np.random.randint(low=1, high=7, size=(2, 3)))  # Like dice rolls

# Random choice from array
arr = np.array([10, 20, 30, 40, 50])
print(np.random.choice(arr, size=3))

# Random choice with replacement=False (unique values)
print(np.random.choice(arr, size=3, replace=False))

# Random choice with probabilities
print(np.random.choice(arr, size=5, p=[0.1, 0.1, 0.2, 0.3, 0.3]))

import numpy as np

# Set seed for reproducibility
np.random.seed(42)

# Random floats between 0 and 1
print(np.random.rand(3, 2))  # 3x2 array

# Random floats from uniform distribution [low, high)
print(np.random.uniform(0, 10, size=5))

# Random integers
print(np.random.randint(0, 10, size=5))  # [low, high)

# Random integers from range
print(np.random.randint(low=1, high=7, size=(2, 3)))  # Like dice rolls

# Random choice from array
arr = np.array([10, 20, 30, 40, 50])
print(np.random.choice(arr, size=3))

# Random choice with replacement=False (unique values)
print(np.random.choice(arr, size=3, replace=False))

# Random choice with probabilities
print(np.random.choice(arr, size=5, p=[0.1, 0.1, 0.2, 0.3, 0.3]))

Python

Statistical Distributions

# Normal (Gaussian) distribution
normal = np.random.normal(loc=0, scale=1, size=1000)  # mean=0, std=1
print(normal.mean(), normal.std())

# Standard normal
standard_normal = np.random.randn(1000)

# Binomial distribution
binomial = np.random.binomial(n=10, p=0.5, size=1000)  # 10 trials, p=0.5

# Poisson distribution
poisson = np.random.poisson(lam=5, size=1000)  # lambda=5

# Exponential distribution
exponential = np.random.exponential(scale=2, size=1000)

# Beta distribution
beta = np.random.beta(a=2, b=5, size=1000)

# Gamma distribution
gamma = np.random.gamma(shape=2, scale=2, size=1000)

# Normal (Gaussian) distribution
normal = np.random.normal(loc=0, scale=1, size=1000)  # mean=0, std=1
print(normal.mean(), normal.std())

# Standard normal
standard_normal = np.random.randn(1000)

# Binomial distribution
binomial = np.random.binomial(n=10, p=0.5, size=1000)  # 10 trials, p=0.5

# Poisson distribution
poisson = np.random.poisson(lam=5, size=1000)  # lambda=5

# Exponential distribution
exponential = np.random.exponential(scale=2, size=1000)

# Beta distribution
beta = np.random.beta(a=2, b=5, size=1000)

# Gamma distribution
gamma = np.random.gamma(shape=2, scale=2, size=1000)

Python

Array Manipulation with Random

arr = np.arange(10)

# Shuffle in place
np.random.shuffle(arr)
print(arr)

# Permutation (returns shuffled copy)
original = np.arange(10)
shuffled = np.random.permutation(original)
print(original)  # Unchanged
print(shuffled)  # Shuffled

arr = np.arange(10)

# Shuffle in place
np.random.shuffle(arr)
print(arr)

# Permutation (returns shuffled copy)
original = np.arange(10)
shuffled = np.random.permutation(original)
print(original)  # Unchanged
print(shuffled)  # Shuffled

Python

New Random Generator (Recommended)

# Modern approach using Generator
from numpy.random import default_rng

rng = default_rng(42)  # Seed

# Generate random numbers
print(rng.random(5))
print(rng.integers(0, 10, size=5))
print(rng.normal(0, 1, size=5))
print(rng.choice([1, 2, 3, 4, 5], size=3))

# Modern approach using Generator
from numpy.random import default_rng

rng = default_rng(42)  # Seed

# Generate random numbers
print(rng.random(5))
print(rng.integers(0, 10, size=5))
print(rng.normal(0, 1, size=5))
print(rng.choice([1, 2, 3, 4, 5], size=3))

Python

File I/O

Text Files

import numpy as np

# Save array to text file
arr = np.array([[1, 2, 3], [4, 5, 6]])
np.savetxt('data.txt', arr)

# Load from text file
loaded = np.loadtxt('data.txt')
print(loaded)

# Save with formatting
np.savetxt('data.csv', arr, delimiter=',', fmt='%d')

# Load CSV
loaded_csv = np.loadtxt('data.csv', delimiter=',')

import numpy as np

# Save array to text file
arr = np.array([[1, 2, 3], [4, 5, 6]])
np.savetxt('data.txt', arr)

# Load from text file
loaded = np.loadtxt('data.txt')
print(loaded)

# Save with formatting
np.savetxt('data.csv', arr, delimiter=',', fmt='%d')

# Load CSV
loaded_csv = np.loadtxt('data.csv', delimiter=',')

Python

Binary Files (.npy)

# Save single array
arr = np.array([1, 2, 3, 4, 5])
np.save('array.npy', arr)

# Load
loaded = np.load('array.npy')
print(loaded)

# Save multiple arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
np.savez('multiple.npz', a=arr1, b=arr2)

# Load multiple
data = np.load('multiple.npz')
print(data['a'])
print(data['b'])

# Save compressed
np.savez_compressed('compressed.npz', a=arr1, b=arr2)

# Save single array
arr = np.array([1, 2, 3, 4, 5])
np.save('array.npy', arr)

# Load
loaded = np.load('array.npy')
print(loaded)

# Save multiple arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
np.savez('multiple.npz', a=arr1, b=arr2)

# Load multiple
data = np.load('multiple.npz')
print(data['a'])
print(data['b'])

# Save compressed
np.savez_compressed('compressed.npz', a=arr1, b=arr2)

Python

Memory-Mapped Files

# For very large files that don't fit in memory
# Create memory-mapped array
mm_arr = np.memmap('memmap.dat', dtype='float32', mode='w+', shape=(1000, 1000))

# Write data
mm_arr[:] = np.random.rand(1000, 1000)
mm_arr.flush()

# Read memory-mapped array
mm_loaded = np.memmap('memmap.dat', dtype='float32', mode='r', shape=(1000, 1000))
print(mm_loaded[0, :10])  # Access without loading entire file

# For very large files that don't fit in memory
# Create memory-mapped array
mm_arr = np.memmap('memmap.dat', dtype='float32', mode='w+', shape=(1000, 1000))

# Write data
mm_arr[:] = np.random.rand(1000, 1000)
mm_arr.flush()

# Read memory-mapped array
mm_loaded = np.memmap('memmap.dat', dtype='float32', mode='r', shape=(1000, 1000))
print(mm_loaded[0, :10])  # Access without loading entire file

Python

Advanced Topics

Structured Arrays

import numpy as np

# Define structured data type
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

# Create structured array
data = np.array([('Alice', 25, 55.5),
                 ('Bob', 30, 75.0),
                 ('Charlie', 35, 80.2)], dtype=dt)

print(data)
print(data['name'])    # ['Alice' 'Bob' 'Charlie']
print(data['age'])     # [25 30 35]
print(data[0])         # ('Alice', 25, 55.5)

# Sorting by field
sorted_data = np.sort(data, order='age')
print(sorted_data)

import numpy as np

# Define structured data type
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

# Create structured array
data = np.array([('Alice', 25, 55.5),
                 ('Bob', 30, 75.0),
                 ('Charlie', 35, 80.2)], dtype=dt)

print(data)
print(data['name'])    # ['Alice' 'Bob' 'Charlie']
print(data['age'])     # [25 30 35]
print(data[0])         # ('Alice', 25, 55.5)

# Sorting by field
sorted_data = np.sort(data, order='age')
print(sorted_data)

Python

Masked Arrays

# Handle missing or invalid data
data = np.array([1, 2, -999, 4, -999, 6])

# Create masked array
masked = np.ma.masked_equal(data, -999)
print(masked)  # [1 2 -- 4 -- 6]

# Operations ignore masked values
print(masked.mean())  # 3.25 (ignores -999)
print(masked.sum())   # 13

# Manual mask
mask = np.array([False, False, True, False, True, False])
masked2 = np.ma.array(data, mask=mask)
print(masked2)

# Handle missing or invalid data
data = np.array([1, 2, -999, 4, -999, 6])

# Create masked array
masked = np.ma.masked_equal(data, -999)
print(masked)  # [1 2 -- 4 -- 6]

# Operations ignore masked values
print(masked.mean())  # 3.25 (ignores -999)
print(masked.sum())   # 13

# Manual mask
mask = np.array([False, False, True, False, True, False])
masked2 = np.ma.array(data, mask=mask)
print(masked2)

Python

Vectorization

# Vectorize a Python function to work on arrays
def my_function(x, y):
    if x > y:
        return x - y
    else:
        return x + y

# Vectorize
vec_function = np.vectorize(my_function)

a = np.array([1, 2, 3, 4])
b = np.array([4, 3, 2, 1])

result = vec_function(a, b)
print(result)  # [5 5 5 5]

# Note: For better performance, prefer built-in NumPy functions
result_fast = np.where(a > b, a - b, a + b)
print(result_fast)  # [5 5 5 5]

# Vectorize a Python function to work on arrays
def my_function(x, y):
    if x > y:
        return x - y
    else:
        return x + y

# Vectorize
vec_function = np.vectorize(my_function)

a = np.array([1, 2, 3, 4])
b = np.array([4, 3, 2, 1])

result = vec_function(a, b)
print(result)  # [5 5 5 5]

# Note: For better performance, prefer built-in NumPy functions
result_fast = np.where(a > b, a - b, a + b)
print(result_fast)  # [5 5 5 5]

Python

Advanced Indexing

# Integer array indexing
arr = np.arange(12).reshape(3, 4)
rows = np.array([0, 0, 2, 2])
cols = np.array([0, 2, 0, 2])
print(arr[rows, cols])  # [0 2 8 10]

# Boolean mask with multiple conditions
arr = np.arange(20)
mask = (arr % 2 == 0) & (arr > 10)
print(arr[mask])  # [12 14 16 18]

# np.where for conditional replacement
arr = np.array([1, 2, 3, 4, 5])
result = np.where(arr > 3, 100, arr)
print(result)  # [1 2 3 100 100]

# np.select for multiple conditions
conditions = [arr < 2, arr < 4, arr >= 4]
choices = ['small', 'medium', 'large']
result = np.select(conditions, choices)
print(result)  # ['small' 'medium' 'medium' 'large' 'large']

# Integer array indexing
arr = np.arange(12).reshape(3, 4)
rows = np.array([0, 0, 2, 2])
cols = np.array([0, 2, 0, 2])
print(arr[rows, cols])  # [0 2 8 10]

# Boolean mask with multiple conditions
arr = np.arange(20)
mask = (arr % 2 == 0) & (arr > 10)
print(arr[mask])  # [12 14 16 18]

# np.where for conditional replacement
arr = np.array([1, 2, 3, 4, 5])
result = np.where(arr > 3, 100, arr)
print(result)  # [1 2 3 100 100]

# np.select for multiple conditions
conditions = [arr < 2, arr < 4, arr >= 4]
choices = ['small', 'medium', 'large']
result = np.select(conditions, choices)
print(result)  # ['small' 'medium' 'medium' 'large' 'large']

Python

Memory Views and Copies

arr = np.array([1, 2, 3, 4, 5])

# View (shares memory)
view = arr[1:4]
view[0] = 999
print(arr)  # [1 999 3 4 5] - original is modified!

# Copy (independent)
copy = arr[1:4].copy()
copy[0] = 111
print(arr)  # [1 999 3 4 5] - original unchanged

# Check if array owns its data
print(arr.flags['OWNDATA'])    # True
print(view.flags['OWNDATA'])   # False
print(copy.flags['OWNDATA'])   # True

# Base attribute
print(view.base is arr)   # True (view references arr)
print(copy.base is None)  # True (copy is independent)

arr = np.array([1, 2, 3, 4, 5])

# View (shares memory)
view = arr[1:4]
view[0] = 999
print(arr)  # [1 999 3 4 5] - original is modified!

# Copy (independent)
copy = arr[1:4].copy()
copy[0] = 111
print(arr)  # [1 999 3 4 5] - original unchanged

# Check if array owns its data
print(arr.flags['OWNDATA'])    # True
print(view.flags['OWNDATA'])   # False
print(copy.flags['OWNDATA'])   # True

# Base attribute
print(view.base is arr)   # True (view references arr)
print(copy.base is None)  # True (copy is independent)

Python

Einstein Summation (einsum)

# Powerful tool for multi-dimensional operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Matrix multiplication
C = np.einsum('ij,jk->ik', A, B)
print(C)  # Same as np.dot(A, B)

# Trace
trace = np.einsum('ii->', A)
print(trace)  # 5 (1 + 4)

# Transpose
At = np.einsum('ij->ji', A)
print(At)

# Element-wise multiplication and sum
result = np.einsum('ij,ij->', A, B)
print(result)  # Sum of A * B element-wise

# Powerful tool for multi-dimensional operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Matrix multiplication
C = np.einsum('ij,jk->ik', A, B)
print(C)  # Same as np.dot(A, B)

# Trace
trace = np.einsum('ii->', A)
print(trace)  # 5 (1 + 4)

# Transpose
At = np.einsum('ij->ji', A)
print(At)

# Element-wise multiplication and sum
result = np.einsum('ij,ij->', A, B)
print(result)  # Sum of A * B element-wise

Python

Best Practices

1. Use Vectorization Instead of Loops

# Bad: Using loops
arr = np.arange(1000)
result = np.zeros(1000)
for i in range(len(arr)):
    result[i] = arr[i] ** 2

# Good: Vectorized operation
result = arr ** 2

# Bad: Using loops
arr = np.arange(1000)
result = np.zeros(1000)
for i in range(len(arr)):
    result[i] = arr[i] ** 2

# Good: Vectorized operation
result = arr ** 2

Python

2. Specify Data Types

# Bad: Default data type
arr = np.zeros(1000000)  # float64, uses more memory

# Good: Specify appropriate type
arr = np.zeros(1000000, dtype=np.float32)  # Uses half the memory

# Bad: Default data type
arr = np.zeros(1000000)  # float64, uses more memory

# Good: Specify appropriate type
arr = np.zeros(1000000, dtype=np.float32)  # Uses half the memory

Python

3. Use In-Place Operations

arr = np.arange(1000)

# Creates new array
arr = arr + 1

# In-place operation (more memory efficient)
arr += 1

arr = np.arange(1000)

# Creates new array
arr = arr + 1

# In-place operation (more memory efficient)
arr += 1

Python

4. Avoid Copying When Possible

# Use views when you don't need independence
large_arr = np.arange(1000000)
subset = large_arr[::2]  # View, not copy

# Only copy when necessary
subset_copy = large_arr[::2].copy()

# Use views when you don't need independence
large_arr = np.arange(1000000)
subset = large_arr[::2]  # View, not copy

# Only copy when necessary
subset_copy = large_arr[::2].copy()

Python

5. Use Appropriate Functions

# Bad: Manual implementation
arr = np.array([1, 2, 3, 4, 5])
mean = np.sum(arr) / len(arr)

# Good: Built-in function
mean = np.mean(arr)

# Bad: Manual implementation
arr = np.array([1, 2, 3, 4, 5])
mean = np.sum(arr) / len(arr)

# Good: Built-in function
mean = np.mean(arr)

Python

6. Pre-allocate Arrays

# Bad: Growing array
result = np.array([])
for i in range(1000):
    result = np.append(result, i)

# Good: Pre-allocate
result = np.zeros(1000)
for i in range(1000):
    result[i] = i

# Even better: Use arange
result = np.arange(1000)

# Bad: Growing array
result = np.array([])
for i in range(1000):
    result = np.append(result, i)

# Good: Pre-allocate
result = np.zeros(1000)
for i in range(1000):
    result[i] = i

# Even better: Use arange
result = np.arange(1000)

Python

7. Use Broadcasting

# Bad: Manual broadcasting
arr = np.array([[1, 2, 3], [4, 5, 6]])
result = np.zeros_like(arr)
for i in range(arr.shape[0]):
    result[i] = arr[i] + np.array([10, 20, 30])

# Good: Automatic broadcasting
result = arr + np.array([10, 20, 30])

# Bad: Manual broadcasting
arr = np.array([[1, 2, 3], [4, 5, 6]])
result = np.zeros_like(arr)
for i in range(arr.shape[0]):
    result[i] = arr[i] + np.array([10, 20, 30])

# Good: Automatic broadcasting
result = arr + np.array([10, 20, 30])

Python

8. Profile Your Code

import numpy as np
import time

# Method 1
start = time.time()
arr = np.arange(1000000)
result = arr ** 2
end = time.time()
print(f"Method 1: {end - start:.4f} seconds")

# Method 2
start = time.time()
arr = np.arange(1000000)
result = np.power(arr, 2)
end = time.time()
print(f"Method 2: {end - start:.4f} seconds")

import numpy as np
import time

# Method 1
start = time.time()
arr = np.arange(1000000)
result = arr ** 2
end = time.time()
print(f"Method 1: {end - start:.4f} seconds")

# Method 2
start = time.time()
arr = np.arange(1000000)
result = np.power(arr, 2)
end = time.time()
print(f"Method 2: {end - start:.4f} seconds")

Python

9. Handle Memory Efficiently

# For large datasets, use memory-mapped files
# For operations on portions of data, use slicing/views
# Delete intermediate arrays when done
del intermediate_result

# Use generators for large data processing
def data_generator(n):
    for i in range(n):
        yield np.random.rand(1000)

# For large datasets, use memory-mapped files
# For operations on portions of data, use slicing/views
# Delete intermediate arrays when done
del intermediate_result

# Use generators for large data processing
def data_generator(n):
    for i in range(n):
        yield np.random.rand(1000)

Python

10. Document Array Shapes

def process_data(X):
    """
    Process input data.

    Parameters
    ----------
    X : ndarray, shape (n_samples, n_features)
        Input data matrix

    Returns
    -------
    result : ndarray, shape (n_samples,)
        Processed output
    """
    # Shape: (n_samples, n_features) -> (n_samples,)
    return np.mean(X, axis=1)

def process_data(X):
    """
    Process input data.

    Parameters
    ----------
    X : ndarray, shape (n_samples, n_features)
        Input data matrix

    Returns
    -------
    result : ndarray, shape (n_samples,)
        Processed output
    """
    # Shape: (n_samples, n_features) -> (n_samples,)
    return np.mean(X, axis=1)

Python

Common Pitfalls and Solutions

Pitfall 1: Modifying Array Through View

# Problem
arr = np.arange(10)
subset = arr[5:]  # View
subset[:] = 0     # Modifies original!
print(arr)        # [0 1 2 3 4 0 0 0 0 0]

# Solution: Use copy
arr = np.arange(10)
subset = arr[5:].copy()
subset[:] = 0
print(arr)  # [0 1 2 3 4 5 6 7 8 9] - unchanged

# Problem
arr = np.arange(10)
subset = arr[5:]  # View
subset[:] = 0     # Modifies original!
print(arr)        # [0 1 2 3 4 0 0 0 0 0]

# Solution: Use copy
arr = np.arange(10)
subset = arr[5:].copy()
subset[:] = 0
print(arr)  # [0 1 2 3 4 5 6 7 8 9] - unchanged

Python

Pitfall 2: Integer Division

# Problem (Python 2 style)
arr = np.array([1, 2, 3, 4, 5])
result = arr / 2  # In older NumPy versions, this was integer division

# Solution: Ensure float division
result = arr / 2.0  # or arr / float(2)
# In Python 3 and modern NumPy, / always does float division

# Problem (Python 2 style)
arr = np.array([1, 2, 3, 4, 5])
result = arr / 2  # In older NumPy versions, this was integer division

# Solution: Ensure float division
result = arr / 2.0  # or arr / float(2)
# In Python 3 and modern NumPy, / always does float division

Python

Pitfall 3: Dimension Confusion

# Problem
arr = np.array([1, 2, 3])
print(arr.shape)  # (3,) - 1D array

arr2d = np.array([[1, 2, 3]])
print(arr2d.shape)  # (1, 3) - 2D array with 1 row

# They behave differently in some operations!

# Solution: Be explicit about dimensions
arr_column = arr.reshape(-1, 1)  # (3, 1)
arr_row = arr.reshape(1, -1)     # (1, 3)

# Problem
arr = np.array([1, 2, 3])
print(arr.shape)  # (3,) - 1D array

arr2d = np.array([[1, 2, 3]])
print(arr2d.shape)  # (1, 3) - 2D array with 1 row

# They behave differently in some operations!

# Solution: Be explicit about dimensions
arr_column = arr.reshape(-1, 1)  # (3, 1)
arr_row = arr.reshape(1, -1)     # (1, 3)

Python

Useful Resources

Official Documentation: https://numpy.org/doc/
NumPy User Guide: https://numpy.org/doc/stable/user/index.html
NumPy Reference: https://numpy.org/doc/stable/reference/
NumPy Tutorial: https://numpy.org/numpy-tutorials/
Performance Tips: https://numpy.org/doc/stable/user/performance.html

Summary

NumPy is the foundation of scientific computing in Python, providing:

Efficient multi-dimensional arrays
Broadcasting for implicit operations on arrays of different shapes
Comprehensive mathematical functions
Linear algebra operations
Random number generation
File I/O capabilities
Integration with other scientific libraries

Master NumPy to unlock the full potential of Python for data science, machine learning, and scientific computing!

Quick Reference Card

# Array Creation
np.array([1,2,3])              # From list
np.zeros((3,4))                # Array of zeros
np.ones((2,3))                 # Array of ones
np.arange(10)                  # Range of values
np.linspace(0,1,5)             # Evenly spaced values

# Array Info
arr.shape                      # Dimensions
arr.dtype                      # Data type
arr.size                       # Number of elements
arr.ndim                       # Number of dimensions

# Indexing
arr[i]                         # 1D indexing
arr[i,j]                       # 2D indexing
arr[i:j]                       # Slicing
arr[arr > 5]                   # Boolean indexing

# Operations
arr + 5                        # Element-wise addition
arr * arr2                     # Element-wise multiplication
arr @ arr2                     # Matrix multiplication
np.dot(arr, arr2)              # Dot product

# Aggregations
np.sum(arr)                    # Sum
np.mean(arr)                   # Mean
np.std(arr)                    # Standard deviation
np.max(arr)                    # Maximum
np.argmax(arr)                 # Index of maximum

# Reshaping
arr.reshape(3,4)               # Reshape
arr.flatten()                  # Flatten to 1D
arr.T                          # Transpose

# Random
np.random.rand(3,4)            # Random values [0,1)
np.random.randn(100)           # Standard normal
np.random.randint(0,10,5)      # Random integers

# Linear Algebra
np.linalg.inv(A)               # Matrix inverse
np.linalg.det(A)               # Determinant
np.linalg.eig(A)               # Eigenvalues/vectors
np.linalg.solve(A,b)           # Solve Ax=b

# Array Creation
np.array([1,2,3])              # From list
np.zeros((3,4))                # Array of zeros
np.ones((2,3))                 # Array of ones
np.arange(10)                  # Range of values
np.linspace(0,1,5)             # Evenly spaced values

# Array Info
arr.shape                      # Dimensions
arr.dtype                      # Data type
arr.size                       # Number of elements
arr.ndim                       # Number of dimensions

# Indexing
arr[i]                         # 1D indexing
arr[i,j]                       # 2D indexing
arr[i:j]                       # Slicing
arr[arr > 5]                   # Boolean indexing

# Operations
arr + 5                        # Element-wise addition
arr * arr2                     # Element-wise multiplication
arr @ arr2                     # Matrix multiplication
np.dot(arr, arr2)              # Dot product

# Aggregations
np.sum(arr)                    # Sum
np.mean(arr)                   # Mean
np.std(arr)                    # Standard deviation
np.max(arr)                    # Maximum
np.argmax(arr)                 # Index of maximum

# Reshaping
arr.reshape(3,4)               # Reshape
arr.flatten()                  # Flatten to 1D
arr.T                          # Transpose

# Random
np.random.rand(3,4)            # Random values [0,1)
np.random.randn(100)           # Standard normal
np.random.randint(0,10,5)      # Random integers

# Linear Algebra
np.linalg.inv(A)               # Matrix inverse
np.linalg.det(A)               # Determinant
np.linalg.eig(A)               # Eigenvalues/vectors
np.linalg.solve(A,b)           # Solve Ax=b

Python

Comprehensive NumPy Cheatsheet

📦 Import Convention

import numpy as np

import numpy as np

Python

🎯 Array Creation

From Existing Data

np.array([1, 2, 3])                    # 1D array from list
np.array([[1,2], [3,4]])               # 2D array from nested lists
np.asarray([1, 2, 3])                  # Convert to array (no copy if already array)
np.copy(arr)                           # Create a copy
np.frombuffer(b'\x01\x02', dtype=int)  # From buffer
np.fromiter(range(5), dtype=int)       # From iterable

np.array([1, 2, 3])                    # 1D array from list
np.array([[1,2], [3,4]])               # 2D array from nested lists
np.asarray([1, 2, 3])                  # Convert to array (no copy if already array)
np.copy(arr)                           # Create a copy
np.frombuffer(b'\x01\x02', dtype=int)  # From buffer
np.fromiter(range(5), dtype=int)       # From iterable

Python

Zeros, Ones, and Empty

np.zeros(5)                            # [0. 0. 0. 0. 0.]
np.zeros((3, 4))                       # 3x4 array of zeros
np.ones((2, 3, 4))                     # 2x3x4 array of ones
np.empty((2, 2))                       # Uninitialized 2x2 array
np.zeros_like(arr)                     # Zeros with same shape as arr
np.ones_like(arr)                      # Ones with same shape as arr
np.empty_like(arr)                     # Empty with same shape as arr
np.full((3, 3), 7)                     # 3x3 array filled with 7
np.full_like(arr, 5)                   # Like arr, filled with 5

np.zeros(5)                            # [0. 0. 0. 0. 0.]
np.zeros((3, 4))                       # 3x4 array of zeros
np.ones((2, 3, 4))                     # 2x3x4 array of ones
np.empty((2, 2))                       # Uninitialized 2x2 array
np.zeros_like(arr)                     # Zeros with same shape as arr
np.ones_like(arr)                      # Ones with same shape as arr
np.empty_like(arr)                     # Empty with same shape as arr
np.full((3, 3), 7)                     # 3x3 array filled with 7
np.full_like(arr, 5)                   # Like arr, filled with 5

Python

Ranges and Sequences

np.arange(10)                          # [0 1 2 ... 9]
np.arange(2, 10)                       # [2 3 4 ... 9]
np.arange(0, 1, 0.1)                   # [0. 0.1 0.2 ... 0.9]
np.linspace(0, 10, 5)                  # 5 evenly spaced values
np.logspace(0, 2, 5)                   # [1. 3.16... 10. 31.6... 100.]
np.geomspace(1, 1000, 4)               # Geometric sequence

np.arange(10)                          # [0 1 2 ... 9]
np.arange(2, 10)                       # [2 3 4 ... 9]
np.arange(0, 1, 0.1)                   # [0. 0.1 0.2 ... 0.9]
np.linspace(0, 10, 5)                  # 5 evenly spaced values
np.logspace(0, 2, 5)                   # [1. 3.16... 10. 31.6... 100.]
np.geomspace(1, 1000, 4)               # Geometric sequence

Python

Identity and Diagonal

np.eye(3)                              # 3x3 identity matrix
np.eye(3, 4)                           # 3x4 identity matrix
np.identity(3)                         # 3x3 identity matrix
np.diag([1, 2, 3])                     # Diagonal matrix
np.diag(arr)                           # Extract diagonal
np.diagflat([1, 2])                    # Create diagonal array

np.eye(3)                              # 3x3 identity matrix
np.eye(3, 4)                           # 3x4 identity matrix
np.identity(3)                         # 3x3 identity matrix
np.diag([1, 2, 3])                     # Diagonal matrix
np.diag(arr)                           # Extract diagonal
np.diagflat([1, 2])                    # Create diagonal array

Python

Random Arrays

np.random.rand(3, 4)                   # Uniform [0, 1), shape (3,4)
np.random.randn(3, 4)                  # Standard normal, shape (3,4)
np.random.randint(0, 10, (3, 4))       # Random ints [0, 10)
np.random.random((3, 4))               # Random floats [0, 1)
np.random.uniform(0, 10, (3, 4))       # Uniform [0, 10)
np.random.normal(0, 1, (3, 4))         # Normal (μ=0, σ=1)
np.random.choice([1,2,3,4], 10)        # Random choices
np.random.permutation(10)              # Random permutation
np.random.shuffle(arr)                 # Shuffle in place

np.random.rand(3, 4)                   # Uniform [0, 1), shape (3,4)
np.random.randn(3, 4)                  # Standard normal, shape (3,4)
np.random.randint(0, 10, (3, 4))       # Random ints [0, 10)
np.random.random((3, 4))               # Random floats [0, 1)
np.random.uniform(0, 10, (3, 4))       # Uniform [0, 10)
np.random.normal(0, 1, (3, 4))         # Normal (μ=0, σ=1)
np.random.choice([1,2,3,4], 10)        # Random choices
np.random.permutation(10)              # Random permutation
np.random.shuffle(arr)                 # Shuffle in place

Python

📊 Array Attributes

arr.shape                              # Dimensions (rows, cols, ...)
arr.ndim                               # Number of dimensions
arr.size                               # Total number of elements
arr.dtype                              # Data type
arr.itemsize                           # Size of each element (bytes)
arr.nbytes                             # Total bytes (size * itemsize)
arr.T                                  # Transpose
arr.real                               # Real part (complex arrays)
arr.imag                               # Imaginary part
arr.flat                               # Flat iterator
arr.flags                              # Memory layout info

arr.shape                              # Dimensions (rows, cols, ...)
arr.ndim                               # Number of dimensions
arr.size                               # Total number of elements
arr.dtype                              # Data type
arr.itemsize                           # Size of each element (bytes)
arr.nbytes                             # Total bytes (size * itemsize)
arr.T                                  # Transpose
arr.real                               # Real part (complex arrays)
arr.imag                               # Imaginary part
arr.flat                               # Flat iterator
arr.flags                              # Memory layout info

Python

🎯 Data Types

np.int8, np.int16, np.int32, np.int64  # Signed integers
np.uint8, np.uint16, np.uint32, np.uint64  # Unsigned integers
np.float16, np.float32, np.float64     # Floating point
np.complex64, np.complex128            # Complex numbers
np.bool_                               # Boolean
np.object_                             # Python objects
np.string_, np.unicode_                # Strings

# Convert types
arr.astype(np.float32)                 # Convert to float32
arr.astype('int')                      # Convert to int

np.int8, np.int16, np.int32, np.int64  # Signed integers
np.uint8, np.uint16, np.uint32, np.uint64  # Unsigned integers
np.float16, np.float32, np.float64     # Floating point
np.complex64, np.complex128            # Complex numbers
np.bool_                               # Boolean
np.object_                             # Python objects
np.string_, np.unicode_                # Strings

# Convert types
arr.astype(np.float32)                 # Convert to float32
arr.astype('int')                      # Convert to int

Python

🔍 Indexing & Slicing

Basic Indexing

arr[0]                                 # First element
arr[-1]                                # Last element
arr[2:5]                               # Elements 2, 3, 4
arr[::2]                               # Every other element
arr[::-1]                              # Reverse
arr[1:8:2]                             # Start:stop:step

arr[0]                                 # First element
arr[-1]                                # Last element
arr[2:5]                               # Elements 2, 3, 4
arr[::2]                               # Every other element
arr[::-1]                              # Reverse
arr[1:8:2]                             # Start:stop:step

Python

Multi-dimensional Indexing

arr[i, j]                              # Element at row i, col j
arr[i]                                 # Row i
arr[:, j]                              # Column j
arr[0:2, 1:3]                          # Subarray
arr[..., 0]                            # Last dimension, first element
arr[:, :, 0]                           # Same as above for 3D

arr[i, j]                              # Element at row i, col j
arr[i]                                 # Row i
arr[:, j]                              # Column j
arr[0:2, 1:3]                          # Subarray
arr[..., 0]                            # Last dimension, first element
arr[:, :, 0]                           # Same as above for 3D

Python

Boolean Indexing

arr[arr > 5]                           # Elements > 5
arr[(arr > 5) & (arr < 10)]            # Elements 5 < x < 10
arr[(arr < 5) | (arr > 10)]            # Elements x < 5 or x > 10
arr[~(arr > 5)]                        # Elements <= 5 (NOT operator)

arr[arr > 5]                           # Elements > 5
arr[(arr > 5) & (arr < 10)]            # Elements 5 < x < 10
arr[(arr < 5) | (arr > 10)]            # Elements x < 5 or x > 10
arr[~(arr > 5)]                        # Elements <= 5 (NOT operator)

Python

Fancy Indexing

arr[[0, 2, 4]]                         # Elements at indices 0, 2, 4
arr[[0, 1], [2, 3]]                    # Elements (0,2) and (1,3)
arr[np.ix_([0,2], [1,3])]              # Outer indexing

arr[[0, 2, 4]]                         # Elements at indices 0, 2, 4
arr[[0, 1], [2, 3]]                    # Elements (0,2) and (1,3)
arr[np.ix_([0,2], [1,3])]              # Outer indexing

Python

➕ Mathematical Operations

Arithmetic

arr + 5                                # Add scalar
arr - 5                                # Subtract scalar
arr * 5                                # Multiply by scalar
arr / 5                                # Divide by scalar
arr // 5                               # Floor division
arr % 5                                # Modulo
arr ** 2                               # Power
np.add(arr1, arr2)                     # Element-wise addition
np.subtract(arr1, arr2)                # Element-wise subtraction
np.multiply(arr1, arr2)                # Element-wise multiplication
np.divide(arr1, arr2)                  # Element-wise division
np.power(arr, 2)                       # Element-wise power
np.sqrt(arr)                           # Square root
np.square(arr)                         # Square
np.exp(arr)                            # e^x
np.log(arr)                            # Natural log
np.log10(arr)                          # Log base 10
np.log2(arr)                           # Log base 2

arr + 5                                # Add scalar
arr - 5                                # Subtract scalar
arr * 5                                # Multiply by scalar
arr / 5                                # Divide by scalar
arr // 5                               # Floor division
arr % 5                                # Modulo
arr ** 2                               # Power
np.add(arr1, arr2)                     # Element-wise addition
np.subtract(arr1, arr2)                # Element-wise subtraction
np.multiply(arr1, arr2)                # Element-wise multiplication
np.divide(arr1, arr2)                  # Element-wise division
np.power(arr, 2)                       # Element-wise power
np.sqrt(arr)                           # Square root
np.square(arr)                         # Square
np.exp(arr)                            # e^x
np.log(arr)                            # Natural log
np.log10(arr)                          # Log base 10
np.log2(arr)                           # Log base 2

Python

Trigonometric

np.sin(arr)                            # Sine
np.cos(arr)                            # Cosine
np.tan(arr)                            # Tangent
np.arcsin(arr)                         # Inverse sine
np.arccos(arr)                         # Inverse cosine
np.arctan(arr)                         # Inverse tangent
np.arctan2(y, x)                       # Atan2(y, x)
np.sinh(arr)                           # Hyperbolic sine
np.cosh(arr)                           # Hyperbolic cosine
np.tanh(arr)                           # Hyperbolic tangent
np.deg2rad(arr)                        # Degrees to radians
np.rad2deg(arr)                        # Radians to degrees

np.sin(arr)                            # Sine
np.cos(arr)                            # Cosine
np.tan(arr)                            # Tangent
np.arcsin(arr)                         # Inverse sine
np.arccos(arr)                         # Inverse cosine
np.arctan(arr)                         # Inverse tangent
np.arctan2(y, x)                       # Atan2(y, x)
np.sinh(arr)                           # Hyperbolic sine
np.cosh(arr)                           # Hyperbolic cosine
np.tanh(arr)                           # Hyperbolic tangent
np.deg2rad(arr)                        # Degrees to radians
np.rad2deg(arr)                        # Radians to degrees

Python

Rounding

np.round(arr)                          # Round to nearest
np.round(arr, 2)                       # Round to 2 decimals
np.floor(arr)                          # Round down
np.ceil(arr)                           # Round up
np.trunc(arr)                          # Truncate
np.rint(arr)                           # Round to nearest int
np.fix(arr)                            # Round towards zero

np.round(arr)                          # Round to nearest
np.round(arr, 2)                       # Round to 2 decimals
np.floor(arr)                          # Round down
np.ceil(arr)                           # Round up
np.trunc(arr)                          # Truncate
np.rint(arr)                           # Round to nearest int
np.fix(arr)                            # Round towards zero

Python

Comparison

arr == 5                               # Equal to
arr != 5                               # Not equal to
arr > 5                                # Greater than
arr < 5                                # Less than
arr >= 5                               # Greater or equal
arr <= 5                               # Less or equal
np.equal(arr1, arr2)                   # Element-wise ==
np.not_equal(arr1, arr2)               # Element-wise !=
np.greater(arr1, arr2)                 # Element-wise >
np.less(arr1, arr2)                    # Element-wise <
np.allclose(arr1, arr2)                # All close (tolerance)
np.isclose(arr1, arr2)                 # Element-wise close

arr == 5                               # Equal to
arr != 5                               # Not equal to
arr > 5                                # Greater than
arr < 5                                # Less than
arr >= 5                               # Greater or equal
arr <= 5                               # Less or equal
np.equal(arr1, arr2)                   # Element-wise ==
np.not_equal(arr1, arr2)               # Element-wise !=
np.greater(arr1, arr2)                 # Element-wise >
np.less(arr1, arr2)                    # Element-wise <
np.allclose(arr1, arr2)                # All close (tolerance)
np.isclose(arr1, arr2)                 # Element-wise close

Python

📈 Aggregate Functions

Basic Aggregations

np.sum(arr)                            # Sum all elements
np.sum(arr, axis=0)                    # Sum along axis 0
np.sum(arr, axis=1)                    # Sum along axis 1
np.prod(arr)                           # Product of all elements
np.cumsum(arr)                         # Cumulative sum
np.cumprod(arr)                        # Cumulative product
np.diff(arr)                           # Differences between consecutive

np.sum(arr)                            # Sum all elements
np.sum(arr, axis=0)                    # Sum along axis 0
np.sum(arr, axis=1)                    # Sum along axis 1
np.prod(arr)                           # Product of all elements
np.cumsum(arr)                         # Cumulative sum
np.cumprod(arr)                        # Cumulative product
np.diff(arr)                           # Differences between consecutive

Python

Statistics

np.mean(arr)                           # Mean
np.median(arr)                         # Median
np.average(arr)                        # Average
np.average(arr, weights=w)             # Weighted average
np.std(arr)                            # Standard deviation
np.var(arr)                            # Variance
np.min(arr)                            # Minimum
np.max(arr)                            # Maximum
np.ptp(arr)                            # Peak to peak (max - min)
np.percentile(arr, 50)                 # 50th percentile
np.quantile(arr, 0.5)                  # 0.5 quantile

np.mean(arr)                           # Mean
np.median(arr)                         # Median
np.average(arr)                        # Average
np.average(arr, weights=w)             # Weighted average
np.std(arr)                            # Standard deviation
np.var(arr)                            # Variance
np.min(arr)                            # Minimum
np.max(arr)                            # Maximum
np.ptp(arr)                            # Peak to peak (max - min)
np.percentile(arr, 50)                 # 50th percentile
np.quantile(arr, 0.5)                  # 0.5 quantile

Python

Indices of Extrema

np.argmin(arr)                         # Index of minimum
np.argmax(arr)                         # Index of maximum
np.argmin(arr, axis=0)                 # Indices along axis
np.argmax(arr, axis=1)                 # Indices along axis
np.nanargmin(arr)                      # Ignore NaN
np.nanargmax(arr)                      # Ignore NaN

np.argmin(arr)                         # Index of minimum
np.argmax(arr)                         # Index of maximum
np.argmin(arr, axis=0)                 # Indices along axis
np.argmax(arr, axis=1)                 # Indices along axis
np.nanargmin(arr)                      # Ignore NaN
np.nanargmax(arr)                      # Ignore NaN

Python

Logical Operations

np.all(arr)                            # True if all True
np.any(arr)                            # True if any True
np.all(arr > 0)                        # Check condition
np.any(arr > 0)                        # Check condition

np.all(arr)                            # True if all True
np.any(arr)                            # True if any True
np.all(arr > 0)                        # Check condition
np.any(arr > 0)                        # Check condition

Python

🔄 Array Manipulation

Reshaping

arr.reshape(3, 4)                      # Reshape to 3x4
arr.reshape(-1, 1)                     # Column vector
arr.reshape(1, -1)                     # Row vector
arr.reshape(2, -1)                     # Auto-calculate columns
arr.flatten()                          # Flatten to 1D (copy)
arr.ravel()                            # Flatten to 1D (view)
arr.squeeze()                          # Remove single dimensions
np.expand_dims(arr, axis=0)            # Add dimension at axis

arr.reshape(3, 4)                      # Reshape to 3x4
arr.reshape(-1, 1)                     # Column vector
arr.reshape(1, -1)                     # Row vector
arr.reshape(2, -1)                     # Auto-calculate columns
arr.flatten()                          # Flatten to 1D (copy)
arr.ravel()                            # Flatten to 1D (view)
arr.squeeze()                          # Remove single dimensions
np.expand_dims(arr, axis=0)            # Add dimension at axis

Python

Transposing

arr.T                                  # Transpose
np.transpose(arr)                      # Transpose
np.transpose(arr, (2, 0, 1))           # Permute axes
np.swapaxes(arr, 0, 1)                 # Swap two axes
np.moveaxis(arr, 0, -1)                # Move axis

arr.T                                  # Transpose
np.transpose(arr)                      # Transpose
np.transpose(arr, (2, 0, 1))           # Permute axes
np.swapaxes(arr, 0, 1)                 # Swap two axes
np.moveaxis(arr, 0, -1)                # Move axis

Python

Joining Arrays

np.concatenate([arr1, arr2])           # Concatenate along axis 0
np.concatenate([arr1, arr2], axis=1)   # Along axis 1
np.vstack([arr1, arr2])                # Vertical stack (rows)
np.hstack([arr1, arr2])                # Horizontal stack (cols)
np.dstack([arr1, arr2])                # Depth stack
np.stack([arr1, arr2])                 # Stack along new axis
np.stack([arr1, arr2], axis=1)         # Stack along axis 1
np.column_stack([arr1, arr2])          # Stack as columns
np.row_stack([arr1, arr2])             # Stack as rows

np.concatenate([arr1, arr2])           # Concatenate along axis 0
np.concatenate([arr1, arr2], axis=1)   # Along axis 1
np.vstack([arr1, arr2])                # Vertical stack (rows)
np.hstack([arr1, arr2])                # Horizontal stack (cols)
np.dstack([arr1, arr2])                # Depth stack
np.stack([arr1, arr2])                 # Stack along new axis
np.stack([arr1, arr2], axis=1)         # Stack along axis 1
np.column_stack([arr1, arr2])          # Stack as columns
np.row_stack([arr1, arr2])             # Stack as rows

Python

Splitting Arrays

np.split(arr, 3)                       # Split into 3 equal parts
np.split(arr, [3, 5])                  # Split at indices 3, 5
np.vsplit(arr, 2)                      # Vertical split (rows)
np.hsplit(arr, 2)                      # Horizontal split (cols)
np.dsplit(arr, 2)                      # Depth split
np.array_split(arr, 3)                 # Split (unequal allowed)

np.split(arr, 3)                       # Split into 3 equal parts
np.split(arr, [3, 5])                  # Split at indices 3, 5
np.vsplit(arr, 2)                      # Vertical split (rows)
np.hsplit(arr, 2)                      # Horizontal split (cols)
np.dsplit(arr, 2)                      # Depth split
np.array_split(arr, 3)                 # Split (unequal allowed)

Python

Adding/Removing Elements

np.append(arr, [7, 8, 9])              # Append elements (copy)
np.insert(arr, 3, [99])                # Insert at index
np.delete(arr, [1, 3])                 # Delete at indices
np.resize(arr, (4, 4))                 # Resize (repeats if needed)
np.pad(arr, 2, mode='constant')        # Pad with zeros
np.pad(arr, 2, mode='edge')            # Pad with edge values

np.append(arr, [7, 8, 9])              # Append elements (copy)
np.insert(arr, 3, [99])                # Insert at index
np.delete(arr, [1, 3])                 # Delete at indices
np.resize(arr, (4, 4))                 # Resize (repeats if needed)
np.pad(arr, 2, mode='constant')        # Pad with zeros
np.pad(arr, 2, mode='edge')            # Pad with edge values

Python

Repeating Elements

np.repeat(arr, 3)                      # Repeat each element 3 times
np.repeat(arr, 3, axis=0)              # Repeat along axis
np.tile(arr, 3)                        # Tile array 3 times
np.tile(arr, (2, 3))                   # Tile in 2D

np.repeat(arr, 3)                      # Repeat each element 3 times
np.repeat(arr, 3, axis=0)              # Repeat along axis
np.tile(arr, 3)                        # Tile array 3 times
np.tile(arr, (2, 3))                   # Tile in 2D

Python

🧮 Linear Algebra

Matrix Products

np.dot(A, B)                           # Matrix multiplication
A @ B                                  # Matrix multiplication (Python 3.5+)
np.matmul(A, B)                        # Matrix multiplication
np.inner(a, b)                         # Inner product
np.outer(a, b)                         # Outer product
np.tensordot(A, B, axes=1)             # Tensor dot product
np.einsum('ij,jk->ik', A, B)           # Einstein summation
np.kron(A, B)                          # Kronecker product

np.dot(A, B)                           # Matrix multiplication
A @ B                                  # Matrix multiplication (Python 3.5+)
np.matmul(A, B)                        # Matrix multiplication
np.inner(a, b)                         # Inner product
np.outer(a, b)                         # Outer product
np.tensordot(A, B, axes=1)             # Tensor dot product
np.einsum('ij,jk->ik', A, B)           # Einstein summation
np.kron(A, B)                          # Kronecker product

Python

Matrix Properties

np.trace(A)                            # Trace (sum of diagonal)
np.linalg.det(A)                       # Determinant
np.linalg.matrix_rank(A)               # Rank
np.linalg.norm(A)                      # Frobenius norm
np.linalg.norm(A, ord=2)               # 2-norm (spectral)
np.linalg.norm(A, ord='fro')           # Frobenius norm
np.linalg.cond(A)                      # Condition number

np.trace(A)                            # Trace (sum of diagonal)
np.linalg.det(A)                       # Determinant
np.linalg.matrix_rank(A)               # Rank
np.linalg.norm(A)                      # Frobenius norm
np.linalg.norm(A, ord=2)               # 2-norm (spectral)
np.linalg.norm(A, ord='fro')           # Frobenius norm
np.linalg.cond(A)                      # Condition number

Python

Matrix Decomposition

np.linalg.inv(A)                       # Matrix inverse
np.linalg.pinv(A)                      # Pseudo-inverse (Moore-Penrose)
np.linalg.eig(A)                       # Eigenvalues & eigenvectors
np.linalg.eigvals(A)                   # Eigenvalues only
np.linalg.eigh(A)                      # Hermitian/symmetric eigendecomp
np.linalg.svd(A)                       # Singular value decomposition
np.linalg.qr(A)                        # QR decomposition
np.linalg.cholesky(A)                  # Cholesky decomposition

np.linalg.inv(A)                       # Matrix inverse
np.linalg.pinv(A)                      # Pseudo-inverse (Moore-Penrose)
np.linalg.eig(A)                       # Eigenvalues & eigenvectors
np.linalg.eigvals(A)                   # Eigenvalues only
np.linalg.eigh(A)                      # Hermitian/symmetric eigendecomp
np.linalg.svd(A)                       # Singular value decomposition
np.linalg.qr(A)                        # QR decomposition
np.linalg.cholesky(A)                  # Cholesky decomposition

Python

Solving Systems

np.linalg.solve(A, b)                  # Solve Ax = b
np.linalg.lstsq(A, b, rcond=None)      # Least squares solution

np.linalg.solve(A, b)                  # Solve Ax = b
np.linalg.lstsq(A, b, rcond=None)      # Least squares solution

Python

📉 Statistical Functions

Descriptive Statistics

np.mean(arr)                           # Arithmetic mean
np.median(arr)                         # Median
np.std(arr)                            # Standard deviation
np.std(arr, ddof=1)                    # Sample std (N-1)
np.var(arr)                            # Variance
np.var(arr, ddof=1)                    # Sample variance
np.nanmean(arr)                        # Mean (ignore NaN)
np.nanmedian(arr)                      # Median (ignore NaN)
np.nanstd(arr)                         # Std (ignore NaN)
np.nanvar(arr)                         # Var (ignore NaN)

np.mean(arr)                           # Arithmetic mean
np.median(arr)                         # Median
np.std(arr)                            # Standard deviation
np.std(arr, ddof=1)                    # Sample std (N-1)
np.var(arr)                            # Variance
np.var(arr, ddof=1)                    # Sample variance
np.nanmean(arr)                        # Mean (ignore NaN)
np.nanmedian(arr)                      # Median (ignore NaN)
np.nanstd(arr)                         # Std (ignore NaN)
np.nanvar(arr)                         # Var (ignore NaN)

Python

Correlation

np.corrcoef(x, y)                      # Correlation coefficient matrix
np.cov(x, y)                           # Covariance matrix
np.correlate(x, y)                     # Cross-correlation

np.corrcoef(x, y)                      # Correlation coefficient matrix
np.cov(x, y)                           # Covariance matrix
np.correlate(x, y)                     # Cross-correlation

Python

Histograms

np.histogram(arr, bins=10)             # Histogram
np.histogram2d(x, y, bins=10)          # 2D histogram
np.bincount(arr)                       # Count occurrences
np.digitize(arr, bins)                 # Bin indices

np.histogram(arr, bins=10)             # Histogram
np.histogram2d(x, y, bins=10)          # 2D histogram
np.bincount(arr)                       # Count occurrences
np.digitize(arr, bins)                 # Bin indices

Python

🎲 Random Sampling

Distributions

np.random.random(10)                   # Uniform [0, 1)
np.random.rand(3, 4)                   # Uniform [0, 1), shape (3,4)
np.random.randn(3, 4)                  # Standard normal
np.random.randint(0, 10, 5)            # Random integers [0, 10)
np.random.uniform(0, 10, 5)            # Uniform [0, 10)
np.random.normal(5, 2, 100)            # Normal(μ=5, σ=2)
np.random.binomial(10, 0.5, 100)       # Binomial(n=10, p=0.5)
np.random.poisson(5, 100)              # Poisson(λ=5)
np.random.exponential(2, 100)          # Exponential(scale=2)
np.random.gamma(2, 2, 100)             # Gamma(shape=2, scale=2)
np.random.beta(2, 5, 100)              # Beta(α=2, β=5)
np.random.chisquare(2, 100)            # Chi-square(df=2)

np.random.random(10)                   # Uniform [0, 1)
np.random.rand(3, 4)                   # Uniform [0, 1), shape (3,4)
np.random.randn(3, 4)                  # Standard normal
np.random.randint(0, 10, 5)            # Random integers [0, 10)
np.random.uniform(0, 10, 5)            # Uniform [0, 10)
np.random.normal(5, 2, 100)            # Normal(μ=5, σ=2)
np.random.binomial(10, 0.5, 100)       # Binomial(n=10, p=0.5)
np.random.poisson(5, 100)              # Poisson(λ=5)
np.random.exponential(2, 100)          # Exponential(scale=2)
np.random.gamma(2, 2, 100)             # Gamma(shape=2, scale=2)
np.random.beta(2, 5, 100)              # Beta(α=2, β=5)
np.random.chisquare(2, 100)            # Chi-square(df=2)

Python

Sampling

np.random.choice([1,2,3,4,5], 10)      # Random choices
np.random.choice(arr, 5, replace=False) # Sample without replacement
np.random.choice(arr, 5, p=probs)      # Weighted sampling
np.random.shuffle(arr)                 # Shuffle in place
np.random.permutation(arr)             # Random permutation (copy)

np.random.choice([1,2,3,4,5], 10)      # Random choices
np.random.choice(arr, 5, replace=False) # Sample without replacement
np.random.choice(arr, 5, p=probs)      # Weighted sampling
np.random.shuffle(arr)                 # Shuffle in place
np.random.permutation(arr)             # Random permutation (copy)

Python

Random Generator (Recommended)

from numpy.random import default_rng
rng = default_rng(42)                  # Create generator with seed
rng.random(10)                         # Random floats
rng.integers(0, 10, 5)                 # Random integers
rng.normal(0, 1, 100)                  # Normal distribution
rng.choice([1,2,3,4,5], 10)            # Random choices

from numpy.random import default_rng
rng = default_rng(42)                  # Create generator with seed
rng.random(10)                         # Random floats
rng.integers(0, 10, 5)                 # Random integers
rng.normal(0, 1, 100)                  # Normal distribution
rng.choice([1,2,3,4,5], 10)            # Random choices

Python

💾 File I/O

Text Files

np.savetxt('data.txt', arr)            # Save to text
np.savetxt('data.csv', arr, delimiter=',')  # Save as CSV
np.savetxt('data.txt', arr, fmt='%.2f')     # Format specifier
np.loadtxt('data.txt')                 # Load from text
np.loadtxt('data.csv', delimiter=',')  # Load CSV
np.loadtxt('data.txt', skiprows=1)     # Skip header
np.genfromtxt('data.csv', delimiter=',')    # More flexible
np.genfromtxt('data.csv', names=True)  # With column names

np.savetxt('data.txt', arr)            # Save to text
np.savetxt('data.csv', arr, delimiter=',')  # Save as CSV
np.savetxt('data.txt', arr, fmt='%.2f')     # Format specifier
np.loadtxt('data.txt')                 # Load from text
np.loadtxt('data.csv', delimiter=',')  # Load CSV
np.loadtxt('data.txt', skiprows=1)     # Skip header
np.genfromtxt('data.csv', delimiter=',')    # More flexible
np.genfromtxt('data.csv', names=True)  # With column names

Python

Binary Files

np.save('arr.npy', arr)                # Save single array
np.load('arr.npy')                     # Load single array
np.savez('arrays.npz', a=arr1, b=arr2) # Save multiple arrays
np.savez_compressed('arr.npz', a=arr1) # Compressed
data = np.load('arrays.npz')           # Load multiple
data['a']                              # Access by name

np.save('arr.npy', arr)                # Save single array
np.load('arr.npy')                     # Load single array
np.savez('arrays.npz', a=arr1, b=arr2) # Save multiple arrays
np.savez_compressed('arr.npz', a=arr1) # Compressed
data = np.load('arrays.npz')           # Load multiple
data['a']                              # Access by name

Python

Memory-Mapped Files

# Create memory-mapped file
mm = np.memmap('data.dat', dtype='float32',
               mode='w+', shape=(1000, 1000))
mm[:] = np.random.rand(1000, 1000)
mm.flush()

# Load memory-mapped file
mm = np.memmap('data.dat', dtype='float32',
               mode='r', shape=(1000, 1000))

# Create memory-mapped file
mm = np.memmap('data.dat', dtype='float32',
               mode='w+', shape=(1000, 1000))
mm[:] = np.random.rand(1000, 1000)
mm.flush()

# Load memory-mapped file
mm = np.memmap('data.dat', dtype='float32',
               mode='r', shape=(1000, 1000))

Python

🔧 Utility Functions

Array Testing

np.isnan(arr)                          # Check for NaN
np.isinf(arr)                          # Check for infinity
np.isfinite(arr)                       # Check for finite
np.isreal(arr)                         # Check for real
np.iscomplex(arr)                      # Check for complex

np.isnan(arr)                          # Check for NaN
np.isinf(arr)                          # Check for infinity
np.isfinite(arr)                       # Check for finite
np.isreal(arr)                         # Check for real
np.iscomplex(arr)                      # Check for complex

Python

Array Comparison

np.array_equal(arr1, arr2)             # True if identical
np.array_equiv(arr1, arr2)             # True if broadcastable & equal
np.allclose(arr1, arr2)                # True if close (tolerance)
np.allclose(arr1, arr2, rtol=1e-5)     # Relative tolerance
np.allclose(arr1, arr2, atol=1e-8)     # Absolute tolerance

np.array_equal(arr1, arr2)             # True if identical
np.array_equiv(arr1, arr2)             # True if broadcastable & equal
np.allclose(arr1, arr2)                # True if close (tolerance)
np.allclose(arr1, arr2, rtol=1e-5)     # Relative tolerance
np.allclose(arr1, arr2, atol=1e-8)     # Absolute tolerance

Python

Sorting

np.sort(arr)                           # Sort (returns copy)
arr.sort()                             # Sort in place
np.argsort(arr)                        # Indices that would sort
np.sort(arr, axis=0)                   # Sort along axis
np.lexsort((arr1, arr2))               # Sort by multiple keys
np.partition(arr, 3)                   # Partial sort (3rd smallest)
np.argpartition(arr, 3)                # Indices of partial sort

np.sort(arr)                           # Sort (returns copy)
arr.sort()                             # Sort in place
np.argsort(arr)                        # Indices that would sort
np.sort(arr, axis=0)                   # Sort along axis
np.lexsort((arr1, arr2))               # Sort by multiple keys
np.partition(arr, 3)                   # Partial sort (3rd smallest)
np.argpartition(arr, 3)                # Indices of partial sort

Python

Searching

np.where(arr > 5)                      # Indices where condition
np.where(arr > 5, x, y)                # x if condition else y
np.argwhere(arr > 5)                   # Indices (2D format)
np.nonzero(arr)                        # Indices of non-zero
np.flatnonzero(arr)                    # Flat indices of non-zero
np.searchsorted(arr, 5)                # Index to insert 5
np.extract(arr > 5, arr)               # Extract elements

np.where(arr > 5)                      # Indices where condition
np.where(arr > 5, x, y)                # x if condition else y
np.argwhere(arr > 5)                   # Indices (2D format)
np.nonzero(arr)                        # Indices of non-zero
np.flatnonzero(arr)                    # Flat indices of non-zero
np.searchsorted(arr, 5)                # Index to insert 5
np.extract(arr > 5, arr)               # Extract elements

Python

Set Operations

np.unique(arr)                         # Unique elements (sorted)
np.unique(arr, return_counts=True)     # With counts
np.unique(arr, return_index=True)      # With first indices
np.in1d(arr1, arr2)                    # Test membership
np.intersect1d(arr1, arr2)             # Intersection
np.union1d(arr1, arr2)                 # Union
np.setdiff1d(arr1, arr2)               # Set difference
np.setxor1d(arr1, arr2)                # Symmetric difference

np.unique(arr)                         # Unique elements (sorted)
np.unique(arr, return_counts=True)     # With counts
np.unique(arr, return_index=True)      # With first indices
np.in1d(arr1, arr2)                    # Test membership
np.intersect1d(arr1, arr2)             # Intersection
np.union1d(arr1, arr2)                 # Union
np.setdiff1d(arr1, arr2)               # Set difference
np.setxor1d(arr1, arr2)                # Symmetric difference

Python

Miscellaneous

np.clip(arr, 0, 10)                    # Clip values to [0, 10]
np.piecewise(x, [x<0, x>=0], [lambda x: 0, lambda x: x])  # Piecewise
np.select([cond1, cond2], [val1, val2]) # Select based on conditions
np.where(condition, x, y)              # Ternary operator
np.choose(indices, [arr1, arr2, arr3]) # Choose from list
np.vectorize(func)                     # Vectorize function
np.apply_along_axis(func, 0, arr)      # Apply function along axis
np.apply_over_axes(func, arr, [0,1])   # Apply over multiple axes

np.clip(arr, 0, 10)                    # Clip values to [0, 10]
np.piecewise(x, [x<0, x>=0], [lambda x: 0, lambda x: x])  # Piecewise
np.select([cond1, cond2], [val1, val2]) # Select based on conditions
np.where(condition, x, y)              # Ternary operator
np.choose(indices, [arr1, arr2, arr3]) # Choose from list
np.vectorize(func)                     # Vectorize function
np.apply_along_axis(func, 0, arr)      # Apply function along axis
np.apply_over_axes(func, arr, [0,1])   # Apply over multiple axes

Python

🎭 Advanced Indexing

Mesh Grids

x = np.linspace(0, 5, 5)
y = np.linspace(0, 3, 3)
X, Y = np.meshgrid(x, y)               # 2D coordinate matrices
X, Y = np.mgrid[0:5:5j, 0:3:3j]        # Using mgrid
X, Y = np.ogrid[0:5:5j, 0:3:3j]        # Open meshgrid (1D arrays)

x = np.linspace(0, 5, 5)
y = np.linspace(0, 3, 3)
X, Y = np.meshgrid(x, y)               # 2D coordinate matrices
X, Y = np.mgrid[0:5:5j, 0:3:3j]        # Using mgrid
X, Y = np.ogrid[0:5:5j, 0:3:3j]        # Open meshgrid (1D arrays)

Python

Index Tricks

np.ix_([0, 1], [2, 3])                 # Index mesh for fancy indexing
np.r_[1:4, 0, 4]                       # Concatenate slices
np.c_[arr1, arr2]                      # Column stack shortcut
np.s_[::2]                             # Slice object
np.indices((3, 3))                     # Index arrays
np.unravel_index(7, (3, 3))            # Convert flat index to coords
np.ravel_multi_index([[0,1], [1,2]], (3,3))  # Coords to flat

np.ix_([0, 1], [2, 3])                 # Index mesh for fancy indexing
np.r_[1:4, 0, 4]                       # Concatenate slices
np.c_[arr1, arr2]                      # Column stack shortcut
np.s_[::2]                             # Slice object
np.indices((3, 3))                     # Index arrays
np.unravel_index(7, (3, 3))            # Convert flat index to coords
np.ravel_multi_index([[0,1], [1,2]], (3,3))  # Coords to flat

Python

🧪 Special Arrays

Structured Arrays

# Define dtype
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

# Create structured array
arr = np.array([('Alice', 25, 55.5), 
                ('Bob', 30, 70.0)], dtype=dt)

arr['name']                            # Access by field name
arr[0]                                 # Access by row
arr[['name', 'age']]                   # Multiple fields

# Define dtype
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

# Create structured array
arr = np.array([('Alice', 25, 55.5), 
                ('Bob', 30, 70.0)], dtype=dt)

arr['name']                            # Access by field name
arr[0]                                 # Access by row
arr[['name', 'age']]                   # Multiple fields

Python

Masked Arrays

import numpy.ma as ma

# Create masked array
data = np.array([1, 2, -999, 4, -999, 6])
masked = ma.masked_equal(data, -999)

# Operations ignore masked values
masked.mean()                          # 3.25
masked.sum()                           # 13

# Manual masking
mask = [False, False, True, False, True, False]
masked = ma.array(data, mask=mask)

import numpy.ma as ma

# Create masked array
data = np.array([1, 2, -999, 4, -999, 6])
masked = ma.masked_equal(data, -999)

# Operations ignore masked values
masked.mean()                          # 3.25
masked.sum()                           # 13

# Manual masking
mask = [False, False, True, False, True, False]
masked = ma.array(data, mask=mask)

Python

Character Arrays

np.char.add(['Hello'], [' World'])     # String concatenation
np.char.multiply('Ha', 3)              # 'HaHaHa'
np.char.upper(['hello', 'world'])      # Uppercase
np.char.lower(['HELLO', 'WORLD'])      # Lowercase
np.char.strip(['  hello  '])           # Strip whitespace
np.char.replace('hello', 'l', 'L')     # Replace
np.char.split('hello world')           # Split
np.char.join('-', ['hello', 'world'])  # Join

np.char.add(['Hello'], [' World'])     # String concatenation
np.char.multiply('Ha', 3)              # 'HaHaHa'
np.char.upper(['hello', 'world'])      # Uppercase
np.char.lower(['HELLO', 'WORLD'])      # Lowercase
np.char.strip(['  hello  '])           # Strip whitespace
np.char.replace('hello', 'l', 'L')     # Replace
np.char.split('hello world')           # Split
np.char.join('-', ['hello', 'world'])  # Join

Python

⚡ Performance Tips

Vectorization

# Bad: Loop
result = np.zeros(len(arr))
for i in range(len(arr)):
    result[i] = arr[i] ** 2

# Good: Vectorized
result = arr ** 2

# Bad: Loop
result = np.zeros(len(arr))
for i in range(len(arr)):
    result[i] = arr[i] ** 2

# Good: Vectorized
result = arr ** 2

Python

Broadcasting

# Bad: Explicit loop
for i in range(arr.shape[0]):
    arr[i] += vector

# Good: Broadcasting
arr += vector

# Bad: Explicit loop
for i in range(arr.shape[0]):
    arr[i] += vector

# Good: Broadcasting
arr += vector

Python

In-Place Operations

arr += 1                               # In-place (no copy)
arr = arr + 1                          # Creates new array
np.add(arr, 1, out=arr)                # Explicit in-place

arr += 1                               # In-place (no copy)
arr = arr + 1                          # Creates new array
np.add(arr, 1, out=arr)                # Explicit in-place

Python

Memory Views vs Copies

view = arr[::2]                        # View (no copy)
copy = arr[::2].copy()                 # Explicit copy
arr.base is None                       # True if owns data
view.base is arr                       # True if view of arr

view = arr[::2]                        # View (no copy)
copy = arr[::2].copy()                 # Explicit copy
arr.base is None                       # True if owns data
view.base is arr                       # True if view of arr

Python

🎓 Common Patterns

Normalize Array

# Z-score normalization
normalized = (arr - arr.mean()) / arr.std()

# Min-max normalization
normalized = (arr - arr.min()) / (arr.max() - arr.min())

# Z-score normalization
normalized = (arr - arr.mean()) / arr.std()

# Min-max normalization
normalized = (arr - arr.min()) / (arr.max() - arr.min())

Python

Distance Matrix

from scipy.spatial.distance import cdist
# Or manually:
X = np.random.rand(100, 2)
dist = np.sqrt(((X[:, None] - X) ** 2).sum(axis=2))

from scipy.spatial.distance import cdist
# Or manually:
X = np.random.rand(100, 2)
dist = np.sqrt(((X[:, None] - X) ** 2).sum(axis=2))

Python

One-Hot Encoding

labels = np.array([0, 1, 2, 1, 0])
n_classes = 3
one_hot = np.eye(n_classes)[labels]

labels = np.array([0, 1, 2, 1, 0])
n_classes = 3
one_hot = np.eye(n_classes)[labels]

Python

Moving Average

window = 3
weights = np.ones(window) / window
moving_avg = np.convolve(arr, weights, mode='valid')

window = 3
weights = np.ones(window) / window
moving_avg = np.convolve(arr, weights, mode='valid')

Python

Polynomial Fitting

x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 1, 4, 9, 16])
coeffs = np.polyfit(x, y, 2)           # Fit 2nd degree polynomial
poly = np.poly1d(coeffs)               # Create polynomial
y_pred = poly(x)                       # Predict

x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 1, 4, 9, 16])
coeffs = np.polyfit(x, y, 2)           # Fit 2nd degree polynomial
poly = np.poly1d(coeffs)               # Create polynomial
y_pred = poly(x)                       # Predict

Python

🔗 Integration with Other Libraries

With Pandas

import pandas as pd
df = pd.DataFrame(arr)                 # Array to DataFrame
arr = df.values                        # DataFrame to array
arr = df.to_numpy()                    # Recommended method

import pandas as pd
df = pd.DataFrame(arr)                 # Array to DataFrame
arr = df.values                        # DataFrame to array
arr = df.to_numpy()                    # Recommended method

Python

With Matplotlib

import matplotlib.pyplot as plt
plt.plot(arr)                          # Plot array
plt.imshow(arr)                        # Display 2D array as image
plt.hist(arr.flatten(), bins=50)       # Histogram

import matplotlib.pyplot as plt
plt.plot(arr)                          # Plot array
plt.imshow(arr)                        # Display 2D array as image
plt.hist(arr.flatten(), bins=50)       # Histogram

Python

With PIL/Pillow

from PIL import Image
img_array = np.array(Image.open('image.jpg'))
img = Image.fromarray(arr.astype('uint8'))

from PIL import Image
img_array = np.array(Image.open('image.jpg'))
img = Image.fromarray(arr.astype('uint8'))

Python

📚 Quick Reference Table

Operation	Syntax	Description
Creation
From list	`np.array([1,2,3])`	Create from list
Zeros	`np.zeros((3,4))`	3×4 array of zeros
Ones	`np.ones((2,3))`	2×3 array of ones
Range	`np.arange(10)`	0 to 9
Linspace	`np.linspace(0,1,5)`	5 evenly spaced values
Identity	`np.eye(3)`	3×3 identity matrix
Indexing
Single element	`arr[i,j]`	Element at row i, col j
Slice	`arr[1:3,:]`	Rows 1-2, all columns
Boolean	`arr[arr>5]`	Elements > 5
Fancy	`arr[[0,2,4]]`	Elements at indices 0,2,4
Math
Add	`arr + 5`	Add 5 to each element
Multiply	`arr * 2`	Multiply by 2
Power	`arr ** 2`	Square each element
Sqrt	`np.sqrt(arr)`	Square root
Exp	`np.exp(arr)`	e^x
Log	`np.log(arr)`	Natural log
Aggregate
Sum	`np.sum(arr)`	Sum all elements
Mean	`np.mean(arr)`	Average
Min/Max	`np.min(arr)`, `np.max(arr)`	Minimum, maximum
Std	`np.std(arr)`	Standard deviation
Shape
Reshape	`arr.reshape(3,4)`	Change shape to 3×4
Flatten	`arr.flatten()`	Convert to 1D
Transpose	`arr.T`	Swap rows and columns
Join/Split
Concatenate	`np.concatenate([a,b])`	Join arrays
Stack	`np.vstack([a,b])`	Stack vertically
Split	`np.split(arr, 3)`	Split into 3 parts
Linear Algebra
Dot product	`np.dot(A,B)` or `A @ B`	Matrix multiplication
Inverse	`np.linalg.inv(A)`	Matrix inverse
Determinant	`np.linalg.det(A)`	Determinant
Eigenvalues	`np.linalg.eig(A)`	Eigenvalues & vectors
Random
Random floats	`np.random.rand(3,4)`	Uniform [0,1)
Random ints	`np.random.randint(0,10,5)`	Integers [0,10)
Normal dist	`np.random.randn(100)`	Standard normal
Choice	`np.random.choice([1,2,3])`	Random selection

Happy NumPy coding! 🚀

Discover more from Altgr Blog

Subscribe to get the latest posts sent to your email.

Table of Contents

Introduction to NumPy

Why Use NumPy?

Installation

NumPy Arrays

Array vs List Comparison

Array Creation

From Python Lists

Using Built-in Functions

Specifying Data Types

Array Attributes

Array Indexing and Slicing

Basic Indexing

Slicing 1D Arrays

Indexing 2D Arrays

Boolean Indexing

Fancy Indexing

Array Operations

Arithmetic Operations

Universal Functions (ufuncs)

Comparison Operations

Aggregate Functions

Mathematical Functions

Trigonometric Functions

Rounding Functions

Exponential and Logarithmic

Array Reshaping

Reshape

Flatten and Ravel

Transpose

Stack and Split

Broadcasting

Broadcasting Rules

Examples

Common Broadcasting Patterns

Linear Algebra

Matrix Operations

Matrix Decomposition

Solving Linear Systems

Matrix Properties

Statistical Functions

Basic Statistics

Correlation and Covariance

Binning and Histograms

Random Number Generation

Random Module

Statistical Distributions

Array Manipulation with Random

New Random Generator (Recommended)

File I/O

Text Files

Binary Files (.npy)

Memory-Mapped Files

Advanced Topics

Structured Arrays

Masked Arrays

Vectorization

Advanced Indexing

Memory Views and Copies

Einstein Summation (einsum)

Best Practices

1. Use Vectorization Instead of Loops

2. Specify Data Types

3. Use In-Place Operations

4. Avoid Copying When Possible

5. Use Appropriate Functions

6. Pre-allocate Arrays

7. Use Broadcasting

8. Profile Your Code

9. Handle Memory Efficiently

10. Document Array Shapes

Common Pitfalls and Solutions

Pitfall 1: Modifying Array Through View

Pitfall 2: Integer Division

Pitfall 3: Dimension Confusion

Useful Resources

Summary

Quick Reference Card

Comprehensive NumPy Cheatsheet

📦 Import Convention