Table of Contents
- Introduction to Python
- Getting Started
- Python Basics
- Control Flow
- Functions
- Data Structures
- Object-Oriented Programming
- Error Handling
- File Operations
- Modules and Packages
- Advanced Topics
- Web Development
- Data Science and Analytics
- Testing and Debugging
- Best Practices
Introduction to Python
Python is a high-level, interpreted programming language known for its simplicity and readability. Created by Guido van Rossum in 1991, Python emphasizes code readability and allows programmers to express concepts in fewer lines of code.
Why Choose Python?
mindmap
root((Python Advantages))
Easy to Learn
Simple Syntax
Readable Code
Beginner Friendly
Versatile
Web Development
Data Science
AI/ML
Automation
Large Community
Extensive Libraries
Active Support
Open Source
Cross Platform
Windows
Mac
LinuxPython Applications
graph TD
A[Python Applications] --> B[Web Development]
A --> C[Data Science]
A --> D[Machine Learning]
A --> E[Automation]
A --> F[Game Development]
A --> G[Desktop Applications]
B --> B1[Django]
B --> B2[Flask]
B --> B3[FastAPI]
C --> C1[Pandas]
C --> C2[NumPy]
C --> C3[Matplotlib]
D --> D1[TensorFlow]
D --> D2[PyTorch]
D --> D3[Scikit-learn]Getting Started
Installation
- Download Python: Visit python.org and download the latest version
- Install: Run the installer and ensure “Add Python to PATH” is checked
- Verify Installation: Open command prompt and type:
python --versionPythonDevelopment Environment
graph LR
A[Choose IDE] --> B[VS Code]
A --> C[PyCharm]
A --> D[Jupyter Notebook]
A --> E[IDLE]
B --> F[Recommended for beginners]
C --> G[Professional development]
D --> H[Data science & research]
E --> I[Built-in with Python]Your First Python Program
# Hello World program
print("Hello, World!")
print("Welcome to Python programming!")
# Variables and basic operations
name = "Python"
version = 3.12
print(f"Learning {name} version {version}")PythonPython Basics
Variables and Data Types
graph TD
A[Python Data Types] --> B[Numeric]
A --> C[Sequence]
A --> D[Mapping]
A --> E[Set]
A --> F[Boolean]
A --> G[None]
B --> B1[int]
B --> B2[float]
B --> B3[complex]
C --> C1[str]
C --> C2[list]
C --> C3[tuple]
D --> D1[dict]
E --> E1[set]
E --> E2[frozenset]mindmap
root((Python Data Types))
Numeric
Integer
integer_num = 42
Float
float_num = 3.14159
Complex
complex_num = 3 + 4j
String
text = "Hello, Python!"
multiline = """This is a\nmultiline string"""
Boolean
is_python_fun = True
is_difficult = False
Sequence
List (mutable)
fruits = ['apple', 'banana', 'orange']
Tuple (immutable)
coordinates = (10, 20)
Mapping
Dictionary
person = #123;'name': 'Alice', 'age': 30, 'city': 'New York'#125;
Set
unique_numbers = #123;1, 2, 3, 4, 5#125;# Numeric types
integer_num = 42
float_num = 3.14159
complex_num = 3 + 4j
# String
text = "Hello, Python!"
multiline = """This is a
multiline string"""
# Boolean
is_python_fun = True
is_difficult = False
# List (mutable)
fruits = ["apple", "banana", "orange"]
# Tuple (immutable)
coordinates = (10, 20)
# Dictionary
person = {
"name": "Alice",
"age": 30,
"city": "New York"
}
# Set
unique_numbers = {1, 2, 3, 4, 5}PythonNumeric Types
integer_num(int): Represents the whole number 42, which is an immutable integer value. Theinttype is for storing positive or negative whole numbers without any decimal places.float_num(float): Represents the decimal number 3.14159. Thefloattype is used for real numbers that contain a decimal point and can be positive or negative.complex_num(complex): Represents the complex number 3 + 4j, where 3 is the real part and 4j is the imaginary part. Thejsuffix indicates the imaginary component.
String Type
text(str): Represents the immutable sequence of characters “Hello, Python!”. It is enclosed in double quotes.multiline(str): Represents an immutable, multi-line string. It is created using triple double quotes ("""), which preserves the formatting and line breaks exactly as they appear in the code.
Boolean Type
is_python_fun(bool): Represents the truth valueTrue. In Python, booleans are a built-in type with only two possible values:TrueandFalse.is_difficult(bool): Represents the truth valueFalse. Boolean types are used for logical operations and conditional statements.
List Type (Mutable Sequence)
fruits(list): Represents an ordered, mutable sequence of strings, in this case, a list of fruit names. Items can be added, removed, or changed after the list is created.
Tuple Type (Immutable Sequence)
coordinates(tuple): Represents an ordered, immutable sequence of two integers. Once defined, the items within a tuple cannot be changed.
Dictionary Type (Mapping)
person(dict): Represents a mutable collection of key-value pairs. Each unique key (a string) maps to a specific value (a string or integer). Dictionaries are useful for storing structured data and accessing it by key.
Set Type
unique_numbers(set): Represents an unordered, mutable collection of unique integer elements. Sets automatically discard duplicate values and are efficient for membership testing and mathematical set operations.
Operators
graph TD
A[Python Operators] --> B[Arithmetic]
A --> C[Comparison]
A --> D[Logical]
A --> E[Assignment]
A --> F[Membership]
A --> G[Identity]
B --> B1["+ - * / % ** //"]
C --> C1["== != < > <= >="]
D --> D1["and or not"]
E --> E1["= += -= *= /="]
F --> F1["in not in"]
G --> G1["is is not"]# Arithmetic operators
a, b = 10, 3
print(f"Addition: {a + b}") # 13
print(f"Subtraction: {a - b}") # 7
print(f"Multiplication: {a * b}") # 30
print(f"Division: {a / b}") # 3.333...
print(f"Floor Division: {a // b}") # 3
print(f"Modulus: {a % b}") # 1
print(f"Exponentiation: {a ** b}") # 1000
# Comparison operators
print(f"Equal: {a == b}") # False
print(f"Not equal: {a != b}") # True
print(f"Greater than: {a > b}") # True
# Logical operators
x, y = True, False
print(f"AND: {x and y}") # False
print(f"OR: {x or y}") # True
print(f"NOT: {not x}") # FalsePythonMultiple Assignment
- Code:
a, b = 10, 3 - Explanation: This concisely assigns
10to variableaand3to variablebin a single line. This is also called “tuple unpacking” because it effectively unpacks the values on the right-hand side and assigns them to the variables on the left.
Arithmetic Operators
These are used to perform mathematical calculations on numeric values.
| Operator | Operation | Code | Calculation | Output |
|---|---|---|---|---|
+ | Addition | a + b | 10 + 3 | 13 |
- | Subtraction | a - b | 10 - 3 | 7 |
* | Multiplication | a * b | 10 * 3 | 30 |
/ | Division | a / b | 10 / 3 | 3.333... |
// | Floor Division | a // b | 10 // 3 | 3 |
% | Modulus | a % b | 10 % 3 | 1 |
** | Exponentiation | a ** b | 10 ** 3 | 1000 |
Comparison Operators
These compare two values and return a boolean result (True or False).
| Operator | Comparison | Code | Condition | Output |
|---|---|---|---|---|
== | Equal to | a == b | 10 == 3 | False |
!= | Not equal to | a != b | 10 != 3 | True |
> | Greater than | a > b | 10 > 3 | True |
Logical Operators
These combine conditional statements and evaluate to a boolean value.
| Operator | Operation | Code | Condition | Output |
|---|---|---|---|---|
and | Logical AND | x and y | True and False | False |
or | Logical OR | x or y | True or False | True |
not | Logical NOT | not x | not True | False |
Use of f-strings
The code uses f-strings (formatted string literals), a feature in Python 3.6+.
- The
fbefore the opening quote signifies that it is an f-string. - Expressions or variables placed inside curly braces (
{...}) are evaluated and converted into a string. - For example,
f"Addition: {a + b}"evaluatesa + bto13and embeds it in the string. This makes the output readable and the code cleaner than other formatting methods.
Input and Output
# Getting user input
name = input("Enter your name: ")
age = int(input("Enter your age: "))
# Formatted output
print(f"Hello {name}, you are {age} years old!")
# Different ways to format strings
print("Hello %s, you are %d years old!" % (name, age))
print("Hello {}, you are {} years old!".format(name, age))
print("Hello {name}, you are {age} years old!".format(name=name, age=age))PythonUser input
name = input("Enter your name: "): Theinput()function is a built-in Python function that prompts the user for input, pauses the program until the user types a value, and then returns that value as a string. The string prompt inside the parentheses is optional and is displayed to the user to give them instructions.age = int(input("Enter your age: ")):- The inner
input("Enter your age: ")function first collects a string from the user, such as"30". - The outer
int()function then converts that string into an integer (30). - This explicit type conversion (or “type casting”) is necessary because the
input()function always returns a string, even if the user enters a number. If you tried to perform a mathematical operation with the original string input, it would result in an error.
- The inner
Formatted string output
The code demonstrates several methods for formatting strings in Python, from the newest and most readable to older, legacy methods.
- F-strings (Formatted String Literals)
- Code:
print(f"Hello {name}, you are {age} years old!") - Explanation: F-strings, introduced in Python 3.6, are the modern and most recommended way to format strings.
- The
fat the beginning signifies a formatted string. - Variables and expressions enclosed in curly braces (
{...}) are evaluated at runtime and replaced with their values. - This method is highly readable and performant.
- The
2. % Operator (C-style Formatting)
- Code:
print("Hello %s, you are %d years old!" % (name, age)) - Explanation: This is an older, legacy method of formatting strings inspired by the
printffunction in the C language.- Format Specifiers: It uses format specifiers like
%sfor a string and%dfor a decimal integer. - Variables: The variables to be substituted are passed as a tuple to the right of the
%operator.
- Format Specifiers: It uses format specifiers like
3. str.format() Method
- Code:
print("Hello {}, you are {} years old!".format(name, age)) - Explanation: This method, introduced in Python 2.6, is more flexible than the
%operator. It uses{}as placeholders.- Positional Arguments: Without any numbers in the
{}placeholders, theformat()method fills them in with the arguments provided, in order.
- Positional Arguments: Without any numbers in the
4. str.format() with Named Arguments
- Code:
print("Hello {name}, you are {age} years old!".format(name=name, age=age)) - Explanation: This variation of the
str.format()method uses named placeholders.- Keyword Arguments: You can pass arguments to the
format()method using keyword arguments, which improves the readability of the formatting string by clearly associating the placeholder with its corresponding variable.
- Keyword Arguments: You can pass arguments to the
Control Flow
Conditional Statements
flowchart TD
A[Start] --> B{Condition}
B -->|True| C[Execute if block]
B -->|False| D{Elif condition?}
D -->|True| E[Execute elif block]
D -->|False| F[Execute else block]
C --> G[End]
E --> G
F --> G# If-elif-else statements
score = 85
if score >= 90:
grade = "A"
print("Excellent!")
elif score >= 80:
grade = "B"
print("Good job!")
elif score >= 70:
grade = "C"
print("Average")
elif score >= 60:
grade = "D"
print("Needs improvement")
else:
grade = "F"
print("Failed")
print(f"Your grade is: {grade}")
# Ternary operator
status = "Pass" if score >= 60 else "Fail"
print(f"Status: {status}")PythonIf-elif-else statements
This block of code is a conditional statement that executes different code depending on whether the specified conditions are true or false. Python evaluates each condition in order, from top to bottom.
score = 85: A variablescoreis initialized with the integer value85.if score >= 90:: The interpreter first checks ifscoreis greater than or equal to 90.- Since
85 >= 90is false, this block is skipped.
- Since
elif score >= 80:: The interpreter moves to the next condition, checking ifscoreis greater than or equal to 80.85 >= 80is true, so the code inside this block is executed.gradeis set to"B"."Good job!"is printed to the console.
- Execution stops: As soon as a true condition is found in an
if-elif-elseblock, the corresponding code runs, and the rest of the conditions (elifandelse) are skipped. print(f"Your grade is: {grade}"): After the conditional block, the finalgradevalue, which is"B", is printed using an f-string.
Ternary operator
This is a single-line shorthand for a simple if-else statement. It is a concise way to assign a value to a variable based on a single condition.
- Code:
status = "Pass" if score >= 60 else "Fail" - How it works:
- The code first evaluates the condition
score >= 60. - Since
85 >= 60is true, the expression returns the value before theif, which is"Pass". - This value is then assigned to the
statusvariable.
- The code first evaluates the condition
- Equivalent
if-elsestatement:if score >= 60: status = "Pass" else: status = "Fail"print(f"Status: {status}"): The final value of thestatusvariable,"Pass", is printed to the console.
Loops
flowchart TD
A[For Loop] --> B[Iterate over sequence]
B --> C[Execute code block]
C --> D{More items?}
D -->|Yes| C
D -->|No| E[End]
F[While Loop] --> G{Condition True?}
G -->|Yes| H[Execute code block]
H --> G
G -->|No| I[End]# For loops
fruits = ["apple", "banana", "orange"]
# Iterate over list
for fruit in fruits:
print(f"I like {fruit}")
# Iterate with index
for index, fruit in enumerate(fruits):
print(f"{index + 1}. {fruit}")
# Range function
for i in range(1, 6): # 1 to 5
print(f"Number: {i}")
# While loops
count = 0
while count < 5:
print(f"Count: {count}")
count += 1
# Loop control statements
for i in range(10):
if i == 3:
continue # Skip iteration
if i == 7:
break # Exit loop
print(i)
# Nested loops
for i in range(3):
for j in range(3):
print(f"({i}, {j})")PythonFor loops
A for loop iterates over a sequence, such as a list, tuple, or string, or other iterable objects. The loop runs a block of code for each item in the sequence.
- Iterating over a list:
- Code:
for fruit in fruits: - Explanation: The loop steps through each element of the
fruitslist. In each iteration, the variablefruitis assigned the current element’s value, which is then used within the loop.
- Code:
- Iterating with
enumerate():- Code:
for index, fruit in enumerate(fruits): - Explanation: The
enumerate()function is a way to iterate through a sequence while automatically keeping track of the index. It provides both the index and the value during each iteration.
- Code:
- Using the
range()function:- Code:
for i in range(1, 6): - Explanation: The
range()function generates a sequence of numbers.range(start, stop)generates numbers fromstartup to, but not including,stop. Here, it produces the sequence 1, 2, 3, 4, 5.
- Code:
While loops
A while loop repeatedly runs a block of code as long as a specified condition is true.
- Code:
while count < 5: - Explanation: The program first initializes
countto 0. The loop checks ifcount < 5is true. If true, the code inside the loop runs, printing the count and then incrementing it by 1. This repeats untilcountbecomes 5, at which point the condition becomes false and the loop terminates.
Loop control statements
These statements alter the normal flow of a loop based on specific conditions.
continue: Jumps to the next iteration of the loop, skipping any remaining code in the current iteration.- Code:
if i == 3: continue - Explanation: In the
forloop, wheniis 3, thecontinuestatement is executed. This causes the program to skip theprint(i)statement for that iteration and proceed to the next number in the range.
- Code:
break: Immediately exits the loop entirely.- Code:
if i == 7: break - Explanation: When
ibecomes 7, thebreakstatement is executed, and the program terminates the loop completely, skipping the rest of the numbers in the range.
- Code:
Nested loops
A nested loop is a loop inside another loop. The inner loop completes all of its iterations for each single iteration of the outer loop.
- Code:
for i in range(3): for j in range(3): - Explanation: The outer loop (
for i) runs three times. For each time the outer loop runs, the inner loop (for j) runs completely, also three times. This results in a total of nine print statements, producing all the coordinate pairs from (0, 0) to (2, 2).
Loop Patterns
flowchart TD
A[Start] --> B[Initialize range of x]
B --> C{Loop through each x}
C --> D["Check condition (if any)"]
D -->|True| E[Compute x**2]
D -->|False| C
E --> F[Store in list or dict]
F --> C
C --> G[End]# List comprehension
squares = [x**2 for x in range(1, 6)]
print(squares) # [1, 4, 9, 16, 25]
# Conditional list comprehension
even_squares = [x**2 for x in range(1, 11) if x % 2 == 0]
print(even_squares) # [4, 16, 36, 64, 100]
# Dictionary comprehension
square_dict = {x: x**2 for x in range(1, 6)}
print(square_dict) # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}PythonList comprehension
List comprehensions provide a concise way to create lists. They consist of an expression followed by a for clause, and optionally, one or more if or for clauses. They offer a more readable and often more efficient alternative to traditional for loops for creating lists.
- Basic list comprehension:
- Code:
squares = [x**2 for x in range(1, 6)] - Explanation:
range(1, 6)generates numbers from 1 to 5 (i.e., 1, 2, 3, 4, 5).for x in range(1, 6)iterates through each number in this sequence.x**2is the expression that calculates the square of each numberx.- The results of this expression are collected into a new list called
squares.
- Output:
[1, 4, 9, 16, 25]
- Code:
- Conditional list comprehension:
- Code:
even_squares = [x**2 for x in range(1, 11) if x % 2 == 0] - Explanation: This comprehension adds an
ifclause to filter elements.range(1, 11)generates numbers from 1 to 10.for x in range(1, 11)iterates through these numbers.if x % 2 == 0checks if the current numberxis even (i.e., its remainder when divided by 2 is 0).- Only if the
ifcondition isTrueisx**2calculated and added to theeven_squareslist.
- Output:
[4, 16, 36, 64, 100]
- Code:
Dictionary comprehension
Similar to list comprehensions, dictionary comprehensions provide a concise way to create dictionaries. They consist of a key-value pair expression followed by a for clause, and optionally, one or more if or for clauses.
- Code:
square_dict = {x: x**2 for x in range(1, 6)} - Explanation:
range(1, 6)generates numbers from 1 to 5.for x in range(1, 6)iterates through each number in this sequence.x: x**2is the expression that defines the key-value pair for each item. Here,xbecomes the key, andx**2becomes its corresponding value.- These key-value pairs are collected into a new dictionary called
square_dict.
- Output:
{1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
Functions
Function Basics
graph TD
A[Function Definition] --> B[def keyword]
A --> C[Function name]
A --> D[Parameters]
A --> E[Function body]
A --> F[Return statement]
G[Function Call] --> H[Function name]
G --> I[Arguments]
J[Function Benefits] --> K[Code reusability]
J --> L[Modularity]
J --> M[Easier debugging]
J --> N[Better organization]# Basic function
def greet(name):
"""Greet a person with their name."""
return f"Hello, {name}!"
# Function call
message = greet("Alice")
print(message)
# Function with multiple parameters
def add_numbers(a, b):
"""Add two numbers and return the result."""
result = a + b
return result
sum_result = add_numbers(5, 3)
print(f"Sum: {sum_result}")
# Function with default parameters
def introduce(name, age=25, city="Unknown"):
"""Introduce a person with optional age and city."""
return f"Hi, I'm {name}, {age} years old from {city}"
print(introduce("Bob"))
print(introduce("Carol", 30))
print(introduce("Dave", 35, "Boston"))PythonBasic function
def greet(name):: This defines a function namedgreetusing thedefkeyword. The function accepts one argument,name, which is a placeholder for the value that will be passed when the function is called."""Greet a person with their name.""": This is a docstring, a type of comment used to explain the function’s purpose. Good practice dictates that you should use docstrings to document your functions for clarity.return f"Hello, {name}!": Thereturnstatement sends a value back to the code that called the function. In this case, it returns a formatted string literal (f-string) that includes the value of thenameargument.
Function call
message = greet("Alice"): This line calls thegreetfunction, passing the string"Alice"as an argument. The value returned by the function ("Hello, Alice!") is assigned to themessagevariable.print(message): This prints the value of themessagevariable to the console.
Function with multiple parameters
def add_numbers(a, b):: This function is defined with two parameters,aandb. The function needs two arguments to be passed when it is called.result = a + b: The function’s body calculates the sum of the two parameters and stores it in a local variableresult.return result: The function returns the sum stored in theresultvariable.sum_result = add_numbers(5, 3): This callsadd_numbers, passing the values5and3. The arguments are assigned to the parametersaandbbased on their position. The returned value (8) is stored insum_result.
Function with default parameters
def introduce(name, age=25, city="Unknown"):: This function definition includes default values for theageandcityparameters. If a caller does not provide a value for these parameters, the default value will be used automatically.print(introduce("Bob")):- This call provides only the
nameargument. - Since
ageandcityare not provided, the function uses their default values (25and"Unknown").
- This call provides only the
print(introduce("Carol", 30)):- This call provides values for
nameandage. - The provided
30overrides the defaultagevalue. The defaultcityvalue is still used.
- This call provides values for
print(introduce("Dave", 35, "Boston")):- This call explicitly provides values for all three parameters.
- In this case, none of the default values are used.
Advanced Function Features
# Variable arguments
def sum_all(*numbers):
"""Sum all provided numbers."""
return sum(numbers)
print(sum_all(1, 2, 3, 4, 5)) # 15
# Keyword arguments
def create_profile(**info):
"""Create a profile with keyword arguments."""
profile = {}
for key, value in info.items():
profile[key] = value
return profile
profile = create_profile(name="Alice", age=30, profession="Engineer")
print(profile)
# Lambda functions
square = lambda x: x**2
print(square(5)) # 25
# Higher-order functions
numbers = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, numbers))
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))
print(f"Squared: {squared}")
print(f"Even numbers: {even_numbers}")PythonVariable arguments (*args)
def sum_all(*numbers):: The*numberssyntax in a function definition allows the function to accept any number of positional arguments.- How it works: When the
sum_all()function is called, all the passed arguments (e.g.,1, 2, 3, 4, 5) are collected and “packed” into a tuple namednumbers. - Inside the function: The
sum()built-in function is then applied to thenumberstuple to calculate the total sum. - Output:
15.
Keyword arguments (**kwargs)
- **
def create_profile(**info):**: The**infosyntax allows the function to accept any number of keyword arguments (key-value pairs). - How it works: All the keyword arguments passed to
create_profile()are collected and “packed” into a dictionary namedinfo. - Inside the function: The code iterates through the items of the
infodictionary to build a new profile dictionary, demonstrating how to access the collected keyword arguments. - Output:
{'name': 'Alice', 'age': 30, 'profession': 'Engineer'}.
Lambda functions
- **
square = lambda x: x**2**: A lambda function is a small, anonymous (unnamed) function defined with thelambdakeyword. - Syntax:
lambda arguments: expression - Characteristics:
- They are restricted to a single expression.
- They are often used for short-term, throwaway operations.
- Output:
square(5)returns25, the result of the expression5**2.
Higher-order functions
A higher-order function is one that takes one or more functions as arguments or returns a function as its result. The map() and filter() functions are common examples.
map()with a lambda:- Code:
squared = list(map(lambda x: x**2, numbers)) - Explanation: The
map()function applies the specified function (in this case, thelambdafunction that squares a number) to every item in an iterable (numbers). - Output:
squaredis[1, 4, 9, 16, 25].
- Code:
filter()with a lambda:- Code:
even_numbers = list(filter(lambda x: x % 2 == 0, numbers)) - Explanation: The
filter()function constructs an iterator from elements of an iterable for which a function (in this case, thelambdafunction that checks if a number is even) returns true. - Output:
even_numbersis[2, 4].
- Code:
Note: Both map() and filter() return iterator objects in Python 3, so list() is used to convert them into a list for printing.
Decorators
# Basic decorator
def timing_decorator(func):
"""Decorator to measure function execution time."""
import time
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"{func.__name__} took {end_time - start_time:.4f} seconds")
return result
return wrapper
@timing_decorator
def slow_function():
"""A function that takes some time."""
import time
time.sleep(1)
return "Done!"
result = slow_function()PythonThis code demonstrates a fundamental use of Python decorators: modifying the behavior of a function without changing its source code. The timing_decorator is applied to slow_function using the @timing_decorator syntax. When slow_function is called, the wrapper function within the decorator is executed instead.
timing_decorator(func)
This is the decorator function, which accepts another function (func) as an argument.
- It defines a nested function named
wrapper. - The
wrapperfunction is a “closure” that retains access to thefuncvariable from the enclosing scope, even aftertiming_decoratorhas finished executing. - This
wrapperis a general-purpose function that can accept any number of positional arguments (*args) and keyword arguments (**kwargs), making it reusable for any function signature. - The decorator function returns the
wrapperfunction.
wrapper(*args, **kwargs)
This nested function contains the enhanced behavior.
start_time = time.time(): It records the current timestamp before calling the original function.result = func(*args, **kwargs): It calls the original function (slow_functionin this case), passing along all arguments it received. It also captures the return value of the original function.end_time = time.time(): It records the timestamp after the original function has finished executing.print(...): It prints a formatted string showing the original function’s name and the elapsed time.return result: It returns the result of the original function, ensuring the decorator doesn’t interfere with the function’s intended output.
@timing_decorator
- This is “syntactic sugar” for replacing the original function with the decorated version.
- The line
@timing_decoratoris equivalent to writingslow_function = timing_decorator(slow_function). - The original
slow_functionvariable is reassigned to thewrapperfunction returned bytiming_decorator, but the original function is still accessible within thewrapperas thefuncvariable.
slow_function()
- The function is executed, but because it is decorated, it is actually the
wrapperthat runs. - Inside
wrapper,slow_function()is called. It waits for one second usingtime.sleep(1)before returning the string"Done!". - After the original function completes, the
wrapperprints the time taken and then returns the"Done!"string, which is assigned to theresultvariable.
Data Structures
Lists
graph TD
A[List Operations] --> B[Creation]
A --> C[Access]
A --> D[Modification]
A --> E[Methods]
B --> B1["Empty list: []"]
B --> B2["With values: [1,2,3]"]
B --> B3["list() constructor"]
C --> C1["Indexing: list[0]"]
C --> C2["Slicing: list[1:3]"]
C --> C3["Negative indexing: list[-1]"]
D --> D1[Append]
D --> D2[Insert]
D --> D3[Remove]
D --> D4[Pop]
E --> E1["sort()"]
E --> E2["reverse()"]
E --> E3["count()"]
E --> E4["index()"]# List creation and operations
fruits = ["apple", "banana", "orange"]
# Adding elements
fruits.append("grape") # Add at end
fruits.insert(1, "kiwi") # Insert at index
fruits.extend(["mango", "peach"]) # Add multiple items
print(f"Fruits: {fruits}")
# Accessing elements
print(f"First fruit: {fruits[0]}")
print(f"Last fruit: {fruits[-1]}")
print(f"First three: {fruits[:3]}")
# Modifying elements
fruits[0] = "green apple"
print(f"Modified: {fruits}")
# List methods
fruits.sort() # Sort in place
print(f"Sorted: {fruits}")
fruits.reverse() # Reverse in place
print(f"Reversed: {fruits}")
# List comprehension with conditions
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_squares = [x**2 for x in numbers if x % 2 == 0]
print(f"Even squares: {even_squares}")PythonList creation and initialization
fruits = ["apple", "banana", "orange"]: This creates and initializes a list namedfruits. A list is an ordered, mutable collection of items enclosed in square brackets.
Adding elements
fruits.append("grape"): The.append()method adds a single element to the end of the list.fruits.insert(1, "kiwi"): The.insert()method adds an element at a specific index. The existing elements from that index onward are shifted to make space for the new item.fruits.extend(["mango", "peach"]): The.extend()method adds all the elements from an iterable (like another list) to the end of the current list.- Result: The list
fruitsbecomes['apple', 'kiwi', 'banana', 'orange', 'grape', 'mango', 'peach'].
Accessing elements
fruits[0]: Accesses the first element of the list using its index, which is0in Python.fruits[-1]: Accesses the last element of the list using negative indexing.fruits[:3]: Uses list slicing to get a sub-list containing the first three elements. The slice includes the starting index (0) but excludes the ending index (3).
Modifying elements
fruits[0] = "green apple": Since lists are mutable, you can change an element by assigning a new value to its index.- Result: The list
fruitsbecomes['green apple', 'kiwi', 'banana', 'orange', 'grape', 'mango', 'peach'].
List methods
fruits.sort(): The.sort()method sorts the elements of the list in place (it modifies the original list). For strings, this is done alphabetically.fruits.reverse(): The.reverse()method reverses the order of the elements in the list in place.
List comprehension with conditions
- **
even_squares = [x**2 **for x in numbers if x % 2 == 0]**:- This is a concise way to create a new list by applying an expression to each item in an iterable.
for x in numbers: Iterates through each element in thenumberslist.if x % 2 == 0: A condition that filters the elements. Only even numbers (xdivided by 2 has a remainder of 0) will be processed.- **
x**2****: The expression that squares each even number.
- Output: The resulting list
even_squareswill contain the squares of all the even numbers from the originalnumberslist.
Dictionaries
graph TD
A[Dictionary] --> B[Key-Value Pairs]
A --> C[Mutable]
A --> D[Unordered]
A --> E[Methods]
B --> B1[Keys must be immutable]
B --> B2[Values can be any type]
E --> E1["keys()"]
E --> E2["values()"]
E --> E3["items()"]
E --> E4["get()"]
E --> E5["pop()"]
E --> E6["update()"]# Dictionary creation and operations
student = {
"name": "Alice",
"age": 20,
"grades": [85, 90, 78, 92],
"is_enrolled": True
}
# Accessing values
print(f"Name: {student['name']}")
print(f"Age: {student.get('age', 'Unknown')}")
# Adding/updating values
student["major"] = "Computer Science"
student["age"] = 21
# Dictionary methods
print(f"Keys: {list(student.keys())}")
print(f"Values: {list(student.values())}")
print(f"Items: {list(student.items())}")
# Iterating through dictionary
for key, value in student.items():
print(f"{key}: {value}")
# Dictionary comprehension
squares_dict = {x: x**2 for x in range(1, 6)}
print(f"Squares: {squares_dict}")
# Nested dictionaries
class_roster = {
"student1": {"name": "Alice", "grade": 85},
"student2": {"name": "Bob", "grade": 90},
"student3": {"name": "Carol", "grade": 78}
}
for student_id, info in class_roster.items():
print(f"{student_id}: {info['name']} - Grade: {info['grade']}")PythonDictionary creation and initialization
student = { ... }: A dictionary namedstudentis created using curly braces{}.- A dictionary stores data as unique
key: valuepairs. - The
studentdictionary stores a variety of data types, including a string ("Alice"), an integer (20), a list of integers ([85, 90, 78, 92]), and a boolean (True).
Accessing values
student['name']: Uses bracket[]notation to access the value associated with the key"name". This method will raise aKeyErrorif the key does not exist.student.get('age', 'Unknown'): Uses the.get()method to access the value for the key"age".- Safeguard: This method is safer than bracket notation because it returns
Noneby default (or a specified default value like'Unknown') if the key is not found, preventing aKeyError.
- Safeguard: This method is safer than bracket notation because it returns
Adding and updating values
student["major"] = "Computer Science": Adds a new key-value pair to the dictionary.student["age"] = 21: Updates the value for an existing key. Since dictionary keys must be unique, this overwrites the old value of20with21.
Dictionary methods
.keys(): Returns a view object containing all the keys in the dictionary..values(): Returns a view object containing all the values in the dictionary..items(): Returns a view object containing all key-value pairs as tuples.list(...): The view objects returned by these methods are dynamic and can be iterated over, but they are not lists. Usinglist()converts the view into a static list for printing or other operations.
Iterating through a dictionary
for key, value in student.items():: This is the most efficient and Pythonic way to loop through a dictionary to access both keys and values simultaneously. The.items()method returns key-value pairs as tuples, which are then unpacked into thekeyandvaluevariables for each iteration.
Dictionary comprehension
squares_dict = {x: x**2 **for x in range(1, 6)}: A concise way to create a new dictionary from an iterable.- How it works: It loops through the numbers 1 to 5 from
range(1, 6). For each numberx, it creates a key-value pair where the key isxand the value isxsquared (x**2).
- How it works: It loops through the numbers 1 to 5 from
Nested dictionaries
class_roster = { ... }: A dictionary where the values are themselves other dictionaries. This allows for structuring more complex data.for student_id, info in class_roster.items():: The outer loop iterates through theclass_rosterdictionary.student_idis the key (e.g.,"student1"), andinfois the value, which is the inner dictionary (e.g.,{"name": "Alice", "grade": 85}).info['name']: In each iteration, you can access the values of the inner dictionaryinfousing its keys, such as'name'and'grade'.
Sets
flowchart TD
A["fruits = {'apple', 'banana', 'orange'}"] --> B["fruits.add('grape')"]
B --> C["fruits = {'apple', 'banana', 'orange', 'grape'}"]
C --> D["fruits.discard('banana')"]
D --> E["fruits = {'apple', 'orange', 'grape'}"]# Set operations
fruits = {"apple", "banana", "orange"}
vegetables = {"carrot", "broccoli", "spinach"}
healthy_foods = {"apple", "banana", "carrot", "broccoli"}
# Set operations
print(f"Fruits: {fruits}")
print(f"Union: {fruits | healthy_foods}")
print(f"Intersection: {fruits & healthy_foods}")
print(f"Difference: {fruits - healthy_foods}")
# Set methods
fruits.add("grape")
fruits.discard("banana") # Won't raise error if not found
print(f"Modified fruits: {fruits}")PythonSet creation and initialization
fruits = {"apple", "banana", "orange"}: A set namedfruitsis created using curly braces{}. Unlike a list or tuple, a set is an unordered collection and does not allow duplicate elements.vegetables = { ... }andhealthy_foods = { ... }: These lines create other sets that are used to demonstrate set operations.
Set operations
Set operations are mathematical-like operations that can be performed on sets using operators or built-in methods.
- Union (
|): Combines all unique elements from both sets.- Code:
fruits | healthy_foods - Explanation: The union of
fruits({"apple", "banana", "orange"}) andhealthy_foods({"apple", "banana", "carrot", "broccoli"}) results in a new set containing all the unique elements from both. - Output:
{'apple', 'banana', 'orange', 'carrot', 'broccoli'}(the order may vary)
- Code:
- Intersection (
&): Returns a new set containing only the elements common to both sets.- Code:
fruits & healthy_foods - Explanation: The intersection identifies the elements that are in both
fruitsandhealthy_foods. - Output:
{'apple', 'banana'}
- Code:
- Difference (
-): Returns a new set with elements from the first set that are not present in the second.- Code:
fruits - healthy_foods - Explanation: This operation removes all elements found in
healthy_foodsfrom thefruitsset. - Output:
{'orange'}
- Code:
Set methods
Sets have built-in methods for modifying or inspecting the set.
.add("grape"): Adds a single element to the set. If the element is already present, the set remains unchanged..discard("banana"): Removes a specified element from the set if it is present. The key difference from the.remove()method is that.discard()does not raise an error if the item is not found, making it safer to use when you are unsure if an element exists.fruitsmodification: After the.add()and.discard()calls, thefruitsset is modified in place.- Final Output:
{'apple', 'orange', 'grape'}
- Final Output:
Tuples
mindmap
root((Tuple Operations))
Basic Tuples
coordinates = (10, 20)
rgb_color = (255, 128, 0)
Unpacking
x, y = coordinates
r, g, b = rgb_color
Named Tuples
Point(x, y)
point = (10, 20)
Person(name, age, city)
person = ('Alice', 30, 'New York')classDiagram
class Point {
+ x : int
+ y : int
}
class Person {
+ name : str
+ age : int
+ city : str
}
class point {
<<instance>>
x = 10
y = 20
}
class person {
<<instance>>
name = "Alice"
age = 30
city = "New York"
}
Point <|-- point : instantiates
Person <|-- person : instantiates# Tuple operations
coordinates = (10, 20)
rgb_color = (255, 128, 0)
# Tuple unpacking
x, y = coordinates
r, g, b = rgb_color
print(f"Coordinates: x={x}, y={y}")
print(f"Color: R={r}, G={g}, B={b}")
# Named tuples
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
point = Point(10, 20)
print(f"Point: x={point.x}, y={point.y}")
Person = namedtuple('Person', ['name', 'age', 'city'])
person = Person("Alice", 30, "New York")
print(f"Person: {person.name}, {person.age}, {person.city}")PythonTuple creation and unpacking
coordinates = (10, 20): A tuple is an immutable, ordered sequence of elements. Like lists, they can store different types of data. Tuples are defined using parentheses()and are often used for data that should not change, such as a coordinate pair.x, y = coordinates: This is called tuple unpacking. Python assigns the values from thecoordinatestuple (10,20) to the variablesxandyrespectively. The number of variables on the left must match the number of elements in the tuple to avoid aValueError.rgb_color = (255, 128, 0): Another tuple is created, representing a color code.r, g, b = rgb_color: The color tuple is unpacked into three variables for the red, green, and blue values. This improves readability by giving meaningful names to the variables.
Named tuples
from collections import namedtuple: Thenamedtuplefactory function is imported from thecollectionsmodule. It is used to create tuple-like subclasses with named fields.Point = namedtuple('Point', ['x', 'y']): A new class,Point, is created usingnamedtuple. It is defined with two field names,'x'and'y', which function like attributes for each instance ofPoint.point = Point(10, 20): An instance of thePointnamed tuple is created, withxset to10andyset to20.print(f"Point: x={point.x}, y={point.y}"): Named tuples can be accessed via their field names using dot notation (.). This is more descriptive and readable than accessing elements via numerical index (e.g.,point[0]), especially when a tuple contains many elements.Person = namedtuple('Person', ['name', 'age', 'city']): A second named tuple class,Person, is created to represent structured data.person = Person("Alice", 30, "New York"): An instance is created with the given values. Named tuples offer a lightweight way to create an object-like structure without the overhead of defining a full class.print(f"Person: {person.name}, {person.age}, {person.city}"): Accessing thePersondata is clean and self-documenting.
Benefits of using named tuples
- Readability: Accessing data by name (e.g.,
person.name) is much clearer than using an index (person[0]). - Immutability: Like regular tuples, named tuples are immutable. This means their data cannot be changed after creation, making them safe for storing constant data.
- Efficiency: They are more memory-efficient than dictionaries and offer faster access speeds, making them suitable for representing a large number of data records.
Here is a beginner-friendly example of a simple but robust program that calculates the area of a rectangle.
flowchart TD
A["Start calculate_rectangle_area(length, width)"] --> B{"length <= 0 or width <= 0?"}
B -->|Yes| C["Print Error: Length and width must be positive"]
C --> D[Return None]
B -->|No| E["area = length * width"]
E --> F[Return area]sequenceDiagram
participant U as User
participant F as calculate_rectangle_area
U->>F: (10, 5)
F-->>U: 50
U->>U: Print "The area of the rectangle is: 50"
U->>F: (10, -5)
F-->>U: None + "Error message"
U->>U: Print "Could not calculate area due to invalid input."
U->>F: (0, 5)
F-->>U: None + "Error message"
U->>U: Print "Could not calculate area due to invalid input.""""
This module contains functions for working with rectangles.
"""
from typing import Optional, Union
# Good practice: use type hints for clarity.
def calculate_rectangle_area(length: Union[int, float], width: Union[int, float]) -> Optional[Union[int, float]]:
"""
Calculates the area of a rectangle.
This function takes a rectangle's length and width, validates them,
and returns the area.
Args:
length: The length of the rectangle. Must be a positive number.
width: The width of the rectangle. Must be a positive number.
Returns:
The calculated area, or None if the input is invalid.
"""
# Good practice: handle potential errors and edge cases.
if length <= 0 or width <= 0:
print("Error: Length and width must be positive numbers.")
return None
# Good practice: use meaningful variable names.
area = length * width
return area
# This is a good practice to prevent code from running when the module is imported.
if __name__ == "__main__":
# Example 1: Valid input
rect_area = calculate_rectangle_area(10, 5)
if rect_area is not None:
# Good practice: use f-strings for clear, readable output.
print(f"The area of the rectangle is: {rect_area}")
# Example 2: Invalid input (negative width)
invalid_area = calculate_rectangle_area(10, -5)
# The function prints an error message, but we still check the return value.
if invalid_area is None:
print("Could not calculate area due to invalid input.")
# Example 3: Invalid input (zero length)
zero_area = calculate_rectangle_area(0, 5)
if zero_area is None:
print("Could not calculate area due to invalid input.")PythonExplanation of production-grade practices
- Modularization: The code is organized into a single file (
rectangle.py) with a clear purpose. For larger projects, code would be split across multiple modules and packages, a key principle of the “Don’t Repeat Yourself” (DRY) philosophy. - Documentation (Docstrings): The code includes a docstring that explains what the module does. The
calculate_rectangle_areafunction also has a docstring explaining its purpose, arguments (Args), and return values (Returns), following the PEP 257 convention. - Type Hinting: The code uses type hints (
->,:) to indicate the expected types for function parameters and return values. This makes the code easier to understand and can be used by linters and IDEs to catch potential errors. - Error Handling and Input Validation: The
calculate_rectangle_areafunction explicitly checks for invalid input (non-positive numbers) and handles it gracefully by printing an error and returningNone. This prevents the program from crashing and helps with debugging. - Clear and Consistent Naming: The code follows PEP 8 naming conventions, such as
snake_casefor function and variable names, which makes the code easy to read. - Safe Entry Point (
if __name__ == "__main__":): The example usage code is placed within anif __name__ == "__main__":block. This is a standard practice that ensures the code inside the block only runs when the script is executed directly, not when it is imported as a module by another script.
Object-Oriented Programming
Classes and Objects
classDiagram
class Person {
<<class>>
- species : str = "Homo sapiens"
- name : str
- age : int
- email : str
+ __init__(name: str, age: int, email: str)
+ introduce() str
+ have_birthday() void
+ __str__() str
+ __repr__() str
}
class person1 {
<<instance>>
name = "Alice"
age = 25
email = "alice@email.com"
}
class person2 {
<<instance>>
name = "Bob"
age = 30
email = "bob@email.com"
}
Person <|-- person1 : instantiates
Person <|-- person2 : instantiates# Basic class definition
class Person:
"""A class to represent a person."""
# Class variable
species = "Homo sapiens"
def __init__(self, name, age, email):
"""Initialize a Person object."""
# Instance variables
self.name = name
self.age = age
self.email = email
def introduce(self):
"""Return an introduction string."""
return f"Hi, I'm {self.name}, {self.age} years old."
def have_birthday(self):
"""Increment age by 1."""
self.age += 1
print(f"Happy birthday! {self.name} is now {self.age}")
def __str__(self):
"""String representation of the person."""
return f"Person(name='{self.name}', age={self.age})"
def __repr__(self):
"""Developer representation of the person."""
return f"Person('{self.name}', {self.age}, '{self.email}')"
# Creating objects
person1 = Person("Alice", 25, "alice@email.com")
person2 = Person("Bob", 30, "bob@email.com")
print(person1.introduce())
print(person2.introduce())
person1.have_birthday()
print(f"Alice's new age: {person1.age}")PythonClass Person
This code defines a class named Person which acts as a blueprint for creating objects that represent people.
Attributes
- Class variable (
species):species = "Homo sapiens"- This variable is defined at the class level and is shared by all instances (objects) of the
Personclass. - It holds a value that is constant for all people created from this blueprint.
- This variable is defined at the class level and is shared by all instances (objects) of the
- Instance variables (
self.name,self.age,self.email):- These are unique to each instance of the class and are initialized within the special
__init__method. - Each
Personobject you create will have its ownname,age, andemailwith potentially different values.
- These are unique to each instance of the class and are initialized within the special
selfparameter:- The
selfparameter in method definitions refers to the instance of the class that the method is being called on. - It allows instance methods to access and modify the instance’s unique attributes.
- The
Methods
__init__(self, name, age, email):- This is the constructor method, or “initializer,” and is automatically called whenever a new
Personobject is created. - It sets up the initial state of the object by assigning the passed
name,age, andemailarguments to the instance variables.
- This is the constructor method, or “initializer,” and is automatically called whenever a new
introduce(self):- This is a custom instance method that returns a string introducing the person, using their unique
nameandageinstance variables.
- This is a custom instance method that returns a string introducing the person, using their unique
have_birthday(self):- This method modifies the state of the object by incrementing the
ageinstance variable by 1. - It shows that instance methods can change an object’s internal data.
- This method modifies the state of the object by incrementing the
__str__(self):- This magic or “dunder” method is used to provide a user-friendly string representation of the object.
- It is automatically called by the
print()andstr()functions. - Output is meant for the end-user.
__repr__(self):- This magic method returns an official, unambiguous string representation of the object, typically used for debugging.
- Ideally, the string could be used to recreate the object.
- Output is meant for the developer.
Creating and using objects
person1 = Person("Alice", 25, "alice@email.com"): This line creates an instance of thePersonclass namedperson1. The arguments"Alice",25, and"alice@email.com"are passed to the__init__method to initialize the object’s instance variables.person2 = Person("Bob", 30, "bob@email.com"): A second, separate instance namedperson2is created.person1.introduce(): This calls theintroducemethod on theperson1object. The method usesperson1‘s unique data ("Alice",25) to produce the output.person1.have_birthday(): This calls thehave_birthdaymethod onperson1, modifying only that object’sage.person2‘s age remains unchanged.print(person1): This would implicitly call theperson1object’s__str__method and printPerson(name='Alice', age=26)(after the birthday).
Inheritance
classDiagram
class Person {
<<class>>
- species : str = "Homo sapiens"
- name : str
- age : int
- email : str
+ __init__(name: str, age: int, email: str)
+ introduce() str
+ have_birthday() void
+ __str__() str
+ __repr__() str
}
class Student {
<<subclass>>
- student_id : str
- grades : list
+ __init__(name: str, age: int, email: str, student_id: str)
+ add_grade(grade: int) void
+ get_average() float
+ is_passing(passing_grade: int=60) bool
+ introduce() str
}
class Teacher {
<<subclass>>
- subject : str
- salary : float
+ __init__(name: str, age: int, email: str, subject: str, salary: float)
+ teach() str
+ grade_student(student: Student, grade: int) void
}
%% Inheritance
Person <|-- Student
Person <|-- Teacher
%% Instances
class student {
<<instance>>
name = "Carol"
age = 20
email = "carol@email.com"
student_id = "S12345"
grades = [85, 92, 78, 88]
}
class teacher {
<<instance>>
name = "Dr. Smith"
age = 45
email = "smith@email.com"
subject = "Mathematics"
salary = 75000
}
Student <|-- student : instantiates
Teacher <|-- teacher : instantiates
Teacher --> Student : grades# Inheritance example
class Student(Person):
"""A class to represent a student, inheriting from Person."""
def __init__(self, name, age, email, student_id):
"""Initialize a Student object."""
super().__init__(name, age, email) # Call parent constructor
self.student_id = student_id
self.grades = []
def add_grade(self, grade):
"""Add a grade to the student's record."""
if 0 <= grade <= 100:
self.grades.append(grade)
else:
print("Grade must be between 0 and 100")
def get_average(self):
"""Calculate and return the average grade."""
if self.grades:
return sum(self.grades) / len(self.grades)
return 0
def is_passing(self, passing_grade=60):
"""Check if student is passing."""
return self.get_average() >= passing_grade
def introduce(self):
"""Override parent method."""
return f"Hi, I'm {self.name}, a student with ID {self.student_id}."
class Teacher(Person):
"""A class to represent a teacher."""
def __init__(self, name, age, email, subject, salary):
"""Initialize a Teacher object."""
super().__init__(name, age, email)
self.subject = subject
self.salary = salary
def teach(self):
"""Return a teaching message."""
return f"{self.name} is teaching {self.subject}"
def grade_student(self, student, grade):
"""Grade a student."""
student.add_grade(grade)
print(f"{self.name} gave {student.name} a grade of {grade}")
# Using inheritance
student = Student("Carol", 20, "carol@email.com", "S12345")
teacher = Teacher("Dr. Smith", 45, "smith@email.com", "Mathematics", 75000)
print(student.introduce())
print(teacher.introduce())
student.add_grade(85)
student.add_grade(92)
student.add_grade(78)
teacher.grade_student(student, 88)
print(f"Student average: {student.get_average():.2f}")
print(f"Is passing: {student.is_passing()}")PythonExplanation of inheritance
Inheritance is a fundamental concept in Object-Oriented Programming (OOP) that allows a new class (the child or subclass) to adopt the attributes and methods of an existing class (the parent or base class). This promotes code reuse and creates a logical, hierarchical relationship between classes.
Parent class (Person)
The Person class, from the previous example, serves as the base class for Student and Teacher. Both students and teachers are people and share common attributes like name, age, and email, which are defined in Person.
Subclass: Student(Person)
The Student class inherits from Person, indicated by class Student(Person):.
__init__(...): The constructor forStudenttakes its own specific parameters (student_id), but it also needs to initialize the inherited attributes fromPerson.super().__init__(name, age, email): Thesuper()function is used to call the__init__method of the parent class (Person). This efficiently handles the initialization of attributes inherited from the parent without duplicating code.
self.student_id,self.grades: These are new instance variables specific to theStudentclass.add_grade(),get_average(),is_passing(): These are new methods that extend the functionality of theStudentclass beyond what thePersonclass offers.introduce(): This method overrides theintroduce()method from thePersonclass. Whenintroduce()is called on aStudentobject, Python executes theStudentversion, not thePersonversion. This allowsStudentobjects to provide a more specific introduction.
Subclass: Teacher(Person)
The Teacher class also inherits from Person, and its methods and attributes extend the Person class in a different way.
__init__(...): It usessuper().__init__(...)to initialize thePersonattributes and adds its own specific attributes likesubjectandsalary.teach(): A method specific to teachers.grade_student(self, student, grade):- This method demonstrates polymorphism, where a
Teacherobject interacts with aStudentobject. - It uses a method from another class (
student.add_grade(grade)) to perform a task.
- This method demonstrates polymorphism, where a
How inheritance is used in the example
- Object Creation: When
student = Student(...)is called, aStudentobject is created. Python first calls thePersonconstructor viasuper()to set the name, age, and email, and then it sets thestudent_idandgradesspecific to the student. - Method Overriding:
print(student.introduce())calls theintroducemethod from theStudentclass, whileprint(teacher.introduce())calls theintroducemethod from thePersonclass, whichTeacherinherited without modifying. - Interaction between objects: The
teacherobject calls theadd_grademethod on thestudentobject, demonstrating how different parts of a program can work together. - Polymorphism: Both
studentandteacherobjects are fundamentallyPersonobjects, so they share the common characteristics ofPersonbut also have their own specific methods and behaviors.
Advanced OOP Concepts
classDiagram
class Shape {
<> + area()* + perimeter()* } class Rectangle { - width : float - height : float + __init__(width: float, height: float) + area() float + perimeter() float } class Circle { - radius : float + __init__(radius: float) + area() float + perimeter() float } class BankAccount { - _balance : float + __init__(initial_balance: float=0) + balance : float <> + deposit(amount: float) void + withdraw(amount: float) void } %% Inheritance Shape <|-- Rectangle Shape <|-- Circle %% Instances class rectangle { <> width = 5 height = 10 } class circle { <> radius = 3 } class account { <> _balance = 1000 } Rectangle <|-- rectangle : instantiates Circle <|-- circle : instantiates BankAccount <|-- account : instantiates
classDiagram
class Shape {
<<abstract>>
+ area()*
+ perimeter()*
}
class Rectangle {
- width : float
- height : float
+ __init__(width: float, height: float)
+ area() float
+ perimeter() float
}
class Circle {
- radius : float
+ __init__(radius: float)
+ area() float
+ perimeter() float
}
class BankAccount {
- _balance : float
+ __init__(initial_balance: float=0)
+ balance : float <<property>>
+ deposit(amount: float) void
+ withdraw(amount: float) void
}
%% Inheritance
Shape <|-- Rectangle
Shape <|-- Circle
%% Instances
class rectangle {
<<instance>>
width = 5
height = 10
}
class circle {
<<instance>>
radius = 3
}
class account {
<<instance>>
_balance = 1000
}
Rectangle <|-- rectangle : instantiates
Circle <|-- circle : instantiates
BankAccount <|-- account : instantiates# Abstract classes and methods
from abc import ABC, abstractmethod
class Shape(ABC):
"""Abstract base class for shapes."""
@abstractmethod
def area(self):
"""Calculate the area of the shape."""
pass
@abstractmethod
def perimeter(self):
"""Calculate the perimeter of the shape."""
pass
class Rectangle(Shape):
"""Rectangle class inheriting from Shape."""
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width * self.height
def perimeter(self):
return 2 * (self.width + self.height)
class Circle(Shape):
"""Circle class inheriting from Shape."""
def __init__(self, radius):
self.radius = radius
def area(self):
return 3.14159 * self.radius ** 2
def perimeter(self):
return 2 * 3.14159 * self.radius
# Property decorators
class BankAccount:
"""Bank account with property decorators."""
def __init__(self, initial_balance=0):
self._balance = initial_balance
@property
def balance(self):
"""Get the current balance."""
return self._balance
@balance.setter
def balance(self, amount):
"""Set the balance with validation."""
if amount < 0:
raise ValueError("Balance cannot be negative")
self._balance = amount
def deposit(self, amount):
"""Deposit money to the account."""
if amount > 0:
self._balance += amount
else:
raise ValueError("Deposit amount must be positive")
def withdraw(self, amount):
"""Withdraw money from the account."""
if amount > self._balance:
raise ValueError("Insufficient funds")
if amount <= 0:
raise ValueError("Withdrawal amount must be positive")
self._balance -= amount
# Using the classes
rectangle = Rectangle(5, 10)
circle = Circle(3)
print(f"Rectangle area: {rectangle.area()}")
print(f"Circle area: {circle.area():.2f}")
account = BankAccount(1000)
print(f"Initial balance: ${account.balance}")
account.deposit(500)
print(f"After deposit: ${account.balance}")
account.withdraw(200)
print(f"After withdrawal: ${account.balance}")PythonAbstract classes and methods
from abc import ABC, abstractmethod: Theabc(Abstract Base Classes) module provides the tools for defining abstract classes in Python.ABCis the base class for abstract classes, and@abstractmethodis a decorator used to declare an abstract method.class Shape(ABC):: This definesShapeas an abstract base class. An abstract class cannot be instantiated directly. Its purpose is to define a common interface for its subclasses.@abstractmethod: This decorator marks thearea()andperimeter()methods as abstract. Any non-abstract subclass ofShapemust provide its own implementation for all abstract methods. If a subclass (e.g.,Rectangle) fails to implement all abstract methods, Python will prevent you from creating an instance of that subclass.pass: Thepassstatement is a null operation. It is used here as a placeholder for the abstract method’s body, indicating that the method is intended to be implemented by a subclass.class Rectangle(Shape):andclass Circle(Shape):: These are concrete subclasses that inherit fromShape. They are “concrete” because they provide implementations for all ofShape‘s abstract methods (area()andperimeter()).
Property decorators
Property decorators provide a way to create “managed” attributes in a class. They let you implement getter, setter, and deleter methods for an attribute, allowing you to add validation, logging, or other logic without changing the public interface.
class BankAccount:: A class that models a bank account._balance: The private instance variable_balanceis created with a leading underscore_by convention to indicate that it is intended for internal use only and should not be accessed directly by code outside the class.@property: This decorator is used on thebalance()method. It turns this method into a “getter” method. Now, when you accessaccount.balance(without parentheses), Python implicitly calls thisbalance()method.@balance.setter: This decorator is used on the setter method. It is the “setter” for thebalanceproperty.- Validation: It includes a check to ensure that the
amountis not negative. If an invalid value is provided (e.g.,account.balance = -100), it raises aValueError.
- Validation: It includes a check to ensure that the
- Direct method calls (
depositandwithdraw): These methods show the standard way of performing actions on the account. They include input validation and logic to ensure the account state remains consistent.
Using the classes
rectangle = Rectangle(5, 10): An instance ofRectangleis created. Since it has implemented thearea()andperimeter()methods, it can be instantiated.circle = Circle(3): An instance ofCircleis created.account = BankAccount(1000): ABankAccountobject is created with an initial balance.print(f"Initial balance: ${account.balance}"): This line accesses the balance using the@propertygetter method. It is read as if it were a normal attribute.account.deposit(500): This line calls thedepositmethod to add funds.account.withdraw(200): This line calls thewithdrawmethod to remove funds.
Note: If you tried to create an instance of the abstract Shape class, you would get a TypeError because it is an abstract class.
Error Handling
Exception Handling Flow
flowchart TD
A[Start] --> B[Try Block]
B --> C{Exception Occurs?}
C -->|No| D[Continue Execution]
C -->|Yes| E{Matching Except Block?}
E -->|Yes| F[Execute Except Block]
E -->|No| G[Unhandled Exception]
F --> H[Finally Block]
D --> H
H --> I[End]
G --> J[Program Crashes]flowchart TD
A["Start divide_numbers(a, b)"] --> B["Try: result = a / b"]
B -->|Success| C[Return result]
B -->|ZeroDivisionError| D["Print Error: Cannot divide by zero!"]
D --> E[Return None]
B -->|TypeError| F["Print Error: Please provide numeric values!"]
F --> E
C --> G[End]
E --> Gflowchart TD
A["Start process_data(data)"] --> B[Try block]
B --> C["Convert data to int -> number = int(data)"]
C --> D["result = 100 / number"]
D --> E["index = [1,2,3][number]"]
E -->|Success| F["Return (result, index)"]
C -->|ValueError| G["Print Error: not a valid integer"]
D -->|ZeroDivisionError| H["Print Error: Cannot divide by zero"]
E -->|IndexError| I["Print Error: Index out of range"]
B -->|Other Exception| J["Print Unexpected error: e"]
G --> K["Finally: Print Processing completed"]
H --> K
I --> K
J --> K
F --> K
K --> L[End]# Basic exception handling
def divide_numbers(a, b):
"""Divide two numbers with error handling."""
try:
result = a / b
return result
except ZeroDivisionError:
print("Error: Cannot divide by zero!")
return None
except TypeError:
print("Error: Please provide numeric values!")
return None
# Test the function
print(divide_numbers(10, 2)) # 5.0
print(divide_numbers(10, 0)) # Error message, returns None
print(divide_numbers(10, "2")) # Error message, returns None
# Multiple exception handling
def process_data(data):
"""Process data with comprehensive error handling."""
try:
# Try to convert to integer and perform operations
number = int(data)
result = 100 / number
index = [1, 2, 3][number] # This might raise IndexError
return result, index
except ValueError:
print(f"Error: '{data}' is not a valid integer")
except ZeroDivisionError:
print("Error: Cannot divide by zero")
except IndexError:
print("Error: Index out of range")
except Exception as e:
print(f"Unexpected error: {e}")
finally:
print("Processing completed")
# Test different scenarios
process_data("5") # Normal case
process_data("abc") # ValueError
process_data("0") # ZeroDivisionError
process_data("10") # IndexErrorPythonBasic exception handling (try-except)
This block of code is a fundamental mechanism for anticipating and handling potential errors at runtime.
tryblock: This block contains the code that might raise an exception. If an error occurs during execution of this code, Python stops and immediately jumps to the appropriateexceptblock.except ZeroDivisionError: This is an exception handler that specifically catches aZeroDivisionError. If thetryblock attempts to divide by zero, this code is executed, printing a user-friendly error message instead of crashing the program. The function then returnsNone.except TypeError: This handler catches aTypeError, which occurs when an operation is performed on an inappropriate data type (e.g., trying to divide a number by a string). This block also prints an error and returnsNone.
Example function calls
divide_numbers(10, 2): This call executes thetryblock successfully, calculates10 / 2, and returns5.0.divide_numbers(10, 0): This call raises aZeroDivisionErrorin thetryblock. Python catches it with theexcept ZeroDivisionErrorblock, prints the error message, and returnsNone.divide_numbers(10, "2"): This call raises aTypeErrorin thetryblock. Python catches it with theexcept TypeErrorblock, prints the error message, and returnsNone.
Comprehensive exception handling (try-except-finally)
This example shows a more robust approach to handling different types of exceptions and introduces the finally block.
tryblock: Contains multiple operations that could raise different exceptions:int(data): Could raise aValueErrorifdatais not a valid number string.100 / number: Could raise aZeroDivisionErrorifnumberis 0.[1, 2, 3][number]: Could raise anIndexErrorifnumberis outside the valid index range (0, 1, or 2).
- Specific
exceptblocks: The code provides separateexceptblocks forValueError,ZeroDivisionError, andIndexError. When an exception is raised, Python looks for the first matchingexceptblock. - General
exceptblock (except Exception as e): This acts as a catch-all for any other unforeseen exceptions that might occur. The exception object is assigned to the variablee, allowing you to inspect its details, such as the error message. finallyblock: This block is guaranteed to execute, regardless of whether an exception occurred or was handled. It is typically used for cleanup actions, like closing files or releasing resources.
Example function calls
process_data("5"): Thetryblock runs without any exceptions. Thefinallyblock is executed afterward.process_data("abc"):int("abc")raises aValueError, which is caught by the correspondingexceptblock. Thefinallyblock then executes.process_data("0"):100 / 0raises aZeroDivisionError, which is caught. Thefinallyblock executes.process_data("10"): The number is converted to10, but[1, 2, 3][10]raises anIndexError, which is caught. Thefinallyblock executes.
Custom Exceptions
classDiagram
class Exception {
}
class InsufficientFundsError {
- balance : float
- amount : float
+ __init__(balance: float, amount: float)
}
class InvalidAccountError {
}
class BankAccount {
- account_number : str
- balance : float
+ __init__(account_number: str, initial_balance: float=0)
+ withdraw(amount: float) void
}
%% Inheritance
Exception <|-- InsufficientFundsError
Exception <|-- InvalidAccountError
%% Relationships
BankAccount --> InsufficientFundsError : raises
BankAccount --> InvalidAccountError : raisesflowchart TD
A["Start withdraw(amount)"] --> B{"amount <= 0?"}
B -->|Yes| C["Raise ValueError: 'Invalid amount'"]
B -->|No| D{"amount > balance?"}
D -->|Yes| E["Raise InsufficientFundsError(balance, amount)"]
D -->|No| F[balance -= amount]
F --> G[Return success]
C --> H[Exception handling outside]
E --> H
G --> I[End]sequenceDiagram
participant U as User
participant BA as BankAccount
participant E as InsufficientFundsError
U->>BA: withdraw(600)
BA->>BA: Check amount > balance
BA-->>E: raise InsufficientFundsError(500, 600)
E-->>U: Exception caught
U->>U: Print "Transaction failed: Insufficient funds..."# Custom exception classes
class InsufficientFundsError(Exception):
"""Exception raised when account has insufficient funds."""
def __init__(self, balance, amount):
self.balance = balance
self.amount = amount
super().__init__(f"Insufficient funds: ${balance} available, ${amount} requested")
class InvalidAccountError(Exception):
"""Exception raised for invalid account operations."""
pass
class BankAccount:
"""Bank account with custom exception handling."""
def __init__(self, account_number, initial_balance=0):
if not account_number:
raise InvalidAccountError("Account number cannot be empty")
self.account_number = account_number
self.balance = initial_balance
def withdraw(self, amount):
"""Withdraw money with custom exception handling."""
if amount <= 0:
raise ValueError("Withdrawal amount must be positive")
if amount > self.balance:
raise InsufficientFundsError(self.balance, amount)
self.balance -= amount
return self.balance
def deposit(self, amount):
"""Deposit money to the account."""
if amount <= 0:
raise ValueError("Deposit amount must be positive")
self.balance += amount
return self.balance
# Using custom exceptions
try:
account = BankAccount("ACC123", 1000)
print(f"Initial balance: ${account.balance}")
account.withdraw(500)
print(f"After withdrawal: ${account.balance}")
account.withdraw(600) # This will raise InsufficientFundsError
except InsufficientFundsError as e:
print(f"Transaction failed: {e}")
except InvalidAccountError as e:
print(f"Account error: {e}")
except ValueError as e:
print(f"Value error: {e}")PythonCustom exception classes
This code demonstrates how to create and use custom exceptions, which are user-defined classes that inherit from Python’s built-in Exception class. Using custom exceptions provides clearer, domain-specific error handling and improves code readability.
InsufficientFundsError
class InsufficientFundsError(Exception):: This defines a new exception class that inherits from the baseExceptionclass. By convention, custom exception class names end withError.def __init__(self, balance, amount):: The constructor of the custom exception accepts specific parameters (balance,amount) related to the error. This allows the exception to carry meaningful context about what went wrong.super().__init__(...): This line calls the constructor of the parentExceptionclass, passing it a formatted error message. This ensures that the message is stored correctly within the exception object.
InvalidAccountError
class InvalidAccountError(Exception):: Another custom exception class is defined for a different type of application-specific error.pass: This simple exception class does not require a custom constructor. It inherits everything from the baseExceptionclass and serves as a specific type marker for catching this particular kind of error.
BankAccount class
This class uses the custom exceptions to handle business logic errors.
__init__(self, account_number, initial_balance=0):: The constructor for theBankAccountvalidates theaccount_number. If it’s empty, it raises theInvalidAccountErrorcustom exception, which is more specific than a genericValueError.withdraw(self, amount)::- Built-in
ValueError: Used for an invalid input value (amount <= 0), which aligns with Python’s convention for parameter validation. - Custom
InsufficientFundsError: Used for a domain-specific error condition (amount > self.balance). Theraisestatement creates and throws an instance of the custom exception, passing the current balance and the requested amount to its constructor.
- Built-in
Using and catching custom exceptions
The try-except block demonstrates how to use and handle the different types of exceptions.
account = BankAccount("ACC123", 1000): A newBankAccountobject is created successfully.account.withdraw(500): This line executes without an error, and the balance is updated.account.withdraw(600):- The
withdrawmethod detects that600is greater than the current balance (500). - It raises an
InsufficientFundsErrorwith contextual information. - The execution of the
tryblock immediately stops.
- The
except InsufficientFundsError as e:: The raisedInsufficientFundsErroris caught by this specificexceptblock. The exception object is assigned to the variablee.- The message printed includes the custom message created in the exception’s
__init__method: “Transaction failed: Insufficient funds: 500 available, 600 requested”.
- The message printed includes the custom message created in the exception’s
except InvalidAccountError as e:: This block would catch anInvalidAccountError, for example, ifBankAccount("", 1000)was called.except ValueError as e:: This block would catch theValueError, for example, ifaccount.withdraw(0)was called.
Context Managers
classDiagram
class FileManager {
- filename : str
- mode : str
- file : File
+ __init__(filename: str, mode: str)
+ __enter__() File
+ __exit__(exc_type, exc_val, exc_tb) bool
}
class File {
+ write(str) void
+ read() str
+ close() void
}
FileManager --> File : managessequenceDiagram
participant U as User
participant CM as FileManager
participant F as File
U->>CM: with FileManager("test.txt", "w")
CM->>CM: __enter__()
CM-->>U: file object (F)
U->>F: write("Hello, World!")
U->>F: write("This is a test file.")
U->>CM: Exit 'with' block
CM->>CM: __exit__(exc_type=None)
CM->>F: close()
U->>CM: with FileManager("test.txt", "r")
CM->>CM: __enter__()
CM-->>U: file object (F)
U->>F: read()
F-->>U: "Hello, World!\nThis is a test file."
U->>CM: Exit 'with' block
CM->>CM: __exit__(exc_type=None)
CM->>F: close()# Using context managers for resource management
class FileManager:
"""Custom context manager for file operations."""
def __init__(self, filename, mode):
self.filename = filename
self.mode = mode
self.file = None
def __enter__(self):
print(f"Opening file: {self.filename}")
self.file = open(self.filename, self.mode)
return self.file
def __exit__(self, exc_type, exc_val, exc_tb):
print(f"Closing file: {self.filename}")
if self.file:
self.file.close()
if exc_type:
print(f"Exception occurred: {exc_val}")
return False # Don't suppress exceptions
# Using the context manager
try:
with FileManager("test.txt", "w") as file:
file.write("Hello, World!")
file.write("\nThis is a test file.")
# File will be automatically closed even if an exception occurs
with FileManager("test.txt", "r") as file:
content = file.read()
print(f"File content:\n{content}")
except FileNotFoundError:
print("File not found!")
except PermissionError:
print("Permission denied!")PythonThis code defines a custom context manager called FileManager, which ensures that files are properly opened and, most importantly, closed, even if errors occur during file operations. This prevents resource leaks and results in cleaner, more robust code.
FileManager class
To function as a context manager, a class must implement the __enter__() and __exit__() special methods.
__init__(self, filename, mode): The constructor initializes theFileManagerinstance with thefilenameandmode(e.g.,'w'for write,'r'for read) for the file. It also setsself.filetoNoneinitially.__enter__(self): This method is called when thewithstatement is entered.- It prints a message indicating the file is being opened.
- It opens the specified file and stores the file object in
self.file. - It returns the file object, which is then assigned to the variable after the
askeyword in thewithstatement (filein this case).
__exit__(self, exc_type, exc_val, exc_tb): This method is called when thewithblock is exited, whether it completes successfully or an exception occurs.- It prints a message indicating the file is being closed.
- Resource Cleanup: It checks if
self.fileexists and, if so, calls itsclose()method, ensuring the file is always closed. - Exception Handling: The parameters
exc_type,exc_val, andexc_tbcontain information about any exception that was raised in thewithblock.- If an exception occurred (
exc_typeis notNone), it prints a message about the exception.
- If an exception occurred (
- Exception Suppression: Returning
False(the default return value if a function has noreturnstatement) from__exit__causes any exception that occurred inside thewithblock to be re-raised. Returning aTruevalue would suppress the exception.
Using the context manager
with FileManager("test.txt", "w") as file:: Thiswithstatement creates an instance ofFileManagerand calls its__enter__method. The returned file object is assigned to thefilevariable. The code within the indented block is then executed.file.write(...): Thefileobject is used to write data.- Exiting the block: After the writing is complete, the
__exit__method is automatically called. It closes the file, ensuring the changes are saved and the resource is released. - Second
withstatement: A new instance ofFileManageris created to read from the file. try...exceptblock: Wrapping thewithstatements in atry...exceptblock is a good practice for handling errors likeFileNotFoundErrororPermissionErrorthat might occur during the initialopen()call.
File Operations
File Handling Flow
flowchart TD
A[File Operations] --> B[Open File]
B --> C{File Opened?}
C -->|Yes| D[Perform Operations]
C -->|No| E[Handle Error]
D --> F[Read/Write/Append]
F --> G[Close File]
G --> H[End]
E --> H
I[File Modes] --> J[r - Read]
I --> K[w - Write]
I --> L[a - Append]
I --> M[r+ - Read/Write]
I --> N[x - Exclusive Create]classDiagram
class FileOperations {
+write_to_file(filename, content) void
+read_from_file(filename) str | None
+append_to_file(filename, content) void
}
class File {
+write(str) void
+read() str
+close() void
}
FileOperations --> File : usesflowchart TD
A[Start] --> B[Open file in 'w' mode]
B --> C{Error?}
C -- No --> D[Write content]
D --> E[Close file]
E --> F[Print success message]
C -- Yes --> G[Catch Exception]
G --> H[Print error message]
H --> I[End]
F --> I[End]# Basic file operations
def write_to_file(filename, content):
"""Write content to a file."""
try:
with open(filename, 'w') as file:
file.write(content)
print(f"Successfully wrote to {filename}")
except Exception as e:
print(f"Error writing to file: {e}")
def read_from_file(filename):
"""Read content from a file."""
try:
with open(filename, 'r') as file:
content = file.read()
return content
except FileNotFoundError:
print(f"File {filename} not found")
return None
except Exception as e:
print(f"Error reading file: {e}")
return None
def append_to_file(filename, content):
"""Append content to a file."""
try:
with open(filename, 'a') as file:
file.write(content)
print(f"Successfully appended to {filename}")
except Exception as e:
print(f"Error appending to file: {e}")
# Example usage
sample_text = """Python File Handling
This is a sample text file.
It contains multiple lines.
Each line demonstrates file operations."""
# Write to file
write_to_file("sample.txt", sample_text)
# Read from file
content = read_from_file("sample.txt")
if content:
print("File content:")
print(content)
# Append to file
append_to_file("sample.txt", "\nThis line was appended.")
# Read updated content
updated_content = read_from_file("sample.txt")
if updated_content:
print("\nUpdated file content:")
print(updated_content)PythonFile handling functions and the with statement
This code demonstrates the three fundamental file operations: writing, reading, and appending. It also effectively utilizes the with statement and try...except blocks for robust and safe file management.
- File Modes: The
open()function takes a file mode as an argument to specify the intended operation.'w'(Write): Opens a file for writing. If the file already exists, its contents are erased and replaced with the new data. If the file does not exist, a new one is created.'r'(Read): Opens a file for reading. If the file does not exist, aFileNotFoundErroris raised.'a'(Append): Opens a file for writing. New content is added to the end of the existing file. If the file does not exist, a new one is created.
with open(...) as file:: Thewithstatement creates a context manager that ensures files are properly closed automatically, even if errors occur. This eliminates the risk of resource leaks and is the recommended practice for file handling.try...except: Wrapping the file operations in atry...exceptblock is a standard practice for anticipating and handling potential exceptions gracefully.except FileNotFoundError: Catches the specific error that occurs when a file is opened in a read mode ('r') and does not exist.
write_to_file(filename, content)
- This function opens
filenamein'w'(write) mode. - The content from the
sample_textvariable is written to the file, overwriting any previous content. - A
try...exceptblock is included to catch potential errors during the file-writing process.
read_from_file(filename)
- This function opens
filenamein'r'(read) mode. - The
file.read()method is used to read the entire content of the file into a string. - A
try...exceptblock handles the case where the file might not exist (FileNotFoundError), returningNoneinstead of crashing the program.
append_to_file(filename, content)
- This function opens
filenamein'a'(append) mode. - The new
contentstring, starting with a newline character\n, is added to the end of the file, preserving the original content. - A
try...exceptblock catches any potential errors during the append operation.
Example usage
The script demonstrates a full cycle of file operations:
- Writing:
sample.txtis created with the initial text. - Reading: The content is read back and printed to the console.
- Appending: A new line is added to the end of
sample.txt. - Re-reading: The updated content of
sample.txtis read back and printed, showing the original text plus the new, appended line.
Advanced File Operations
flowchart TD
A[Start] --> B{"Save JSON?"}
B -- Yes --> C["save_json()"]
C --> D["students.json created"]
B -- No --> E{"Save CSV?"}
E -- Yes --> F["save_csv()"]
F --> G["students.csv created"]
E -- No --> H["Skip file ops"]
D --> I["Move to data/"]
G --> I
I --> J["explore_directory()"]
J --> K[End]project/
│── main.py
│── data/
│ ├── students.json
│ ├── students.csvPythonsequenceDiagram
participant U as User
participant S as save_json()
participant F as File (students.json)
U->>S: save_json(student_data, "students.json")
S->>F: open("students.json", "w")
S->>F: json.dump(data, indent=4)
F-->>S: Success
S-->>U: "Data saved to students.json"
U->>S: load_json("students.json")
S->>F: open("students.json", "r")
F-->>S: file content
S->>U: parsed dictimport os
import json
import csv
from pathlib import Path
# Working with different file formats
# JSON files
def save_json(data, filename):
"""Save data to JSON file."""
try:
with open(filename, 'w') as file:
json.dump(data, file, indent=4)
print(f"Data saved to {filename}")
except Exception as e:
print(f"Error saving JSON: {e}")
def load_json(filename):
"""Load data from JSON file."""
try:
with open(filename, 'r') as file:
return json.load(file)
except FileNotFoundError:
print(f"JSON file {filename} not found")
return None
except json.JSONDecodeError:
print(f"Invalid JSON in {filename}")
return None
# CSV files
def save_csv(data, filename, headers):
"""Save data to CSV file."""
try:
with open(filename, 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(headers)
writer.writerows(data)
print(f"CSV data saved to {filename}")
except Exception as e:
print(f"Error saving CSV: {e}")
def load_csv(filename):
"""Load data from CSV file."""
try:
with open(filename, 'r') as file:
reader = csv.reader(file)
data = list(reader)
return data
except FileNotFoundError:
print(f"CSV file {filename} not found")
return None
except Exception as e:
print(f"Error loading CSV: {e}")
return None
# Example data
student_data = {
"students": [
{"name": "Alice", "age": 20, "grade": 85},
{"name": "Bob", "age": 22, "grade": 90},
{"name": "Carol", "age": 19, "grade": 78}
],
"course": "Python Programming",
"semester": "Fall 2024"
}
# Save to JSON
save_json(student_data, "students.json")
# Load from JSON
loaded_data = load_json("students.json")
if loaded_data:
print("Loaded JSON data:")
print(f"Course: {loaded_data['course']}")
for student in loaded_data['students']:
print(f" {student['name']}: {student['grade']}")
# Save to CSV
csv_data = [[s['name'], s['age'], s['grade']] for s in student_data['students']]
csv_headers = ['Name', 'Age', 'Grade']
save_csv(csv_data, "students.csv", csv_headers)
# Load from CSV
loaded_csv = load_csv("students.csv")
if loaded_csv:
print("\nLoaded CSV data:")
for row in loaded_csv:
print(row)
# File and directory operations using pathlib
def explore_directory(path):
"""Explore directory contents."""
directory = Path(path)
if not directory.exists():
print(f"Directory {path} does not exist")
return
print(f"Contents of {path}:")
for item in directory.iterdir():
if item.is_file():
size = item.stat().st_size
print(f" 📄 {item.name} ({size} bytes)")
elif item.is_dir():
print(f" 📁 {item.name}/")
# Create directory if it doesn't exist
data_dir = Path("data")
data_dir.mkdir(exist_ok=True)
# Move files to the data directory
if Path("students.json").exists():
Path("students.json").rename(data_dir / "students.json")
if Path("students.csv").exists():
Path("students.csv").rename(data_dir / "students.csv")
explore_directory("data")PythonThis code demonstrates advanced file handling in Python, focusing on working with different file formats like JSON and CSV, and performing file system operations in a modern, object-oriented way using the pathlib module.
Working with JSON files
The json module is a built-in library for working with JSON (JavaScript Object Notation), a common data format for data exchange. Python dictionaries and lists can be seamlessly converted to and from JSON format.
save_json(data, filename):json.dump(data, file, indent=4): This function serializes a Python object (data) and writes it to a file-like object (file). Theindent=4argument formats the output to be more human-readable.
load_json(filename):json.load(file): This function deserializes a JSON file, converting its content back into a Python object.- Error handling: The function includes specific
exceptblocks to handleFileNotFoundError(if the file doesn’t exist) andjson.JSONDecodeError(if the file contains invalid JSON).
Working with CSV files
The csv module provides functionality to read from and write to CSV (Comma-Separated Values) files, which are commonly used for tabular data.
save_csv(data, filename, headers):newline='': This parameter in theopen()function prevents blank lines from being inserted between rows, a common issue when writing CSV files in Python.csv.writer(file): Creates a writer object that converts a list of lists into a CSV format.writer.writerow(headers): Writes a single row, typically used for headers.writer.writerows(data): Writes all rows from an iterable (like a list of lists) at once.
load_csv(filename):csv.reader(file): Returns a reader object that iterates over lines in the CSV file.data = list(reader): Reads all rows from the reader object and stores them as a list of lists.
File system operations with pathlib
The pathlib module (introduced in Python 3.4) offers a modern, object-oriented approach to file system paths, providing a cleaner and more cross-platform alternative to the older os.path module.
Path(...): Creates aPathobject that represents a file or directory path. Path objects have intuitive methods for file system interaction.directory.exists(): Checks if a path exists.directory.iterdir(): Iterates over all files and subdirectories within a directory.item.is_file()/item.is_dir(): Methods to determine if a path is a file or a directory./operator: A clean and intuitive way to join paths. For example,data_dir / "students.json"creates a newPathobject for the file inside thedata_dirdirectory.data_dir.mkdir(exist_ok=True): Creates a new directory. Theexist_ok=Trueargument prevents an error if the directory already exists, making the operation idempotent.path.rename(target): Moves or renames a file or directory.
Example walkthrough
- JSON Creation and Storage:
student_data(a Python dictionary with nested lists and dictionaries) is saved tostudents.jsonusingsave_json. - JSON Loading: The
students.jsonfile is loaded back into the program, and its contents are printed to verify the data integrity. - CSV Preparation and Storage: A list of lists is created from the
student_datato match the structure required for thecsv.writerowsfunction. This data is then saved tostudents.csv. - CSV Loading: The
students.csvfile is loaded and printed, showing the data arranged as a list of lists. - Directory Management: A directory named
datais created usingpathlib.Path.mkdir. Theexist_ok=Trueflag is crucial for ensuring the script can be run multiple times without failure. - File Movement: The previously created
students.jsonandstudents.csvfiles are moved into the newly createddatadirectory using therenamemethod. - Directory Exploration: The
explore_directoryfunction usespathlib‘siterdir()and path properties to inspect thedatadirectory and print its contents, including file names and sizes.
Modules and Packages
Module Structure
graph TD
A[Python Module System] --> B[Built-in Modules]
A --> C[Standard Library]
A --> D[Third-party Packages]
A --> E[Custom Modules]
B --> B1[math, random, datetime]
C --> C1[os, sys, json, csv]
D --> D1[requests, pandas, numpy]
E --> E1[Your own .py files]
F[Import Methods] --> G[import module]
F --> H[from module import function]
F --> I[import module as alias]
F --> J[from module import *]flowchart TD
subgraph MathOperations["math_operations.py"]
A1["add(a, b)"] --> R1["a + b"]
A2["subtract(a, b)"] --> R2["a - b"]
A3["multiply(a, b)"] --> R3["a * b"]
A4["divide(a, b)"] -->|Check b==0| Err1["ValueError"] --> R4["a / b"]
A5["power(base, exponent)"] --> R5["base ** exponent"]
A6["square_root(number)"] -->|Check number<0| Err2["ValueError"] --> R6["math.sqrt(number)"]
A7["factorial(n)"] -->|n<0| Err3["ValueError"]
A7 --> R7["n * factorial(n-1)"]
A8["is_prime(n)"] -->|"Loop 2..sqrt(n)"| R8["Check divisibility"]
A8 --> Bool1["Return True/False"]
A9["fibonacci(n)"] -->|"Base cases (0,1,2)"| R9["Generate sequence iteratively"]
Const1["PI = math.pi"]
Const2["E = math.e"]
end
%% Group outputs
R1 & R2 & R3 & R4 & R5 & R6 & R7 & R8 & R9 -.-> Output["Results"]Creating Custom Modules
# Create a file called math_operations.py
"""
Mathematical operations module.
Contains functions for common mathematical calculations.
"""
import math
def add(a, b):
"""Add two numbers."""
return a + b
def subtract(a, b):
"""Subtract two numbers."""
return a - b
def multiply(a, b):
"""Multiply two numbers."""
return a * b
def divide(a, b):
"""Divide two numbers."""
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
def power(base, exponent):
"""Calculate base raised to the power of exponent."""
return base ** exponent
def square_root(number):
"""Calculate square root of a number."""
if number < 0:
raise ValueError("Cannot calculate square root of negative number")
return math.sqrt(number)
def factorial(n):
"""Calculate factorial of a number."""
if n < 0:
raise ValueError("Factorial is not defined for negative numbers")
if n == 0 or n == 1:
return 1
return n * factorial(n - 1)
def is_prime(n):
"""Check if a number is prime."""
if n < 2:
return False
for i in range(2, int(math.sqrt(n)) + 1):
if n % i == 0:
return False
return True
def fibonacci(n):
"""Generate Fibonacci sequence up to n terms."""
if n <= 0:
return []
elif n == 1:
return [0]
elif n == 2:
return [0, 1]
sequence = [0, 1]
for i in range(2, n):
sequence.append(sequence[i-1] + sequence[i-2])
return sequence
# Module-level variables
PI = math.pi
E = math.e
# Module test code
if __name__ == "__main__":
# This code runs only when the module is executed directly
print("Testing math_operations module:")
print(f"Add: {add(5, 3)}")
print(f"Square root of 16: {square_root(16)}")
print(f"Factorial of 5: {factorial(5)}")
print(f"Is 17 prime? {is_prime(17)}")
print(f"First 10 Fibonacci numbers: {fibonacci(10)}")PythonThis is a well-structured Python module for mathematical operations, saved as math_operations.py. It demonstrates several important programming concepts.
Module structure and purpose
- Module: A Python module is simply a
.pyfile containing Python code. This file bundles related functions, variables, and constants (PI,E) into a reusable unit. Other scripts can import this module to use its functionality. - Docstrings (
"""..."""): The code uses docstrings at the module level and for each function. This is a best practice for documenting code. These docstrings can be accessed programmatically (e.g.,help(math_operations)) and provide a description of the code’s purpose and functionality. - Import: The statement
import mathbrings the functionality of Python’s built-inmathmodule into scope, allowing the custom functions to use it (e.g.,math.sqrtandmath.pi).
Mathematical functions
The module defines several functions for different mathematical tasks, with clear examples of good function design.
- Basic Arithmetic (
add,subtract,multiply): Simple functions that perform common arithmetic operations. - Division with Error Handling (
divide): This function includes explicit error handling to prevent division by zero. Ifbis0, it raises aValueError, which is a standard way to signal that a function received an invalid argument. - Advanced Operations (
power,square_root,factorial):power: Uses the**operator for exponentiation.square_root: Usesmath.sqrt()for efficiency and includes error handling for negative numbers, as the square root of a negative number is undefined in the real number system.factorial: Uses a recursive approach and handles edge cases for negative numbers and zero.
- Algorithmic Functions (
is_prime,fibonacci):is_prime: Efficiently checks for primality by only iterating up to the square root of the number.fibonacci: Generates a list of Fibonacci numbers up to a given termn. It correctly handles edge cases fornbeing 0, 1, or 2.
Module-level variables
- Constants (
PI,E):math.piandmath.eare assigned to module-level variables. This allows other scripts that importmath_operationsto access these constants more conveniently (e.g.,math_operations.PI).
Module entry point (if __name__ == "__main__":)
This code block is a standard Python idiom.
- How it works:
__name__is a built-in variable that is set to"__main__"when a script is executed directly from the command line.__name__is set to the module’s name (e.g.,"math_operations") when it is imported by another script.
- Purpose: The code inside this block (the “module test code”) will only run if you execute
math_operations.pydirectly. This allows a single file to contain both reusable functions for a larger program and a test suite or demonstration code for when it’s run independently.
Using Modules
flowchart TD
subgraph CustomModule["Custom Module: math_operations.py"]
M1["add()"]
M2["multiply()"]
M3["divide()"]
M4["is_prime()"]
M5["fibonacci()"]
M6["square_root()"]
end
subgraph BuiltInModules["Built-in Modules"]
B1["random"]
B2["datetime"]
B3["os"]
end
%% Import Methods
User["Your Script"] -->|Method 1: import math_operations| CustomModule
User -->|Method 2: from math_operations import multiply, divide, is_prime| M2 & M3 & M4
User -->|Method 3: import math_operations as math_ops| M5
User -->|Method 4: from math_operations import *| M6 & M1
%% Built-in
User --> B1
User --> B2
User --> B3
%% Examples
B1 --> R1["randint(1,100), choice(list)"]
B2 --> R2["datetime.now(), strftime()"]
B3 --> R3["getcwd(), listdir()"]# Different ways to import modules
# Method 1: Import entire module
import math_operations
result = math_operations.add(10, 5)
print(f"10 + 5 = {result}")
# Method 2: Import specific functions
from math_operations import multiply, divide, is_prime
product = multiply(4, 7)
quotient = divide(20, 4)
prime_check = is_prime(23)
print(f"4 * 7 = {product}")
print(f"20 / 4 = {quotient}")
print(f"Is 23 prime? {prime_check}")
# Method 3: Import with alias
import math_operations as math_ops
fibonacci_seq = math_ops.fibonacci(8)
print(f"Fibonacci sequence: {fibonacci_seq}")
# Method 4: Import all (use with caution)
from math_operations import *
sqrt_result = square_root(25)
print(f"Square root of 25: {sqrt_result}")
# Using built-in modules
import random
import datetime
import os
# Random module
random_number = random.randint(1, 100)
random_choice = random.choice(['apple', 'banana', 'orange'])
print(f"Random number: {random_number}")
print(f"Random choice: {random_choice}")
# Datetime module
current_time = datetime.datetime.now()
formatted_time = current_time.strftime("%Y-%m-%d %H:%M:%S")
print(f"Current time: {formatted_time}")
# OS module
current_directory = os.getcwd()
files_in_directory = os.listdir(".")
print(f"Current directory: {current_directory}")
print(f"Files in directory: {files_in_directory[:5]}") # Show first 5 filesPythonThis code demonstrates four different ways to import the custom math_operations module and showcases the use of several built-in standard library modules.
Methods for importing modules
The different import methods offer a trade-off between convenience and clarity.
Method 1: Import entire module
import math_operations: Imports the entiremath_operationsmodule.- Usage: To access any function or variable from the module, you must use the module name followed by a dot (
.), for example,math_operations.add(). - Pros: This is very explicit and makes it clear where a function or variable originated. It prevents naming conflicts if you import from multiple modules with similar function names.
- Cons: Can be more verbose.
Method 2: Import specific functions
from math_operations import multiply, divide, is_prime: Imports only the specified names directly into the current script’s namespace.- Usage: You can call the imported functions directly without a module prefix, for example,
multiply(). - Pros: Makes the code more concise and easier to read, especially if you only need a few functions.
- Cons: Can lead to naming conflicts if you import functions with the same name from different modules.
Method 3: Import with alias
import math_operations as math_ops: Imports the entire module but assigns it a shorter, more convenient alias.- Usage: You access the module’s contents using the alias, for example,
math_ops.fibonacci(). - Pros: Reduces typing for long module names while still providing a clear namespace. This is a common practice for scientific libraries like NumPy (
import numpy as np) and Pandas (import pandas as pd). - Cons: Adds another name to remember.
Method 4: Import all (from ... import *)
from math_operations import *: Imports all public names (functions, classes, variables) from the module directly into the current namespace.- Usage: You can call all public functions and variables directly, without any prefix, for example,
square_root(). - Pros: Extremely convenient for interactive sessions or small scripts where you need many items from a single module.
- Cons: Not recommended for production code. It makes the code less readable because it’s unclear where a function came from. It can also easily cause naming conflicts and is more difficult for automatic tools to analyze.
Built-in standard library modules
Python’s standard library includes many modules for common tasks.
randommodule: Provides functions for generating pseudo-random numbers and making random choices.random.randint(1, 100): Returns a random integer between 1 and 100, inclusive.random.choice([...]): Returns a randomly selected item from a sequence.
datetimemodule: Offers classes and functions for working with dates and times.datetime.datetime.now(): Returns adatetimeobject representing the current date and time.strftime(...): Formats adatetimeobject into a human-readable string based on specified codes.
osmodule: Provides a way to interact with the operating system, including file system operations.os.getcwd(): Returns the current working directory as a string.os.listdir("."): Returns a list of all files and directories in the current directory (.).
Package Creation
flowchart TD
subgraph my_package["📦 my_package/"]
direction TB
I0["__init__.py"]
subgraph geometry["📂 geometry/"]
I1["__init__.py"]
S1["shapes.py"]
S2["calculations.py"]
end
subgraph utilities["📂 utilities/"]
I2["__init__.py"]
U1["helpers.py"]
U2["validators.py"]
end
end
%% shapes.py classes
S1 --> C1["class Circle\n- radius\n+ area()\n+ circumference()"]
S1 --> C2["class Rectangle\n- width, height\n+ area()\n+ perimeter()"]
%% calculations.py functions
S2 --> F1["def distance(point1, point2)"]
S2 --> F2["def midpoint(point1, point2)"]
%% geometry/__init__.py exports
I1 -->|exports| C1
I1 -->|exports| C2
I1 -->|exports| F1
I1 -->|exports| F2
%% utilities placeholders (future use)
U1 --> UH["helper functions"]
U2 --> UV["validation functions"]
%% Usage Example
User["User Script"] -->|from my_package.geometry import Circle, Rectangle, distance| C1 & C2 & F1# Create a package structure:
# my_package/
# __init__.py
# geometry/
# __init__.py
# shapes.py
# calculations.py
# utilities/
# __init__.py
# helpers.py
# validators.py
# geometry/shapes.py
"""Geometric shapes module."""
import math
class Circle:
"""Circle class."""
def __init__(self, radius):
self.radius = radius
def area(self):
return math.pi * self.radius ** 2
def circumference(self):
return 2 * math.pi * self.radius
class Rectangle:
"""Rectangle class."""
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width * self.height
def perimeter(self):
return 2 * (self.width + self.height)
# geometry/__init__.py
"""Geometry package."""
from .shapes import Circle, Rectangle
from .calculations import distance, midpoint
__version__ = "1.0.0"
__all__ = ["Circle", "Rectangle", "distance", "midpoint"]
# geometry/calculations.py
"""Geometric calculations."""
import math
def distance(point1, point2):
"""Calculate distance between two points."""
x1, y1 = point1
x2, y2 = point2
return math.sqrt((x2 - x1)**2 + (y2 - y1)**2)
def midpoint(point1, point2):
"""Calculate midpoint between two points."""
x1, y1 = point1
x2, y2 = point2
return ((x1 + x2) / 2, (y1 + y2) / 2)
# Using the package
from my_package.geometry import Circle, Rectangle, distance
# Create shapes
circle = Circle(5)
rectangle = Rectangle(4, 6)
print(f"Circle area: {circle.area():.2f}")
print(f"Rectangle area: {rectangle.area()}")
# Calculate distance
point_a = (0, 0)
point_b = (3, 4)
dist = distance(point_a, point_b)
print(f"Distance between {point_a} and {point_b}: {dist}")PythonThis example demonstrates the creation and use of a Python package, a way to organize related modules into a directory hierarchy. The use of packages is essential for managing and reusing code effectively in larger projects.
Package structure
The directory structure shown is a standard layout for a Python package named my_package.
my_package/: The top-level package directory.__init__.py: This file marks themy_packagedirectory as a package and is executed when the package is imported.geometry/: A sub-package for all geometry-related code.__init__.py: Defines what functions and classes from thegeometrysub-package should be publicly available when a user imports it.shapes.py: A module containing classes for different geometric shapes, likeCircleandRectangle.calculations.py: A module containing utility functions for geometric calculations, likedistanceandmidpoint.
utilities/: Another sub-package for general utility functions (this package is defined but not used in the example).
The role of __init__.py
While __init__.py is no longer strictly required in Python 3.3+ for marking a directory as a package, it remains a common and best practice for several reasons.
- Simplifies imports: It allows you to expose specific functionality from inner modules to a higher level. For example, by importing
Circleinmy_package/geometry/__init__.py, users can import it withfrom my_package.geometry import Circleinstead of the longerfrom my_package.geometry.shapes import Circle. - Defines the namespace: The
__all__variable in__init__.pyexplicitly defines what gets imported when a user runs a wildcard import likefrom my_package.geometry import *. - Initializes the package: It can contain code that runs once when the package is first imported, which is useful for setting up package-wide configurations.
- Metadata: It is a good place to define package-level variables, such as
__version__.
Package components
geometry/shapes.py
This module defines classes like Circle and Rectangle. These classes encapsulate related data (radius, width, height) and functionality (area, circumference, perimeter).
geometry/calculations.py
This module defines standalone functions like distance and midpoint that operate on geometric data.
Using the package
from my_package.geometry import Circle, Rectangle, distance: This line imports specific items from thegeometrysub-package. Because__init__.pyexposesCircle,Rectangle, anddistance, they can be imported directly from thegeometrypackage instead of from their specific modules.circle = Circle(5): An instance of theCircleclass is created and used to perform calculations.dist = distance(point_a, point_b): Thedistancefunction is called directly to calculate the distance between two points.
This demonstrates how a well-structured package provides a clear and organized API, allowing developers to easily find and use the desired functionality.
Advanced Topics
Generators and Iterators
graph TD
A[Generators] --> B[Memory Efficient]
A --> C[Lazy Evaluation]
A --> D[yield keyword]
A --> E[Generator Expressions]
F[Iterators] --> G[__iter__ method]
F --> H[__next__ method]
F --> I[StopIteration]
J[Benefits] --> K[Memory Conservation]
J --> L[Performance]
J --> M[Infinite Sequences]flowchart TD
A["Call fibonacci_generator(5)"] --> B[Generator object created]
B -->|"next()"| C[Start execution: a=0, b=1, count=0]
C -->|"yield 0"| D[Pause state saved]
D -->|"next()"| E[Resume: a=1, b=1, count=1]
E -->|"yield 1"| F[Pause state saved]
F -->|"next()"| G[Resume: a=1, b=2, count=2]
G -->|"yield 1"| H[Pause state saved]
H -->|"next()"| I[Resume: a=2, b=3, count=3]
I -->|"yield 2"| J[Pause state saved]
J -->|"next()"| K[Resume: a=3, b=5, count=4]
K -->|"yield 3"| L[Pause state saved]
L -->|"next()"| M[Resume: count=5 -> stop condition]
M --> N[StopIteration raised]classDiagram
class CountDown {
- int start
+ __init__(start: int)
+ __iter__() Iterator
+ __next__() int
}
class Iterator {
<<interface>>
+ __iter__() Iterator
+ __next__() Any
}
CountDown ..|> Iterator# Generator functions
def fibonacci_generator(n):
"""Generate Fibonacci sequence using generator."""
a, b = 0, 1
count = 0
while count < n:
yield a
a, b = b, a + b
count += 1
# Using generator
print("Fibonacci sequence using generator:")
for num in fibonacci_generator(10):
print(num, end=" ")
print()
# Generator expressions
squares_gen = (x**2 for x in range(1, 6))
print("Squares generator:", list(squares_gen))
# Infinite generator
def infinite_counter():
"""Generate infinite sequence of numbers."""
num = 0
while True:
yield num
num += 1
counter = infinite_counter()
print("First 5 numbers from infinite counter:")
for _ in range(5):
print(next(counter), end=" ")
print()
# Custom iterator
class CountDown:
"""Custom iterator for countdown."""
def __init__(self, start):
self.start = start
def __iter__(self):
return self
def __next__(self):
if self.start <= 0:
raise StopIteration
self.start -= 1
return self.start + 1
print("Countdown from 5:")
for num in CountDown(5):
print(num, end=" ")
print()
# Generator for file processing (memory efficient)
def read_large_file(file_path):
"""Generator to read large files line by line."""
try:
with open(file_path, 'r') as file:
for line in file:
yield line.strip()
except FileNotFoundError:
print(f"File {file_path} not found")
return
# Create a sample file for demonstration
with open("sample_data.txt", "w") as f:
for i in range(1000):
f.write(f"Line {i + 1}: Some data here\n")
# Process large file efficiently
line_count = 0
for line in read_large_file("sample_data.txt"):
line_count += 1
if line_count <= 5: # Show first 5 lines
print(line)
print(f"Total lines processed: {line_count}")PythonGenerator functions and the yield keyword
- Generator functions: These are a special type of function that can pause their execution and resume later. Unlike normal functions that return a single value and terminate, generator functions use the
yieldkeyword to produce a sequence of values over time. fibonacci_generator(n): This function is a generator.yield a: Instead ofreturn,yieldis used to produce the next value in the Fibonacci sequence.- State preservation: The function’s local variables (
a,b,count) and state are preserved between calls. When theforloop requests the next value, the generator resumes from where it left off.
- Memory efficiency: Generators are memory-efficient because they produce one item at a time, so they don’t need to store the entire sequence in memory. This is especially useful for processing large datasets or infinite sequences.
Generator expressions
- **
squares_gen = (x**2 **for x in range(1, 6))**: This is a concise syntax for creating a generator object, similar to a list comprehension but with parentheses instead of square brackets. - Lazy evaluation: It does not immediately generate all the squares. It’s a generator object that produces the squares only when requested.
list(squares_gen): Converts the generator into a list, forcing all the values to be generated at once.
Infinite generators
infinite_counter(): This generator uses awhile Trueloop to produce an infinite sequence of numbers.next(counter): Thenext()function is used to manually advance the generator and get the next value.- This demonstrates that generators can handle sequences of arbitrary length without running out of memory.
Custom iterators
class CountDown:: This class demonstrates how to create a custom iterable object from scratch by implementing the__iter__and__next__special methods.__iter__(self): This method returns the iterator object itself (self).__next__(self): This method defines the iteration logic. It checks for a termination condition (self.start <= 0). If met, it raises aStopIterationexception, which signals to theforloop that the iteration is complete.
Generator for file processing
read_large_file(file_path): This generator provides a memory-efficient way to read large files.- How it works:
- The
with open(...)statement ensures proper file handling. for line in file:: This directly iterates over the file object, which itself is an efficient iterator that reads one line at a time.yield line.strip(): For each line read, the generator yields the stripped line.
- The
- Benefit: If the file were extremely large (many gigabytes), reading it line by line with this generator would consume a minimal amount of memory, unlike
file.readlines(), which would attempt to load the entire file into a list in memory. - Usage: The example shows how to process the file line by line without storing the entire file content, printing only the first few lines and then counting the rest. This illustrates the memory-saving advantage of using a generator.
Decorators
flowchart TD
A["Call slow_function()"] --> B["CallCounter wrapper (__call__)"]
B --> C[Increase call_count]
C --> D[Execute original slow_function]
D --> E[Return result]
E --> F[timer_decorator wrapper]
F --> G[Start timer]
G --> H[Execute wrapped function result]
H --> I[End timer]
I --> J[Print execution time]
J --> K[Return final result]classDiagram
class CallCounter {
- func
- call_count
+ __init__(func)
+ __call__(*args, **kwargs)
}
class timer_decorator {
<<function decorator>>
+ wrapper(func)
}
class repeat {
<<parameterized decorator>>
+ __call__(func)
}
class slow_function {
<<decorated>>
+ slow_function()
}
CallCounter --> slow_function : wraps
timer_decorator --> slow_function : wraps
repeat --> greet : wraps
class Temperature {
- float _celsius
+ __init__(celsius=0)
+ celsius : float
+ fahrenheit : float
+ kelvin : float
}
Temperature : + celsius (getter/setter)
Temperature : + fahrenheit (getter/setter)
Temperature : + kelvin (getter)import time
import functools
# Basic decorator
def timer_decorator(func):
"""Decorator to measure function execution time."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"{func.__name__} took {end_time - start_time:.4f} seconds")
return result
return wrapper
# Decorator with parameters
def repeat(times):
"""Decorator to repeat function execution."""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
results = []
for _ in range(times):
result = func(*args, **kwargs)
results.append(result)
return results
return wrapper
return decorator
# Class-based decorator
class CallCounter:
"""Decorator to count function calls."""
def __init__(self, func):
self.func = func
self.call_count = 0
functools.update_wrapper(self, func)
def __call__(self, *args, **kwargs):
self.call_count += 1
print(f"{self.func.__name__} has been called {self.call_count} times")
return self.func(*args, **kwargs)
# Using decorators
@timer_decorator
@CallCounter
def slow_function():
"""A function that takes some time."""
time.sleep(0.1)
return "Task completed"
@repeat(3)
def greet(name):
"""Greet a person."""
return f"Hello, {name}!"
# Test decorated functions
result = slow_function()
print(f"Result: {result}")
slow_function() # Call again to see call count
greetings = greet("Alice")
print(f"Greetings: {greetings}")
# Property decorators for classes
class Temperature:
"""Temperature class with property decorators."""
def __init__(self, celsius=0):
self._celsius = celsius
@property
def celsius(self):
"""Get temperature in Celsius."""
return self._celsius
@celsius.setter
def celsius(self, value):
"""Set temperature in Celsius."""
if value < -273.15:
raise ValueError("Temperature cannot be below absolute zero")
self._celsius = value
@property
def fahrenheit(self):
"""Get temperature in Fahrenheit."""
return (self._celsius * 9/5) + 32
@fahrenheit.setter
def fahrenheit(self, value):
"""Set temperature in Fahrenheit."""
self.celsius = (value - 32) * 5/9
@property
def kelvin(self):
"""Get temperature in Kelvin."""
return self._celsius + 273.15
# Using property decorators
temp = Temperature(25)
print(f"Temperature: {temp.celsius}°C, {temp.fahrenheit}°F, {temp.kelvin}K")
temp.fahrenheit = 86
print(f"After setting to 86°F: {temp.celsius}°C")PythonContext Managers
flowchart TD
A[Enter with timer_context] --> B[Record start_time]
B --> C[Execute wrapped block]
C --> D[After block finishes]
D --> E[Record end_time]
E --> F[Print execution time]
F --> G[Exit context]classDiagram
class DatabaseConnection {
- db_path : str
- connection : sqlite3.Connection
+ __init__(db_path: str)
+ __enter__() sqlite3.Connection
+ __exit__(exc_type, exc_val, exc_tb) bool
}flowchart TD
A[Enter with DatabaseConnection] --> B[Open sqlite3 connection]
B --> C[Return connection object]
C --> D[Execute DB operations inside with-block]
D --> E{Error occurred?}
E -- No --> F[Commit transaction]
E -- Yes --> G[Rollback transaction]
F --> H[Close connection]
G --> H
H --> I[Exit context]flowchart TD
A[Enter with file_backup] --> B{File exists?}
B -- Yes --> C[Create backup file]
B -- No --> D[Skip backup]
C --> E[Yield original file to block]
D --> E
E --> F{Exception raised?}
F -- Yes --> G[Restore from backup]
G --> H[Re-raise exception]
F -- No --> I[Continue execution]
H --> J[Remove backup file]
I --> J[Remove backup file]
J --> K[Exit context]import sqlite3
import tempfile
import os
# Simple context manager using contextlib
from contextlib import contextmanager
@contextmanager
def timer_context():
"""Context manager to measure execution time."""
start_time = time.time()
print("Starting timer...")
try:
yield
finally:
end_time = time.time()
print(f"Execution took {end_time - start_time:.4f} seconds")
# Database context manager
class DatabaseConnection:
"""Context manager for database connections."""
def __init__(self, db_path):
self.db_path = db_path
self.connection = None
def __enter__(self):
print(f"Opening database connection to {self.db_path}")
self.connection = sqlite3.connect(self.db_path)
return self.connection
def __exit__(self, exc_type, exc_val, exc_tb):
if self.connection:
if exc_type is None:
print("Committing transaction...")
self.connection.commit()
else:
print("Rolling back transaction due to error...")
self.connection.rollback()
self.connection.close()
print("Database connection closed")
# File backup context manager
@contextmanager
def file_backup(file_path):
"""Context manager to backup and restore files."""
backup_path = f"{file_path}.backup"
# Create backup if original exists
if os.path.exists(file_path):
with open(file_path, 'r') as original:
with open(backup_path, 'w') as backup:
backup.write(original.read())
print(f"Backup created: {backup_path}")
try:
yield file_path
except Exception as e:
# Restore backup on error
if os.path.exists(backup_path):
with open(backup_path, 'r') as backup:
with open(file_path, 'w') as original:
original.write(backup.read())
print(f"File restored from backup due to error: {e}")
raise
finally:
# Clean up backup
if os.path.exists(backup_path):
os.remove(backup_path)
print("Backup file removed")
# Using context managers
print("Testing timer context manager:")
with timer_context():
time.sleep(0.5)
result = sum(range(1000000))
print(f"\nResult: {result}")
# Database operations with context manager
with DatabaseConnection(":memory:") as conn:
cursor = conn.cursor()
cursor.execute("""
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE
)
""")
cursor.execute("INSERT INTO users (name, email) VALUES (?, ?)",
("Alice", "alice@example.com"))
cursor.execute("INSERT INTO users (name, email) VALUES (?, ?)",
("Bob", "bob@example.com"))
cursor.execute("SELECT * FROM users")
users = cursor.fetchall()
print("Users in database:")
for user in users:
print(f" {user}")
# File backup example
test_file = "test_file.txt"
with open(test_file, 'w') as f:
f.write("Original content")
print(f"\nTesting file backup context manager:")
try:
with file_backup(test_file) as file_path:
with open(file_path, 'w') as f:
f.write("Modified content")
# Simulate an error
# raise Exception("Something went wrong!")
print("File modified successfully")
except Exception as e:
print(f"Error occurred: {e}")
# Check final file content
with open(test_file, 'r') as f:
print(f"Final file content: {f.read()}")PythonThis code provides practical examples of context managers, a Python feature that simplifies resource management by ensuring that setup and teardown operations are handled automatically, even when errors occur.
Context managers with @contextmanager
timer_context(): This is a function-based context manager created using the@contextmanagerdecorator from thecontextlibmodule.- Setup: The code before the
yieldstatement is executed when thewithblock is entered. In this case, it records the start time and prints a message. - Teardown: The code in the
finallyblock is executed when thewithblock is exited, regardless of whether an exception occurred. This ensures the timer always stops and reports the execution duration. - Usage: The
with timer_context():block provides a clean, elegant way to time any block of code.
- Setup: The code before the
Class-based context managers
DatabaseConnection: This class-based context manager handles SQLite database connections, providing a robust way to manage database transactions and prevent resource leaks.__init__(self, db_path): Initializes the instance with the database path. The:memory:path creates a temporary, in-memory database that is destroyed when the connection is closed.__enter__(self): Establishes the database connection when thewithblock is entered. The connection object is returned and assigned to theas connvariable.__exit__(self, exc_type, exc_val, exc_tb): Manages the connection cleanup when the block is exited.- It checks
exc_typeto determine if an exception occurred. - If no exception occurred (
exc_type is None), it callsconn.commit()to save the changes. - If an exception occurred, it calls
conn.rollback()to undo any changes, ensuring data integrity. - It always calls
conn.close()to ensure the resource is released.
- It checks
File backup context manager
file_backup(file_path): This is another function-based context manager that creates a backup of a file before an operation and restores it if an error occurs.- Backup: In the setup phase, it creates a backup of the original file.
- Restore on error: If an exception is caught within the
try...finallyblock, it restores the file from the backup before re-raising the exception. - Cleanup: In the
finallyblock, it always removes the temporary backup file, keeping the file system clean.
Usage and demonstration
- Timer usage: A
timer_contextis used to time a simple loop, demonstrating its ability to measure execution time. - Database usage: A
DatabaseConnectionis used with an in-memory SQLite database to perform database operations. Thewithstatement ensures the connection is automatically handled, and the transaction is committed correctly upon successful completion. - File backup usage: The
file_backupcontext manager demonstrates how to protect files from modification errors.- The
with file_backup(...)block handles the modification, and thetry...exceptblock outside the context manager catches any errors, such as the commented-outraise Exception. - The final
open(...)call verifies the file’s content after the context manager has done its work.
- The
Web Development
Web Development Overview
graph TD
A[Web Development with Python] --> B[Frameworks]
A --> C[Components]
A --> D[Deployment]
B --> B1[Flask - Lightweight]
B --> B2[Django - Full-featured]
B --> B3[FastAPI - Modern & Fast]
C --> C1[Templates]
C --> C2[Forms]
C --> C3[Database]
C --> C4[Authentication]
C --> C5[APIs]
D --> D1[Heroku]
D --> D2[AWS]
D --> D3[Docker]flowchart TD
A["Start Application"] --> B["init_fastapi_db()"]
B --> C["uvicorn.run(app)"]
C --> D[User sends request]
D --> E{Route?}
E -->|"GET '/'"| F["read_root()"]
E -->|"GET '/api/posts'"| G["get_posts()"]
E -->|"GET '/api/posts/{id}'"| H["get_post()"]
E -->|"POST '/api/posts'"| I["create_post()"]
E -->|"DELETE '/api/posts/{id}'"| J["delete_post()"]
G --> K[Fetch all posts from DB]
H --> L[Fetch single post from DB]
I --> M[Insert post into DB]
J --> N[Delete post from DB]
K --> O[Return JSON List of Posts]
L --> O
M --> O[Return success + post_id]
N --> O[Return success message]
F --> P[Return HTML page]
O --> Q[Response to Client]
P --> QClass Diagram (conceptual model for DB and API)
classDiagram
class Post {
+ id: int
+ title: str
+ content: str
+ author: str
+ created_at: datetime
}
class PostCreate {
+ title: str
+ content: str
+ author: str
}
class Database {
+ init_fastapi_db()
+ get_db()
}
class API {
+ read_root()
+ get_posts()
+ get_post(post_id: int)
+ create_post(post: PostCreate)
+ delete_post(post_id: int)
}
Database --> Post
API --> Post
API --> PostCreate# Install Flask: pip install flask
from flask import Flask, render_template, request, jsonify, redirect, url_for
import sqlite3
import os
app = Flask(__name__)
app.secret_key = 'your-secret-key-here'
# Database setup
def init_db():
"""Initialize the database."""
conn = sqlite3.connect('blog.db')
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS posts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
content TEXT NOT NULL,
author TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
conn.commit()
conn.close()
def get_db_connection():
"""Get database connection."""
conn = sqlite3.connect('blog.db')
conn.row_factory = sqlite3.Row
try:
yield conn
finally:
conn.close()
def init_fastapi_db():
"""Initialize FastAPI database."""
conn = sqlite3.connect('fastapi_blog.db')
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS posts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
content TEXT NOT NULL,
author TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
conn.commit()
conn.close()
# API Endpoints
@app.get("/", response_class=HTMLResponse)
async def read_root():
"""Serve the main page."""
html_content = """
<!DOCTYPE html>
<html>
<head>
<title>FastAPI Blog</title>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
.post { border: 1px solid #ddd; padding: 20px; margin: 10px 0; }
.nav { margin-bottom: 20px; }
.nav a { margin-right: 10px; color: #007bff; text-decoration: none; }
</style>
</head>
<body>
<h1>FastAPI Blog</h1>
<div class="nav">
<a href="/docs">API Documentation</a>
<a href="/api/posts">View Posts JSON</a>
</div>
<p>Use the API endpoints to manage blog posts. Check the documentation at <a href="/docs">/docs</a></p>
</body>
</html>
"""
return HTMLResponse(content=html_content)
@app.get("/api/posts", response_model=List[Post])
async def get_posts(db: sqlite3.Connection = Depends(get_db)):
"""Get all blog posts."""
cursor = db.cursor()
cursor.execute('SELECT * FROM posts ORDER BY created_at DESC')
posts = cursor.fetchall()
return [Post(
id=post[0],
title=post[1],
content=post[2],
author=post[3],
created_at=post[4]
) for post in posts]
@app.get("/api/posts/{post_id}", response_model=Post)
async def get_post(post_id: int, db: sqlite3.Connection = Depends(get_db)):
"""Get a specific post by ID."""
cursor = db.cursor()
cursor.execute('SELECT * FROM posts WHERE id = ?', (post_id,))
post = cursor.fetchone()
if not post:
raise HTTPException(status_code=404, detail="Post not found")
return Post(
id=post[0],
title=post[1],
content=post[2],
author=post[3],
created_at=post[4]
)
@app.post("/api/posts", response_model=dict)
async def create_post(post: PostCreate, db: sqlite3.Connection = Depends(get_db)):
"""Create a new blog post."""
cursor = db.cursor()
cursor.execute('INSERT INTO posts (title, content, author) VALUES (?, ?, ?)',
(post.title, post.content, post.author))
post_id = cursor.lastrowid
db.commit()
return {"message": "Post created successfully", "post_id": post_id}
@app.delete("/api/posts/{post_id}")
async def delete_post(post_id: int, db: sqlite3.Connection = Depends(get_db)):
"""Delete a blog post."""
cursor = db.cursor()
cursor.execute('DELETE FROM posts WHERE id = ?', (post_id,))
if cursor.rowcount == 0:
raise HTTPException(status_code=404, detail="Post not found")
db.commit()
return {"message": "Post deleted successfully"}
if __name__ == "__main__":
init_fastapi_db()
uvicorn.run(app, host="0.0.0.0", port=8000)PythonThis Python script is a web application that manages blog posts using a SQLite database. The code initially imports modules for Flask but is ultimately configured to run using FastAPI, a more modern and high-performance framework for building APIs.
Here is a breakdown of the code and the concepts it demonstrates.
Web framework setup
The script uses two popular Python web frameworks, though only FastAPI is utilized in the final execution.
- Flask imports: The lines
from flask import Flask, render_template, request, jsonify, redirect, url_forindicate an initial intent to build a traditional web application using Flask. However, theappinstance is later created usingFastAPI(). - FastAPI initialization:
app = Flask(__name__)is the typical way to create a Flask app instance. In this code, it appears as a leftover from an earlier version or a misinterpretation, as the subsequent@app.getdecorators belong to FastAPI.import uvicornand other FastAPI-related modules are also not present, although the final execution commanduvicorn.run(app, ...)correctly runs a FastAPI application. The decorator@app.get("/", response_class=HTMLResponse)is also for FastAPI, not Flask. A corrected version would remove the Flask imports andapp = Flask(__name__)and explicitly importFastAPIto create the app instance.
uvicorn.run(): This is the command used to run a FastAPI application. Uvicorn is an ASGI (Asynchronous Server Gateway Interface) server that runs FastAPI and other asynchronous frameworks.
Database setup and connection
The script uses the built-in sqlite3 module to interact with a local database file.
init_db(): This function is designed for a Flask app, creating ablog.dbfile. The code initializes the database with apoststable, but it is not called in the final executable block.get_db_connection(): This function is a context manager designed for Flask or traditional web apps. It returns a connection that behaves like a dictionary, providing name-based access to columns. This function is also not used by the FastAPI endpoints.init_fastapi_db(): This function, called at the end of the script, correctly initializes the database for the FastAPI application by creating afastapi_blog.dbfile and apoststable within it.Depends(get_db): This FastAPI dependency injection pattern is intended to provide a database connection to the endpoints. It’s not fully defined in the provided code snippet but serves the same purpose as theget_db_connection()function: managing the database connection lifecycle for each request.
API endpoints (using FastAPI)
The core of the application consists of RESTful API endpoints for managing blog posts, defined using FastAPI’s decorators.
@app.get("/"): Serves a root HTML page with links to the API documentation and the JSON list of posts. Theasynckeyword indicates that this function is asynchronous.@app.get("/api/posts"): Retrieves and returns all blog posts from the database. It usesresponse_model=List[Post]for automatic data validation and serialization.@app.get("/api/posts/{post_id}"): Retrieves a single blog post based on itsID. It includes error handling to return a404 Not Foundif the post does not exist.@app.post("/api/posts"): Creates a new blog post by inserting data into the database. It uses a Pydantic model (PostCreate) for automatic request body validation.@app.delete("/api/posts/{post_id}"): Deletes a blog post by itsID. It returns a404 Not Founderror if the post ID does not exist in the database.
Execution block
The if __name__ == "__main__": block ensures that the following code only runs when the script is executed directly.
init_fastapi_db(): Initializes the database for the FastAPI app.uvicorn.run(app, host="0.0.0.0", port=8000): Starts the ASGI server (Uvicorn) that hosts the FastAPI application. This makes the web application available athttp://127.0.0.1:8000orhttp://localhost:8000. The0.0.0.0host is used to make the application accessible externally, not just locally.
REST API Client
flowchart TD
A[Start Client] --> B["Initialize BlogAPIClient(base_url)"]
B --> C{Action?}
C -->|create_post| D[Prepare JSON payload]
D --> E[POST /api/posts]
E --> F[Server Response: Post Created]
C -->|get_posts| G[GET /api/posts]
G --> H[Server Response: List of Posts]
C -->|"get_post(post_id)"| I["GET /api/posts/{id}"]
I --> J[Server Response: Single Post]
C -->|"delete_post(post_id)"| K["DELETE /api/posts/{id}"]
K --> L[Server Response: Delete Confirmation]
F --> M[Return JSON to Client]
H --> M
J --> M
L --> M
M --> N[Display Results]Class Diagram – BlogAPIClient
classDiagram
class BlogAPIClient {
- base_url: str
+ __init__(base_url="http://localhost:8000")
+ get_posts() dict
+ get_post(post_id: int) dict
+ create_post(title: str, content: str, author: str) dict
+ delete_post(post_id: int) dict
}
class requests {
+ get(url, **kwargs)
+ post(url, **kwargs)
+ delete(url, **kwargs)
}
BlogAPIClient --> requests : "uses"# API client example using requests library
import requests
import json
class BlogAPIClient:
"""Client for interacting with the blog API."""
def __init__(self, base_url="http://localhost:8000"):
self.base_url = base_url
def get_posts(self):
"""Get all posts."""
response = requests.get(f"{self.base_url}/api/posts")
response.raise_for_status()
return response.json()
def get_post(self, post_id):
"""Get a specific post."""
response = requests.get(f"{self.base_url}/api/posts/{post_id}")
response.raise_for_status()
return response.json()
def create_post(self, title, content, author):
"""Create a new post."""
data = {
"title": title,
"content": content,
"author": author
}
response = requests.post(
f"{self.base_url}/api/posts",
json=data,
headers={"Content-Type": "application/json"}
)
response.raise_for_status()
return response.json()
def delete_post(self, post_id):
"""Delete a post."""
response = requests.delete(f"{self.base_url}/api/posts/{post_id}")
response.raise_for_status()
return response.json()
# Example usage
if __name__ == "__main__":
client = BlogAPIClient()
# Create a new post
new_post = client.create_post(
title="My First API Post",
content="This post was created using the API client!",
author="API User"
)
print(f"Created post: {new_post}")
# Get all posts
posts = client.get_posts()
print(f"Total posts: {len(posts)}")
# Get specific post
if posts:
first_post = client.get_post(posts[0]['id'])
print(f"First post: {first_post['title']}")PythonThis Python script provides a robust and reusable client for interacting with a blog API. It uses the popular requests library to send HTTP requests, which is the standard choice for performing API calls in Python due to its ease of use and comprehensive features.
BlogAPIClient class
The client is implemented as a class, which encapsulates all the logic for interacting with the API. This is a best practice that promotes modularity and maintainability.
__init__(self, base_url): The constructor initializes the client with thebase_urlof the API. This makes the client easily configurable for different environments (e.g., development, staging, production).- Method-per-endpoint design: Each public method (
get_posts,get_post,create_post,delete_post) corresponds to a specific API endpoint. This creates a clear and intuitive interface for the user. requests.get(...): This method is used to send HTTPGETrequests for retrieving data.GETrequests do not modify resources on the server.requests.post(...): This method is used to send HTTPPOSTrequests for creating a new resource.json=data: Therequestslibrary provides a convenientjsonparameter to automatically handle JSON serialization and set the appropriateContent-Typeheader toapplication/json.
requests.delete(...): This method is used to send HTTPDELETErequests for deleting a resource identified by its URL.response.raise_for_status(): This method is a crucial part of robust error handling.- It checks the response status code and automatically raises an
HTTPErrorif the response indicates an error (a status code in the 4xx or 5xx range). - This prevents “silent failures” and ensures that the program does not proceed with an invalid or incomplete response.
- It checks the response status code and automatically raises an
response.json(): This method is used to parse the JSON content of the response into a Python dictionary or list. It is the standard way to handle JSON responses from APIs.
Example usage (if __name__ == "__main__":)
This block demonstrates how to use the BlogAPIClient to perform a sequence of actions.
- Client instantiation: An instance of
BlogAPIClientis created. create_post(): A new post is created by calling thecreate_postmethod. The method sends the data to the API, and the response is printed to the console.get_posts(): All posts are retrieved from the API, and the total count is displayed. This shows that the newly created post is now included.get_post(): A specific post is retrieved by ID. Thepostslist from the previous step is used to get the ID of the first post. The title of this post is then printed, demonstrating that the data was retrieved correctly.
Data Science and Analytics
Data Science Ecosystem
graph TD
A[Data Science with Python] --> B[Data Collection]
A --> C[Data Processing]
A --> D[Analysis & Visualization]
A --> E[Machine Learning]
B --> B1[Web Scraping]
B --> B2[APIs]
B --> B3[Databases]
B --> B4[Files - CSV/JSON/Excel]
C --> C1[Pandas]
C --> C2[NumPy]
C --> C3[Data Cleaning]
D --> D1[Matplotlib]
D --> D2[Seaborn]
D --> D3[Plotly]
D --> D4[Statistical Analysis]
E --> E1[Scikit-learn]
E --> E2[TensorFlow]
E --> E3[PyTorch]NumPy Fundamentals
flowchart TD
A[Start Program] --> B["Import NumPy, Pandas, Matplotlib, Seaborn"]
B --> C["Create Arrays arr1...arr6"]
C --> D["Print Array Properties: shape, dtype"]
D --> E["Array Operations: Addition, Multiplication, Dot Product, Sqrt"]
E --> F["Generate Random Data (Normal Dist.)"]
F --> G["Compute Statistics: Mean, Median, Std, Min, Max"]
G --> H[Define Matrix]
H --> I[Indexing & Slicing]
I --> J["Boolean Indexing → Extract Even Numbers"]
J --> K[End Program]Class Diagram – NumPy Objects in Use
classDiagram
class ndarray {
+ shape: tuple
+ dtype: data-type
+ ndim: int
+ size: int
+ itemsize: int
+ T
+ reshape()
+ flatten()
+ sum()
+ mean()
+ std()
+ min()
+ max()
+ dot()
}
class numpy {
+ array(object)
+ zeros(shape)
+ ones(shape)
+ arange(start, stop, step)
+ linspace(start, stop, num)
+ random.normal(mean, std, size)
+ sqrt()
+ dot()
}
numpy --> ndarray : "creates"# Install required packages: pip install numpy pandas matplotlib seaborn
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# NumPy basics
print("=== NumPy Fundamentals ===")
# Creating arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
arr3 = np.zeros((3, 4))
arr4 = np.ones((2, 3))
arr5 = np.arange(0, 10, 2) # Start, stop, step
arr6 = np.linspace(0, 1, 5) # Start, stop, number of points
print(f"1D Array: {arr1}")
print(f"2D Array:\n{arr2}")
print(f"Shape of arr2: {arr2.shape}")
print(f"Data type: {arr1.dtype}")
# Array operations
print("\n=== Array Operations ===")
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
print(f"Addition: {a + b}")
print(f"Multiplication: {a * b}")
print(f"Dot product: {np.dot(a, b)}")
print(f"Square root: {np.sqrt(a)}")
# Statistical operations
data = np.random.normal(100, 15, 1000) # Mean=100, Std=15, 1000 samples
print(f"\nStatistical Operations:")
print(f"Mean: {np.mean(data):.2f}")
print(f"Median: {np.median(data):.2f}")
print(f"Standard deviation: {np.std(data):.2f}")
print(f"Min: {np.min(data):.2f}, Max: {np.max(data):.2f}")
# Array indexing and slicing
matrix = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
print(f"\nMatrix:\n{matrix}")
print(f"Element at [1,2]: {matrix[1, 2]}")
print(f"First row: {matrix[0, :]}")
print(f"Second column: {matrix[:, 1]}")
print(f"Submatrix:\n{matrix[1:3, 1:3]}")
# Boolean indexing
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
even_numbers = arr[arr % 2 == 0]
print(f"Even numbers: {even_numbers}")PythonThis script provides a concise and comprehensive introduction to NumPy, the fundamental library for numerical computing in Python. It covers the key concepts of creating, manipulating, and operating on arrays, highlighting the power and efficiency of this library for scientific and data analysis tasks.
NumPy fundamentals
np.array(): The primary function for creating a NumPy array from a Python list or tuple. Arrays are homogeneous, meaning all elements must be of the same data type, which allows for fast and efficient computations.- Array creation functions: NumPy provides convenient functions to create arrays with pre-filled values or in a specific pattern.
np.zeros((3, 4)): Creates a 3×4 array with all elements initialized to 0.np.ones((2, 3)): Creates a 2×3 array with all elements initialized to 1.np.arange(0, 10, 2): Similar to Python’srange(), this function creates an array with a fixed step size.np.linspace(0, 1, 5): Creates an array with a specified number of evenly spaced points between a start and stop value.
.shape: An attribute that returns a tuple indicating the size of the array along each dimension..dtype: An attribute that describes the data type of the elements in the array.
Array operations
- Element-wise operations: When standard arithmetic operators (
+,*, etc.) are used on two arrays, the operation is applied element by element, producing a new array. np.dot(a, b): Calculates the dot product of two arrays, which is a common linear algebra operation.np.sqrt(a): A universal function (ufunc) that applies a mathematical function to each element of an array.
Statistical operations
np.random.normal(100, 15, 1000): Generates a sample of random numbers from a normal distribution with a specified mean (100), standard deviation (15), and size (1000).np.mean(),np.median(),np.std(),np.min(),np.max(): NumPy includes a comprehensive set of statistical functions for performing calculations on arrays.
Array indexing and slicing
- Accessing elements: Individual elements can be accessed using zero-based indexing, similar to Python lists, but for multi-dimensional arrays, you specify indices for each dimension, separated by a comma (e.g.,
matrix[1, 2]). - Slicing: Subsets of an array can be extracted using slicing syntax (
start:stop:step). A colon alone represents all elements along that axis.matrix[0, :]: Selects all columns of the first row.matrix[:, 1]: Selects all rows of the second column.matrix[1:3, 1:3]: Selects a submatrix from rows 1 to 2 and columns 1 to 2.
- Boolean indexing: This powerful feature allows you to filter elements based on a condition, using a boolean array (or “mask”) to select only the elements that satisfy the condition.
arr[arr % 2 == 0]: The expressionarr % 2 == 0creates a boolean array ([False, True, False, True, ...]). This is then used to index the original arrayarr, returning only the elements where the mask isTrue.
Pandas Data Manipulation
flowchart TD
A[Start Program] --> B[Create Data Dictionary]
B --> C["Convert to DataFrame (df)"]
C --> D["Display Info: Shape, Columns, Dtypes"]
D --> E[Data Selection & Filtering]
E --> F["Add New Columns: Salary_USD, Salary_EUR, Experience_Level"]
F --> G["GroupBy Department → Aggregate Stats"]
G --> H[Generate Large Dataset sales_df]
H --> I[Add Month & Quarter Columns]
I --> J["Pivot Table: Avg Sales by Product & Region"]
J --> K["GroupBy Month → Monthly Sales"]
K --> L[End Program]classDiagram
class pandas {
+ DataFrame(data)
+ Series(data)
+ date_range(start, periods)
}
class DataFrame {
+ shape: tuple
+ columns: Index
+ dtypes: Series
+ head(n)
+ groupby(keys)
+ agg(func)
+ pivot_table(values, index, columns, aggfunc)
+ apply(func)
}
class Series {
+ values: ndarray
+ dtype
+ head(n)
+ apply(func)
}
pandas --> DataFrame : "creates"
pandas --> Series : "creates"
DataFrame --> Series : "columns are"
DataFrame --> DataFrame : "groupby/agg returns"print("\n=== Pandas Data Manipulation ===")
# Creating DataFrames
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'Age': [25, 30, 35, 28, 32],
'City': ['New York', 'London', 'Tokyo', 'Paris', 'Sydney'],
'Salary': [70000, 80000, 90000, 75000, 85000],
'Department': ['IT', 'Finance', 'IT', 'HR', 'Finance']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Basic DataFrame operations
print(f"\nDataFrame Info:")
print(f"Shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print(f"Data types:\n{df.dtypes}")
# Data selection and filtering
print("\n=== Data Selection ===")
print(f"Names column:\n{df['Name']}")
print(f"First 3 rows:\n{df.head(3)}")
print(f"IT employees:\n{df[df['Department'] == 'IT']}")
print(f"High earners (>80000):\n{df[df['Salary'] > 80000]}")
# Adding new columns
df['Salary_USD'] = df['Salary']
df['Salary_EUR'] = df['Salary'] * 0.85 # Approximate conversion
df['Experience_Level'] = df['Age'].apply(lambda x: 'Senior' if x >= 30 else 'Junior')
print(f"\nDataFrame with new columns:\n{df}")
# Grouping and aggregation
print("\n=== Grouping and Aggregation ===")
dept_stats = df.groupby('Department').agg({
'Salary': ['mean', 'min', 'max'],
'Age': 'mean'
}).round(2)
print("Department Statistics:")
print(dept_stats)
# Data manipulation examples
print("\n=== Advanced Data Manipulation ===")
# Creating a larger dataset for demonstration
np.random.seed(42)
large_data = {
'Date': pd.date_range('2024-01-01', periods=100),
'Sales': np.random.normal(1000, 200, 100),
'Product': np.random.choice(['A', 'B', 'C'], 100),
'Region': np.random.choice(['North', 'South', 'East', 'West'], 100),
'Customer_Count': np.random.poisson(50, 100)
}
sales_df = pd.DataFrame(large_data)
sales_df['Month'] = sales_df['Date'].dt.month
sales_df['Quarter'] = sales_df['Date'].dt.quarter
print("Sales DataFrame (first 10 rows):")
print(sales_df.head(10))
# Pivot tables
print("\nPivot Table - Average Sales by Product and Region:")
pivot_table = sales_df.pivot_table(
values='Sales',
index='Product',
columns='Region',
aggfunc='mean'
).round(2)
print(pivot_table)
# Time series analysis
monthly_sales = sales_df.groupby('Month')['Sales'].sum()
print(f"\nMonthly Sales:\n{monthly_sales}")PythonThis code provides a practical and comprehensive overview of key data manipulation techniques using the Pandas library, a foundational tool for data analysis in Python. It covers creating DataFrames, selecting and filtering data, adding columns, aggregating data, reshaping with pivot tables, and basic time series operations.
Creating and inspecting a DataFrame
pd.DataFrame(data): Creates a Pandas DataFrame, a 2D labeled data structure resembling a spreadsheet or database table, from a Python dictionary.df.shape: Returns a tuple representing the dimensions of the DataFrame ((rows, columns)).df.columns: Returns a list of the column names.df.dtypes: Returns the data type of each column, which is important for understanding how to perform operations on the data.
Data selection and filtering
Pandas offers powerful and intuitive ways to access and filter data.
df['Name']: Selects a single column by name. This returns a Pandas Series, which is a 1D labeled array.df.head(3): Displays the first three rows of the DataFrame, which is useful for a quick look at the data.df[df['Department'] == 'IT']: This is a powerful technique called boolean indexing. The expressiondf['Department'] == 'IT'returns a boolean Series (Truefor ‘IT’ rows,Falseotherwise), and using it to index the DataFrame filters for only the rows where the condition isTrue.df[df['Salary'] > 80000]: Another example of boolean indexing, this time filtering for rows based on a numerical condition.
Adding and manipulating columns
df['Salary_EUR'] = df['Salary'] * 0.85: Adds a new column,Salary_EUR, by performing a vectorized arithmetic operation on an existing column. Vectorized operations are highly efficient in Pandas.df['Age'].apply(lambda x: ...): The.apply()method is used to apply a function (here, alambdafunction) to every value in a Series (df['Age']), creating a new Series to be used for theExperience_Levelcolumn.
Grouping and aggregation
The groupby() method is one of Pandas’ most powerful features, enabling a “split-apply-combine” strategy for data analysis.
df.groupby('Department'): Splits the DataFrame into groups based on the unique values in theDepartmentcolumn..agg(...): The.agg()method applies one or more aggregation functions to the grouped data. In this example, it calculates the mean, min, and max ofSalaryand the mean ofAgefor each department..round(2): Rounds the result to two decimal places for cleaner output.
Advanced data manipulation
The code also shows more complex data operations.
- Creating a larger dataset: A sample dataset is created to demonstrate more realistic data manipulation tasks, including dates and different data distributions using NumPy.
pd.date_range(...): Generates a range of dates..dt.monthand.dt.quarter:Pandasprovides special.dtaccessors for datetime Series, allowing you to extract components like month and quarter.pivot_table(...): This method is a powerful tool for reshaping and summarizing data, similar to pivot tables in spreadsheet applications.values='Sales': Specifies the column to aggregate.index='Product': Groups the data byProductfor the rows.columns='Region': Groups the data byRegionfor the columns.aggfunc='mean': Defines the aggregation function to apply.
- Time series analysis:
sales_df.groupby('Month')['Sales'].sum(): Groups the sales data by month and calculates the total sales for each month, demonstrating a simple time series aggregation. Pandas provides extensive tools for working with time series data.
Data Visualization
flowchart TD
A[Start Visualization] --> B[Set Style seaborn-v0_8]
B --> C[Create Subplots 2x2 Matplotlib]
C --> D1[Line Plot: Monthly Sales Trend]
C --> D2[Bar Plot: Avg Sales by Product]
C --> D3[Histogram: Sales Distribution]
C --> D4[Scatter Plot: Sales vs Customer Count]
D4 --> E[Add Colorbar by Month]
D1 & D2 & D3 & E --> F[Save Plot sales_analysis.png]
F --> G[Show Matplotlib Figure]
G --> H[Create Subplots 2x2 Seaborn]
H --> I1[Box Plot: Sales by Region]
H --> I2[Heatmap: Correlation Matrix]
H --> I3[Violin Plot: Sales by Quarter]
H --> I4[Count Plot: Product by Region]
I4 --> J[Save Plot advanced_sales_analysis.png]
J --> K[Show Seaborn Figure]
K --> L[End]classDiagram
class matplotlib.pyplot {
+style.use()
+subplots()
+show()
+savefig()
+colorbar()
}
class Axes {
+plot(x, y)
+bar(x, y)
+hist(data)
+scatter(x, y)
+set_title(str)
+set_xlabel(str)
+set_ylabel(str)
+grid()
}
class seaborn {
+boxplot(data, x, y, ax)
+heatmap(data, annot, cmap, ax)
+violinplot(data, x, y, ax)
+countplot(data, x, hue, ax)
}
matplotlib.pyplot --> Axes : "creates"
seaborn --> Axes : "draws on"print("\n=== Data Visualization ===")
# Set up the plotting style
plt.style.use('seaborn-v0_8')
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# 1. Line plot - Monthly sales trend
axes[0, 0].plot(monthly_sales.index, monthly_sales.values, marker='o', linewidth=2)
axes[0, 0].set_title('Monthly Sales Trend', fontsize=14)
axes[0, 0].set_xlabel('Month')
axes[0, 0].set_ylabel('Total Sales')
axes[0, 0].grid(True, alpha=0.3)
# 2. Bar plot - Sales by product
product_sales = sales_df.groupby('Product')['Sales'].mean()
axes[0, 1].bar(product_sales.index, product_sales.values, color=['#FF6B6B', '#4ECDC4', '#45B7D1'])
axes[0, 1].set_title('Average Sales by Product', fontsize=14)
axes[0, 1].set_xlabel('Product')
axes[0, 1].set_ylabel('Average Sales')
# 3. Histogram - Sales distribution
axes[1, 0].hist(sales_df['Sales'], bins=20, color='skyblue', alpha=0.7, edgecolor='black')
axes[1, 0].set_title('Sales Distribution', fontsize=14)
axes[1, 0].set_xlabel('Sales Amount')
axes[1, 0].set_ylabel('Frequency')
# 4. Scatter plot - Sales vs Customer Count
axes[1, 1].scatter(sales_df['Customer_Count'], sales_df['Sales'],
c=sales_df['Month'], cmap='viridis', alpha=0.6)
axes[1, 1].set_title('Sales vs Customer Count', fontsize=14)
axes[1, 1].set_xlabel('Customer Count')
axes[1, 1].set_ylabel('Sales')
colorbar = plt.colorbar(axes[1, 1].collections[0], ax=axes[1, 1])
colorbar.set_label('Month')
plt.tight_layout()
plt.savefig('sales_analysis.png', dpi=300, bbox_inches='tight')
plt.show()
# Advanced visualization with Seaborn
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# 1. Box plot - Sales by region
sns.boxplot(data=sales_df, x='Region', y='Sales', ax=axes[0, 0])
axes[0, 0].set_title('Sales Distribution by Region')
# 2. Heatmap - Correlation matrix
correlation_data = sales_df[['Sales', 'Customer_Count', 'Month', 'Quarter']].corr()
sns.heatmap(correlation_data, annot=True, cmap='coolwarm', center=0, ax=axes[0, 1])
axes[0, 1].set_title('Correlation Heatmap')
# 3. Violin plot - Sales by quarter
sns.violinplot(data=sales_df, x='Quarter', y='Sales', ax=axes[1, 0])
axes[1, 0].set_title('Sales Distribution by Quarter')
# 4. Pair plot data preparation and count plot
sns.countplot(data=sales_df, x='Product', hue='Region', ax=axes[1, 1])
axes[1, 1].set_title('Product Count by Region')
axes[1, 1].legend(title='Region', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.savefig('advanced_sales_analysis.png', dpi=300, bbox_inches='tight')
plt.show()PythonThis code provides a comprehensive demonstration of data visualization using Matplotlib and Seaborn, two powerful Python libraries commonly used for data analysis.
Matplotlib basics: Creating subplots
This section uses Matplotlib to create a grid of plots, allowing for the visualization of multiple aspects of the data in a structured layout.
plt.style.use('seaborn-v0_8'): Applies a Seaborn-like style to the plots, improving their aesthetic appeal.fig, axes = plt.subplots(2, 2, figsize=(15, 12)): Creates a new figure (fig) and a 2×2 grid of subplots (axes). Thefigsizeparameter sets the size of the entire figure in inches.- Line plot (
axes[0, 0]):axes[0, 0].plot(...): Creates a line plot on the top-left subplot, which is useful for visualizing trends over a continuous variable like time.marker='o',linewidth=2: Adds customizations to the plot, such as markers for each data point and a thicker line.
- Bar plot (
axes[0, 1]):axes[0, 1].bar(...): Creates a bar plot on the top-right subplot to compare average sales across different product categories.
- Histogram (
axes[1, 0]):axes[1, 0].hist(...): Creates a histogram on the bottom-left subplot to show the distribution of theSalesvariable, usingbins=20to define the number of intervals for grouping the data.
- Scatter plot (
axes[1, 1]):axes[1, 1].scatter(...): Creates a scatter plot on the bottom-right subplot to show the relationship betweenSalesandCustomer_Count.c=sales_df['Month'],cmap='viridis': Colors the data points based on a third variable (Month) using a colormap (viridis), which provides additional information.
plt.colorbar(...): Adds a color bar to the plot to provide a key for interpreting the color-coded data points.plt.tight_layout(): Automatically adjusts the subplot parameters to give a tight layout, preventing labels and titles from overlapping.plt.savefig(...): Saves the generated figure to a file.dpi=300sets the resolution for higher-quality output, andbbox_inches='tight'ensures that all plot elements are included without clipping.plt.show(): Displays the figure on the screen.
Advanced visualization with Seaborn
Seaborn is built on top of Matplotlib and offers a high-level, aesthetically pleasing interface for drawing informative statistical graphics.
fig, axes = plt.subplots(...): A new 2×2 grid of subplots is created for the Seaborn visualizations, demonstrating that Seaborn and Matplotlib can be used together effectively.- Box plot (
sns.boxplot(...)):sns.boxplot(...): Creates a box plot on the top-left subplot, visualizing the distribution ofSaleswithin eachRegion. It displays the median, quartiles, and potential outliers.
- Heatmap (
sns.heatmap(...)):sns.heatmap(...): Creates a heatmap on the top-right subplot, visualizing the correlation matrix between numerical variables.annot=Trueadds the correlation values to the plot, andcmap='coolwarm'uses a diverging color scheme centered at zero.
- Violin plot (
sns.violinplot(...)):sns.violinplot(...): Creates a violin plot on the bottom-left subplot, which shows the distribution ofSalesbyQuarter. It is similar to a box plot but provides a more detailed view of the data’s density.
- Count plot (
sns.countplot(...)):sns.countplot(...): Creates a count plot on the bottom-right subplot to show the number of occurrences of eachProductwithin eachRegion.hue='Region': Creates separate bars for each region within each product category.
plt.tight_layout(): As with the Matplotlib plots, this is used to prevent overlapping elements.plt.savefig(...)andplt.show(): Saves the figure to a file and displays it.
Basic Machine Learning
flowchart TD
A[Start ML Examples] --> B[Prepare Regression Data]
B --> C["Split Train/Test Regression Data"]
C --> D[Train Linear Regression Model]
D --> E[Predict Sales on Test Set]
E --> F["Evaluate Regression: MSE, R², Coefficients"]
F --> G[Prepare Classification Data]
G --> H[Label Encode Product, Region]
H --> I["Split Train/Test Classification Data"]
I --> J["Scale Data (StandardScaler)"]
J --> K[Train Logistic Regression]
I --> L[Train Random Forest]
K --> M[Logistic Regression Predictions]
L --> N[Random Forest Predictions]
M --> O[Evaluate Logistic Regression Accuracy]
N --> P[Evaluate Random Forest Accuracy & Report]
P --> Q["Feature Importance (Random Forest)"]
Q --> R["Summary Statistics with describe()"]
R --> S[Generate Analysis Report]
S --> T["Save Report to analysis_report.txt"]
T --> U[End]classDiagram
class train_test_split {
+split(X, y, test_size, random_state)
}
class LinearRegression {
+fit(X_train, y_train)
+predict(X_test)
+score(X_test, y_test)
-coef_
}
class LogisticRegression {
+fit(X_train, y_train)
+predict(X_test)
+score(X_test, y_test)
}
class RandomForestClassifier {
+fit(X_train, y_train)
+predict(X_test)
+score(X_test, y_test)
-feature_importances_
}
class StandardScaler {
+fit_transform(X_train)
+transform(X_test)
}
class LabelEncoder {
+fit_transform(data)
+inverse_transform()
}
class Metrics {
+mean_squared_error()
+accuracy_score()
+classification_report()
}
train_test_split --> LinearRegression
train_test_split --> LogisticRegression
train_test_split --> RandomForestClassifier
StandardScaler --> LogisticRegression
LabelEncoder --> X_class
Metrics --> LinearRegression
Metrics --> LogisticRegression
Metrics --> RandomForestClassifier# Install scikit-learn: pip install scikit-learn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import mean_squared_error, accuracy_score, classification_report
from sklearn.preprocessing import StandardScaler, LabelEncoder
print("\n=== Machine Learning Examples ===")
# 1. Linear Regression Example
print("1. Linear Regression - Predicting Sales")
# Prepare data for regression
X_reg = sales_df[['Customer_Count', 'Month', 'Quarter']].values
y_reg = sales_df['Sales'].values
# Split the data
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(
X_reg, y_reg, test_size=0.2, random_state=42
)
# Train the model
reg_model = LinearRegression()
reg_model.fit(X_train_reg, y_train_reg)
# Make predictions
y_pred_reg = reg_model.predict(X_test_reg)
# Evaluate
mse = mean_squared_error(y_test_reg, y_pred_reg)
print(f"Mean Squared Error: {mse:.2f}")
print(f"R-squared Score: {reg_model.score(X_test_reg, y_test_reg):.3f}")
# Feature importance
feature_names = ['Customer_Count', 'Month', 'Quarter']
for name, coef in zip(feature_names, reg_model.coef_):
print(f"{name}: {coef:.2f}")
# 2. Classification Example
print("\n2. Classification - Predicting High/Low Sales")
# Create binary target variable
sales_median = sales_df['Sales'].median()
sales_df['High_Sales'] = (sales_df['Sales'] > sales_median).astype(int)
# Prepare data for classification
le_product = LabelEncoder()
le_region = LabelEncoder()
X_class = pd.DataFrame({
'Customer_Count': sales_df['Customer_Count'],
'Month': sales_df['Month'],
'Product': le_product.fit_transform(sales_df['Product']),
'Region': le_region.fit_transform(sales_df['Region'])
})
y_class = sales_df['High_Sales'].values
# Split and scale the data
X_train_class, X_test_class, y_train_class, y_test_class = train_test_split(
X_class, y_class, test_size=0.2, random_state=42
)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train_class)
X_test_scaled = scaler.transform(X_test_class)
# Train models
log_model = LogisticRegression(random_state=42)
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
log_model.fit(X_train_scaled, y_train_class)
rf_model.fit(X_train_class, y_train_class) # Random Forest doesn't need scaling
# Make predictions
log_pred = log_model.predict(X_test_scaled)
rf_pred = rf_model.predict(X_test_class)
# Evaluate models
print("Logistic Regression Accuracy:", accuracy_score(y_test_class, log_pred))
print("Random Forest Accuracy:", accuracy_score(y_test_class, rf_pred))
print("\nRandom Forest Classification Report:")
print(classification_report(y_test_class, rf_pred))
# Feature importance for Random Forest
print("\nFeature Importance (Random Forest):")
for name, importance in zip(X_class.columns, rf_model.feature_importances_):
print(f"{name}: {importance:.3f}")
# 3. Data Analysis Summary
print("\n=== Data Analysis Summary ===")
summary_stats = sales_df.describe()
print("Summary Statistics:")
print(summary_stats)
# Create a comprehensive analysis report
analysis_report = f"""
DATA ANALYSIS REPORT
===================
Dataset Overview:
- Total Records: {len(sales_df)}
- Date Range: {sales_df['Date'].min()} to {sales_df['Date'].max()}
- Products: {', '.join(sales_df['Product'].unique())}
- Regions: {', '.join(sales_df['Region'].unique())}
Key Insights:
- Average Sales: ${sales_df['Sales'].mean():.2f}
- Total Sales: ${sales_df['Sales'].sum():.2f}
- Best Performing Product: {product_sales.idxmax()} (${product_sales.max():.2f} avg)
- Most Active Region: {sales_df['Region'].value_counts().index[0]}
Model Performance:
- Sales Prediction R²: {reg_model.score(X_test_reg, y_test_reg):.3f}
- High/Low Sales Classification: {accuracy_score(y_test_class, rf_pred):.3f}
"""
print(analysis_report)
# Save analysis results
with open('analysis_report.txt', 'w') as f:
f.write(analysis_report)
print("Analysis report saved to 'analysis_report.txt'")PythonThis code provides a practical and comprehensive demonstration of machine learning concepts using the scikit-learn library. It covers data preparation, training different model types (regression and classification), evaluating performance, and interpreting results.
Scikit-learn workflow
The code follows a standard machine learning workflow.
- Data preparation: Features (
X) and the target variable (y) are defined. Categorical data is handled usingLabelEncoderto convert labels like'A','B','C'into numerical representations that machine learning models can understand. - Train-test split: The
train_test_split()function is used to divide the data into training and testing sets. This is a crucial step to evaluate a model’s performance on unseen data and prevent overfitting. - Model training: The chosen model (
LinearRegression,LogisticRegression,RandomForestClassifier) is instantiated and then “trained” on the training data using the.fit()method. - Prediction: The trained model’s
.predict()method is used to make predictions on the test data. - Evaluation: Various metrics (
mean_squared_error,accuracy_score,classification_report) are used to assess the model’s performance. - Interpretation: The model’s internal parameters (
reg_model.coef_,rf_model.feature_importances_) are inspected to understand which features were most influential in the predictions.
Machine learning examples
- Linear Regression
- Purpose: Predicts a continuous target variable (
Sales) based on input features (Customer_Count,Month,Quarter). - Evaluation:
- Mean Squared Error (MSE): Measures the average squared difference between the actual and predicted values. A lower MSE indicates better performance.
- R-squared Score: Represents the proportion of variance in the target variable that is predictable from the features. A value close to 1 indicates a good fit.
- Interpretation: The model’s coefficients (
reg_model.coef_) show the impact of each feature on the predicted sales.
- Classification
- Purpose: Predicts a categorical target variable (
High_Sales– 1 for high sales, 0 for low sales). - Data preparation:
- Feature scaling: The
StandardScaleris used to standardize features by removing the mean and scaling to unit variance. This is essential for models sensitive to the scale of features, likeLogisticRegression.RandomForestClassifieris generally not sensitive to scaling.
- Feature scaling: The
- Models:
- Logistic Regression: A simple, linear model for binary classification.
- Random Forest Classifier: An ensemble model that uses multiple decision trees to make more robust predictions. It often performs better than linear models on complex datasets.
- Evaluation:
- Accuracy Score: Measures the proportion of correctly classified instances.
- Classification Report: Provides a detailed breakdown of precision, recall, and F1-score for each class, offering a more complete picture of performance.
- Interpretation: The
feature_importances_attribute of the Random Forest model indicates which features were most important in determining high or low sales.
Data analysis summary and reporting
This section uses the analysis results to generate a readable summary report, which is a common task in a data science workflow.
sales_df.describe(): Generates descriptive statistics (mean, std, min, max, quartiles) for the numerical columns, providing a quick overview of the data.- Analysis report generation: An f-string is used to create a formatted, human-readable report summarizing the key findings, including dataset overview, key insights from previous analysis, and model performance metrics.
- File output: The report is saved to a text file (
analysis_report.txt) for documentation or sharing.
Testing and Debugging
Testing Framework
graph TD
A[Python Testing] --> B[Unit Testing]
A --> C[Integration Testing]
A --> D[Test-Driven Development]
B --> B1[unittest]
B --> B2[pytest]
B --> B3[doctest]
C --> C1[API Testing]
C --> C2[Database Testing]
C --> C3[End-to-End Testing]
D --> D1[Write Tests First]
D --> D2[Implement Code]
D --> D3[Refactor]Unit Testing with unittest
flowchart TD
A[Start Tests] --> B["setUp() Create Calculator Instance"]
B --> C["Test add()"]
B --> D["Test subtract()"]
B --> E["Test multiply()"]
B --> F["Test divide()"]
B --> G["Test divide_by_zero()"]
B --> H["Test power()"]
B --> I["Edge Case: Large Numbers"]
B --> J["Edge Case: Floating Point Precision"]
B --> K["Edge Case: Negative Numbers"]
C --> L[Assert Equal Results]
D --> L
E --> L
F --> L
G --> L
H --> L
I --> L
J --> L
K --> L
L --> M["tearDown()"]
M --> N[Unittest Runner Collects Results]
N --> O["Summary Printed: Tests Run, Failures, Errors"]
O --> P[End]classDiagram
class Calculator {
+add(a, b)
+subtract(a, b)
+multiply(a, b)
+divide(a, b)
+power(base, exponent)
}
class TestCalculator {
+setUp()
+test_add()
+test_subtract()
+test_multiply()
+test_divide()
+test_divide_by_zero()
+test_power()
+tearDown()
}
class TestCalculatorEdgeCases {
+setUp()
+test_large_numbers()
+test_floating_point_precision()
+test_negative_numbers()
}
TestCalculator --> Calculator
TestCalculatorEdgeCases --> Calculator# calculator.py - Module to test
class Calculator:
"""A simple calculator class."""
def add(self, a, b):
"""Add two numbers."""
return a + b
def subtract(self, a, b):
"""Subtract two numbers."""
return a - b
def multiply(self, a, b):
"""Multiply two numbers."""
return a * b
def divide(self, a, b):
"""Divide two numbers."""
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
def power(self, base, exponent):
"""Calculate base raised to exponent."""
return base ** exponent
# test_calculator.py - Unit tests
import unittest
import sys
import os
class TestCalculator(unittest.TestCase):
"""Test cases for Calculator class."""
def setUp(self):
"""Set up test fixtures before each test method."""
self.calc = Calculator()
def test_add(self):
"""Test addition operation."""
self.assertEqual(self.calc.add(2, 3), 5)
self.assertEqual(self.calc.add(-1, 1), 0)
self.assertEqual(self.calc.add(0, 0), 0)
def test_subtract(self):
"""Test subtraction operation."""
self.assertEqual(self.calc.subtract(5, 3), 2)
self.assertEqual(self.calc.subtract(0, 5), -5)
self.assertEqual(self.calc.subtract(-2, -3), 1)
def test_multiply(self):
"""Test multiplication operation."""
self.assertEqual(self.calc.multiply(3, 4), 12)
self.assertEqual(self.calc.multiply(-2, 3), -6)
self.assertEqual(self.calc.multiply(0, 100), 0)
def test_divide(self):
"""Test division operation."""
self.assertEqual(self.calc.divide(8, 2), 4)
self.assertEqual(self.calc.divide(7, 2), 3.5)
self.assertAlmostEqual(self.calc.divide(1, 3), 0.33333, places=4)
def test_divide_by_zero(self):
"""Test division by zero raises ValueError."""
with self.assertRaises(ValueError):
self.calc.divide(5, 0)
# Test the error message
with self.assertRaisesRegex(ValueError, "Cannot divide by zero"):
self.calc.divide(10, 0)
def test_power(self):
"""Test power operation."""
self.assertEqual(self.calc.power(2, 3), 8)
self.assertEqual(self.calc.power(5, 0), 1)
self.assertEqual(self.calc.power(10, -1), 0.1)
def tearDown(self):
"""Clean up after each test method."""
pass # Nothing to clean up for this simple example
class TestCalculatorEdgeCases(unittest.TestCase):
"""Test edge cases for Calculator class."""
def setUp(self):
self.calc = Calculator()
def test_large_numbers(self):
"""Test with very large numbers."""
large_num = 10**10
self.assertEqual(self.calc.add(large_num, large_num), 2 * large_num)
def test_floating_point_precision(self):
"""Test floating point operations."""
result = self.calc.add(0.1, 0.2)
self.assertAlmostEqual(result, 0.3, places=10)
def test_negative_numbers(self):
"""Test with negative numbers."""
self.assertEqual(self.calc.multiply(-5, -4), 20)
self.assertEqual(self.calc.divide(-10, -2), 5)
# Run tests
if __name__ == '__main__':
# Create a test suite
suite = unittest.TestLoader().loadTestsFromTestCase(TestCalculator)
suite.addTests(unittest.TestLoader().loadTestsFromTestCase(TestCalculatorEdgeCases))
# Run tests with detailed output
runner = unittest.TextTestRunner(verbosity=2)
result = runner.run(suite)
# Print summary
print(f"\nTests run: {result.testsRun}")
print(f"Failures: {len(result.failures)}")
print(f"Errors: {len(result.errors)}")PythonThis script provides a comprehensive example of unit testing in Python using the unittest framework, which is built into the standard library. It defines a Calculator class to be tested and a separate TestCalculator class containing the test cases.
The Calculator class (calculator.py)
- Purpose: A simple class that contains methods for basic arithmetic operations: addition, subtraction, multiplication, division, and exponentiation.
- Method (
divide): Thedividemethod includes explicit error handling. It raises aValueErrorif the denominatorbis0, which is a key behavior that needs to be tested.
The TestCalculator classes (test_calculator.py)
This script contains the unit tests for the Calculator class. It demonstrates several important concepts in unit testing.
- Test Case Class: The
TestCalculatorclass inherits fromunittest.TestCase. This is the standard way to create a test case, as it provides all the necessary assertion methods. - Setup and Teardown:
setUp(self): This method is automatically called before each test method runs. It sets up a new, fresh instance of theCalculator(self.calc) for each test, ensuring that tests are independent and do not interfere with each other.tearDown(self): This method is called after each test method. It is used for cleanup, such as closing files or database connections. In this simple example, it does nothing (pass).
- Test Methods: Each method whose name starts with
test_is considered a test method by theunittestframework.self.assertEqual(a, b): Asserts thataandbare equal. Used foradd,subtract,multiply, andpowertests.self.assertAlmostEqual(a, b, places=n): Asserts thataandbare approximately equal up tondecimal places. This is essential for testing floating-point numbers due to potential precision issues.self.assertRaises(Exception): Used with awithstatement to test that a specific exception (ValueError) is raised when expected.self.assertRaisesRegex(Exception, "message"): A more specific assertion that checks if an exception is raised and if its message matches a given regular expression.
- Testing Edge Cases (
TestCalculatorEdgeCases):- Purpose: Demonstrates the importance of writing a separate test case for special scenarios like very large numbers, floating-point precision, and negative numbers.
self.assertAlmostEqual(0.1 + 0.2, 0.3): This is a classic example showing the need forassertAlmostEqualwith floating-point arithmetic.
- Organizing and Running Tests (
if __name__ == '__main__':):- Test Suite: The code in the
mainblock demonstrates how to manually build aTestSuiteby loading tests from multipleTestCaseclasses. unittest.TextTestRunner(verbosity=2): TheTestRunnerexecutes the tests. Theverbosity=2argument provides more detailed output, showing which tests passed (.), which failed (F), and which encountered an error (E).result = runner.run(suite): The runner executes the suite and returns a result object containing information about the test run, including the number of tests, failures, and errors.
- Test Suite: The code in the
- Summary Output: The script prints a summary of the test results, providing a clear overview of the test run.
Testing with pytest
flowchart TD
A["Start Pytest"] --> B["Collect Test Classes and Functions"]
B --> C["Fixture: calc() -> Provide Calculator instance"]
B --> D["Fixture: tmp_path -> Temporary directory"]
B --> E["Fixture: database_connection -> Session-scoped DB connection"]
C --> F["Test add()"]
C --> G["Test subtract()"]
C --> H["Test multiply()"]
C --> I["Test divide()"]
C --> J["Test divide_by_zero()"]
C --> K["Test power_parametrized()"]
C --> L["Test performance (slow)"]
E --> M["Test database_operation()"]
B --> N["Test temporary_file()"]
B --> O["Test API integration (integration)"]
F --> P["Assertions"]
G --> P
H --> P
I --> P
J --> P
K --> P
L --> P
M --> P
N --> P
O --> P
P --> Q["Report test results: Passed / Failed / Skipped"]
Q --> R["End Pytest"]classDiagram
class Calculator {
+add(a, b)
+subtract(a, b)
+multiply(a, b)
+divide(a, b)
+power(base, exponent)
}
class TestCalculatorPytest {
+test_add(calc)
+test_subtract(calc)
+test_multiply(calc)
+test_divide(calc)
+test_divide_by_zero(calc)
+test_power_parametrized(calc, a, b, expected)
+test_performance(calc)
}
class OtherPytestTests {
+test_temporary_file(tmp_path)
+test_database_operation(database_connection)
+test_api_integration()
}
TestCalculatorPytest --> Calculator
OtherPytestTests --> Calculator# Install pytest: pip install pytest
# test_calculator_pytest.py
import pytest
from calculator import Calculator
class TestCalculatorPytest:
"""Pytest test cases for Calculator."""
@pytest.fixture
def calc(self):
"""Fixture to provide Calculator instance."""
return Calculator()
def test_add(self, calc):
"""Test addition with pytest."""
assert calc.add(2, 3) == 5
assert calc.add(-1, 1) == 0
assert calc.add(0, 0) == 0
def test_subtract(self, calc):
"""Test subtraction with pytest."""
assert calc.subtract(5, 3) == 2
assert calc.subtract(0, 5) == -5
def test_multiply(self, calc):
"""Test multiplication with pytest."""
assert calc.multiply(3, 4) == 12
assert calc.multiply(-2, 3) == -6
def test_divide(self, calc):
"""Test division with pytest."""
assert calc.divide(8, 2) == 4
assert abs(calc.divide(1, 3) - 0.33333) < 0.00001
def test_divide_by_zero(self, calc):
"""Test division by zero with pytest."""
with pytest.raises(ValueError, match="Cannot divide by zero"):
calc.divide(5, 0)
@pytest.mark.parametrize("a,b,expected", [
(2, 3, 8),
(5, 0, 1),
(10, 1, 10),
(2, -1, 0.5),
])
def test_power_parametrized(self, calc, a, b, expected):
"""Test power operation with multiple parameters."""
assert calc.power(a, b) == expected
@pytest.mark.slow
def test_performance(self, calc):
"""Test performance - marked as slow."""
# This test would normally test performance
result = sum(calc.add(i, i) for i in range(10000))
assert result > 0
# Advanced pytest features
def test_temporary_file(tmp_path):
"""Test using temporary directory fixture."""
# tmp_path is a pytest fixture that provides a temporary directory
test_file = tmp_path / "test.txt"
test_file.write_text("Hello, World!")
assert test_file.read_text() == "Hello, World!"
assert test_file.exists()
@pytest.fixture(scope="session")
def database_connection():
"""Session-scoped fixture for database connection."""
# This would normally create a real database connection
connection = {"status": "connected", "type": "test_db"}
yield connection
# Cleanup code would go here
connection["status"] = "disconnected"
def test_database_operation(database_connection):
"""Test using session-scoped fixture."""
assert database_connection["status"] == "connected"
# Custom pytest markers (add to pytest.ini or pyproject.toml)
# [tool:pytest]
# markers =
# slow: marks tests as slow
# integration: marks tests as integration tests
# unit: marks tests as unit tests
@pytest.mark.integration
def test_api_integration():
"""Integration test example."""
# This would test actual API integration
assert True
# Conftest.py - shared fixtures
# This file would contain shared fixtures across multiple test filesPythonThis example demonstrates the use of pytest, a popular and powerful testing framework in Python. It is known for its simple, readable syntax and advanced features, such as fixtures, parametrization, and marking tests.
pytest test structure
- Test Naming:
pytestautomatically discovers test files and test functions. It looks for files namedtest_*.pyor*_test.pyand functions namedtest_*. This convention makes test organization straightforward. - Assertions:
pytestuses standard Pythonassertstatements instead of special methods likeself.assertEqualfromunittest. This simplifies test code and makes it more readable.
Key pytest features
Fixtures
@pytest.fixture: Decorator used to mark a function as a fixture. A fixture is a function that sets up a test environment, like creating an object or a temporary database.calc(self): Thecalcfixture provides an instance of theCalculatorclass.pytestautomatically injects this fixture into any test function that includescalcin its parameter list.- Fixture Scopes: Fixtures can have different scopes (
function,class,module,session). Thedatabase_connectionfixture usesscope="session", meaning it runs once per testing session, which is efficient for expensive setup tasks like database connections. yieldin fixtures: Usingyieldin a fixture function separates setup code from teardown code. The code beforeyieldis the setup, and the code afteryieldis the teardown, which runs after the test has finished.tmp_path: A built-inpytestfixture that provides a path to a temporary directory unique to each test invocation.
Parametrization
@pytest.mark.parametrize(...): This decorator allows you to run a test with different sets of input data without writing multiple test functions.- Usage: The decorator takes a list of argument names (
"a,b,expected") and a list of tuples containing the values for each argument ([(2, 3, 8), (5, 0, 1), ... ]).pytestwill run thetest_power_parametrizedfunction once for each tuple of values.
Custom markers
@pytest.mark.slowand@pytest.mark.integration: Custom markers help categorize tests.- Usage: You can run only tests with a specific marker from the command line, for example,
pytest -m "slow"to run only slow tests, orpytest -m "not slow"to exclude them. - Configuration: To prevent warnings about unknown markers, they should be defined in a
pytest.iniorpyproject.tomlfile.
Exception testing
with pytest.raises(...): Thepytest.raisescontext manager is the idiomatic way to test that a specific exception is raised.match="Cannot divide by zero": The optionalmatchargument checks that the exception’s message matches a given regular expression.
Shared fixtures (conftest.py)
conftest.py: A file used to define fixtures that can be shared across multiple test files within a directory or subdirectories.pytestautomatically finds and loads fixtures defined inconftest.py.
Running pytest
To run the tests, you simply run the pytest command in your terminal from the directory containing your test files. pytest automatically discovers and runs all tests.
Debugging Techniques
flowchart TD
A[Start Demonstration] --> B["complex_calculation(5, 3)"]
B -->|Logging, Debug Mode?| C{Debug Mode?}
C -->|Yes| D["pdb.set_trace()"]
C -->|No| E[Perform calculation steps]
E --> F[Return result]
F --> G[Print Result]
A --> H["DebugContext('data_processing')"]
H --> I[Process data in context]
I --> J[Log processed data]
J --> K[Exit DebugContext]
A --> L["PerformanceMonitor('large_calculation')"]
L --> M[Time-consuming operation]
M --> N[Log duration]
N --> O[Exit PerformanceMonitor]
A --> P["DebugContext('error_prone_operation')"]
P --> Q["risky_operation()"]
Q -->|Exception?| R{Exception?}
R -->|Yes| S[Log error, handle exception]
R -->|No| T[Return Success]
S --> U[Continue execution]
A --> V[Debugger utilities]
V --> W["print_variables(locals())"]
V --> X["trace_calls(frame)"]
V --> Y["memory_usage()"]
U --> Z[End Demonstration]
T --> Z
W --> Z
X --> Z
Y --> ZclassDiagram
class DebugContext {
-name: str
-log_level: int
-logger: Logger
+__enter__()
+__exit__(exc_type, exc_val, exc_tb)
}
class PerformanceMonitor {
-name: str
-start_time: float
+__enter__()
+__exit__(exc_type, exc_val, exc_tb)
}
class Debugger {
+print_variables(local_vars, filter_private=True)
+trace_calls(frame, event, arg)
+memory_usage()
}
class Logging {
+debug_function_calls(func)
+logger: Logger
}
class Functions {
+complex_calculation(x, y, debug_mode=False)
+risky_operation()
+demonstrate_debugging()
}
Functions --> Logging
Functions --> DebugContext
Functions --> PerformanceMonitor
Functions --> Debuggerimport pdb
import logging
import traceback
from functools import wraps
# 1. Logging for debugging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('app.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
def debug_function_calls(func):
"""Decorator to log function calls and results."""
@wraps(func)
def wrapper(*args, **kwargs):
logger.debug(f"Calling {func.__name__} with args={args}, kwargs={kwargs}")
try:
result = func(*args, **kwargs)
logger.debug(f"{func.__name__} returned: {result}")
return result
except Exception as e:
logger.error(f"Error in {func.__name__}: {e}")
logger.error(traceback.format_exc())
raise
return wrapper
@debug_function_calls
def complex_calculation(x, y, debug_mode=False):
"""Example function with debugging features."""
logger.info(f"Starting complex calculation with x={x}, y={y}")
if debug_mode:
# Set breakpoint for debugging
pdb.set_trace()
try:
# Some complex calculations
step1 = x * 2
logger.debug(f"Step 1 result: {step1}")
step2 = step1 + y
logger.debug(f"Step 2 result: {step2}")
if step2 == 0:
raise ValueError("Result cannot be zero")
result = 100 / step2
logger.debug(f"Final result: {result}")
return result
except Exception as e:
logger.error(f"Error in calculation: {e}")
raise
# 2. Custom debugging context manager
class DebugContext:
"""Context manager for debugging code blocks."""
def __init__(self, name, log_level=logging.DEBUG):
self.name = name
self.log_level = log_level
self.logger = logging.getLogger(f"DebugContext.{name}")
def __enter__(self):
self.logger.log(self.log_level, f"Entering {self.name}")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type:
self.logger.error(f"Exception in {self.name}: {exc_val}")
else:
self.logger.log(self.log_level, f"Exiting {self.name} successfully")
# 3. Performance debugging
class PerformanceMonitor:
"""Monitor performance of code blocks."""
def __init__(self, name):
self.name = name
self.start_time = None
def __enter__(self):
self.start_time = time.time()
logger.info(f"Starting performance monitoring for {self.name}")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
end_time = time.time()
duration = end_time - self.start_time
logger.info(f"Performance monitoring for {self.name}: {duration:.4f} seconds")
# Example usage of debugging tools
def demonstrate_debugging():
"""Demonstrate various debugging techniques."""
print("=== Debugging Demonstration ===")
# 1. Basic function with logging
try:
result1 = complex_calculation(5, 3)
print(f"Result 1: {result1}")
except Exception as e:
print(f"Error: {e}")
# 2. Using debug context
with DebugContext("data_processing"):
data = [1, 2, 3, 4, 5]
processed = [x * 2 for x in data]
logger.debug(f"Processed data: {processed}")
# 3. Performance monitoring
with PerformanceMonitor("large_calculation"):
# Simulate some time-consuming operation
total = sum(x**2 for x in range(100000))
logger.info(f"Calculation result: {total}")
# 4. Exception handling with detailed logging
try:
with DebugContext("error_prone_operation"):
risky_operation()
except Exception as e:
logger.error("Caught exception in demonstration")
print(f"Handled error: {e}")
def risky_operation():
"""Function that might raise an exception."""
import random
if random.random() < 0.5:
raise ValueError("Random error for demonstration")
return "Success!"
# Advanced debugging techniques
class Debugger:
"""Custom debugger with various utilities."""
@staticmethod
def print_variables(local_vars, filter_private=True):
"""Print all local variables."""
print("=== Local Variables ===")
for name, value in local_vars.items():
if filter_private and name.startswith('_'):
continue
print(f"{name}: {value} ({type(value).__name__})")
@staticmethod
def trace_calls(frame, event, arg):
"""Trace function calls."""
if event == 'call':
code = frame.f_code
print(f"Calling: {code.co_filename}:{code.co_firstlineno} {code.co_name}")
return Debugger.trace_calls
@staticmethod
def memory_usage():
"""Get current memory usage."""
import psutil
import os
process = psutil.Process(os.getpid())
memory_mb = process.memory_info().rss / 1024 / 1024
return f"Memory usage: {memory_mb:.2f} MB"
# Run debugging demonstration
if __name__ == "__main__":
demonstrate_debugging()
# Example of using the debugger utilities
debugger = Debugger()
print(debugger.memory_usage())
# Print local variables in current scope
local_var1 = "test"
local_var2 = 42
debugger.print_variables(locals())PythonBest Practices
Code Quality and Style
graph TD
A[Python Best Practices] --> B[Code Style]
A --> C[Documentation]
A --> D[Error Handling]
A --> E[Performance]
A --> F[Security]
B --> B1[PEP 8]
B --> B2[Type Hints]
B --> B3[Code Formatting]
C --> C1[Docstrings]
C --> C2[Comments]
C --> C3[README Files]
D --> D1[Exception Handling]
D --> D2[Input Validation]
D --> D3[Logging]
E --> E1[Profiling]
E --> E2[Optimization]
E --> E3[Memory Management]Code Style and Standards
flowchart TD
A[Start Demonstration] --> B[Create Users]
B -->|Validation| C{Valid users?}
C -->|Yes| D[Create TaskManager]
C -->|No| E[Exit with error]
D --> F[Create Tasks]
F --> G[Update Task Status]
G --> H[Get Tasks by Status & User]
H --> I[Print Task Summary]
D --> J[Retry unreliable operation]
J --> K{Success?}
K -->|Yes| L[Print success result]
K -->|No| M[Raise exception]
D --> N[Validate Input]
N --> O{Validation passed?}
O -->|Yes| P[Continue]
O -->|No| Q[Raise ValueError]
A --> R[SecureDataHandler Demonstration]
R --> S[Store sensitive data]
S --> T[Retrieve sensitive data]
T --> U[Clear sensitive data]
E --> V[End Demonstration]
P --> V
L --> V
M --> V
U --> VclassDiagram
class User {
+name: str
+email: str
+age: int
+role: UserRole
+__post_init__()
+is_admin() bool
}
class Task {
+id: int
+title: str
+description: str
+status: Status
+assigned_to: Optional[User]
}
class TaskManager {
-_tasks: Dict[int, Task]
-_next_id: int
+create_task(title, description, assigned_to) Task
+get_task(task_id) Optional[Task]
+update_task_status(task_id, status) bool
+get_tasks_by_status(status) List[Task]
+get_user_tasks(user) List[Task]
+delete_task(task_id) bool
+get_task_summary() Dict[str,int]
+__len__() int
+__str__() str
}
class SecureDataHandler {
-_encryption_key: Optional[str]
-_sensitive_data: Dict[str,str]
+store_sensitive_data(key, value)
+retrieve_sensitive_data(key) Optional[str]
+clear_sensitive_data()
-_simple_encrypt(value) str
-_simple_decrypt(value) str
}
class PerformanceMonitor {
+performance_monitor(operation_name)
}
class Utilities {
+validate_input(value, validators)
+retry_operation(operation, max_attempts, delay)
}
UserRole <|-- User
Status <|-- Task
TaskManager --> Task
TaskManager --> User
Utilities --> TaskManager
SecureDataHandler --> Utilities"""
Best Practices Example Module
This module demonstrates Python best practices including:
- PEP 8 compliance
- Type hints
- Proper documentation
- Error handling
- Design patterns
"""
from typing import List, Dict, Optional, Union, Callable, Any
from dataclasses import dataclass
from enum import Enum
import logging
from contextlib import contextmanager
# Configure logging
logging.basicConfig(level=logging.INFO)
# Constants (use UPPER_CASE)
MAX_RETRY_ATTEMPTS = 3
DEFAULT_TIMEOUT = 30
API_BASE_URL = "https://api.example.com"
# Enums for better code organization
class UserRole(Enum):
"""User roles enumeration."""
ADMIN = "admin"
USER = "user"
GUEST = "guest"
class Status(Enum):
"""Status enumeration."""
PENDING = "pending"
COMPLETED = "completed"
FAILED = "failed"
# Data classes for structured data
@dataclass
class User:
"""User data class with validation."""
name: str
email: str
age: int
role: UserRole = UserRole.USER
def __post_init__(self) -> None:
"""Validate user data after initialization."""
if not self.name.strip():
raise ValueError("Name cannot be empty")
if not self._is_valid_email(self.email):
raise ValueError("Invalid email format")
if self.age < 0 or self.age > 150:
raise ValueError("Age must be between 0 and 150")
@staticmethod
def _is_valid_email(email: str) -> bool:
"""Validate email format."""
return "@" in email and "." in email.split("@")[1]
def is_admin(self) -> bool:
"""Check if user has admin privileges."""
return self.role == UserRole.ADMIN
@dataclass
class Task:
"""Task data class."""
id: int
title: str
description: str
status: Status = Status.PENDING
assigned_to: Optional[User] = None
# Main classes with proper design patterns
class TaskManager:
"""
Task management class following best practices.
This class demonstrates:
- Single Responsibility Principle
- Proper error handling
- Type hints
- Documentation
- Design patterns
"""
def __init__(self) -> None:
"""Initialize TaskManager."""
self._tasks: Dict[int, Task] = {}
self._next_id: int = 1
logger.info("TaskManager initialized")
def create_task(
self,
title: str,
description: str,
assigned_to: Optional[User] = None
) -> Task:
"""
Create a new task.
Args:
title: Task title
description: Task description
assigned_to: User assigned to the task
Returns:
Created Task object
Raises:
ValueError: If title or description is empty
"""
if not title.strip():
raise ValueError("Task title cannot be empty")
if not description.strip():
raise ValueError("Task description cannot be empty")
task = Task(
id=self._next_id,
title=title.strip(),
description=description.strip(),
assigned_to=assigned_to
)
self._tasks[self._next_id] = task
self._next_id += 1
logger.info(f"Created task {task.id}: {task.title}")
return task
def get_task(self, task_id: int) -> Optional[Task]:
"""
Get task by ID.
Args:
task_id: Task ID
Returns:
Task object or None if not found
"""
return self._tasks.get(task_id)
def update_task_status(self, task_id: int, status: Status) -> bool:
"""
Update task status.
Args:
task_id: Task ID
status: New status
Returns:
True if updated successfully, False if task not found
"""
task = self._tasks.get(task_id)
if task is None:
logger.warning(f"Task {task_id} not found for status update")
return False
old_status = task.status
task.status = status
logger.info(f"Task {task_id} status changed from {old_status.value} to {status.value}")
return True
def get_tasks_by_status(self, status: Status) -> List[Task]:
"""
Get all tasks with given status.
Args:
status: Status to filter by
Returns:
List of tasks with the specified status
"""
return [task for task in self._tasks.values() if task.status == status]
def get_user_tasks(self, user: User) -> List[Task]:
"""
Get all tasks assigned to a specific user.
Args:
user: User object
Returns:
List of tasks assigned to the user
"""
return [
task for task in self._tasks.values()
if task.assigned_to == user
]
def delete_task(self, task_id: int) -> bool:
"""
Delete a task.
Args:
task_id: Task ID
Returns:
True if deleted successfully, False if task not found
"""
if task_id in self._tasks:
del self._tasks[task_id]
logger.info(f"Deleted task {task_id}")
return True
logger.warning(f"Task {task_id} not found for deletion")
return False
def get_task_summary(self) -> Dict[str, int]:
"""
Get summary of tasks by status.
Returns:
Dictionary with status counts
"""
summary = {status.value: 0 for status in Status}
for task in self._tasks.values():
summary[task.status.value] += 1
return summary
def __len__(self) -> int:
"""Return number of tasks."""
return len(self._tasks)
def __str__(self) -> str:
"""String representation of TaskManager."""
return f"TaskManager with {len(self._tasks)} tasks"
# Utility functions with proper typing and documentation
def validate_input(
value: Any,
validators: List[Callable[[Any], bool]],
error_message: str = "Validation failed"
) -> None:
"""
Validate input using multiple validators.
Args:
value: Value to validate
validators: List of validator functions
error_message: Error message if validation fails
Raises:
ValueError: If any validator fails
"""
for validator in validators:
if not validator(value):
raise ValueError(error_message)
def retry_operation(
operation: Callable[[], Any],
max_attempts: int = MAX_RETRY_ATTEMPTS,
delay: float = 1.0
) -> Any:
"""
Retry an operation with exponential backoff.
Args:
operation: Function to retry
max_attempts: Maximum number of attempts
delay: Initial delay between retries
Returns:
Result of the operation
Raises:
Exception: Last exception if all attempts fail
"""
import time
for attempt in range(max_attempts):
try:
return operation()
except Exception as e:
if attempt == max_attempts - 1:
logger.error(f"Operation failed after {max_attempts} attempts: {e}")
raise
logger.warning(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay} seconds...")
time.sleep(delay)
delay *= 2 # Exponential backoff
@contextmanager
def performance_monitor(operation_name: str):
"""
Context manager for monitoring performance.
Args:
operation_name: Name of the operation being monitored
"""
import time
start_time = time.time()
logger.info(f"Starting {operation_name}")
try:
yield
finally:
end_time = time.time()
duration = end_time - start_time
logger.info(f"{operation_name} completed in {duration:.2f} seconds")
# Example usage demonstrating best practices
def demonstrate_best_practices() -> None:
"""Demonstrate best practices with comprehensive example."""
print("=== Python Best Practices Demonstration ===")
# Create users with validation
try:
admin_user = User(
name="Alice Admin",
email="alice@example.com",
age=30,
role=UserRole.ADMIN
)
regular_user = User(
name="Bob User",
email="bob@example.com",
age=25
)
print(f"Created users: {admin_user.name}, {regular_user.name}")
except ValueError as e:
print(f"User creation error: {e}")
return
# Create task manager and demonstrate operations
with performance_monitor("TaskManager operations"):
task_manager = TaskManager()
# Create tasks
task1 = task_manager.create_task(
"Implement login feature",
"Create user authentication system",
assigned_to=regular_user
)
task2 = task_manager.create_task(
"Review code",
"Review pull requests",
assigned_to=admin_user
)
task3 = task_manager.create_task(
"Update documentation",
"Update API documentation"
)
print(f"Created {len(task_manager)} tasks")
# Update task status
task_manager.update_task_status(task1.id, Status.COMPLETED)
task_manager.update_task_status(task2.id, Status.PENDING)
# Get tasks by various criteria
completed_tasks = task_manager.get_tasks_by_status(Status.COMPLETED)
user_tasks = task_manager.get_user_tasks(regular_user)
print(f"Completed tasks: {len(completed_tasks)}")
print(f"Tasks assigned to {regular_user.name}: {len(user_tasks)}")
# Get summary
summary = task_manager.get_task_summary()
print("Task Summary:")
for status, count in summary.items():
print(f" {status}: {count}")
# Demonstrate error handling and retry mechanism
def unreliable_operation():
"""Simulate an unreliable operation."""
import random
if random.random() < 0.7: # 70% chance of failure
raise ConnectionError("Network error")
return "Operation successful"
try:
with performance_monitor("Retry operation"):
result = retry_operation(unreliable_operation, max_attempts=5)
print(f"Retry result: {result}")
except Exception as e:
print(f"Operation failed permanently: {e}")
# Demonstrate input validation
validators = [
lambda x: isinstance(x, str),
lambda x: len(x) >= 3,
lambda x: x.strip() != ""
]
try:
validate_input("Valid input", validators)
print("Input validation passed")
except ValueError as e:
print(f"Input validation failed: {e}")
print("Best practices demonstration completed!")
# Security best practices
class SecureDataHandler:
"""Example of secure data handling practices."""
def __init__(self, encryption_key: Optional[str] = None):
"""Initialize with optional encryption key."""
self._encryption_key = encryption_key
self._sensitive_data: Dict[str, str] = {}
def store_sensitive_data(self, key: str, value: str) -> None:
"""
Store sensitive data with basic security measures.
Note: This is a simplified example. In production, use proper
encryption libraries like cryptography.
"""
if not key or not value:
raise ValueError("Key and value cannot be empty")
# In production, encrypt the value here
encrypted_value = self._simple_encrypt(value)
self._sensitive_data[key] = encrypted_value
logger.info(f"Stored sensitive data for key: {key}")
def retrieve_sensitive_data(self, key: str) -> Optional[str]:
"""Retrieve and decrypt sensitive data."""
encrypted_value = self._sensitive_data.get(key)
if encrypted_value is None:
return None
# In production, decrypt the value here
return self._simple_decrypt(encrypted_value)
def _simple_encrypt(self, value: str) -> str:
"""Simple encryption (NOT for production use)."""
# This is just for demonstration - use proper encryption in production
return "".join(chr(ord(c) + 1) for c in value)
def _simple_decrypt(self, encrypted_value: str) -> str:
"""Simple decryption (NOT for production use)."""
return "".join(chr(ord(c) - 1) for c in encrypted_value)
def clear_sensitive_data(self) -> None:
"""Clear all sensitive data from memory."""
self._sensitive_data.clear()
logger.info("Cleared all sensitive data")
if __name__ == "__main__":
# Run the demonstration
demonstrate_best_practices()
# Demonstrate secure data handling
print("\n=== Security Best Practices ===")
secure_handler = SecureDataHandler()
secure_handler.store_sensitive_data("api_key", "secret_key_123")
retrieved_value = secure_handler.retrieve_sensitive_data("api_key")
print(f"Retrieved value: {retrieved_value}")
secure_handler.clear_sensitive_data()
print("Security demonstration completed!")PythonThis script provides a comprehensive demonstration of Python best practices, including strong type hinting, data classes, enums, proper error handling, object-oriented design patterns, clear documentation, context managers, and modular organization.
Best practices explained
Modularity and constants
- Purpose: Organizes code into logical, reusable units and centralizes configuration values.
- Example: The script is a self-contained module demonstrating a
TaskManagerand related components. Constants likeMAX_RETRY_ATTEMPTSandAPI_BASE_URLare defined at the module level inUPPER_CASEto signify they are fixed values.
Enums
- Purpose: Defines a set of named, constant values, making the code more readable, self-documenting, and less error-prone than using raw strings or integers.
- Example:
UserRoleandStatusenums are used for user roles and task statuses, providing a fixed set of valid options.- Benefit: Comparing
task.status == Status.COMPLETEDis much clearer thantask.status == "completed"and prevents typos from causing bugs.
- Benefit: Comparing
Data classes
- Purpose: Reduces boilerplate for classes primarily used to store data by automatically generating methods like
__init__,__repr__, and__eq__. - Example: The
UserandTaskclasses use the@dataclassdecorator.__post_init__: TheUserdata class uses this special method for data validation after initialization, ensuring data integrity.- Immutability: While not explicitly set to
frozen=Truehere, this option could be used to create immutable objects for added safety.
Type hints
- Purpose: Provides static type checking for functions, methods, and variables. While not enforced at runtime, it improves code readability, enables better IDE support, and helps catch type-related errors before execution.
- Example: The script is heavily annotated with type hints, such as
def create_task(...) -> Task:andself._tasks: Dict[int, Task].OptionalandUnion: Used for situations where a value could be of a certain type orNone, or one of several possible types.Callable: Used in theretry_operationutility to hint that the parameteroperationis a function.
Documentation (docstrings)
- Purpose: Documents modules, classes, and functions to explain what they do, their parameters, and what they return. This is crucial for maintainability and collaboration.
- Example: All public components have docstrings following PEP 257, including the
TaskManagerclass, its methods, and the utility functions.
Object-oriented design (TaskManager class)
- Purpose: Encapsulates related data and functionality, promoting code reuse and modularity.
- Single Responsibility Principle: The
TaskManagerclass focuses solely on task management, adhering to this principle by not mixing concerns. - Encapsulation: Private instance variables like
_tasksand_next_idare used internally, and access to them is controlled through public methods (create_task,get_task, etc.).
Error handling
- Purpose: Makes the application robust by gracefully handling unexpected situations and providing informative feedback.
- Example:
- Validation: Methods like
TaskManager.create_taskand theUser.__post_init__method validate input and raise aValueErrorfor invalid data. - Graceful failure: The
retry_operationfunction handles potential exceptions during retries, providing logging and ultimately re-raising the exception if all attempts fail.
- Validation: Methods like
Context managers
- Purpose: Simplifies resource management by ensuring that setup and teardown actions are always performed, even if errors occur.
- Example: The
@contextmanagerdecorator is used to createperformance_monitor, which times the execution of a code block. Thewithstatement guarantees that the timer will be stopped and logged at the end of the block.
Example usage flow
The demonstrate_best_practices() function illustrates how these components work together.
- User creation: A
Userobject is created. The constructor internally calls__post_init__to validate the provided name, email, and age, preventing invalid data from entering the system. - User management: User objects are used within the system to track ownership of tasks. The code demonstrates checking if a user is an admin.
- Task manager:
- A
TaskManagerinstance is created, which automatically logs its initialization. - Tasks are created using the
create_taskmethod. This method includes validation logic for the title and description, and logs the creation of each task. - A task is assigned to a user object.
- A
- Task operations: The script shows how to get tasks by ID and update their status.
- Robustness: When attempting to update a non-existent task, the
update_task_statusmethod correctly handles the case and returnsFalse, logging a warning instead of crashing.
- Robustness: When attempting to update a non-existent task, the
- Task analysis: Methods for filtering tasks by status and getting a summary are called, showcasing the functionality of the
TaskManager. - Context manager: The
performance_monitorcontext manager wraps a code block, logging its start and end times to measure performance. - Exception handling: The example demonstrates how to use the
retry_operationutility to handle an operation that might fail. It also uses atry-exceptblock to handle exceptions, such as aValueErrorduring user creation, ensuring the program doesn’t crash. - Output: The script prints the results of the operations, along with logs providing detailed information about the program’s execution, including successful actions, potential issues (warnings), and errors.
Performance Optimization
flowchart TD
A[Start Performance Demonstration] --> B[Caching Fibonacci]
B --> C{Cached vs Uncached}
C --> D[Print Timing Results]
D --> E[Generator vs List]
E --> F{Compare Memory & Speed}
F --> G[Print Results]
G --> H[String Operations]
H --> I{Slow vs Fast Concatenation}
I --> J[Print Results]
J --> K[Dictionary Lookup vs If-Elif]
K --> L{Compare Speed}
L --> M[Print Results]
M --> N[Memory Optimization]
N --> O[Demonstrate __slots__]
O --> P[Compare Memory Usage]
P --> Q[Generators vs Lists]
Q --> R[Process Sample Items]
R --> S[End Demonstration]classDiagram
class PerformanceOptimizer {
+timing_decorator(func) Callable
+fibonacci_cached(n:int) int
+fibonacci_uncached(n:int) int
+process_with_list(data_size:int) list
+process_with_generator(data_size:int) int
+string_concatenation_slow(words:list) str
+string_concatenation_fast(words:list) str
+process_with_if_elif(value:str) str
+process_with_dict(value:str) str
}
class MemoryOptimizer {
-data: list
+demonstrate_slots()
+demonstrate_generators_memory()
}
class Utilities {
+profile_function(func:Callable)
}
PerformanceOptimizer --> Utilities
MemoryOptimizer --> Utilities"""
Performance optimization techniques and best practices.
"""
import time
import cProfile
import pstats
from functools import lru_cache, wraps
from typing import Any, Callable
import sys
class PerformanceOptimizer:
"""Class demonstrating various performance optimization techniques."""
@staticmethod
def timing_decorator(func: Callable) -> Callable:
"""Decorator to measure function execution time."""
@wraps(func)
def wrapper(*args, **kwargs):
start_time = time.perf_counter()
result = func(*args, **kwargs)
end_time = time.perf_counter()
print(f"{func.__name__} took {end_time - start_time:.4f} seconds")
return result
return wrapper
# 1. Caching for expensive operations
@staticmethod
@lru_cache(maxsize=128)
def fibonacci_cached(n: int) -> int:
"""Cached Fibonacci calculation."""
if n <= 1:
return n
return PerformanceOptimizer.fibonacci_cached(n-1) + PerformanceOptimizer.fibonacci_cached(n-2)
@staticmethod
def fibonacci_uncached(n: int) -> int:
"""Uncached Fibonacci calculation."""
if n <= 1:
return n
return PerformanceOptimizer.fibonacci_uncached(n-1) + PerformanceOptimizer.fibonacci_uncached(n-2)
# 2. Generator vs List comprehension
@staticmethod
@timing_decorator
def process_with_list(data_size: int) -> list:
"""Process data using list comprehension."""
return [x**2 for x in range(data_size)]
@staticmethod
@timing_decorator
def process_with_generator(data_size: int) -> int:
"""Process data using generator."""
return sum(x**2 for x in range(data_size))
# 3. String operations optimization
@staticmethod
@timing_decorator
def string_concatenation_slow(words: list) -> str:
"""Slow string concatenation."""
result = ""
for word in words:
result += word + " "
return result.strip()
@staticmethod
@timing_decorator
def string_concatenation_fast(words: list) -> str:
"""Fast string concatenation."""
return " ".join(words)
# 4. Dictionary lookups vs multiple if-elif
@staticmethod
def process_with_if_elif(value: str) -> str:
"""Process using if-elif chain."""
if value == "a":
return "apple"
elif value == "b":
return "banana"
elif value == "c":
return "cherry"
elif value == "d":
return "date"
else:
return "unknown"
@staticmethod
def process_with_dict(value: str) -> str:
"""Process using dictionary lookup."""
mapping = {
"a": "apple",
"b": "banana",
"c": "cherry",
"d": "date"
}
return mapping.get(value, "unknown")
def profile_function(func: Callable) -> None:
"""Profile a function and display results."""
print(f"\nProfiling {func.__name__}:")
profiler = cProfile.Profile()
profiler.enable()
# Run the function
func()
profiler.disable()
# Display results
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10) # Show top 10 functions
def demonstrate_performance_optimization():
"""Demonstrate various performance optimization techniques."""
print("=== Performance Optimization Demonstration ===")
optimizer = PerformanceOptimizer()
# 1. Caching demonstration
print("\n1. Caching Benefits:")
@optimizer.timing_decorator
def test_fibonacci_cached():
return optimizer.fibonacci_cached(35)
@optimizer.timing_decorator
def test_fibonacci_uncached():
return optimizer.fibonacci_uncached(35)
print("Cached Fibonacci:")
result1 = test_fibonacci_cached()
print("Uncached Fibonacci:")
result2 = test_fibonacci_uncached()
print(f"Results match: {result1 == result2}")
# 2. Generator vs List
print("\n2. Generator vs List:")
data_size = 1000000
print("List comprehension (stores all in memory):")
optimizer.process_with_list(data_size)
print("Generator expression (processes on-demand):")
optimizer.process_with_generator(data_size)
# 3. String operations
print("\n3. String Operations:")
words = ["hello", "world", "python", "performance"] * 1000
print("Slow concatenation:")
result1 = optimizer.string_concatenation_slow(words)
print("Fast concatenation:")
result2 = optimizer.string_concatenation_fast(words)
print(f"Results match: {result1 == result2}")
# 4. Dictionary lookup vs if-elif
print("\n4. Dictionary Lookup vs If-Elif:")
test_values = ["a", "b", "c", "d", "x"] * 100000
@optimizer.timing_decorator
def test_if_elif():
return [optimizer.process_with_if_elif(v) for v in test_values]
@optimizer.timing_decorator
def test_dict_lookup():
return [optimizer.process_with_dict(v) for v in test_values]
print("If-elif chain:")
result1 = test_if_elif()
print("Dictionary lookup:")
result2 = test_dict_lookup()
print(f"Results match: {result1 == result2}")
# Memory optimization techniques
class MemoryOptimizer:
"""Techniques for memory optimization."""
def __init__(self):
self.data = []
def demonstrate_slots(self):
"""Demonstrate __slots__ for memory optimization."""
class RegularClass:
def __init__(self, x, y):
self.x = x
self.y = y
class SlottedClass:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
# Memory usage comparison
regular_objects = [RegularClass(i, i*2) for i in range(1000)]
slotted_objects = [SlottedClass(i, i*2) for i in range(1000)]
print("__slots__ reduces memory usage for instances")
print(f"Regular class instances: {len(regular_objects)}")
print(f"Slotted class instances: {len(slotted_objects)}")
def demonstrate_generators_memory(self):
"""Show memory efficiency of generators."""
def create_large_list():
return [x**2 for x in range(1000000)]
def create_large_generator():
return (x**2 for x in range(1000000))
print("\nMemory usage: List vs Generator")
# List stores all values in memory
large_list = create_large_list()
print(f"List created with {len(large_list)} items")
# Generator creates values on demand
large_gen = create_large_generator()
print("Generator created (no items stored in memory)")
# Process first 10 items from generator
first_ten = [next(large_gen) for _ in range(10)]
print(f"First 10 from generator: {first_ten}")
if __name__ == "__main__":
# Run the performance demonstrations
demonstrate_performance_optimization()
# Memory optimization
print("\n=== Memory Optimization ===")
memory_optimizer = MemoryOptimizer()
memory_optimizer.demonstrate_slots()
memory_optimizer.demonstrate_generators_memory()
print("\nPerformance optimization demonstration completed!")PythonThis comprehensive code demonstrates key performance and memory optimization techniques in Python. It includes practical examples using decorators for timing and caching, generators, optimized string and dictionary operations, __slots__ for class memory reduction, and profiling tools.
Performance optimization techniques
- Caching with
@lru_cache
- Concept: Memoization is a technique that stores the results of expensive function calls and returns the cached result when the same inputs occur again. The
@lru_cachedecorator provides an easy way to apply this technique. - Example:
fibonacci_cachedvs.fibonacci_uncached. The cached version uses the decorator to avoid re-calculating the same values, resulting in significantly faster execution time for repeated or recursive calls, as shown by thetiming_decorator.
- Generators vs. List Comprehensions
- Concept: List comprehensions build and store the entire list in memory at once. Generator expressions, on the other hand, produce one item at a time and do not store the entire sequence.
- Performance: For large datasets, generators consume significantly less memory. While the
timing_decoratormight show similar raw execution times for summing a sequence, the list comprehension’s memory footprint is much larger and can lead to performance degradation or out-of-memory errors. - Example:
process_with_listcreates a full list of squares, whileprocess_with_generatoruses a generator expression to feed the squares tosum()one by one, without storing them all.
- String Operations
- Concept: In Python, strings are immutable. The
+=operator for strings creates a new string object in each iteration. For long loops, this can be highly inefficient. - Optimization: Using
str.join()is the recommended and fastest way to concatenate a sequence of strings. It is optimized to allocate memory efficiently for the final concatenated string. - Example:
string_concatenation_slowuses a loop with+=, whilestring_concatenation_fastuses"".join(words), demonstrating the performance difference.
- Dictionary Lookups vs.
if-elifchains
- Concept: Dictionary lookups are implemented as hash table lookups, which have an average time complexity of O(1) (constant time). A long
if-elifchain has a time complexity that is linear to the number of conditions (O(N)). - Performance: For a large number of conditions or lookups, a dictionary is significantly faster and more scalable than a long
if-elifchain. - Example:
process_with_if_elifuses a linear search, whileprocess_with_dictuses a dictionary lookup, which is more efficient, as demonstrated by thetiming_decorator.
cProfile and pstats
- Concept:
cProfileis a built-in profiler that measures the time and frequency of function calls, helping you identify performance bottlenecks.pstatsis a module for analyzing and displaying profiling results. - Example: The
profile_functionutility shows how to run a function through the profiler and display the results, which indicate which parts of the code are the most time-consuming.
Memory optimization techniques
__slots__ for class instances
- Concept: By default, Python classes store instance attributes in a dictionary (
__dict__), which adds overhead. The__slots__special attribute tells Python to store instance attributes in a fixed-size array instead, eliminating the__dict__and reducing memory usage. - Usage: It’s most effective for classes with many instances and a fixed set of attributes.
- Example:
demonstrate_slotscompares the memory usage of a regular class with a slotted class, showing the potential memory savings.
Generators for memory efficiency
- Concept: As mentioned in the performance section, generators do not store all items in memory, making them ideal for handling large datasets.
- Example:
demonstrate_generators_memoryillustrates the memory difference between creating a large list and a large generator for the same sequence. The generator uses a tiny amount of memory, regardless of the sequence size.
Key takeaways
- Measure first: Use tools like
cProfileto identify actual bottlenecks before attempting to optimize. - Choose the right data structure: Dictionaries are faster than
if-elifchains for lookups. - Optimize loops and string operations: Use
str.join()for string concatenation. - Use generators for large sequences: Prefer generator expressions and functions over list comprehensions when the full list isn’t needed at once to save memory.
- Apply memoization: Use
@lru_cacheto speed up expensive, repeatable function calls. - Reduce class instance memory: Use
__slots__for classes where memory is a concern.
Conclusion
Congratulations! You’ve completed this comprehensive Python programming guide that takes you from beginner concepts to expert-level topics. This book has covered:
What You’ve Learned
mindmap
root((Python Mastery))
Fundamentals
Variables & Data Types
Control Flow
Functions
Error Handling
Intermediate
OOP Concepts
File Operations
Modules & Packages
Data Structures
Advanced
Generators & Iterators
Decorators
Context Managers
Metaclasses
Specializations
Web Development
Data Science
Testing & Debugging
Performance OptimizationNext Steps for Continued Learning
- Practice Projects: Build real-world applications using the concepts you’ve learned
- Open Source Contributions: Contribute to Python projects on GitHub
- Advanced Topics: Explore asyncio, multiprocessing, and advanced design patterns
- Frameworks: Dive deeper into Django, Flask, FastAPI, or data science libraries
- Community: Join Python communities, attend conferences, and keep learning
Key Takeaways
- Write Clean, Readable Code: Follow PEP 8 and use meaningful variable names
- Test Your Code: Write tests early and often
- Handle Errors Gracefully: Use proper exception handling
- Document Your Work: Write clear docstrings and comments
- Keep Learning: Python is constantly evolving with new features and libraries
Remember, becoming proficient in Python is a journey, not a destination. Keep practicing, building projects, and exploring new areas of Python development. The skills you’ve learned in this book provide a solid foundation for whatever direction your Python journey takes you.
Happy coding! 🐍
Discover more from Altgr Blog
Subscribe to get the latest posts sent to your email.
