MySQL Replication – Altgr Blog

From Beginner to Expert

Introduction to MySQL Replication
Prerequisites and Setup
Understanding Replication Architecture
Binary Logging
Setting Up Master-Slave Replication
Master-Master Replication
MySQL Group Replication
Monitoring and Troubleshooting
Performance Optimization
Security Considerations
Advanced Topics
Best Practices

1. Introduction to MySQL Replication

MySQL replication is a process that enables data from one MySQL database server (the master) to be copied to one or more MySQL database servers (the slaves). This feature is essential for:

High Availability: Ensuring your database remains accessible even if one server fails
Load Distribution: Distributing read queries across multiple servers
Backup Solutions: Creating real-time backups
Data Analysis: Separating analytical workloads from production systems

Types of Replication

graph TB
    A[MySQL Replication Types] --> B[Asynchronous Replication]
    A --> C[Semi-synchronous Replication]
    A --> D[Synchronous Replication]

    B --> B1[Master-Slave]
    B --> B2[Master-Master]

    C --> C1[Enhanced Durability]
    C --> C2[Acknowledged Transactions]

    D --> D1[Group Replication]
    D --> D2[MySQL Cluster]

2. Prerequisites and Setup

System Requirements

Operating System: Linux (Ubuntu/CentOS), Windows, or macOS
MySQL Version: 5.7+ (8.0+ recommended)
Network: Stable network connection between servers
Storage: Sufficient disk space for binary logs

Installing MySQL

Ubuntu/Debian

sudo apt update
sudo apt install mysql-server
sudo mysql_secure_installation

sudo apt update
sudo apt install mysql-server
sudo mysql_secure_installation

Bash

CentOS/RHEL

sudo yum install mysql-server
sudo systemctl start mysqld
sudo systemctl enable mysqld
sudo mysql_secure_installation

sudo yum install mysql-server
sudo systemctl start mysqld
sudo systemctl enable mysqld
sudo mysql_secure_installation

Bash

Basic Configuration

# /etc/mysql/mysql.conf.d/mysqld.cnf or /etc/my.cnf

[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON

# /etc/mysql/mysql.conf.d/mysqld.cnf or /etc/my.cnf

[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON

Bash

3. Understanding Replication Architecture

Basic Master-Slave Architecture

graph LR
    A[Client Applications] --> B[Master Server]
    B --> C[Binary Log]
    C --> D[Slave I/O Thread]
    D --> E[Relay Log]
    E --> F[Slave SQL Thread]
    F --> G[Slave Database]

    subgraph "Master Server"
    B
    C
    end

    subgraph "Slave Server"
    D
    E
    F
    G
    end

Replication Process Flow

sequenceDiagram
    participant C as Client
    participant M as Master
    participant S as Slave

    C->>M: INSERT/UPDATE/DELETE
    M->>M: Execute Query
    M->>M: Write to Binary Log
    S->>M: Request Binary Log Events
    M->>S: Send Binary Log Events
    S->>S: Write to Relay Log
    S->>S: Execute from Relay Log
    S->>S: Update Database

Multi-Master Architecture

graph TB
    A[Master 1] <--> B[Master 2]
    A <--> C[Master 3]
    B <--> C

    A --> D[Slave 1]
    A --> E[Slave 2]
    B --> F[Slave 3]
    C --> G[Slave 4]

    style A fill:#e1f5fe
    style B fill:#e1f5fe
    style C fill:#e1f5fe
    style D fill:#f3e5f5
    style E fill:#f3e5f5
    style F fill:#f3e5f5
    style G fill:#f3e5f5

4. Binary Logging

Binary logs are essential for replication. They contain events that describe database changes.

Binary Log Configuration

[mysqld]
# Enable binary logging
log-bin = /var/log/mysql/mysql-bin

# Binary log format
binlog-format = ROW

# Log slave updates (for master-master setup)
log-slave-updates = ON

# Binary log expiration
expire_logs_days = 7

# Maximum binary log size
max_binlog_size = 100M

[mysqld]
# Enable binary logging
log-bin = /var/log/mysql/mysql-bin

# Binary log format
binlog-format = ROW

# Log slave updates (for master-master setup)
log-slave-updates = ON

# Binary log expiration
expire_logs_days = 7

# Maximum binary log size
max_binlog_size = 100M

Bash

Binary Log Formats

graph TD
    A[Binary Log Formats] --> B[STATEMENT]
    A --> C[ROW]
    A --> D[MIXED]

    B --> B1[Logs SQL statements]
    B --> B2[Smaller log size]
    B --> B3[Non-deterministic issues]

    C --> C1[Logs row changes]
    C --> C2[Larger log size]
    C --> C3[More reliable]

    D --> D1[Automatic selection]
    D --> D2[Best of both worlds]

Managing Binary Logs

-- Show binary logs
SHOW BINARY LOGS;

-- Show current binary log position
SHOW MASTER STATUS;

-- Purge old binary logs
PURGE BINARY LOGS TO 'mysql-bin.000010';
PURGE BINARY LOGS BEFORE '2023-01-01 00:00:00';

-- Show binary log events
SHOW BINLOG EVENTS IN 'mysql-bin.000001';

-- Show binary logs
SHOW BINARY LOGS;

-- Show current binary log position
SHOW MASTER STATUS;

-- Purge old binary logs
PURGE BINARY LOGS TO 'mysql-bin.000010';
PURGE BINARY LOGS BEFORE '2023-01-01 00:00:00';

-- Show binary log events
SHOW BINLOG EVENTS IN 'mysql-bin.000001';

SQL

5. Setting Up Master-Slave Replication

Step 1: Configure the Master

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON

INI

Step 2: Create Replication User

-- On Master server
CREATE USER 'replication_user'@'%' IDENTIFIED BY 'strong_password';
GRANT REPLICATION SLAVE ON *.* TO 'replication_user'@'%';
FLUSH PRIVILEGES;

-- On Master server
CREATE USER 'replication_user'@'%' IDENTIFIED BY 'strong_password';
GRANT REPLICATION SLAVE ON *.* TO 'replication_user'@'%';
FLUSH PRIVILEGES;

SQL

Step 3: Get Master Status

-- Lock tables to get consistent snapshot
FLUSH TABLES WITH READ LOCK;

-- Get master status
SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |      154 |              |                  |
+------------------+----------+--------------+------------------+

-- Don't unlock yet if taking a backup

-- Lock tables to get consistent snapshot
FLUSH TABLES WITH READ LOCK;

-- Get master status
SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |      154 |              |                  |
+------------------+----------+--------------+------------------+

-- Don't unlock yet if taking a backup

SQL

Step 4: Configure the Slave

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
server-id = 2
relay-log = mysql-relay-bin
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON
read-only = ON

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
server-id = 2
relay-log = mysql-relay-bin
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON
read-only = ON

INI

Step 5: Start Replication

-- On Slave server
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.100',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_LOG_FILE = 'mysql-bin.000001',
    MASTER_LOG_POS = 154;

START SLAVE;

-- Check slave status
SHOW SLAVE STATUS\G

-- On Slave server
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.100',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_LOG_FILE = 'mysql-bin.000001',
    MASTER_LOG_POS = 154;

START SLAVE;

-- Check slave status
SHOW SLAVE STATUS\G

SQL

Step 6: Verify Replication

-- On Master: Unlock tables
UNLOCK TABLES;

-- Test replication
CREATE DATABASE test_replication;
USE test_replication;
CREATE TABLE test_table (id INT PRIMARY KEY, name VARCHAR(50));
INSERT INTO test_table VALUES (1, 'Test Data');

-- On Slave: Verify data
USE test_replication;
SELECT * FROM test_table;

-- On Master: Unlock tables
UNLOCK TABLES;

-- Test replication
CREATE DATABASE test_replication;
USE test_replication;
CREATE TABLE test_table (id INT PRIMARY KEY, name VARCHAR(50));
INSERT INTO test_table VALUES (1, 'Test Data');

-- On Slave: Verify data
USE test_replication;
SELECT * FROM test_table;

SQL

Replication Setup Flow

flowchart TD
    A[Configure Master] --> B[Create Replication User]
    B --> C[Get Master Status]
    C --> D[Configure Slave]
    D --> E[Start Replication]
    E --> F[Verify Replication]

    F --> G{Replication Working?}
    G -->|Yes| H[Setup Complete]
    G -->|No| I[Troubleshoot]
    I --> J[Check Error Logs]
    J --> K[Fix Issues]
    K --> E

6. Master-Master Replication

Master-Master replication allows both servers to act as masters and slaves simultaneously.

Architecture Overview

graph LR
    A[Application Server 1] --> B[MySQL Master 1]
    C[Application Server 2] --> D[MySQL Master 2]

    B <--> D

    subgraph "Data Center 1"
    A
    B
    end

    subgraph "Data Center 2"
    C
    D
    end

Configuration for Master-Master

Server 1 Configuration

[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
auto-increment-increment = 2
auto-increment-offset = 1
log-slave-updates = ON
gtid-mode = ON
enforce-gtid-consistency = ON

[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
auto-increment-increment = 2
auto-increment-offset = 1
log-slave-updates = ON
gtid-mode = ON
enforce-gtid-consistency = ON

INI

Server 2 Configuration

[mysqld]
server-id = 2
log-bin = mysql-bin
binlog-format = ROW
auto-increment-increment = 2
auto-increment-offset = 2
log-slave-updates = ON
gtid-mode = ON
enforce-gtid-consistency = ON

[mysqld]
server-id = 2
log-bin = mysql-bin
binlog-format = ROW
auto-increment-increment = 2
auto-increment-offset = 2
log-slave-updates = ON
gtid-mode = ON
enforce-gtid-consistency = ON

INI

Setup Process

-- On both servers, create replication users
CREATE USER 'replication_user'@'%' IDENTIFIED BY 'strong_password';
GRANT REPLICATION SLAVE ON *.* TO 'replication_user'@'%';
FLUSH PRIVILEGES;

-- On Server 1: Set up replication to Server 2
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.102',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_AUTO_POSITION = 1;

-- On Server 2: Set up replication to Server 1
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.101',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_AUTO_POSITION = 1;

-- Start replication on both servers
START SLAVE;

-- On both servers, create replication users
CREATE USER 'replication_user'@'%' IDENTIFIED BY 'strong_password';
GRANT REPLICATION SLAVE ON *.* TO 'replication_user'@'%';
FLUSH PRIVILEGES;

-- On Server 1: Set up replication to Server 2
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.102',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_AUTO_POSITION = 1;

-- On Server 2: Set up replication to Server 1
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.101',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_AUTO_POSITION = 1;

-- Start replication on both servers
START SLAVE;

SQL

Conflict Resolution

graph TD
    A[Write Conflict Detected] --> B{Conflict Type}

    B --> C[Primary Key Conflict]
    B --> D[Data Inconsistency]
    B --> E[Auto-increment Conflict]

    C --> F[Use auto-increment offsets]
    D --> G[Application-level resolution]
    E --> H[Configure increment settings]

    F --> I[Resume Replication]
    G --> I
    H --> I

7. MySQL Group Replication

Group Replication provides distributed state machine replication with strong coordination between servers.

Group Replication Architecture

graph TB
    A[Group Replication Cluster]

    subgraph "MySQL Group"
    B[MySQL Node 1Primary]
    C[MySQL Node 2Secondary]
    D[MySQL Node 3Secondary]
    end

    E[Application] --> B
    E --> C
    E --> D

    B <--> C
    B <--> D
    C <--> D

    style B fill:#e8f5e8
    style C fill:#fff2cc
    style D fill:#fff2cc

Configuration

[mysqld]
# Group Replication Configuration
server_id = 1
gtid_mode = ON
enforce_gtid_consistency = ON
binlog_format = ROW
log_slave_updates = ON
binlog_checksum = NONE
slave_preserve_commit_order = ON

# Group Replication specific
plugin_load_add = 'group_replication.so'
group_replication_group_name = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
group_replication_start_on_boot = off
group_replication_local_address = "127.0.0.1:33061"
group_replication_group_seeds = "127.0.0.1:33061,127.0.0.1:33062,127.0.0.1:33063"
group_replication_bootstrap_group = off

[mysqld]
# Group Replication Configuration
server_id = 1
gtid_mode = ON
enforce_gtid_consistency = ON
binlog_format = ROW
log_slave_updates = ON
binlog_checksum = NONE
slave_preserve_commit_order = ON

# Group Replication specific
plugin_load_add = 'group_replication.so'
group_replication_group_name = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
group_replication_start_on_boot = off
group_replication_local_address = "127.0.0.1:33061"
group_replication_group_seeds = "127.0.0.1:33061,127.0.0.1:33062,127.0.0.1:33063"
group_replication_bootstrap_group = off

INI

Setting Up Group Replication

-- Install the plugin
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

-- Create replication user
SET SQL_LOG_BIN=0;
CREATE USER 'repl'@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;

-- Configure the group replication recovery channel
CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';

-- Bootstrap the group (on first node only)
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- Join other nodes to the group
START GROUP_REPLICATION;

-- Install the plugin
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

-- Create replication user
SET SQL_LOG_BIN=0;
CREATE USER 'repl'@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;

-- Configure the group replication recovery channel
CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';

-- Bootstrap the group (on first node only)
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- Join other nodes to the group
START GROUP_REPLICATION;

SQL

Group Replication States

stateDiagram-v2
    [*] --> OFFLINE
    OFFLINE --> RECOVERING: START GROUP_REPLICATION
    RECOVERING --> ONLINE: Recovery Complete
    ONLINE --> ERROR: Network/Data Issues
    ERROR --> RECOVERING: Resolve Issues
    ONLINE --> OFFLINE: STOP GROUP_REPLICATION
    ERROR --> OFFLINE: STOP GROUP_REPLICATION

    ONLINE --> UNREACHABLE: Network Partition
    UNREACHABLE --> ONLINE: Network Restored
    UNREACHABLE --> OFFLINE: Manual Intervention

8. Monitoring and Troubleshooting

Key Monitoring Metrics

mindmap
  root((MySQL Replication Monitoring))
    Lag Metrics
      Seconds Behind Master
      GTID Lag
      Binary Log Position

    Health Metrics
      Slave IO Running
      Slave SQL Running
      Last Error

    Performance Metrics
      Replication Throughput
      Binary Log Size
      Relay Log Size

    System Metrics
      Network Latency
      Disk I/O
      CPU Usage

Essential Monitoring Queries

-- Check slave status
SHOW SLAVE STATUS\G

-- Check replication lag
SELECT 
    CHANNEL_NAME,
    SERVICE_STATE,
    LAST_ERROR_MESSAGE,
    LAST_ERROR_TIMESTAMP
FROM performance_schema.replication_connection_status;

-- Monitor GTID progress
SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
SHOW GLOBAL VARIABLES LIKE 'gtid_purged';

-- Check binary log usage
SELECT 
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 1) AS 'DB Size in MB' 
FROM information_schema.tables;

-- Monitor replication threads
SELECT * FROM performance_schema.replication_applier_status;

-- Check slave status
SHOW SLAVE STATUS\G

-- Check replication lag
SELECT 
    CHANNEL_NAME,
    SERVICE_STATE,
    LAST_ERROR_MESSAGE,
    LAST_ERROR_TIMESTAMP
FROM performance_schema.replication_connection_status;

-- Monitor GTID progress
SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
SHOW GLOBAL VARIABLES LIKE 'gtid_purged';

-- Check binary log usage
SELECT 
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 1) AS 'DB Size in MB' 
FROM information_schema.tables;

-- Monitor replication threads
SELECT * FROM performance_schema.replication_applier_status;

SQL

Common Issues and Solutions

Replication Lag

flowchart TD
    A[High Replication Lag] --> B{Identify Cause}

    B --> C[Network Issues]
    B --> D[Slave Hardware]
    B --> E[Large Transactions]
    B --> F[Binary Log Format]

    C --> C1[Check network latency]
    C --> C2[Increase bandwidth]

    D --> D1[Upgrade hardware]
    D --> D2[Optimize queries]

    E --> E1[Break large transactions]
    E --> E2[Use smaller batches]

    F --> F1[Consider ROW format]
    F --> F2[Optimize binlog settings]

Slave Synchronization Issues

-- Skip problematic statement (use with caution)
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

-- Reset slave to specific position
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_POS = position_number;
START SLAVE;

-- Reset slave completely
STOP SLAVE;
RESET SLAVE ALL;
-- Reconfigure and start replication

-- Skip problematic statement (use with caution)
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

-- Reset slave to specific position
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_POS = position_number;
START SLAVE;

-- Reset slave completely
STOP SLAVE;
RESET SLAVE ALL;
-- Reconfigure and start replication

SQL

Monitoring Scripts

Bash Monitoring Script

#!/bin/bash
# mysql_replication_monitor.sh

MYSQL_USER="monitor_user"
MYSQL_PASS="monitor_password"
MYSQL_HOST="localhost"

# Check slave status
SLAVE_STATUS=$(mysql -u${MYSQL_USER} -p${MYSQL_PASS} -h${MYSQL_HOST} -e "SHOW SLAVE STATUS\G")

# Extract key metrics
IO_RUNNING=$(echo "$SLAVE_STATUS" | grep "Slave_IO_Running" | awk '{print $2}')
SQL_RUNNING=$(echo "$SLAVE_STATUS" | grep "Slave_SQL_Running" | awk '{print $2}')
SECONDS_BEHIND=$(echo "$SLAVE_STATUS" | grep "Seconds_Behind_Master" | awk '{print $2}')

echo "=== MySQL Replication Status ==="
echo "IO Thread Running: $IO_RUNNING"
echo "SQL Thread Running: $SQL_RUNNING"
echo "Seconds Behind Master: $SECONDS_BEHIND"

# Alert if replication is broken
if [ "$IO_RUNNING" != "Yes" ] || [ "$SQL_RUNNING" != "Yes" ]; then
    echo "ALERT: Replication is broken!"
    exit 1
fi

# Alert if lag is too high
if [ "$SECONDS_BEHIND" -gt 60 ]; then
    echo "WARNING: High replication lag detected!"
    exit 1
fi

echo "Replication is healthy"

#!/bin/bash
# mysql_replication_monitor.sh

MYSQL_USER="monitor_user"
MYSQL_PASS="monitor_password"
MYSQL_HOST="localhost"

# Check slave status
SLAVE_STATUS=$(mysql -u${MYSQL_USER} -p${MYSQL_PASS} -h${MYSQL_HOST} -e "SHOW SLAVE STATUS\G")

# Extract key metrics
IO_RUNNING=$(echo "$SLAVE_STATUS" | grep "Slave_IO_Running" | awk '{print $2}')
SQL_RUNNING=$(echo "$SLAVE_STATUS" | grep "Slave_SQL_Running" | awk '{print $2}')
SECONDS_BEHIND=$(echo "$SLAVE_STATUS" | grep "Seconds_Behind_Master" | awk '{print $2}')

echo "=== MySQL Replication Status ==="
echo "IO Thread Running: $IO_RUNNING"
echo "SQL Thread Running: $SQL_RUNNING"
echo "Seconds Behind Master: $SECONDS_BEHIND"

# Alert if replication is broken
if [ "$IO_RUNNING" != "Yes" ] || [ "$SQL_RUNNING" != "Yes" ]; then
    echo "ALERT: Replication is broken!"
    exit 1
fi

# Alert if lag is too high
if [ "$SECONDS_BEHIND" -gt 60 ]; then
    echo "WARNING: High replication lag detected!"
    exit 1
fi

echo "Replication is healthy"

Bash

9. Performance Optimization

Optimization Strategies

graph TD
    A[Performance Optimization] --> B[Hardware Optimization]
    A --> C[Configuration Tuning]
    A --> D[Application Optimization]

    B --> B1[SSD Storage]
    B --> B2[More RAM]
    B --> B3[Better Network]

    C --> C1[Binary Log Settings]
    C --> C2[InnoDB Configuration]
    C --> C3[Replication Settings]

    D --> D1[Batch Operations]
    D --> D2[Read-Write Splitting]
    D --> D3[Connection Pooling]

Key Configuration Parameters

[mysqld]
# InnoDB settings for better replication performance
innodb_buffer_pool_size = 70% of RAM
innodb_log_file_size = 256M
innodb_log_buffer_size = 64M
innodb_flush_log_at_trx_commit = 2

# Binary log optimization
sync_binlog = 0
binlog_cache_size = 32K
max_binlog_cache_size = 2G

# Replication optimization
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK
slave_preserve_commit_order = ON
relay_log_recovery = ON

# Network optimization
max_allowed_packet = 128M
net_buffer_length = 32K

[mysqld]
# InnoDB settings for better replication performance
innodb_buffer_pool_size = 70% of RAM
innodb_log_file_size = 256M
innodb_log_buffer_size = 64M
innodb_flush_log_at_trx_commit = 2

# Binary log optimization
sync_binlog = 0
binlog_cache_size = 32K
max_binlog_cache_size = 2G

# Replication optimization
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK
slave_preserve_commit_order = ON
relay_log_recovery = ON

# Network optimization
max_allowed_packet = 128M
net_buffer_length = 32K

INI

Parallel Replication

sequenceDiagram
    participant M as Master
    participant S as Slave Coordinator
    participant W1 as Worker 1
    participant W2 as Worker 2
    participant W3 as Worker 3
    participant W4 as Worker 4

    M->>S: Binary Log Events
    S->>S: Analyze Dependencies
    S->>W1: Database A Events
    S->>W2: Database B Events
    S->>W3: Database C Events
    S->>W4: Database D Events

    par
        W1->>W1: Apply Events
    and
        W2->>W2: Apply Events
    and
        W3->>W3: Apply Events
    and
        W4->>W4: Apply Events
    end

Performance Monitoring Queries

-- Check slave worker threads
SELECT 
    CHANNEL_NAME,
    WORKER_ID,
    SERVICE_STATE,
    LAST_ERROR_MESSAGE
FROM performance_schema.replication_applier_status_by_worker;

-- Monitor replication throughput
SELECT 
    EVENT_NAME,
    COUNT_STAR,
    SUM_TIMER_WAIT/1000000000000 AS 'Total Time (sec)',
    AVG_TIMER_WAIT/1000000000000 AS 'Avg Time (sec)'
FROM performance_schema.events_statements_summary_global_by_event_name
WHERE EVENT_NAME LIKE '%Binlog%';

-- Check binary log cache hit ratio
SHOW GLOBAL STATUS LIKE 'Binlog_cache%';

-- Check slave worker threads
SELECT 
    CHANNEL_NAME,
    WORKER_ID,
    SERVICE_STATE,
    LAST_ERROR_MESSAGE
FROM performance_schema.replication_applier_status_by_worker;

-- Monitor replication throughput
SELECT 
    EVENT_NAME,
    COUNT_STAR,
    SUM_TIMER_WAIT/1000000000000 AS 'Total Time (sec)',
    AVG_TIMER_WAIT/1000000000000 AS 'Avg Time (sec)'
FROM performance_schema.events_statements_summary_global_by_event_name
WHERE EVENT_NAME LIKE '%Binlog%';

-- Check binary log cache hit ratio
SHOW GLOBAL STATUS LIKE 'Binlog_cache%';

SQL

10. Security Considerations

Security Architecture

SSL/TLS Configuration

graph TB
    subgraph "Security Layers"
    A[Network Security] --> B[Authentication]
    B --> C[Authorization]
    C --> D[Encryption]
    D --> E[Auditing]
    end

    A --> A1[Firewall Rules]
    A --> A2[VPN/Private Networks]

    B --> B1[Strong Passwords]
    B --> B2[Certificate Authentication]

    C --> C1[Least Privilege]
    C --> C2[Role-based Access]

    D --> D1[SSL/TLS]
    D --> D2[Binary Log Encryption]

    E --> E1[Connection Logging]
    E --> E2[Change Tracking]

Master Configuration

[mysqld]
# SSL Configuration
ssl-ca = /etc/mysql/ssl/ca-cert.pem
ssl-cert = /etc/mysql/ssl/server-cert.pem
ssl-key = /etc/mysql/ssl/server-key.pem
require_secure_transport = ON

[mysqld]
# SSL Configuration
ssl-ca = /etc/mysql/ssl/ca-cert.pem
ssl-cert = /etc/mysql/ssl/server-cert.pem
ssl-key = /etc/mysql/ssl/server-key.pem
require_secure_transport = ON

INI

Slave Configuration

-- Configure SSL for replication
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.100',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_SSL = 1,
    MASTER_SSL_CA = '/etc/mysql/ssl/ca-cert.pem',
    MASTER_SSL_CERT = '/etc/mysql/ssl/client-cert.pem',
    MASTER_SSL_KEY = '/etc/mysql/ssl/client-key.pem';

-- Configure SSL for replication
CHANGE MASTER TO
    MASTER_HOST = '192.168.1.100',
    MASTER_USER = 'replication_user',
    MASTER_PASSWORD = 'strong_password',
    MASTER_SSL = 1,
    MASTER_SSL_CA = '/etc/mysql/ssl/ca-cert.pem',
    MASTER_SSL_CERT = '/etc/mysql/ssl/client-cert.pem',
    MASTER_SSL_KEY = '/etc/mysql/ssl/client-key.pem';

SQL

Security Best Practices

-- Create dedicated replication user with minimal privileges
CREATE USER 'repl_user'@'slave_ip' IDENTIFIED BY 'complex_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'slave_ip';

-- Use certificate-based authentication
CREATE USER 'repl_user'@'slave_ip' IDENTIFIED WITH authentication_string 
REQUIRE SSL 
AND X509 
AND ISSUER '/C=US/ST=State/L=City/O=Organization/OU=Unit/CN=CA';

-- Enable binary log encryption (MySQL 8.0+)
SET GLOBAL binlog_encryption = ON;

-- Create dedicated replication user with minimal privileges
CREATE USER 'repl_user'@'slave_ip' IDENTIFIED BY 'complex_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'slave_ip';

-- Use certificate-based authentication
CREATE USER 'repl_user'@'slave_ip' IDENTIFIED WITH authentication_string 
REQUIRE SSL 
AND X509 
AND ISSUER '/C=US/ST=State/L=City/O=Organization/OU=Unit/CN=CA';

-- Enable binary log encryption (MySQL 8.0+)
SET GLOBAL binlog_encryption = ON;

SQL

Network Security

# Firewall rules for MySQL replication
# Allow MySQL port from specific IPs only
iptables -A INPUT -p tcp --dport 3306 -s 192.168.1.100 -j ACCEPT
iptables -A INPUT -p tcp --dport 3306 -j DROP

# For Group Replication (additional ports)
iptables -A INPUT -p tcp --dport 33061 -s 192.168.1.0/24 -j ACCEPT

# Firewall rules for MySQL replication
# Allow MySQL port from specific IPs only
iptables -A INPUT -p tcp --dport 3306 -s 192.168.1.100 -j ACCEPT
iptables -A INPUT -p tcp --dport 3306 -j DROP

# For Group Replication (additional ports)
iptables -A INPUT -p tcp --dport 33061 -s 192.168.1.0/24 -j ACCEPT

Bash

11. Advanced Topics

Global Transaction Identifiers (GTIDs)

graph TD
    A[GTID Components] --> B[Source UUID]
    A --> C[Transaction Number]

    B --> B1[Server Identifier]
    B --> B2[Unique per server]

    C --> C1[Sequential number]
    C --> C2[Per transaction]

    D[GTID Format] --> E["source_id:transaction_id"]
    E --> F["Example: 3E11FA47-71CA-11E1-9E33-C80AA9429562:23"]

GTID Configuration

[mysqld]
gtid_mode = ON
enforce_gtid_consistency = ON
log_slave_updates = ON

[mysqld]
gtid_mode = ON
enforce_gtid_consistency = ON
log_slave_updates = ON

INI

GTID Operations

-- Check GTID status
SHOW GLOBAL VARIABLES LIKE 'gtid%';

-- View executed GTIDs
SELECT @@global.gtid_executed;

-- Set up GTID-based replication
CHANGE MASTER TO
    MASTER_HOST = 'master_host',
    MASTER_USER = 'repl_user',
    MASTER_PASSWORD = 'password',
    MASTER_AUTO_POSITION = 1;

-- Check GTID status
SHOW GLOBAL VARIABLES LIKE 'gtid%';

-- View executed GTIDs
SELECT @@global.gtid_executed;

-- Set up GTID-based replication
CHANGE MASTER TO
    MASTER_HOST = 'master_host',
    MASTER_USER = 'repl_user',
    MASTER_PASSWORD = 'password',
    MASTER_AUTO_POSITION = 1;

SQL

Point-in-Time Recovery with Replication

timeline
    title Point-in-Time Recovery Process

    section Backup
        Daily Backup    : Full backup created

    section Failure
        Data Corruption : Problem detected

    section Recovery
        Restore Backup  : Apply full backup
        Apply Binlogs   : Replay binary logs
        Specific Point  : Stop at desired time

    section Validation
        Verify Data     : Check data integrity
        Resume Ops      : Resume operations

Recovery Process

# 1. Restore from backup
mysql < full_backup.sql

# 2. Apply binary logs up to specific point
mysqlbinlog --start-datetime="2023-01-01 09:00:00" \
            --stop-datetime="2023-01-01 10:30:00" \
            mysql-bin.000001 mysql-bin.000002 | mysql

# 3. Verify and resume replication

# 1. Restore from backup
mysql < full_backup.sql

# 2. Apply binary logs up to specific point
mysqlbinlog --start-datetime="2023-01-01 09:00:00" \
            --stop-datetime="2023-01-01 10:30:00" \
            mysql-bin.000001 mysql-bin.000002 | mysql

# 3. Verify and resume replication

Bash

Multi-Source Replication

graph TB
    A[Master 1Sales DB] --> D[Consolidated Slave]
    B[Master 2Inventory DB] --> D
    C[Master 3Customer DB] --> D

    D --> E[Data Warehouse]
    D --> F[Reporting Server]

    style A fill:#ff9999
    style B fill:#99ff99
    style C fill:#9999ff
    style D fill:#ffff99

Configuration

-- Set up multiple replication channels
CHANGE MASTER TO 
    MASTER_HOST='sales_master',
    MASTER_USER='repl_user',
    MASTER_PASSWORD='password'
    FOR CHANNEL 'sales_channel';

CHANGE MASTER TO 
    MASTER_HOST='inventory_master',
    MASTER_USER='repl_user', 
    MASTER_PASSWORD='password'
    FOR CHANNEL 'inventory_channel';

-- Start specific channels
START SLAVE FOR CHANNEL 'sales_channel';
START SLAVE FOR CHANNEL 'inventory_channel';

-- Set up multiple replication channels
CHANGE MASTER TO 
    MASTER_HOST='sales_master',
    MASTER_USER='repl_user',
    MASTER_PASSWORD='password'
    FOR CHANNEL 'sales_channel';

CHANGE MASTER TO 
    MASTER_HOST='inventory_master',
    MASTER_USER='repl_user', 
    MASTER_PASSWORD='password'
    FOR CHANNEL 'inventory_channel';

-- Start specific channels
START SLAVE FOR CHANNEL 'sales_channel';
START SLAVE FOR CHANNEL 'inventory_channel';

SQL

Delayed Replication

-- Set up delayed replication (delayed by 1 hour)
CHANGE MASTER TO 
    MASTER_HOST='master_host',
    MASTER_USER='repl_user',
    MASTER_PASSWORD='password',
    MASTER_DELAY = 3600;

-- Check delay status
SHOW SLAVE STATUS\G

-- Set up delayed replication (delayed by 1 hour)
CHANGE MASTER TO 
    MASTER_HOST='master_host',
    MASTER_USER='repl_user',
    MASTER_PASSWORD='password',
    MASTER_DELAY = 3600;

-- Check delay status
SHOW SLAVE STATUS\G

SQL

12. Best Practices

Replication Best Practices Checklist

mindmap
  root((Best Practices))
    Planning
      Capacity Planning
      Network Design
      Security Planning
      Disaster Recovery

    Configuration
      GTID Usage
      Binary Log Format
      Parallel Workers
      SSL Encryption

    Monitoring
      Lag Monitoring
      Error Alerting
      Performance Metrics
      Health Checks

    Maintenance
      Log Rotation
      Regular Testing
      Backup Verification
      Documentation

1. Planning and Design

Capacity Planning

Network Bandwidth: Ensure sufficient bandwidth for replication traffic
Storage: Plan for binary log storage (typically 10-20% of data size)
Hardware: Size slave servers appropriately for workload

Network Design

graph TB
    subgraph "Production Network"
    A[Master Server]
    B[Application Servers]
    end

    subgraph "Replication Network"
    C[Dedicated Network]
    D[Slave Servers]
    end

    A -.-> C
    C --> D
    B --> A

    style A fill:#ff9999
    style D fill:#99ff99
    style C fill:#ffff99

2. Configuration Best Practices

# Recommended production configuration
[mysqld]
# Server identification
server_id = 1

# Binary logging
log_bin = mysql-bin
binlog_format = ROW
sync_binlog = 1
expire_logs_days = 7

# GTID
gtid_mode = ON
enforce_gtid_consistency = ON

# Replication
log_slave_updates = ON
relay_log_recovery = ON
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK

# InnoDB
innodb_flush_log_at_trx_commit = 1
innodb_buffer_pool_size = 70% of RAM

# Security
require_secure_transport = ON

# Recommended production configuration
[mysqld]
# Server identification
server_id = 1

# Binary logging
log_bin = mysql-bin
binlog_format = ROW
sync_binlog = 1
expire_logs_days = 7

# GTID
gtid_mode = ON
enforce_gtid_consistency = ON

# Replication
log_slave_updates = ON
relay_log_recovery = ON
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK

# InnoDB
innodb_flush_log_at_trx_commit = 1
innodb_buffer_pool_size = 70% of RAM

# Security
require_secure_transport = ON

INI

3. Monitoring and Alerting

Key Metrics to Monitor

-- Create monitoring view
CREATE VIEW replication_health AS
SELECT 
    'Replication Status' as metric,
    CASE 
        WHEN IO_THREAD = 'Yes' AND SQL_THREAD = 'Yes' 
        THEN 'Healthy' 
        ELSE 'Broken' 
    END as status,
    SECONDS_BEHIND_MASTER as lag_seconds,
    LAST_ERROR as last_error
FROM (
    SELECT 
        VALUE as IO_THREAD
    FROM performance_schema.global_status 
    WHERE VARIABLE_NAME = 'Slave_IO_Running'
) io
CROSS JOIN (
    SELECT 
        VALUE as SQL_THREAD  
    FROM performance_schema.global_status
    WHERE VARIABLE_NAME = 'Slave_SQL_Running'
) sql;

-- Create monitoring view
CREATE VIEW replication_health AS
SELECT 
    'Replication Status' as metric,
    CASE 
        WHEN IO_THREAD = 'Yes' AND SQL_THREAD = 'Yes' 
        THEN 'Healthy' 
        ELSE 'Broken' 
    END as status,
    SECONDS_BEHIND_MASTER as lag_seconds,
    LAST_ERROR as last_error
FROM (
    SELECT 
        VALUE as IO_THREAD
    FROM performance_schema.global_status 
    WHERE VARIABLE_NAME = 'Slave_IO_Running'
) io
CROSS JOIN (
    SELECT 
        VALUE as SQL_THREAD  
    FROM performance_schema.global_status
    WHERE VARIABLE_NAME = 'Slave_SQL_Running'
) sql;

SQL

Alerting Thresholds

Replication Lag: Alert if > 60 seconds
Thread Status: Alert immediately if IO or SQL thread stops
Error Rate: Alert on any replication errors
Disk Space: Alert when binary log disk usage > 80%

4. Backup and Recovery

Backup Strategy

graph LR
    A[Master Backup] --> B[Mysqldump]
    A --> C[Physical Backup]
    A --> D[Binary Logs]

    E[Slave Backup] --> F[Read-only Backup]
    E --> G[Consistent Snapshot]

    B --> H[Point-in-time Recovery]
    C --> H
    D --> H

Backup Commands

# Consistent backup with binary log position
mysqldump --single-transaction --routines --triggers --master-data=2 \
          --all-databases > backup_$(date +%Y%m%d_%H%M%S).sql

# Binary log backup
mysqlbinlog --result-file=binlog_backup.sql mysql-bin.000001

# Consistent backup with binary log position
mysqldump --single-transaction --routines --triggers --master-data=2 \
          --all-databases > backup_$(date +%Y%m%d_%H%M%S).sql

# Binary log backup
mysqlbinlog --result-file=binlog_backup.sql mysql-bin.000001

Bash

5. Testing and Validation

Regular Testing Checklist

Failover testing
Backup restoration testing
Replication lag under load
Network partition recovery
Point-in-time recovery
Security penetration testing

Automated Testing Script

#!/bin/bash
# replication_test.sh

# Test basic connectivity
mysql -h slave_host -u test_user -p -e "SELECT 1"

# Test replication lag
CURRENT_TIME=$(date +%s)
mysql -h master_host -u test_user -p -e "INSERT INTO test.heartbeat VALUES (${CURRENT_TIME})"

sleep 5

SLAVE_TIME=$(mysql -h slave_host -u test_user -p -e "SELECT MAX(timestamp) FROM test.heartbeat" -N)
LAG=$((CURRENT_TIME - SLAVE_TIME))

if [ $LAG -gt 10 ]; then
    echo "ALERT: Replication lag is ${LAG} seconds"
    exit 1
fi

echo "Replication test passed. Lag: ${LAG} seconds"

#!/bin/bash
# replication_test.sh

# Test basic connectivity
mysql -h slave_host -u test_user -p -e "SELECT 1"

# Test replication lag
CURRENT_TIME=$(date +%s)
mysql -h master_host -u test_user -p -e "INSERT INTO test.heartbeat VALUES (${CURRENT_TIME})"

sleep 5

SLAVE_TIME=$(mysql -h slave_host -u test_user -p -e "SELECT MAX(timestamp) FROM test.heartbeat" -N)
LAG=$((CURRENT_TIME - SLAVE_TIME))

if [ $LAG -gt 10 ]; then
    echo "ALERT: Replication lag is ${LAG} seconds"
    exit 1
fi

echo "Replication test passed. Lag: ${LAG} seconds"

Bash

6. Troubleshooting Workflow

flowchart TD
    A[Replication Issue Detected] --> B{Check Slave Status}

    B --> C[IO Thread Stopped]
    B --> D[SQL Thread Stopped] 
    B --> E[High Lag]
    B --> F[Duplicate Key Error]

    C --> C1[Check network connectivity]
    C --> C2[Verify master status]
    C --> C3[Check credentials]

    D --> D1[Check for table locks]
    D --> D2[Review error logs]
    D --> D3[Check disk space]

    E --> E1[Monitor system resources]
    E --> E2[Check for large transactions]
    E --> E3[Verify network bandwidth]

    F --> F1[Identify conflicting data]
    F --> F2[Skip duplicate or fix data]
    F --> F3[Resume replication]

    C1 --> G[Resolve and Restart]
    C2 --> G
    C3 --> G
    D1 --> G
    D2 --> G
    D3 --> G
    E1 --> H[Optimize and Monitor]
    E2 --> H
    E3 --> H
    F1 --> I[Data Consistency Check]
    F2 --> I
    F3 --> I

7. Documentation Standards

Required Documentation

Network topology diagrams
Configuration file templates
Runbook procedures
Contact information
Escalation procedures
Change management process

Configuration Management

# Version control for MySQL configurations
git init mysql-configs
cd mysql-configs

# Store configurations
cp /etc/mysql/mysql.conf.d/mysqld.cnf master.cnf
cp /etc/mysql/mysql.conf.d/mysqld.cnf slave.cnf

git add *.cnf
git commit -m "Initial MySQL replication configuration"

# Version control for MySQL configurations
git init mysql-configs
cd mysql-configs

# Store configurations
cp /etc/mysql/mysql.conf.d/mysqld.cnf master.cnf
cp /etc/mysql/mysql.conf.d/mysqld.cnf slave.cnf

git add *.cnf
git commit -m "Initial MySQL replication configuration"

Bash

Conclusion

MySQL replication is a powerful feature that provides high availability, scalability, and disaster recovery capabilities. This comprehensive guide has covered:

Basic Concepts: Understanding replication fundamentals
Setup Procedures: Step-by-step configuration guides
Advanced Topics: GTID, Group Replication, and complex topologies
Operations: Monitoring, troubleshooting, and maintenance
Best Practices: Production-ready recommendations

Key Takeaways

Plan Thoroughly: Proper planning prevents performance problems
Monitor Continuously: Proactive monitoring prevents issues
Test Regularly: Regular testing ensures reliability
Document Everything: Good documentation saves time
Stay Updated: Keep up with MySQL replication improvements

Next Steps

Practice with test environments
Implement monitoring solutions
Develop disaster recovery procedures
Consider MySQL 8.0 features like Group Replication
Explore cloud-based replication solutions

For continued learning, refer to the official MySQL documentation and consider MySQL certification programs.

This guide serves as a comprehensive reference for MySQL replication. Always test configurations in development environments before applying to production systems.

Discover more from Altgr Blog

Subscribe to get the latest posts sent to your email.

Table of Contents

1. Introduction to MySQL Replication

Types of Replication

2. Prerequisites and Setup

System Requirements

Installing MySQL

Ubuntu/Debian

CentOS/RHEL

Basic Configuration

3. Understanding Replication Architecture

Basic Master-Slave Architecture

Replication Process Flow

Multi-Master Architecture

4. Binary Logging

Binary Log Configuration

Binary Log Formats

Managing Binary Logs

5. Setting Up Master-Slave Replication

Step 1: Configure the Master

Step 2: Create Replication User

Step 3: Get Master Status

Step 4: Configure the Slave

Step 5: Start Replication

Step 6: Verify Replication

Replication Setup Flow

6. Master-Master Replication

Architecture Overview

Configuration for Master-Master

Server 1 Configuration

Server 2 Configuration

Setup Process

Conflict Resolution

7. MySQL Group Replication

Group Replication Architecture

Configuration

Setting Up Group Replication

Group Replication States

8. Monitoring and Troubleshooting

Key Monitoring Metrics

Essential Monitoring Queries

Common Issues and Solutions

Replication Lag

Slave Synchronization Issues

Monitoring Scripts

Bash Monitoring Script

9. Performance Optimization

Optimization Strategies

Key Configuration Parameters

Parallel Replication

Performance Monitoring Queries

10. Security Considerations

Security Architecture

SSL/TLS Configuration

Master Configuration

Slave Configuration

Security Best Practices

Network Security

11. Advanced Topics

Global Transaction Identifiers (GTIDs)

GTID Configuration

GTID Operations

Point-in-Time Recovery with Replication

Recovery Process

Multi-Source Replication

Configuration

Delayed Replication

12. Best Practices

Replication Best Practices Checklist

1. Planning and Design

Capacity Planning

Network Design

2. Configuration Best Practices

3. Monitoring and Alerting

Key Metrics to Monitor

Alerting Thresholds

4. Backup and Recovery

Backup Strategy

Backup Commands

5. Testing and Validation

Regular Testing Checklist