Deploying Production-Ready Kubernetes with Ansible

    1. Introduction

    This guide provides a comprehensive approach to deploying and managing production-grade Kubernetes clusters using Ansible automation. We’ll cover everything from basic setup to advanced configurations including high availability, monitoring, logging, GitOps, and security hardening.

    2. System Architecture

    graph TD
        subgraph "Control Plane"
            A[API Server] --> B[etcd]
            A --> C[Controller Manager]
            A --> D[Scheduler]
        end
    
        subgraph "Worker Nodes"
            E[kubelet] --> F[Container Runtime]
            G[kube-proxy] --> H[Pod Network]
        end
    
        subgraph "Infrastructure Services"
            I[Ingress Controller]
            J[Monitoring Stack]
            K[Logging Stack]
            L[GitOps Controller]
            M[Backup System]
            N[Secrets Management]
        end
    
        subgraph "Ansible Control"
            O[Ansible Controller] --> P[Inventory]
            O --> Q[Playbooks]
            O --> R[Roles]
            O --> S[Variables]
        end
    
        O -->|"Control Plane"| A
        O -->|"Worker Nodes"| E
        O -->|"Infrastructure Services"| I

    3. Prerequisites

    Hardware Requirements

    • Control Plane Nodes: 2+ CPUs, 4GB+ RAM, 50GB+ storage
    • Worker Nodes: 4+ CPUs, 8GB+ RAM, 100GB+ storage
    • Ansible Controller: 2+ CPUs, 4GB+ RAM

    OS Requirements

    • Ubuntu 22.04 LTS or CentOS/RHEL 8+ (all nodes)
    • Python 3.8+ (Ansible controller)

    Network Requirements

    • All nodes must have unique hostnames, MAC addresses, and product_uuids
    • Disable swap on all Kubernetes nodes
    • Ensure connectivity between all nodes (TCP ports 6443, 2379-2380, 10250-10252)
    • Unique subnet for pod networking (e.g., 10.244.0.0/16)
    • Load balancer for API server (for HA setup)

    Package Requirements

    # On Ansible controller
    sudo apt update
    sudo apt install -y python3-pip
    pip3 install ansible ansible-core netaddr jmespath
    Bash

    4. Ansible Project Structure

    graph TD
        A[ansible-kubernetes] --> B[inventories]
        A --> C[roles]
        A --> D[playbooks]
        A --> E[group_vars]
        A --> F[host_vars]
        A --> G[files]
    
        B --> B1[production]
        B --> B2[staging]
    
        C --> C1[common]
        C --> C2[kubernetes]
        C --> C3[containerd]
        C --> C4[networking]
        C --> C5[monitoring]
        C --> C6[logging]
        C --> C7[ingress]
        C --> C8[gitops]
        C --> C9[backup]
        C --> C10[secrets]
    
        D --> D1[site.yml]
        D --> D2[kubernetes.yml]
        D --> D3[addons.yml]

    Create Base Structure

    mkdir -p ansible-kubernetes/{inventories/{production,staging},roles,playbooks,group_vars,host_vars,files}
    cd ansible-kubernetes
    Bash

    5. Setting Up Inventory and Variables

    Inventory Structure

    ---
    all:
      children:
        kubernetes:
          children:
            control_plane:
              hosts:
                master01:
                  ansible_host: 192.168.1.101
                master02:
                  ansible_host: 192.168.1.102
                master03:
                  ansible_host: 192.168.1.103
            workers:
              hosts:
                worker01:
                  ansible_host: 192.168.1.111
                worker02:
                  ansible_host: 192.168.1.112
                worker03:
                  ansible_host: 192.168.1.113
            lb:
              hosts:
                lb01:
                  ansible_host: 192.168.1.100
                  virtual_ip: 192.168.1.200
        etcd:
          children:
            control_plane:
    YAML

    Group Variables

    ---
    # Kubernetes configuration
    kubernetes_version: "1.26.0"
    pod_network_cidr: "10.244.0.0/16"
    service_network_cidr: "10.96.0.0/12"
    kubernetes_api_server_port: 6443
    kubernetes_dns_domain: "cluster.local"
    
    # Container runtime configuration
    container_runtime: "containerd"
    containerd_version: "1.6.8"
    
    # CNI configuration
    cni_plugin: "calico"
    calico_version: "v3.24.5"
    
    # Control plane configuration
    control_plane_endpoint: "{{ hostvars.lb01.virtual_ip }}:{{ kubernetes_api_server_port }}"
    control_plane_endpoint_noport: "{{ hostvars.lb01.virtual_ip }}"
    
    # Add-ons configuration
    enable_dashboard: true
    enable_metrics_server: true
    enable_ingress: true
    ingress_controller: "nginx"
    nginx_ingress_version: "v1.6.4"
    
    # Monitoring configuration
    enable_monitoring: true
    prometheus_operator_version: "v0.63.0"
    grafana_admin_password: "change-me-in-production"
    
    # Logging configuration
    enable_logging: true
    logging_stack: "efk"  # Options: efk, loki
    
    # GitOps configuration
    enable_gitops: true
    gitops_tool: "argocd"  # Options: argocd, fluxcd
    argocd_version: "v2.6.3"
    
    # Backup configuration
    enable_backup: true
    backup_tool: "velero"
    velero_version: "v1.10.1"
    backup_bucket: "k8s-backups"
    backup_provider: "aws"  # Options: aws, gcp, azure, minio
    
    # Secrets management
    secrets_management: "sealed-secrets"  # Options: sealed-secrets, vault
    sealed_secrets_version: "v0.20.5"
    
    # Security configuration
    apiserver_cert_extra_sans:
      - "kubernetes"
      - "kubernetes.default"
      - "{{ control_plane_endpoint_noport }}"
    
    # HA configuration
    ha_enabled: true
    keepalived_virtual_ip: "{{ hostvars.lb01.virtual_ip }}"
    keepalived_interface: "eth0"
    haproxy_connect_timeout: "10s"
    haproxy_client_timeout: "30s"
    haproxy_server_timeout: "30s"
    YAML

    6. Core Kubernetes Installation

    Common Role Tasks

    ---
    - name: Update apt cache
      apt:
        update_cache: yes
        cache_valid_time: 3600
      when: ansible_os_family == "Debian"
    
    - name: Install required packages
      package:
        name:
          - apt-transport-https
          - ca-certificates
          - curl
          - gnupg
          - lsb-release
          - python3-pip
          - python3-setuptools
          - ntp
          - iptables
          - software-properties-common
        state: present
    
    - name: Configure hostnames
      hostname:
        name: "{{ inventory_hostname }}"
    
    - name: Update /etc/hosts
      lineinfile:
        path: /etc/hosts
        line: "{{ hostvars[item].ansible_host }} {{ item }}"
        state: present
      loop: "{{ groups['kubernetes'] }}"
    
    - name: Disable swap
      shell: |
        swapoff -a
        sed -i '/swap/d' /etc/fstab
      args:
        executable: /bin/bash
    
    - name: Configure kernel modules for Kubernetes
      copy:
        dest: "/etc/modules-load.d/k8s.conf"
        content: |
          overlay
          br_netfilter
    
    - name: Load kernel modules
      command: "modprobe {{ item }}"
      loop:
        - overlay
        - br_netfilter
      changed_when: false
    
    - name: Configure sysctl parameters for Kubernetes
      copy:
        dest: "/etc/sysctl.d/k8s.conf"
        content: |
          net.bridge.bridge-nf-call-iptables = 1
          net.bridge.bridge-nf-call-ip6tables = 1
          net.ipv4.ip_forward = 1
    
    - name: Apply sysctl parameters
      command: sysctl --system
      changed_when: false
    YAML

    Container Runtime Installation (Containerd)

    ---
    - name: Install containerd dependencies
      package:
        name:
          - containerd.io
        state: present
      register: pkg_result
      until: pkg_result is success
      retries: 3
      delay: 5
    
    - name: Create containerd configuration directory
      file:
        path: /etc/containerd
        state: directory
        mode: '0755'
    
    - name: Generate default containerd configuration
      shell: containerd config default > /etc/containerd/config.toml
      args:
        creates: /etc/containerd/config.toml
    
    - name: Configure containerd to use systemd cgroup driver
      replace:
        path: /etc/containerd/config.toml
        regexp: 'SystemdCgroup = false'
        replace: 'SystemdCgroup = true'
      notify: restart containerd
    
    - name: Enable and start containerd service
      systemd:
        name: containerd
        state: started
        enabled: yes
        daemon_reload: yes
    YAML

    Kubernetes Components Installation

    ---
    - name: Add Kubernetes apt key
      apt_key:
        url: https://packages.cloud.google.com/apt/doc/apt-key.gpg
        state: present
      when: ansible_os_family == "Debian"
    
    - name: Add Kubernetes repository
      apt_repository:
        repo: deb https://apt.kubernetes.io/ kubernetes-xenial main
        state: present
        filename: kubernetes
      when: ansible_os_family == "Debian"
    
    - name: Install Kubernetes components
      package:
        name:
          - kubelet={{ kubernetes_version }}-00
          - kubeadm={{ kubernetes_version }}-00
          - kubectl={{ kubernetes_version }}-00
        state: present
      register: pkg_result
      until: pkg_result is success
      retries: 3
      delay: 5
    
    - name: Hold Kubernetes components
      dpkg_selections:
        name: "{{ item }}"
        selection: hold
      loop:
        - kubelet
        - kubeadm
        - kubectl
      when: ansible_os_family == "Debian"
    
    - name: Enable and start kubelet service
      systemd:
        name: kubelet
        state: started
        enabled: yes
        daemon_reload: yes
    YAML

    Kubernetes Cluster Initialization

    ---
    - name: Initialize Kubernetes cluster with kubeadm
      command: >
        kubeadm init 
        --control-plane-endpoint "{{ control_plane_endpoint }}" 
        --upload-certs 
        --pod-network-cidr={{ pod_network_cidr }} 
        --service-cidr={{ service_network_cidr }}
        --kubernetes-version {{ kubernetes_version }}
      args:
        creates: /etc/kubernetes/admin.conf
      register: kubeadm_init
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Extract join command for control plane
      command: kubeadm token create --print-join-command --certificate-key $(kubeadm init phase upload-certs --upload-certs | tail -1)
      register: control_plane_join_command
      when: inventory_hostname == groups['control_plane'][0] and groups['control_plane'] | length > 1
    
    - name: Extract join command for workers
      command: kubeadm token create --print-join-command
      register: worker_join_command
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Configure kubectl for root user
      shell: |
        mkdir -p /root/.kube
        cp -i /etc/kubernetes/admin.conf /root/.kube/config
        chown root:root /root/.kube/config
      args:
        creates: /root/.kube/config
      when: inventory_hostname in groups['control_plane']
    
    - name: Join other control plane nodes
      command: "{{ hostvars[groups['control_plane'][0]].control_plane_join_command.stdout }} --control-plane"
      args:
        creates: /etc/kubernetes/kubelet.conf
      when: inventory_hostname in groups['control_plane'] and inventory_hostname != groups['control_plane'][0]
    
    - name: Join worker nodes to the cluster
      command: "{{ hostvars[groups['control_plane'][0]].worker_join_command.stdout }}"
      args:
        creates: /etc/kubernetes/kubelet.conf
      when: inventory_hostname in groups['workers']
    YAML

    7. Networking Setup

    ---
    - name: Deploy Calico CNI
      kubernetes.core.k8s:
        state: present
        src: "https://docs.projectcalico.org/{{ calico_version }}/manifests/calico.yaml"
      when: cni_plugin == "calico" and inventory_hostname == groups['control_plane'][0]
    
    - name: Wait for all nodes to be ready
      kubernetes.core.k8s_info:
        kind: Node
      register: nodes
      until: nodes.resources | map(attribute='status.conditions') | flatten | selectattr('type', 'equalto', 'Ready') | map(attribute='status') | list | unique == ['True']
      retries: 30
      delay: 10
      when: inventory_hostname == groups['control_plane'][0]
    YAML

    8. Ingress Controller Configuration

    ---
    - name: Deploy NGINX Ingress Controller
      kubernetes.core.k8s:
        state: present
        src: "https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-{{ nginx_ingress_version }}/deploy/static/provider/baremetal/deploy.yaml"
      when: ingress_controller == "nginx" and inventory_hostname == groups['control_plane'][0]
    
    - name: Configure NodePort to LoadBalancer services for ingress
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Service
          metadata:
            name: ingress-nginx-controller
            namespace: ingress-nginx
          spec:
            type: LoadBalancer
            ports:
              - name: http
                port: 80
                targetPort: 80
                protocol: TCP
              - name: https
                port: 443
                targetPort: 443
                protocol: TCP
            selector:
              app.kubernetes.io/component: controller
              app.kubernetes.io/instance: ingress-nginx
              app.kubernetes.io/name: ingress-nginx
      when: ingress_controller == "nginx" and inventory_hostname == groups['control_plane'][0]
    YAML

    9. Dashboard, Monitoring, and Logging

    Kubernetes Dashboard

    ---
    - name: Deploy Kubernetes Dashboard
      kubernetes.core.k8s:
        state: present
        src: https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
      when: enable_dashboard and inventory_hostname == groups['control_plane'][0]
    
    - name: Create Dashboard Admin User
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: ServiceAccount
          metadata:
            name: admin-user
            namespace: kubernetes-dashboard
      when: enable_dashboard and inventory_hostname == groups['control_plane'][0]
    
    - name: Create ClusterRoleBinding for Dashboard Admin
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: rbac.authorization.k8s.io/v1
          kind: ClusterRoleBinding
          metadata:
            name: admin-user
          roleRef:
            apiGroup: rbac.authorization.k8s.io
            kind: ClusterRole
            name: cluster-admin
          subjects:
          - kind: ServiceAccount
            name: admin-user
            namespace: kubernetes-dashboard
      when: enable_dashboard and inventory_hostname == groups['control_plane'][0]
    YAML

    Monitoring with Prometheus and Grafana

    ---
    - name: Add Prometheus Helm repository
      kubernetes.core.helm_repository:
        name: prometheus-community
        repo_url: https://prometheus-community.github.io/helm-charts
      when: enable_monitoring and inventory_hostname == groups['control_plane'][0]
    
    - name: Create monitoring namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: monitoring
      when: enable_monitoring and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Prometheus stack with Grafana
      kubernetes.core.helm:
        name: kube-prometheus-stack
        chart_ref: prometheus-community/kube-prometheus-stack
        release_namespace: monitoring
        create_namespace: false
        values:
          grafana:
            adminPassword: "{{ grafana_admin_password }}"
            service:
              type: LoadBalancer
            ingress:
              enabled: true
              hosts:
                - grafana.{{ kubernetes_dns_domain }}
          prometheus:
            prometheusSpec:
              serviceMonitorSelectorNilUsesHelmValues: false
              serviceMonitorSelector: {}
              retention: 7d
              resources:
                requests:
                  cpu: 200m
                  memory: 512Mi
            service:
              type: ClusterIP
            ingress:
              enabled: true
              hosts:
                - prometheus.{{ kubernetes_dns_domain }}
          alertmanager:
            alertmanagerSpec:
              resources:
                requests:
                  cpu: 100m
                  memory: 128Mi
            service:
              type: ClusterIP
            ingress:
              enabled: true
              hosts:
                - alertmanager.{{ kubernetes_dns_domain }}
      when: enable_monitoring and inventory_hostname == groups['control_plane'][0]
    YAML

    Logging with EFK Stack

    ---
    - name: Add Elastic Helm repository
      kubernetes.core.helm_repository:
        name: elastic
        repo_url: https://helm.elastic.co
      when: enable_logging and logging_stack == "efk" and inventory_hostname == groups['control_plane'][0]
    
    - name: Create logging namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: logging
      when: enable_logging and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Elasticsearch
      kubernetes.core.helm:
        name: elasticsearch
        chart_ref: elastic/elasticsearch
        release_namespace: logging
        create_namespace: false
        values:
          replicas: 3
          minimumMasterNodes: 2
          esJavaOpts: "-Xmx512m -Xms512m"
          resources:
            requests:
              cpu: "500m"
              memory: "1Gi"
            limits:
              cpu: "1000m"
              memory: "2Gi"
      when: enable_logging and logging_stack == "efk" and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Kibana
      kubernetes.core.helm:
        name: kibana
        chart_ref: elastic/kibana
        release_namespace: logging
        create_namespace: false
        values:
          ingress:
            enabled: true
            hosts:
              - kibana.{{ kubernetes_dns_domain }}
      when: enable_logging and logging_stack == "efk" and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Fluentd
      kubernetes.core.helm:
        name: fluentd
        chart_ref: bitnami/fluentd
        release_namespace: logging
        create_namespace: false
        values:
          elasticsearch:
            host: elasticsearch-master.logging.svc.cluster.local
            port: 9200
      when: enable_logging and logging_stack == "efk" and inventory_hostname == groups['control_plane'][0]
    YAML

    10. GitOps Implementation

    ---
    - name: Create GitOps namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: argocd
      when: enable_gitops and gitops_tool == "argocd" and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy ArgoCD
      kubernetes.core.k8s:
        state: present
        src: "https://raw.githubusercontent.com/argoproj/argo-cd/{{ argocd_version }}/manifests/install.yaml"
      when: enable_gitops and gitops_tool == "argocd" and inventory_hostname == groups['control_plane'][0]
    
    - name: Configure ArgoCD Ingress
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: networking.k8s.io/v1
          kind: Ingress
          metadata:
            name: argocd-server-ingress
            namespace: argocd
            annotations:
              kubernetes.io/ingress.class: nginx
              nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
              nginx.ingress.kubernetes.io/ssl-passthrough: "true"
              nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
          spec:
            rules:
            - host: argocd.{{ kubernetes_dns_domain }}
              http:
                paths:
                - path: /
                  pathType: Prefix
                  backend:
                    service:
                      name: argocd-server
                      port:
                        number: 443
      when: enable_gitops and gitops_tool == "argocd" and inventory_hostname == groups['control_plane'][0]
    YAML

    11. Backup and Disaster Recovery

    ---
    - name: Create Backup namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: velero
      when: enable_backup and backup_tool == "velero" and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Velero credentials
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Secret
          metadata:
            name: cloud-credentials
            namespace: velero
          type: Opaque
          stringData:
            cloud: |
              [default]
              aws_access_key_id={{ aws_access_key_id }}
              aws_secret_access_key={{ aws_secret_access_key }}
      when: enable_backup and backup_tool == "velero" and backup_provider == "aws" and inventory_hostname == groups['control_plane'][0]
    
    - name: Add Velero Helm repository
      kubernetes.core.helm_repository:
        name: vmware-tanzu
        repo_url: https://vmware-tanzu.github.io/helm-charts
      when: enable_backup and backup_tool == "velero" and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Velero
      kubernetes.core.helm:
        name: velero
        chart_ref: vmware-tanzu/velero
        release_namespace: velero
        create_namespace: false
        values:
          configuration:
            provider: aws
            backupStorageLocation:
              name: default
              bucket: "{{ backup_bucket }}"
              config:
                region: us-east-1
            volumeSnapshotLocation:
              name: default
              config:
                region: us-east-1
          credentials:
            existingSecret: cloud-credentials
          initContainers:
            - name: velero-plugin-for-aws
              image: velero/velero-plugin-for-aws:{{ velero_version }}
              volumeMounts:
                - mountPath: /target
                  name: plugins
          schedules:
            daily-backup:
              schedule: "0 1 * * *"
              template:
                ttl: 720h # 30 days
      when: enable_backup and backup_tool == "velero" and backup_provider == "aws" and inventory_hostname == groups['control_plane'][0]
    YAML

    12. CI/CD Integration

    ---
    - name: Create CI/CD namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: ci
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Add Jenkins Helm repository
      kubernetes.core.helm_repository:
        name: jenkins
        repo_url: https://charts.jenkins.io
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Jenkins
      kubernetes.core.helm:
        name: jenkins
        chart_ref: jenkins/jenkins
        release_namespace: ci
        create_namespace: false
        values:
          controller:
            adminUser: admin
            adminPassword: "{{ jenkins_admin_password }}"
            ingress:
              enabled: true
              apiVersion: networking.k8s.io/v1
              hosts:
                - jenkins.{{ kubernetes_dns_domain }}
            resources:
              requests:
                cpu: "500m"
                memory: "1Gi"
              limits:
                cpu: "1000m"
                memory: "2Gi"
          agent:
            resources:
              requests:
                cpu: "500m"
                memory: "1Gi"
              limits:
                cpu: "1000m"
                memory: "2Gi"
      when: inventory_hostname == groups['control_plane'][0]
    YAML

    13. Secrets Management

    ---
    - name: Create Secrets Management namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: sealed-secrets
      when: secrets_management == "sealed-secrets" and inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Sealed Secrets Controller
      kubernetes.core.k8s:
        state: present
        src: "https://github.com/bitnami-labs/sealed-secrets/releases/download/{{ sealed_secrets_version }}/controller.yaml"
      when: secrets_management == "sealed-secrets" and inventory_hostname == groups['control_plane'][0]
    
    - name: Wait for the Sealed Secrets controller to be ready
      kubernetes.core.k8s_info:
        kind: Deployment
        name: sealed-secrets-controller
        namespace: sealed-secrets
      register: sealed_secrets_deployment
      until: sealed_secrets_deployment.resources[0].status.availableReplicas is defined and sealed_secrets_deployment.resources[0].status.availableReplicas == 1
      retries: 30
      delay: 10
      when: secrets_management == "sealed-secrets" and inventory_hostname == groups['control_plane'][0]
    YAML

    14. High Availability Setup

    ---
    - name: Install HAProxy and Keepalived
      package:
        name:
          - haproxy
          - keepalived
        state: present
      when: inventory_hostname in groups['lb']
    
    - name: Configure HAProxy
      template:
        src: haproxy.cfg.j2
        dest: /etc/haproxy/haproxy.cfg
        owner: root
        group: root
        mode: '0644'
      notify: restart haproxy
      when: inventory_hostname in groups['lb']
    
    - name: Configure Keepalived master
      template:
        src: keepalived_master.conf.j2
        dest: /etc/keepalived/keepalived.conf
        owner: root
        group: root
        mode: '0644'
      notify: restart keepalived
      when: inventory_hostname == groups['lb'][0]
    
    - name: Configure Keepalived backup
      template:
        src: keepalived_backup.conf.j2
        dest: /etc/keepalived/keepalived.conf
        owner: root
        group: root
        mode: '0644'
      notify: restart keepalived
      when: inventory_hostname in groups['lb'] and inventory_hostname != groups['lb'][0]
    YAML

    HAProxy template:

    global
        log /dev/log local0
        log /dev/log local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
    
    defaults
        log     global
        mode    tcp
        option  tcplog
        option  dontlognull
        timeout connect {{ haproxy_connect_timeout }}
        timeout client  {{ haproxy_client_timeout }}
        timeout server  {{ haproxy_server_timeout }}
    
    frontend kubernetes-apiserver
        bind *:{{ kubernetes_api_server_port }}
        mode tcp
        option tcplog
        default_backend kubernetes-apiserver
    
    backend kubernetes-apiserver
        mode tcp
        option tcp-check
        balance roundrobin
        {% for host in groups['control_plane'] %}
        server {{ host }} {{ hostvars[host].ansible_host }}:{{ kubernetes_api_server_port }} check fall 3 rise 2
        {% endfor %}
    Jinja HTML

    15. SRE Principles Integration

    ---
    - name: Create SLO namespace
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: slo
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy SLO Operator
      kubernetes.core.k8s:
        state: present
        src: https://github.com/slok/sloth/releases/latest/download/sloth.yaml
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Create SLO for API Server
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: sloth.slok.dev/v1
          kind: PrometheusServiceLevel
          metadata:
            name: kubernetes-api-sli
            namespace: slo
          spec:
            service: "kubernetes-api"
            labels:
              app: "kubernetes"
              component: "apiserver"
            slos:
              - name: "availability"
                objective: 99.9
                description: "API server availability SLO"
                sli:
                  events:
                    errorQuery: sum(rate(apiserver_request_total{code=~"5.."}[{{.window}}]))
                    totalQuery: sum(rate(apiserver_request_total[{{.window}}]))
                alerting:
                  pageAlert:
                    disable: false
                    labels:
                      severity: critical
                      team: platform
                    annotations:
                      summary: "High error rate on API server"
                  ticketAlert:
                    disable: false
                    labels:
                      severity: warning
                      team: platform
                    annotations:
                      summary: "Elevated error rate on API server"
      when: inventory_hostname == groups['control_plane'][0]
    
    - name: Deploy Alert Manager Config
      kubernetes.core.k8s:
        state: present
        definition:
          apiVersion: v1
          kind: ConfigMap
          metadata:
            name: alertmanager-config
            namespace: monitoring
          data:
            alertmanager.yml: |
              global:
                resolve_timeout: 5m
                slack_api_url: "{{ alertmanager_slack_webhook }}"
    
              route:
                group_by: ['job', 'alertname', 'severity']
                group_wait: 30s
                group_interval: 5m
                repeat_interval: 3h
                receiver: 'slack-notifications'
                routes:
                - match:
                    severity: critical
                  receiver: 'pagerduty-critical'
    
              receivers:
              - name: 'slack-notifications'
                slack_configs:
                - channel: '#alerts'
                  send_resolved: true
                  title: '[{{ "{{" }} .Status {{ "}}" }}] {{ "{{" }} .CommonLabels.alertname {{ "}}" }}'
                  text: >-
                    {{ "{{" }} range .Alerts {{ "}}" }}
                    *Alert:* {{ "{{" }} .Annotations.summary {{ "}}" }}
                    *Description:* {{ "{{" }} .Annotations.description {{ "}}" }}
                    *Severity:* {{ "{{" }} .Labels.severity {{ "}}" }}
                    {{ "{{" }} end {{ "}}" }}
              - name: 'pagerduty-critical'
                pagerduty_configs:
                - service_key: "{{ alertmanager_pagerduty_key }}"
                  description: '{{ "{{" }} .CommonLabels.alertname {{ "}}" }}'
                  details:
                    firing: '{{ "{{" }} .Alerts.Firing {{ "}}" }}'
      when: enable_monitoring and inventory_hostname == groups['control_plane'][0]
    YAML

    16. Main Playbook

    ---
    - name: Prepare all hosts
      hosts: all
      become: yes
      roles:
        - common
    
    - name: Configure Load Balancer
      hosts: lb
      become: yes
      roles:
        - ha
      tags:
        - lb
        - ha
    
    - name: Install Container Runtime
      hosts: kubernetes
      become: yes
      roles:
        - containerd
      tags:
        - runtime
    
    - name: Install Kubernetes Components
      hosts: kubernetes
      become: yes
      roles:
        - kubernetes
      tags:
        - kubernetes
    
    - name: Initialize Kubernetes Control Plane
      hosts: control_plane
      become: yes
      tasks:
        - include_role:
            name: kubernetes
            tasks_from: init_control_plane
      tags:
        - init
    
    - name: Configure Kubernetes Networking
      hosts: control_plane[0]
      become: yes
      roles:
        - networking
      tags:
        - networking
    
    - name: Deploy Kubernetes Addons
      hosts: control_plane[0]
      become: yes
      roles:
        - ingress
        - dashboard
        - monitoring
        - logging
        - gitops
        - backup
        - secrets
        - cicd
        - sre
      tags:
        - addons
    YAML

    17. Using the Playbook

    # Clone the repository (assuming you've set it up in Git)
    git clone https://github.com/yourusername/ansible-kubernetes.git
    cd ansible-kubernetes
    
    # Update inventory with your actual server details
    # Edit inventories/production/hosts.yml and group_vars/all.yml as needed
    
    # Install required collections
    ansible-galaxy collection install kubernetes.core community.general
    
    # Run playbook for a complete setup
    ansible-playbook -i inventories/production/hosts.yml playbooks/site.yml
    
    # Alternatively, run specific tags
    ansible-playbook -i inventories/production/hosts.yml playbooks/site.yml --tags "kubernetes,networking"
    Bash

    18. Security Hardening Best Practices

    • Use secure TLS settings for all components
    • Implement Network Policies to restrict pod-to-pod traffic
    • Enable Pod Security Standards (Restricted profile)
    • Use least-privilege RBAC configuration
    • Regularly scan images for vulnerabilities
    • Implement seccomp and AppArmor profiles
    • Encrypt etcd data at rest
    • Regular certificate rotation
    • Limit API server access to trusted networks
    • Use admission controllers for policy enforcement
    • Implement audit logging and regular review

    19. Troubleshooting

    Common issues and solutions:

    • Node NotReady: Check kubelet logs with journalctl -u kubelet
    • Pod scheduling: Check node resources with kubectl describe node
    • Network issues: Verify CNI with kubectl get pods -n kube-system
    • API server unavailable: Check control plane components with kubectl get pods -n kube-system
    • Certificate issues: Check with kubeadm certs check-expiration

    Integrating Terraform with Ansible for Multi-Cloud Kubernetes Deployments

    In addition to the Ansible-based deployment approach, we can enhance our Kubernetes deployment by using Terraform to provision the underlying infrastructure across different cloud providers. This section demonstrates how to integrate Terraform with our Ansible playbooks for a complete Infrastructure as Code (IaC) solution.

    21. Infrastructure as Code with Terraform

    Workflow Architecture

    graph TD
        A[Terraform] -->|Provisions Infrastructure| B[AWS/Azure/GCP VMs]
        A -->|Generates| C[Dynamic Inventory]
        C -->|Used by| D[Ansible]
        D -->|Configures| E[Kubernetes Cluster]
        D -->|Deploys| F[Add-ons & Monitoring]

    Project Structure with Terraform

    ansible-kubernetes/
    ├── terraform/
       ├── aws/
       ├── azure/
       ├── gcp/
       └── outputs.tf
    ├── inventories/
    ├── roles/
    ├── playbooks/
    └── ...
    Bash

    22. Terraform for AWS Infrastructure

    AWS Provider Configuration

    provider "aws" {
      region = var.region
    }
    
    resource "aws_vpc" "k8s_vpc" {
      cidr_block           = var.vpc_cidr
      enable_dns_hostnames = true
      enable_dns_support   = true
    
      tags = {
        Name = "kubernetes-vpc"
      }
    }
    
    resource "aws_subnet" "k8s_subnet" {
      vpc_id                  = aws_vpc.k8s_vpc.id
      cidr_block              = var.subnet_cidr
      map_public_ip_on_launch = true
      availability_zone       = "${var.region}a"
    
      tags = {
        Name = "kubernetes-subnet"
      }
    }
    
    resource "aws_security_group" "k8s_sg" {
      name        = "kubernetes-sg"
      description = "Allow Kubernetes traffic"
      vpc_id      = aws_vpc.k8s_vpc.id
    
      # SSH
      ingress {
        from_port   = 22
        to_port     = 22
        protocol    = "tcp"
        cidr_blocks = ["0.0.0.0/0"]
      }
    
      # Kubernetes API
      ingress {
        from_port   = 6443
        to_port     = 6443
        protocol    = "tcp"
        cidr_blocks = ["0.0.0.0/0"]
      }
    
      # Allow all internal traffic
      ingress {
        from_port   = 0
        to_port     = 0
        protocol    = "-1"
        cidr_blocks = [var.vpc_cidr]
      }
    
      # Allow all outbound traffic
      egress {
        from_port   = 0
        to_port     = 0
        protocol    = "-1"
        cidr_blocks = ["0.0.0.0/0"]
      }
    }
    
    # Load Balancer for HA Setup
    resource "aws_lb" "k8s_api_lb" {
      name               = "kubernetes-api-lb"
      internal           = false
      load_balancer_type = "network"
      subnets            = [aws_subnet.k8s_subnet.id]
    
      tags = {
        Name = "kubernetes-api-lb"
      }
    }
    
    resource "aws_lb_target_group" "k8s_api_tg" {
      name     = "kubernetes-api-tg"
      port     = 6443
      protocol = "TCP"
      vpc_id   = aws_vpc.k8s_vpc.id
    }
    
    resource "aws_lb_listener" "k8s_api" {
      load_balancer_arn = aws_lb.k8s_api_lb.arn
      port              = 6443
      protocol          = "TCP"
    
      default_action {
        type             = "forward"
        target_group_arn = aws_lb_target_group.k8s_api_tg.arn
      }
    }
    
    # Master Nodes
    resource "aws_instance" "k8s_master" {
      count         = var.master_count
      ami           = var.ubuntu_ami
      instance_type = var.master_instance_type
      key_name      = var.ssh_key_name
    
      subnet_id                   = aws_subnet.k8s_subnet.id
      vpc_security_group_ids      = [aws_security_group.k8s_sg.id]
      associate_public_ip_address = true
    
      root_block_device {
        volume_size = 50
        volume_type = "gp2"
      }
    
      tags = {
        Name = "k8s-master-${count.index + 1}"
        Role = "master"
      }
    }
    
    # Worker Nodes
    resource "aws_instance" "k8s_worker" {
      count         = var.worker_count
      ami           = var.ubuntu_ami
      instance_type = var.worker_instance_type
      key_name      = var.ssh_key_name
    
      subnet_id                   = aws_subnet.k8s_subnet.id
      vpc_security_group_ids      = [aws_security_group.k8s_sg.id]
      associate_public_ip_address = true
    
      root_block_device {
        volume_size = 100
        volume_type = "gp2"
      }
    
      tags = {
        Name = "k8s-worker-${count.index + 1}"
        Role = "worker"
      }
    }
    
    # Register masters with load balancer
    resource "aws_lb_target_group_attachment" "k8s_master_lb_attachment" {
      count            = var.master_count
      target_group_arn = aws_lb_target_group.k8s_api_tg.arn
      target_id        = aws_instance.k8s_master[count.index].id
      port             = 6443
    }
    HCL

    AWS Variables

    variable "region" {
      description = "AWS region"
      default     = "us-east-1"
    }
    
    variable "vpc_cidr" {
      description = "CIDR for the VPC"
      default     = "10.0.0.0/16"
    }
    
    variable "subnet_cidr" {
      description = "CIDR for the subnet"
      default     = "10.0.1.0/24"
    }
    
    variable "ubuntu_ami" {
      description = "Ubuntu 22.04 AMI"
      default     = "ami-0557a15b87f6559cf" # Update with appropriate AMI
    }
    
    variable "master_count" {
      description = "Number of master nodes"
      default     = 3
    }
    
    variable "worker_count" {
      description = "Number of worker nodes"
      default     = 3
    }
    
    variable "master_instance_type" {
      description = "Instance type for master nodes"
      default     = "t3.medium"
    }
    
    variable "worker_instance_type" {
      description = "Instance type for worker nodes"
      default     = "t3.large"
    }
    
    variable "ssh_key_name" {
      description = "SSH key name"
      default     = "kubernetes-key"
    }
    HCL

    AWS Outputs for Ansible Integration

    output "vpc_id" {
      value = aws_vpc.k8s_vpc.id
    }
    
    output "api_lb_dns_name" {
      value = aws_lb.k8s_api_lb.dns_name
    }
    
    output "master_ips" {
      value = aws_instance.k8s_master[*].public_ip
    }
    
    output "master_private_ips" {
      value = aws_instance.k8s_master[*].private_ip
    }
    
    output "worker_ips" {
      value = aws_instance.k8s_worker[*].public_ip
    }
    
    output "worker_private_ips" {
      value = aws_instance.k8s_worker[*].private_ip
    }
    HCL

    23. Terraform for Azure Infrastructure

    provider "azurerm" {
      features {}
    }
    
    resource "azurerm_resource_group" "k8s_rg" {
      name     = var.resource_group_name
      location = var.location
    }
    
    resource "azurerm_virtual_network" "k8s_vnet" {
      name                = "kubernetes-vnet"
      address_space       = ["10.0.0.0/16"]
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
    }
    
    resource "azurerm_subnet" "k8s_subnet" {
      name                 = "kubernetes-subnet"
      resource_group_name  = azurerm_resource_group.k8s_rg.name
      virtual_network_name = azurerm_virtual_network.k8s_vnet.name
      address_prefixes     = ["10.0.1.0/24"]
    }
    
    resource "azurerm_network_security_group" "k8s_nsg" {
      name                = "kubernetes-nsg"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
    
      security_rule {
        name                       = "SSH"
        priority                   = 1001
        direction                  = "Inbound"
        access                     = "Allow"
        protocol                   = "Tcp"
        source_port_range          = "*"
        destination_port_range     = "22"
        source_address_prefix      = "*"
        destination_address_prefix = "*"
      }
    
      security_rule {
        name                       = "KubernetesAPI"
        priority                   = 1002
        direction                  = "Inbound"
        access                     = "Allow"
        protocol                   = "Tcp"
        source_port_range          = "*"
        destination_port_range     = "6443"
        source_address_prefix      = "*"
        destination_address_prefix = "*"
      }
    }
    
    # Load Balancer for HA Setup
    resource "azurerm_public_ip" "k8s_lb_ip" {
      name                = "kubernetes-lb-ip"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
      allocation_method   = "Static"
      sku                 = "Standard"
    }
    
    resource "azurerm_lb" "k8s_lb" {
      name                = "kubernetes-lb"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
      sku                 = "Standard"
    
      frontend_ip_configuration {
        name                 = "PublicIPAddress"
        public_ip_address_id = azurerm_public_ip.k8s_lb_ip.id
      }
    }
    
    resource "azurerm_lb_backend_address_pool" "k8s_backend_pool" {
      loadbalancer_id = azurerm_lb.k8s_lb.id
      name            = "kubernetes-backend-pool"
    }
    
    resource "azurerm_lb_rule" "k8s_lb_rule" {
      loadbalancer_id                = azurerm_lb.k8s_lb.id
      name                           = "kubernetes-api"
      protocol                       = "Tcp"
      frontend_port                  = 6443
      backend_port                   = 6443
      frontend_ip_configuration_name = "PublicIPAddress"
      backend_address_pool_ids       = [azurerm_lb_backend_address_pool.k8s_backend_pool.id]
      probe_id                       = azurerm_lb_probe.k8s_lb_probe.id
    }
    
    resource "azurerm_lb_probe" "k8s_lb_probe" {
      loadbalancer_id = azurerm_lb.k8s_lb.id
      name            = "kubernetes-api-probe"
      port            = 6443
      protocol        = "Tcp"
    }
    
    # Master Nodes
    resource "azurerm_network_interface" "k8s_master_nic" {
      count               = var.master_count
      name                = "k8s-master-nic-${count.index + 1}"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
    
      ip_configuration {
        name                          = "internal"
        subnet_id                     = azurerm_subnet.k8s_subnet.id
        private_ip_address_allocation = "Dynamic"
        public_ip_address_id          = azurerm_public_ip.k8s_master_pip[count.index].id
      }
    }
    
    resource "azurerm_network_interface_security_group_association" "k8s_master_nsg_assoc" {
      count                     = var.master_count
      network_interface_id      = azurerm_network_interface.k8s_master_nic[count.index].id
      network_security_group_id = azurerm_network_security_group.k8s_nsg.id
    }
    
    resource "azurerm_public_ip" "k8s_master_pip" {
      count               = var.master_count
      name                = "k8s-master-ip-${count.index + 1}"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
      allocation_method   = "Static"
      sku                 = "Standard"
    }
    
    resource "azurerm_linux_virtual_machine" "k8s_master" {
      count               = var.master_count
      name                = "k8s-master-${count.index + 1}"
      resource_group_name = azurerm_resource_group.k8s_rg.name
      location            = azurerm_resource_group.k8s_rg.location
      size                = var.master_vm_size
      admin_username      = "adminuser"
      network_interface_ids = [
        azurerm_network_interface.k8s_master_nic[count.index].id,
      ]
    
      admin_ssh_key {
        username   = "adminuser"
        public_key = file(var.ssh_public_key_path)
      }
    
      os_disk {
        caching              = "ReadWrite"
        storage_account_type = "Premium_LRS"
        disk_size_gb         = 50
      }
    
      source_image_reference {
        publisher = "Canonical"
        offer     = "0001-com-ubuntu-server-jammy"
        sku       = "22_04-lts"
        version   = "latest"
      }
    
      tags = {
        Role = "master"
      }
    }
    
    # Worker Nodes
    resource "azurerm_network_interface" "k8s_worker_nic" {
      count               = var.worker_count
      name                = "k8s-worker-nic-${count.index + 1}"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
    
      ip_configuration {
        name                          = "internal"
        subnet_id                     = azurerm_subnet.k8s_subnet.id
        private_ip_address_allocation = "Dynamic"
        public_ip_address_id          = azurerm_public_ip.k8s_worker_pip[count.index].id
      }
    }
    
    resource "azurerm_network_interface_security_group_association" "k8s_worker_nsg_assoc" {
      count                     = var.worker_count
      network_interface_id      = azurerm_network_interface.k8s_worker_nic[count.index].id
      network_security_group_id = azurerm_network_security_group.k8s_nsg.id
    }
    
    resource "azurerm_public_ip" "k8s_worker_pip" {
      count               = var.worker_count
      name                = "k8s-worker-ip-${count.index + 1}"
      location            = azurerm_resource_group.k8s_rg.location
      resource_group_name = azurerm_resource_group.k8s_rg.name
      allocation_method   = "Static"
      sku                 = "Standard"
    }
    
    resource "azurerm_linux_virtual_machine" "k8s_worker" {
      count               = var.worker_count
      name                = "k8s-worker-${count.index + 1}"
      resource_group_name = azurerm_resource_group.k8s_rg.name
      location            = azurerm_resource_group.k8s_rg.location
      size                = var.worker_vm_size
      admin_username      = "adminuser"
      network_interface_ids = [
        azurerm_network_interface.k8s_worker_nic[count.index].id,
      ]
    
      admin_ssh_key {
        username   = "adminuser"
        public_key = file(var.ssh_public_key_path)
      }
    
      os_disk {
        caching              = "ReadWrite"
        storage_account_type = "Premium_LRS"
        disk_size_gb         = 100
      }
    
      source_image_reference {
        publisher = "Canonical"
        offer     = "0001-com-ubuntu-server-jammy"
        sku       = "22_04-lts"
        version   = "latest"
      }
    
      tags = {
        Role = "worker"
      }
    }
    HCL

    Azure Variables and Outputs

    variable "resource_group_name" {
      description = "Name of the resource group"
      default     = "kubernetes-rg"
    }
    
    variable "location" {
      description = "Azure region to deploy resources"
      default     = "eastus"
    }
    
    variable "master_count" {
      description = "Number of master nodes"
      default     = 3
    }
    
    variable "worker_count" {
      description = "Number of worker nodes"
      default     = 3
    }
    
    variable "master_vm_size" {
      description = "Size of master VMs"
      default     = "Standard_D2s_v3"
    }
    
    variable "worker_vm_size" {
      description = "Size of worker VMs"
      default     = "Standard_D4s_v3"
    }
    
    variable "ssh_public_key_path" {
      description = "Path to the SSH public key"
      default     = "~/.ssh/id_rsa.pub"
    }
    HCL
    output "resource_group_name" {
      value = azurerm_resource_group.k8s_rg.name
    }
    
    output "kubernetes_api_ip" {
      value = azurerm_public_ip.k8s_lb_ip.ip_address
    }
    
    output "master_public_ips" {
      value = azurerm_public_ip.k8s_master_pip[*].ip_address
    }
    
    output "master_private_ips" {
      value = azurerm_linux_virtual_machine.k8s_master[*].private_ip_address
    }
    
    output "worker_public_ips" {
      value = azurerm_public_ip.k8s_worker_pip[*].ip_address
    }
    
    output "worker_private_ips" {
      value = azurerm_linux_virtual_machine.k8s_worker[*].private_ip_address
    }
    HCL

    24. Terraform for GCP Infrastructure

    provider "google" {
      project = var.project_id
      region  = var.region
      zone    = var.zone
    }
    
    resource "google_compute_network" "k8s_network" {
      name                    = "kubernetes-network"
      auto_create_subnetworks = false
    }
    
    resource "google_compute_subnetwork" "k8s_subnet" {
      name          = "kubernetes-subnet"
      ip_cidr_range = "10.0.0.0/24"
      region        = var.region
      network       = google_compute_network.k8s_network.id
    }
    
    resource "google_compute_firewall" "k8s_firewall" {
      name    = "kubernetes-firewall"
      network = google_compute_network.k8s_network.name
    
      allow {
        protocol = "icmp"
      }
    
      allow {
        protocol = "tcp"
        ports    = ["22", "6443", "2379-2380", "10250-10252"]
      }
    
      source_ranges = ["0.0.0.0/0"]
    }
    
    # Load Balancer for API server
    resource "google_compute_address" "k8s_lb_ip" {
      name   = "kubernetes-lb-ip"
      region = var.region
    }
    
    resource "google_compute_http_health_check" "k8s_health_check" {
      name               = "kubernetes-health-check"
      port               = 6443
      request_path       = "/healthz"
      check_interval_sec = 5
      timeout_sec        = 5
    }
    
    resource "google_compute_target_pool" "k8s_target_pool" {
      name             = "kubernetes-target-pool"
      instances        = [for vm in google_compute_instance.k8s_master : "${var.zone}/${vm.name}"]
      health_checks    = [google_compute_http_health_check.k8s_health_check.name]
      session_affinity = "CLIENT_IP"
    }
    
    resource "google_compute_forwarding_rule" "k8s_forwarding_rule" {
      name       = "kubernetes-forwarding-rule"
      target     = google_compute_target_pool.k8s_target_pool.id
      port_range = "6443"
      ip_address = google_compute_address.k8s_lb_ip.address
      region     = var.region
    }
    
    # Master Nodes
    resource "google_compute_instance" "k8s_master" {
      count        = var.master_count
      name         = "k8s-master-${count.index + 1}"
      machine_type = var.master_machine_type
      zone         = var.zone
    
      boot_disk {
        initialize_params {
          image = "ubuntu-os-cloud/ubuntu-2204-lts"
          size  = 50
          type  = "pd-ssd"
        }
      }
    
      network_interface {
        network    = google_compute_network.k8s_network.name
        subnetwork = google_compute_subnetwork.k8s_subnet.name
        access_config {
          // Ephemeral IP
        }
      }
    
      metadata = {
        ssh-keys = "${var.ssh_username}:${file(var.ssh_public_key_path)}"
      }
    
      tags = ["kubernetes", "master"]
    }
    
    # Worker Nodes
    resource "google_compute_instance" "k8s_worker" {
      count        = var.worker_count
      name         = "k8s-worker-${count.index + 1}"
      machine_type = var.worker_machine_type
      zone         = var.zone
    
      boot_disk {
        initialize_params {
          image = "ubuntu-os-cloud/ubuntu-2204-lts"
          size  = 100
          type  = "pd-ssd"
        }
      }
    
      network_interface {
        network    = google_compute_network.k8s_network.name
        subnetwork = google_compute_subnetwork.k8s_subnet.name
        access_config {
          // Ephemeral IP
        }
      }
    
      metadata = {
        ssh-keys = "${var.ssh_username}:${file(var.ssh_public_key_path)}"
      }
    
      tags = ["kubernetes", "worker"]
    }
    HCL

    GCP Variables and Outputs

    variable "project_id" {
      description = "GCP Project ID"
      default     = "your-project-id"
    }
    
    variable "region" {
      description = "GCP region"
      default     = "us-central1"
    }
    
    variable "zone" {
      description = "GCP zone"
      default     = "us-central1-a"
    }
    
    variable "master_count" {
      description = "Number of master nodes"
      default     = 3
    }
    
    variable "worker_count" {
      description = "Number of worker nodes"
      default     = 3
    }
    
    variable "master_machine_type" {
      description = "Machine type for master nodes"
      default     = "e2-standard-2"
    }
    
    variable "worker_machine_type" {
      description = "Machine type for worker nodes"
      default     = "e2-standard-4"
    }
    
    variable "ssh_username" {
      description = "SSH username"
      default     = "ubuntu"
    }
    
    variable "ssh_public_key_path" {
      description = "Path to the SSH public key"
      default     = "~/.ssh/id_rsa.pub"
    }
    HCL
    output "lb_ip_address" {
      value = google_compute_address.k8s_lb_ip.address
    }
    
    output "master_public_ips" {
      value = google_compute_instance.k8s_master[*].network_interface.0.access_config.0.nat_ip
    }
    
    output "master_private_ips" {
      value = google_compute_instance.k8s_master[*].network_interface.0.network_ip
    }
    
    output "worker_public_ips" {
      value = google_compute_instance.k8s_worker[*].network_interface.0.access_config.0.nat_ip
    }
    
    output "worker_private_ips" {
      value = google_compute_instance.k8s_worker[*].network_interface.0.network_ip
    }
    HCL

    25. Generating Ansible Inventory from Terraform

    resource "local_file" "ansible_inventory" {
      content = templatefile("${path.module}/templates/inventory.tmpl",
        {
          master_nodes = zipmap(
            [for i in range(var.master_count) : "master${i + 1}"],
            [for i in range(var.master_count) : {
              ansible_host = aws_instance.k8s_master[i].public_ip
              private_ip   = aws_instance.k8s_master[i].private_ip
            }]
          )
          worker_nodes = zipmap(
            [for i in range(var.worker_count) : "worker${i + 1}"],
            [for i in range(var.worker_count) : {
              ansible_host = aws_instance.k8s_worker[i].public_ip
              private_ip   = aws_instance.k8s_worker[i].private_ip
            }]
          )
          lb_endpoint         = aws_lb.k8s_api_lb.dns_name
          kubernetes_api_port = "6443"
        }
      )
      filename = "${path.module}/../../inventories/aws/hosts.yml"
    }
    HCL

    Inventory template for AWS:

    all:
      children:
        kubernetes:
          children:
            control_plane:
              hosts:
                %{ for name, node in master_nodes ~}
                ${name}:
                  ansible_host: ${node.ansible_host}
                  private_ip: ${node.private_ip}
                %{ endfor ~}
            workers:
              hosts:
                %{ for name, node in worker_nodes ~}
                ${name}:
                  ansible_host: ${node.ansible_host}
                  private_ip: ${node.private_ip}
                %{ endfor ~}
        etcd:
          children:
            control_plane:
      vars:
        ansible_user: ubuntu
        ansible_ssh_private_key_file: ~/.ssh/id_rsa
        ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
        control_plane_endpoint: "${lb_endpoint}:${kubernetes_api_port}"
        control_plane_endpoint_noport: "${lb_endpoint}"
    YAML

    26. Integrating Terraform with Ansible Workflow

    Create a bash script to automate the workflow:

    #!/bin/bash
    set -e
    
    CLOUD_PROVIDER=${1:-aws}
    ACTION=${2:-apply}
    
    echo "Deploying Kubernetes infrastructure on $CLOUD_PROVIDER"
    
    # Initialize and apply Terraform
    cd terraform/$CLOUD_PROVIDER
    terraform init
    terraform $ACTION -auto-approve
    
    # If we're destroying, we're done
    if [ "$ACTION" == "destroy" ]; then
        echo "Infrastructure destroyed successfully"
        exit 0
    fi
    
    # Wait for SSH to be available on all hosts
    echo "Waiting for SSH to be available..."
    sleep 30
    
    # Run Ansible playbook
    cd ../..
    ansible-playbook -i inventories/$CLOUD_PROVIDER/hosts.yml playbooks/site.yml
    
    echo "Kubernetes cluster deployed successfully on $CLOUD_PROVIDER!"
    echo "Run 'kubectl --kubeconfig kubeconfig get nodes' to check your cluster"
    Bash

    Make the script executable:

    chmod +x deploy-k8s.sh
    Bash

    27. Using the Multi-Cloud Deployment

    # Deploy on AWS
    ./deploy-k8s.sh aws apply
    
    # Deploy on Azure
    ./deploy-k8s.sh azure apply
    
    # Deploy on GCP
    ./deploy-k8s.sh gcp apply
    
    # Destroy infrastructure when done
    ./deploy-k8s.sh aws destroy
    Bash

    28. Best Practices for Multi-Cloud Kubernetes Deployments

    • State Management: Store Terraform state files in a remote backend (S3, Azure Blob, GCS)
    • Variable Encapsulation: Use Terraform modules for reusable components
    • Secrets Management: Never store credentials in your Terraform code; use environment variables or a vault
    • Network Consistency: Maintain consistent CIDR ranges across cloud providers
    • Version Pinning: Pin provider versions to avoid unexpected changes
    • Tagging Strategy: Implement a consistent tagging strategy for all resources
    • Cost Monitoring: Set up cost monitoring and alerts for each cloud provider
    • Backup Strategy: Ensure your backup strategy works across all cloud environments
    • Disaster Recovery: Test disaster recovery procedures regularly

    By combining Terraform for infrastructure provisioning with Ansible for configuration management, you have a powerful, flexible approach to deploying Kubernetes across multiple cloud providers while maintaining consistency and reliability.


    Discover more from Altgr Blog

    Subscribe to get the latest posts sent to your email.

    Leave a Reply

    Your email address will not be published. Required fields are marked *