From Beginner to Expert – AWS, GCP, Azure
Edition: 2025 | Target Audience: DevOps Engineers, Cloud Architects, System Administrators
Prerequisites: Basic Linux, Cloud fundamentals, YAML syntax
Estimated Learning Time: 40-60 hours
Table of Contents
Part I: Foundations
- Prerequisites and Learning Path
- Introduction to AWX and Ansible Tower
- Installation and Environment Setup
- Core Concepts and Architecture
- Basic Operations and Workflows
Part II: Cloud Platform Integration
- AWS Integration Deep Dive
- Google Cloud Platform Integration
- Azure Integration
- Multi-Cloud Strategy and Hybrid Deployments
Part III: Advanced Topics
- Advanced Features and Customization
- Security, Compliance, and Best Practices
- Performance Optimization and Scaling
- Monitoring and Observability
Part IV: Practical Application
- Hands-on Labs and Exercises
- Real-World Projects and Case Studies
- Troubleshooting and Problem Resolution
- CI/CD Integration Patterns
Part V: Reference Materials
1. Prerequisites and Learning Path
Before You Begin
This guide assumes you have foundational knowledge in the following areas. If you’re new to any of these topics, please review the recommended resources first.
Required Prerequisites
graph TD
A[Required Knowledge] --> B[Linux Fundamentals]
A --> C[Cloud Platforms Basics]
A --> D[YAML Syntax]
A --> E[Basic Networking]
B --> F[Command Line Interface]
B --> G[File Permissions]
B --> H[System Services]
C --> I[Virtual Machines]
C --> J[Storage Concepts]
C --> K[Network Security Groups]
D --> L[Data Structures]
D --> M[Indentation Rules]
E --> N[TCP/IP Basics]
E --> O[DNS Concepts]
E --> P[Load Balancing]Knowledge Assessment Checklist
Before proceeding, ensure you can confidently answer “yes” to these questions:
- Linux: Can you navigate directories, edit files, and manage permissions via command line?
- Cloud Basics: Do you understand concepts like VMs, networks, and storage in cloud environments?
- YAML: Can you read and write basic YAML configuration files?
- Networking: Do you understand IP addresses, ports, and basic security concepts?
- Version Control: Are you familiar with Git and basic repository operations?
Recommended Learning Path
graph LR
A[Complete Prerequisites] --> B[Install AWX]
B --> C[Learn Core Concepts]
C --> D[Basic Operations]
D --> E[Choose Cloud Platform]
E --> F[Advanced Features]
F --> G[Real Projects]
subgraph "Choose Your Path"
H[AWS Track]
I[GCP Track]
J[Azure Track]
K[Multi-Cloud Track]
end
E --> H
E --> I
E --> J
E --> KSystem Requirements
AWX Server Requirements
| Component | Minimum | Recommended | Production |
|---|---|---|---|
| CPU | 2 cores | 4 cores | 8+ cores |
| RAM | 4 GB | 8 GB | 16+ GB |
| Storage | 20 GB | 40 GB | 100+ GB |
| Network | 1 Gbps | 1 Gbps | 10 Gbps |
| OS | RHEL 8+, Ubuntu 20.04+ | RHEL 9+, Ubuntu 22.04+ | RHEL 9+, Ubuntu 22.04+ |
Client Machine Requirements
- Modern web browser (Chrome 90+, Firefox 88+, Safari 14+)
- Terminal/SSH client
- Git client
- Text editor (VS Code recommended)
- Cloud CLI tools (aws-cli, gcloud, az-cli)
Setting Up Your Learning Environment
Option 1: Local Development (Recommended for Learning)
# Install Docker and Docker Compose
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER
# Install Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/download/v2.39.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
# Verify installation
docker --version
docker-compose --versionBashOption 2: Cloud-Based Lab Environment
# cloud-lab-setup.yml
---
cloud_labs:
aws:
instance_type: t3.medium
ami: ubuntu-22.04
security_groups: ["awx-lab-sg"]
gcp:
machine_type: e2-medium
image_family: ubuntu-2204-lts
firewall_tags: ["awx-lab"]
azure:
vm_size: Standard_B2s
image_offer: 0001-com-ubuntu-server-jammy
nsg: awx-lab-nsgYAML2. Introduction to AWX and Ansible Tower
Evolution of Infrastructure Automation
The journey from manual server management to modern infrastructure automation has revolutionized how organizations deploy and manage their IT resources.
timeline
title Infrastructure Automation Evolution
2000s : Manual Server Management
: SSH Scripts
: Custom Tools
2010s : Configuration Management
: Puppet, Chef
: Ansible Core
2015s : Enterprise Automation
: Ansible Tower
: AWX Project
2020s : Cloud-Native Automation
: Multi-Cloud Support
: GitOps Integration
2025s : AI-Enhanced Automation
: Intelligent Workflows
: Predictive ScalingWhat is AWX?
AWX is the open-source upstream project for Red Hat Ansible Automation Platform (formerly Ansible Tower). It provides a web-based user interface, REST API, and task engine that sits on top of Ansible Core, transforming command-line automation into an enterprise-ready platform.
graph TB
subgraph "AWX Ecosystem"
A[AWX Web UI] --> B[REST API]
A --> C[Job Engine]
A --> D[Scheduler]
B --> E[CLI Integration]
B --> F[External Systems]
C --> G[Ansible Core]
C --> H[Playbook Execution]
D --> I[Cron Jobs]
D --> J[Event-Driven Automation]
G --> K[Target Infrastructure]
subgraph "Multi-Cloud Targets"
K --> L[AWS Resources]
K --> M[GCP Resources]
K --> N[Azure Resources]
K --> O[On-Premises Servers]
end
end
style A fill:#e1f5fe
style G fill:#f3e5f5
style K fill:#e8f5e8Product Comparison Matrix
Understanding the differences between Ansible variants helps you choose the right tool for your needs:
| Feature | Ansible Core | AWX | Ansible Automation Platform |
|---|---|---|---|
| Cost | Free | Free | Commercial License |
| Interface | CLI only | Web UI + CLI + API | Web UI + CLI + API + Analytics |
| User Management | Local users | RBAC + Teams | Enterprise LDAP/SAML + RBAC |
| Job Scheduling | Cron/Manual | Built-in Scheduler | Advanced Scheduling + SLAs |
| Inventory Management | Static files | Dynamic + Static | Smart Inventory + Automation Mesh |
| Credential Management | Local files | Encrypted vault | External credential stores |
| Workflow Orchestration | Manual scripting | Visual workflows | Advanced workflows + approval gates |
| Notifications | Manual | Email/Slack/Webhooks | Enterprise integrations |
| Support | Community | Community | Red Hat Enterprise Support |
| Compliance | Manual | Basic auditing | SOC2, FedRAMP, FIPS compliance |
| High Availability | Manual setup | Manual clustering | Automated HA + Disaster Recovery |
Core Value Propositions
1. Centralized Automation Control
graph LR
A[Scattered Scripts] --> B[AWX Platform]
C[Manual Processes] --> B
D[Tribal Knowledge] --> B
B --> E[Unified Dashboard]
B --> F[Standardized Workflows]
B --> G[Knowledge Repository]
style A fill:#ffebee
style C fill:#ffebee
style D fill:#ffebee
style E fill:#e8f5e8
style F fill:#e8f5e8
style G fill:#e8f5e82. Multi-Cloud Orchestration
- Unified Interface: Manage AWS, GCP, Azure, and on-premises from single platform
- Cloud Abstraction: Write once, deploy anywhere with cloud-agnostic playbooks
- Cost Optimization: Automated resource lifecycle management across providers
- Hybrid Strategies: Seamless integration between cloud and on-premises environments
3. Enterprise Security and Compliance
- Role-Based Access Control: Granular permissions at organization, team, and resource levels
- Credential Isolation: Secure storage and injection without exposure to users
- Audit Trails: Complete logging of all automation activities
- Compliance Ready: Support for SOX, HIPAA, PCI-DSS requirements
Use Cases by Industry
Financial Services
use_cases:
compliance_automation:
- PCI-DSS compliance checking
- SOX controls automation
- Risk assessment workflows
infrastructure_management:
- Multi-region disaster recovery
- Automated patching windows
- Capacity planning automationYAMLHealthcare
use_cases:
security_automation:
- HIPAA compliance monitoring
- Patient data encryption
- Access control automation
operational_efficiency:
- EHR system deployment
- Medical device configuration
- Backup and recovery automationYAMLTechnology/Startups
use_cases:
rapid_scaling:
- Auto-scaling infrastructure
- CI/CD pipeline automation
- Environment provisioning
cost_optimization:
- Resource right-sizing
- Unused resource cleanup
- Multi-cloud cost managementYAMLArchitecture Deep Dive
AWX follows a microservices architecture that provides scalability, reliability, and maintainability for enterprise automation workloads.
High-Level Architecture
graph TB
subgraph "User Layer"
A[Web Browser]
B[CLI Tools]
C[API Clients]
D[Mobile Apps]
end
subgraph "AWX Platform"
E[Load Balancer]
F[Web Service]
G[Task Service]
H[Receptor Service]
end
subgraph "Data Layer"
I[PostgreSQL]
J[Redis Queue]
K[File Storage]
end
subgraph "Execution Layer"
L[Ansible Runner]
M[Execution Nodes]
N[Isolated Environments]
end
subgraph "Target Infrastructure"
O[Cloud Resources]
P[On-Premises]
Q[Containers]
R[Network Devices]
end
A --> E
B --> E
C --> E
D --> E
E --> F
E --> G
E --> H
F --> I
G --> J
G --> K
H --> M
G --> L
L --> N
M --> O
M --> P
M --> Q
M --> R
style E fill:#e3f2fd
style I fill:#fff3e0
style J fill:#fff3e0
style K fill:#fff3e0
style L fill:#f3e5f5Component Breakdown
| Component | Purpose | Scalability | Technology |
|---|---|---|---|
| Web Service | UI, API endpoints, authentication | Horizontal | Django/Python |
| Task Service | Job execution, workflow management | Horizontal | Celery/Python |
| Receptor Service | Mesh networking, remote execution | Horizontal | Go |
| PostgreSQL | Metadata, configuration, audit logs | Vertical + Read Replicas | PostgreSQL 12+ |
| Redis | Task queue, session storage, caching | Horizontal (Cluster) | Redis 6+ |
| File Storage | Playbooks, logs, artifacts | Network storage | NFS/S3/Azure Blob |
Execution Flow
sequenceDiagram
participant User
participant WebUI
participant API
participant TaskService
participant Queue
participant Runner
participant Target
User->>WebUI: Launch Job Template
WebUI->>API: POST /api/v2/job_templates/N/launch/
API->>TaskService: Create Job Instance
TaskService->>Queue: Queue Execution Task
Queue->>Runner: Dequeue and Execute
Runner->>Target: Run Ansible Playbook
Target-->>Runner: Task Results
Runner-->>TaskService: Job Status Updates
TaskService-->>API: Real-time Updates
API-->>WebUI: WebSocket Updates
WebUI-->>User: Live Job OutputAWX vs Alternatives Comparison
AWX vs Jenkins
| Aspect | AWX | Jenkins |
|---|---|---|
| Primary Focus | Infrastructure automation | CI/CD pipelines |
| Learning Curve | Moderate (Ansible knowledge) | Steep (Plugin ecosystem) |
| Configuration | YAML playbooks | Groovy scripts/UI |
| State Management | Idempotent operations | Build-based |
| Multi-Cloud | Native support | Plugin-dependent |
| Agent Requirements | Agentless (SSH) | Agent-based |
AWX vs Terraform
| Aspect | AWX | Terraform |
|---|---|---|
| Approach | Procedural automation | Declarative infrastructure |
| State Management | Stateless (idempotent) | Stateful |
| Configuration | Imperative tasks | Declarative resources |
| Multi-Cloud | Unified playbooks | Provider-specific |
| Application Deployment | Full lifecycle | Infrastructure only |
| Learning Curve | Moderate | Moderate to High |
flowchart TD
B[Automation Controller]
B --> E[Job Management]
B --> F[Inventory Management]
B --> G[Credential Management]
E --> H[Ansible Playbooks]
F --> I[Cloud Resources]
G --> J[Authentication]
H --> K[AWS EC2]
H --> L[GCP Compute]
H --> M[Azure VMs]
Decision Framework: Choosing the Right Tool
flowchart TD
A[Start: Need Automation?] --> B{Team Size}
B -->|1-5 People| C{Budget}
B -->|6-50 People| D{Compliance Needs}
B -->|50+ People| E[Enterprise Platform]
C -->|Free| F[Ansible Core]
C -->|Minimal Budget| G[AWX]
D -->|Basic| G
D -->|Strict| H[Ansible Automation Platform]
E --> H
F --> I[CLI-based automation<br/>Manual scheduling<br/>Local credential storage]
G --> J[Web UI + API<br/>Job scheduling<br/>RBAC + Teams<br/>Audit logging]
H --> K[Enterprise features<br/>Professional support<br/>Advanced compliance<br/>High availability]
style F fill:#e8f5e8
style G fill:#e3f2fd
style H fill:#fff3e0Migration Paths
From Ansible Core to AWX
migration_steps:
preparation:
- Inventory playbooks and roles
- Document current automation workflows
- Identify credential management needs
installation:
- Deploy AWX infrastructure
- Configure authentication (LDAP/local)
- Set up organizations and teams
migration:
- Import existing playbooks as projects
- Create job templates from ad-hoc commands
- Migrate inventories (static to dynamic)
- Configure credentials securely
optimization:
- Implement workflow templates
- Set up notifications and monitoring
- Establish CI/CD integrationYAMLFrom AWX to Ansible Automation Platform
upgrade_considerations:
feature_gaps:
- Smart inventory → Automation mesh
- Basic workflows → Advanced workflow designer
- Community support → Enterprise support
technical_requirements:
- Data migration planning
- License procurement
- Support contract setup
business_justification:
- Compliance requirements (SOC2, FedRAMP)
- SLA requirements
- Enterprise integrations neededYAMLKey Benefits Deep Dive
1. Centralized Management
graph TB
subgraph "Before AWX"
A1[Dev Team Scripts]
A2[Ops Team Tools]
A3[Security Scripts]
A4[Cloud Scripts]
end
subgraph "After AWX"
B1[Unified Platform]
B1 --> B2[Standardized Workflows]
B1 --> B3[Shared Credentials]
B1 --> B4[Common Inventory]
B1 --> B5[Audit Trail]
end
A1 --> B1
A2 --> B1
A3 --> B1
A4 --> B1
style A1 fill:#ffebee
style A2 fill:#ffebee
style A3 fill:#ffebee
style A4 fill:#ffebee
style B1 fill:#e8f5e82. Operational Efficiency Metrics
| Metric | Before AWX | After AWX | Improvement |
|---|---|---|---|
| Deployment Time | 4-6 hours | 15-30 minutes | 85% reduction |
| Error Rate | 15-20% | 2-5% | 75% reduction |
| Rollback Time | 2-4 hours | 5-15 minutes | 90% reduction |
| Team Onboarding | 2-4 weeks | 3-5 days | 80% reduction |
| Compliance Audits | 40-60 hours | 4-8 hours | 85% reduction |
3. Installation and Environment Setup
Installation Strategy Decision Matrix
Choosing the right installation method depends on your specific requirements:
graph TD
A[Choose Installation Method] --> B{Purpose}
B -->|Learning/Development| C[Docker Compose]
B -->|Production/Team| D{Infrastructure}
B -->|High Availability| E[Kubernetes/OpenShift]
D -->|Single Server| F[Docker Compose]
D -->|Multi-Server| G[Kubernetes]
D -->|Cloud Native| H[Managed Kubernetes]
C --> I[Quick SetupResource EfficientEasy Backup]
F --> J[Production ReadySSL TerminationExternal Database]
G --> K[ScalableHigh AvailabilityRolling Updates]
H --> L[Cloud IntegrationManaged ServicesAuto-scaling]
E --> M[Enterprise GradeMulti-RegionDisaster Recovery]Environment Planning
Deployment Scenarios Comparison
| Scenario | CPU | RAM | Storage | Network | Use Case |
|---|---|---|---|---|---|
| Development | 2 cores | 4 GB | 20 GB | 100 Mbps | Learning, testing |
| Small Team | 4 cores | 8 GB | 50 GB | 1 Gbps | < 50 managed nodes |
| Medium Enterprise | 8 cores | 16 GB | 100 GB | 1 Gbps | 50-500 managed nodes |
| Large Enterprise | 16+ cores | 32+ GB | 500+ GB | 10 Gbps | 500+ managed nodes |
| Multi-Region | Distributed | Distributed | Distributed | Dedicated | Global infrastructure |
Network Requirements
network_requirements:
inbound_ports:
- port: 80
description: "HTTP (redirect to HTTPS)"
required: true
- port: 443
description: "HTTPS for web interface"
required: true
- port: 22
description: "SSH for managed nodes"
required: true
outbound_ports:
- port: 443
description: "HTTPS for cloud APIs, Git repos"
required: true
- port: 22
description: "SSH to managed nodes"
required: true
- port: 5432
description: "PostgreSQL (if external)"
required: false
firewall_rules:
- name: "AWX Web Access"
source: "0.0.0.0/0"
destination: "AWX Server"
ports: [80, 443]
- name: "SSH to Managed Nodes"
source: "AWX Server"
destination: "Managed Infrastructure"
ports: [22]YAMLInstallation Methods
Method 1: Docker Compose (Recommended for Learning)
Prerequisites Check Script:
#!/bin/bash
# prerequisites-check.sh
echo "=== AWX Prerequisites Check ==="
# Check Docker
if command -v docker &> /dev/null; then
echo "✓ Docker: $(docker --version)"
docker_running=$(docker info >/dev/null 2>&1 && echo "running" || echo "not running")
echo " Status: $docker_running"
else
echo "✗ Docker: Not installed"
fi
# Check Docker Compose
if command -v docker-compose &> /dev/null; then
echo "✓ Docker Compose: $(docker-compose --version)"
else
echo "✗ Docker Compose: Not installed"
fi
# Check Git
if command -v git &> /dev/null; then
echo "✓ Git: $(git --version)"
else
echo "✗ Git: Not installed"
fi
# Check system resources
echo "✓ System Resources:"
echo " CPU Cores: $(nproc)"
echo " RAM: $(free -h | awk '/^Mem:/ {print $2}')"
echo " Disk Space: $(df -h / | awk 'NR==2 {print $4}')"
# Check ports
echo "✓ Port Check:"
for port in 80 443; do
if netstat -tuln | grep -q ":$port "; then
echo " Port $port: In use"
else
echo " Port $port: Available"
fi
doneBashQuick Installation:
# Clone AWX repository
git clone https://github.com/ansible/awx.git
cd awx
# Create environment configuration
cat > .env << EOF
# Basic configuration
COMPOSE_PROJECT_NAME=awx
AWX_ADMIN_USER=admin
AWX_ADMIN_PASSWORD=SecurePassword123!
# Database settings
POSTGRES_PASSWORD=PostgresPassword123!
POSTGRES_USER=awx
POSTGRES_DB=awx
# Secret key (generate with: openssl rand -base64 30)
SECRET_KEY=$(openssl rand -base64 30)
# Host configuration
HOST_PORT=80
HOST_PORT_SSL=443
DOCKER_COMPOSE_DIR=/opt/awx
# Optional: Custom settings
DEFAULT_PROJECT_PATH=/var/lib/awx/projects
PROJECT_DATA_DIR=/var/lib/awx/projects
EOF
# Deploy AWX
make docker-compose-build
make docker-composeBashAdvanced Docker Compose Configuration:
# docker-compose.override.yml
version: '3.8'
services:
web:
environment:
- AWX_ADMIN_USER=admin
- AWX_ADMIN_PASSWORD=${AWX_ADMIN_PASSWORD}
volumes:
- awx_projects:/var/lib/awx/projects:rw
- awx_logs:/var/log/tower:rw
networks:
- awx_network
task:
volumes:
- awx_projects:/var/lib/awx/projects:rw
- awx_logs:/var/log/tower:rw
networks:
- awx_network
postgres:
environment:
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=awx
- POSTGRES_USER=awx
volumes:
- awx_postgres:/var/lib/postgresql/data:rw
networks:
- awx_network
ports:
- "5432:5432" # Expose for external access
redis:
networks:
- awx_network
ports:
- "6379:6379" # Expose for monitoring
volumes:
awx_postgres:
driver: local
awx_projects:
driver: local
awx_logs:
driver: local
networks:
awx_network:
driver: bridgeBashMethod 2: Kubernetes Deployment
Prerequisites:
# Install kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
# Install AWX Operator
kubectl apply -f https://raw.githubusercontent.com/ansible/awx-operator/devel/deploy/awx-operator.yaml
# Verify operator installation
kubectl get pods -n awxBashProduction Kubernetes Deployment:
# awx-production.yaml
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-production
namespace: awx
spec:
# Service configuration
service_type: LoadBalancer
service_labels:
app: awx-production
# Ingress configuration
ingress_type: ingress
ingress_class_name: nginx
hostname: awx.company.com
ingress_tls_secret: awx-tls-secret
# Database configuration
postgres_storage_class: fast-ssd
postgres_storage_requirements:
requests:
storage: 100Gi
postgres_resource_requirements:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
# AWX configuration
web_resource_requirements:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
task_resource_requirements:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
# Storage configuration
projects_persistence: true
projects_storage_class: standard
projects_storage_size: 50Gi
# Security configuration
admin_user: admin
admin_password_secret: awx-admin-password
secret_key_secret: awx-secret-key
# High availability
replicas: 2
# Additional configuration
extra_settings: |
LOGGING['handlers']['console']['level'] = 'INFO'
INSIGHTS_URL_BASE = 'https://console.redhat.com'
AUTOMATION_ANALYTICS_URL = 'https://console.redhat.com'YAMLMethod 3: Cloud Provider Managed Services
AWS EKS Deployment:
# aws-eks-awx.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: awx-eks-config
data:
aws-region: us-west-2
storage-class: gp3
load-balancer-type: nlb
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-eks
spec:
service_type: LoadBalancer
service_annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
postgres_storage_class: gp3
projects_storage_class: efs
# AWS-specific optimizations
task_resource_requirements:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2"
memory: "4Gi"YAMLInstallation Verification
Health Check Script
#!/bin/bash
# awx-health-check.sh
AWX_URL="http://localhost"
ADMIN_USER="admin"
ADMIN_PASSWORD="password"
echo "=== AWX Health Check ==="
# Check web service
echo "1. Checking web interface..."
if curl -s -o /dev/null -w "%{http_code}" "$AWX_URL" | grep -q "200\|302"; then
echo " ✓ Web interface accessible"
else
echo " ✗ Web interface not accessible"
fi
# Check API
echo "2. Checking API..."
API_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$AWX_URL/api/v2/ping/")
if [ "$API_STATUS" = "200" ]; then
echo " ✓ API responding"
else
echo " ✗ API not responding (HTTP $API_STATUS)"
fi
# Check database connectivity
echo "3. Checking database..."
if docker exec awx_postgres psql -U awx -d awx -c "SELECT 1;" > /dev/null 2>&1; then
echo " ✓ Database accessible"
else
echo " ✗ Database connection failed"
fi
# Check Redis
echo "4. Checking Redis..."
if docker exec awx_redis redis-cli ping | grep -q "PONG"; then
echo " ✓ Redis responding"
else
echo " ✗ Redis connection failed"
fi
# Check authentication
echo "5. Testing authentication..."
TOKEN=$(curl -s -X POST "$AWX_URL/api/v2/tokens/" \
-H "Content-Type: application/json" \
-d "{\"username\":\"$ADMIN_USER\",\"password\":\"$ADMIN_PASSWORD\"}" | \
jq -r '.token // empty')
if [ -n "$TOKEN" ]; then
echo " ✓ Authentication successful"
else
echo " ✗ Authentication failed"
fi
echo "Health check complete."BashCommon Installation Issues and Solutions
Issue Resolution Matrix
| Problem | Symptoms | Cause | Solution |
|---|---|---|---|
| Port conflicts | Service won’t start | Ports 80/443 in use | `sudo netstat -tulpn \ |
| Memory issues | Container crashes | Insufficient RAM | Increase Docker memory limit or add swap |
| Database errors | Migration failures | PostgreSQL issues | Check logs: docker logs awx_postgres |
| Permission errors | File access denied | Volume mount issues | Fix ownership: sudo chown -R 1000:1000 /opt/awx |
| Network issues | Can’t reach managed hosts | Firewall/routing | Test connectivity: ansible all -m ping |
| SSL certificate | Browser warnings | Self-signed cert | Configure proper SSL or add exception |
Troubleshooting Commands
# View all AWX containers
docker ps | grep awx
# Check container logs
docker logs awx_web
docker logs awx_task
docker logs awx_postgres
docker logs awx_redis
# Access AWX database
docker exec -it awx_postgres psql -U awx -d awx
# Check AWX configuration
docker exec -it awx_web awx-manage shell
# Monitor resource usage
docker stats
# Test connectivity to managed hosts
docker exec -it awx_task ansible all -m ping -i /tmp/inventoryBashBackup and Disaster Recovery
Backup Strategy
graph TB
A[AWX Backup Components] --> B[Database Backup]
A --> C[File System Backup]
A --> D[Configuration Backup]
B --> E[PostgreSQL Dump]
B --> F[Point-in-Time Recovery]
C --> G[Project Files]
C --> H[Log Files]
C --> I[SSL Certificates]
D --> J[Environment Variables]
D --> K[Custom Settings]
D --> L[Inventory Files]Database Backup Script
#!/bin/bash
# awx-backup.sh
BACKUP_DIR="/backup/awx/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
echo "Starting AWX backup to $BACKUP_DIR"
# Database backup
echo "Backing up PostgreSQL database..."
docker exec awx_postgres pg_dump -U awx awx > "$BACKUP_DIR/awx_database.sql"
# Projects backup
echo "Backing up project files..."
docker run --rm -v awx_projects:/source -v "$BACKUP_DIR":/backup \
alpine tar czf /backup/projects.tar.gz -C /source .
# Configuration backup
echo "Backing up configuration..."
cp .env "$BACKUP_DIR/"
docker-compose config > "$BACKUP_DIR/docker-compose.yml"
# Create backup manifest
cat > "$BACKUP_DIR/manifest.json" << EOF
{
"backup_date": "$(date -Iseconds)",
"awx_version": "$(docker exec awx_web awx-manage version)",
"database_size": "$(du -h $BACKUP_DIR/awx_database.sql | cut -f1)",
"projects_size": "$(du -h $BACKUP_DIR/projects.tar.gz | cut -f1)"
}
EOF
echo "Backup completed: $BACKUP_DIR"BashAutomated Backup with Retention
#!/bin/bash
# automated-backup.sh
BACKUP_BASE="/backup/awx"
RETENTION_DAYS=30
# Create daily backup
/usr/local/bin/awx-backup.sh
# Cleanup old backups
find "$BACKUP_BASE" -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;
# Upload to cloud storage (optional)
aws s3 sync "$BACKUP_BASE" s3://company-awx-backups/ --deleteBashDisaster Recovery Procedures
# disaster-recovery-plan.yml
recovery_procedures:
complete_restore:
steps:
1. "Deploy fresh AWX instance"
2. "Stop all services"
3. "Restore database from backup"
4. "Restore project files"
5. "Restore configuration"
6. "Start services and verify"
commands: |
# Stop services
docker-compose down
# Restore database
docker-compose up -d postgres
docker exec -i awx_postgres psql -U awx awx < /backup/awx_database.sql
# Restore projects
docker run --rm -v awx_projects:/target -v /backup:/source \
alpine tar xzf /source/projects.tar.gz -C /target
# Start all services
docker-compose up -d
partial_restore:
database_only: |
docker exec -i awx_postgres psql -U awx awx < /backup/awx_database.sql
docker-compose restart web task
projects_only: |
docker run --rm -v awx_projects:/target -v /backup:/source \
alpine tar xzf /source/projects.tar.gz -C /targetYAML4. Core Concepts and Architecture
flowchart TD
C[Release Options] --> G[Production Ready]
D[Environment] --> H[Development]
E[Plan] --> I[Enterprise]
Docker Compose Installation
# Clone AWX repository
git clone https://github.com/ansible/awx.git
cd awx
# Create inventory file
cat > installer/inventory << EOF
[all:vars]
docker_compose_dir=/opt/awx
host_port=80
host_port_ssl=443
docker_compose=true
create_preload_data=True
admin_user=admin
admin_password=password
secret_key=awxsecret
EOF
# Install AWX
cd installer
ansible-playbook -i inventory install.ymlBashKubernetes Installation
# awx-deployment.yaml
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-demo
spec:
service_type: nodeport
nodeport_port: 30080
admin_user: admin
admin_password_secret: awx-admin-passwordYAMLInitial Configuration
sequenceDiagram
participant User
participant AWX
participant Database
participant Redis
User->>AWX: Access Web Interface
AWX->>Database: Initialize Schema
AWX->>Redis: Setup Task Queue
AWX->>User: Display Login Page
User->>AWX: Admin Credentials
AWX->>User: Dashboard Access4. Core Concepts and Architecture
Conceptual Framework
Understanding AWX requires grasping how its components work together to transform individual Ansible playbooks into enterprise-grade automation workflows.
mindmap
root((AWX Platform))
Organizations
Teams
Users
RBAC
Resources
Projects
Inventories
Credentials
Templates
Execution
Jobs
Workflows
Schedules
Management
Notifications
Logging
MonitoringComponent Relationship Map
graph TB
subgraph "Organizational Layer"
A[Organization] --> B[Team A]
A --> C[Team B]
B --> D[User 1]
B --> E[User 2]
C --> F[User 3]
end
subgraph "Resource Layer"
G[Projects] --> H[Git Repository]
I[Inventories] --> J[Static Hosts]
I --> K[Dynamic Sources]
L[Credentials] --> M[SSH Keys]
L --> N[Cloud Tokens]
L --> O[Vault Secrets]
end
subgraph "Execution Layer"
P[Job Templates] --> Q[Playbook Selection]
P --> R[Inventory Assignment]
P --> S[Credential Mapping]
T[Workflows] --> U[Template Chain]
T --> V[Conditional Logic]
end
A --> G
A --> I
A --> L
G --> P
I --> P
L --> P
P --> T
style A fill:#e1f5fe
style P fill:#f3e5f5
style T fill:#e8f5e8Deep Dive: Key Components
1. Organizations – Tenant Isolation
Organizations provide multi-tenancy, allowing complete separation of resources and permissions.
# Organization structure example
organization:
name: "Production Environment"
description: "Production infrastructure automation"
teams:
- name: "Infrastructure Team"
permissions: ["admin"]
members: ["devops-lead", "senior-sre"]
- name: "Development Team"
permissions: ["execute", "read"]
members: ["dev-user1", "dev-user2"]
resources:
projects: ["web-app-deployment", "database-management"]
inventories: ["production-servers", "staging-environment"]
credentials: ["aws-production", "ssh-keys"]YAMLOrganization Design Patterns:
graph LR
subgraph "By Environment"
A1[Development Org]
A2[Staging Org]
A3[Production Org]
end
subgraph "By Business Unit"
B1[Finance IT Org]
B2[Marketing IT Org]
B3[Engineering Org]
end
subgraph "By Geographic Region"
C1[US East Org]
C2[EU West Org]
C3[APAC Org]
end2. Projects – Source Code Management
Projects link AWX to your automation code repositories and manage playbook lifecycle.
# Advanced project configuration
project:
name: "Multi-Cloud Infrastructure"
scm_type: "git"
scm_url: "https://github.com/company/ansible-playbooks.git"
scm_branch: "main"
scm_clean: true
scm_delete_on_update: true
scm_update_on_launch: true
scm_update_cache_timeout: 300
# Webhook integration
webhook_service: "github"
webhook_credential: "github-webhook-token"
# Custom environment
custom_virtualenv: "/var/lib/awx/venv/ansible-custom"
# Execution environment
execution_environment: "quay.io/ansible/creator-base:latest"
playbooks:
infrastructure:
- "aws/ec2-provision.yml"
- "aws/vpc-setup.yml"
- "gcp/compute-deploy.yml"
- "azure/vm-creation.yml"
applications:
- "apps/web-server-setup.yml"
- "apps/database-configuration.yml"
- "apps/load-balancer-config.yml"
maintenance:
- "maintenance/system-updates.yml"
- "maintenance/log-rotation.yml"
- "maintenance/backup-procedures.yml"YAMLProject Synchronization Flow:
sequenceDiagram
participant AWX
participant Git
participant Webhook
participant Job
Note over AWX,Job: Manual Sync
AWX->>Git: Fetch latest changes
Git-->>AWX: Updated playbooks
AWX->>AWX: Update project files
Note over AWX,Job: Automatic Sync
Webhook->>AWX: Repository updated
AWX->>Git: Fetch changes
Git-->>AWX: New playbooks
AWX->>Job: Trigger dependent jobs3. Inventories – Target Management
Inventories define what systems your automation will manage, supporting both static and dynamic sources.
Static Inventory Example:
# production-inventory.ini
[webservers]
web01.company.com ansible_host=10.0.1.10
web02.company.com ansible_host=10.0.1.11
web03.company.com ansible_host=10.0.1.12
[databases]
db01.company.com ansible_host=10.0.2.10
db02.company.com ansible_host=10.0.2.11
[loadbalancers]
lb01.company.com ansible_host=10.0.3.10
# Group variables
[webservers:vars]
http_port=80
https_port=443
app_user=webapp
[databases:vars]
db_port=5432
backup_schedule="0 2 * * *"
[all:vars]
ansible_user=ubuntu
ansible_ssh_private_key_file=/var/lib/awx/.ssh/id_rsa
environment=productionINIDynamic Inventory Architecture:
graph TB
subgraph "AWX Dynamic Inventory System"
A[AWX Inventory Management] --> B[Inventory Sources]
B --> C[Cloud Providers]
B --> D[Virtualization Platforms]
B --> E[Container Orchestrators]
B --> F[Network Devices]
B --> G[Custom Sources]
C --> H[AWS EC2]
C --> I[GCP Compute Engine]
C --> J[Azure VMs]
C --> K[DigitalOcean]
C --> L[OpenStack]
D --> M[VMware vCenter]
D --> N[Red Hat Virtualization]
D --> O[Proxmox]
E --> P[Kubernetes]
E --> Q[OpenShift]
E --> R[Docker Swarm]
F --> S[Cisco Devices]
F --> T[Juniper Networks]
F --> U[F5 Load Balancers]
G --> V[Custom Scripts]
G --> W[External APIs]
G --> X[Database Sources]
end
subgraph "Processing Layer"
Y[Plugin Execution] --> Z[Data Transformation]
Z --> AA[Group Generation]
AA --> BB[Variable Assignment]
BB --> CC[Smart Inventory Filtering]
end
subgraph "Caching & Performance"
DD[Cache Management]
EE[Update Schedules]
FF[Sync Optimization]
end
B --> Y
CC --> DD
DD --> EE
EE --> FF
style A fill:#e1f5fe
style CC fill:#f3e5f5
style DD fill:#e8f5e8Dynamic Inventory Deep Dive
Core Concepts and Benefits
Dynamic inventory revolutionizes infrastructure management by automatically discovering and categorizing your infrastructure resources. Unlike static inventory files that require manual maintenance, dynamic inventory adapts to your changing infrastructure in real-time.
Key Benefits:
mindmap
root((Dynamic Inventory Benefits))
Automation
Auto-discovery
Real-time updates
Reduced manual effort
Accuracy
Always current
Eliminates drift
Prevents errors
Scalability
Handles growth
Multi-cloud support
Unlimited resources
Integration
Cloud-native
API-driven
Event-triggeredMulti-Cloud Dynamic Inventory Strategy
Unified Multi-Cloud Configuration:
# multi-cloud-inventory.yml
---
# AWS Configuration
aws_inventory:
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
- us-west-2
- eu-west-1
# Advanced filtering
filters:
instance-state-name: running
"tag:Environment": ["production", "staging"]
"tag:Managed": "AWX"
# Grouping strategies
keyed_groups:
# Group by environment
- prefix: env
key: tags.Environment | default('untagged')
# Group by application
- prefix: app
key: tags.Application | default('unknown')
# Group by instance type family
- prefix: family
key: instance_type.split('.')[0]
# Group by availability zone
- prefix: az
key: placement.availability_zone
# Group by VPC
- prefix: vpc
key: vpc_id
# Host naming and variables
hostnames:
- tag:Name
- private-ip-address
- public-ip-address
compose:
ansible_host: public_ip_address | default(private_ip_address)
ansible_user: >-
'ec2-user' if platform.startswith('amzn') or platform.startswith('rhel')
else 'ubuntu' if platform.startswith('ubuntu')
else 'admin' if platform.startswith('debian')
else 'centos' if platform.startswith('centos')
else 'root'
cloud_provider: "'aws'"
region: placement.region
cost_center: tags.CostCenter | default('unknown')
backup_enabled: tags.BackupEnabled | default(false)
---
# GCP Configuration
gcp_inventory:
plugin: google.cloud.gcp_compute
projects:
- production-project-123
- staging-project-456
zones:
- us-central1-a
- us-central1-b
- us-west1-a
- europe-west1-b
# Authentication
auth_kind: serviceaccount
service_account_file: /path/to/service-account.json
# Filtering
filters:
- status = RUNNING
- labels.managed = awx
- labels.environment = (production OR staging)
# Grouping
keyed_groups:
- prefix: gcp_env
key: labels.environment | default('untagged')
- prefix: gcp_app
key: labels.application | default('unknown')
- prefix: gcp_zone
key: zone.split('/')[-1]
- prefix: gcp_type
key: machineType.split('/')[-1].split('-')[0]
# Variables
compose:
ansible_host: networkInterfaces[0].accessConfigs[0].natIP | default(networkInterfaces[0].networkIP)
ansible_user: "'ubuntu'"
cloud_provider: "'gcp'"
machine_type: machineType.split('/')[-1]
zone_name: zone.split('/')[-1]
project_id: "'{{ gcp_project }}'"
---
# Azure Configuration
azure_inventory:
plugin: azure.azcollection.azure_rm
# Subscription and authentication
include_vm_resource_groups:
- production-rg
- staging-rg
- development-rg
# Conditional compilation based on tags
conditional_groups:
azure_prod: tags.environment == "production"
azure_staging: tags.environment == "staging"
azure_web: tags.role == "webserver"
azure_db: tags.role == "database"
azure_monitoring: tags.monitoring == "enabled"
# Grouping strategies
keyed_groups:
- prefix: azure_env
key: tags.environment | default('untagged')
- prefix: azure_location
key: location
- prefix: azure_size
key: properties.hardwareProfile.vmSize
- prefix: azure_os
key: properties.storageProfile.osDisk.osType
# Host configuration
hostnames:
- public_ipv4_addresses
- private_ipv4_addresses
- name
compose:
ansible_host: public_ipv4_addresses[0] | default(private_ipv4_addresses[0])
ansible_user: "'azureuser'"
cloud_provider: "'azure'"
vm_size: properties.hardwareProfile.vmSize
os_type: properties.storageProfile.osDisk.osType
resource_group: resourceGroupYAMLAdvanced Smart Inventory Patterns
Smart Inventory Configuration Examples:
# Smart inventory filters for different scenarios
smart_inventories:
# Production web tier across all clouds
production_web_tier:
name: "Production Web Servers"
description: "All production web servers across AWS, GCP, and Azure"
filter: >
(variables__cloud_provider="aws" and group_names__contains="env_production" and group_names__contains="app_web") or
(variables__cloud_provider="gcp" and group_names__contains="gcp_env_production" and group_names__contains="gcp_app_web") or
(variables__cloud_provider="azure" and group_names__contains="azure_env_production" and group_names__contains="azure_web")
variables:
deployment_tier: "web"
monitoring_enabled: true
backup_schedule: "daily"
# High-performance compute instances
compute_intensive:
name: "High-Performance Compute"
description: "Instances optimized for compute-intensive workloads"
filter: >
(variables__instance_type__startswith="c5" or
variables__instance_type__startswith="c6i" or
variables__machine_type__startswith="c2-" or
variables__machine_type__startswith="n2-highcpu" or
variables__vm_size__startswith="F") and
variables__status="running"
variables:
performance_tier: "high"
monitoring_interval: 30
cost_optimization: false
# Multi-region disaster recovery
disaster_recovery_targets:
name: "Disaster Recovery Infrastructure"
description: "Infrastructure designated for disaster recovery"
filter: >
variables__tags__DR="enabled" or
variables__labels__disaster_recovery="true" or
variables__backup_enabled=true
variables:
backup_retention: "30days"
replication_enabled: true
priority: "high"
# Cost optimization candidates
cost_optimization_targets:
name: "Cost Optimization Candidates"
description: "Resources that could benefit from cost optimization"
filter: >
(variables__instance_type__startswith="m5.large" or
variables__instance_type__startswith="m5.xlarge") and
variables__cpu_utilization__lt=20 and
variables__age__gt=30
variables:
optimization_candidate: true
recommended_action: "downsize"
estimated_savings: "calculated"
# Security compliance group
compliance_required:
name: "Compliance-Required Systems"
description: "Systems requiring special security compliance"
filter: >
variables__tags__Compliance="PCI-DSS" or
variables__tags__Compliance="HIPAA" or
variables__tags__Compliance="SOC2" or
variables__labels__compliance="required"
variables:
security_scan_frequency: "daily"
patch_window: "emergency-only"
audit_logging: "enhanced"
encryption_required: trueYAMLCustom Dynamic Inventory Scripts
Advanced Custom Inventory Script Example:
#!/usr/bin/env python3
"""
Advanced Multi-Source Dynamic Inventory Script
Combines data from multiple sources including cloud providers, CMDBs, and monitoring systems
"""
import json
import sys
import argparse
import requests
import boto3
from google.cloud import compute_v1
from azure.identity import DefaultAzureCredential
from azure.mgmt.compute import ComputeManagementClient
import yaml
import os
from datetime import datetime, timedelta
class MultiSourceInventory:
def __init__(self):
self.inventory = {
'_meta': {
'hostvars': {}
}
}
self.config = self.load_config()
def load_config(self):
"""Load configuration from YAML file"""
config_path = os.environ.get('INVENTORY_CONFIG', 'inventory_config.yml')
try:
with open(config_path, 'r') as f:
return yaml.safe_load(f)
except FileNotFoundError:
return self.default_config()
def default_config(self):
"""Default configuration if file not found"""
return {
'sources': {
'aws': {'enabled': True, 'regions': ['us-east-1', 'us-west-2']},
'gcp': {'enabled': True, 'projects': []},
'azure': {'enabled': True, 'subscription_ids': []},
'cmdb': {'enabled': False, 'url': '', 'token': ''},
'monitoring': {'enabled': False, 'prometheus_url': ''}
},
'enrichment': {
'add_monitoring_data': True,
'add_cost_data': True,
'add_security_scan_results': False
},
'filtering': {
'exclude_stopped': True,
'minimum_uptime_hours': 1,
'include_tags': ['Managed=AWX']
}
}
def get_aws_instances(self):
"""Fetch AWS EC2 instances"""
if not self.config['sources']['aws']['enabled']:
return
for region in self.config['sources']['aws']['regions']:
try:
ec2 = boto3.client('ec2', region_name=region)
# Build filters
filters = [
{'Name': 'instance-state-name', 'Values': ['running']}
]
for tag_filter in self.config['filtering']['include_tags']:
key, value = tag_filter.split('=')
filters.append({
'Name': f'tag:{key}',
'Values': [value]
})
response = ec2.describe_instances(Filters=filters)
for reservation in response['Reservations']:
for instance in reservation['Instances']:
self.process_aws_instance(instance, region)
except Exception as e:
print(f"Error fetching AWS instances in {region}: {e}", file=sys.stderr)
def process_aws_instance(self, instance, region):
"""Process individual AWS instance"""
instance_id = instance['InstanceId']
# Extract tags
tags = {tag['Key']: tag['Value'] for tag in instance.get('Tags', [])}
# Determine hostname
hostname = (
tags.get('Name') or
instance.get('PublicIpAddress') or
instance.get('PrivateIpAddress') or
instance_id
)
# Build host variables
hostvars = {
'ansible_host': instance.get('PublicIpAddress') or instance.get('PrivateIpAddress'),
'ansible_user': self.determine_ssh_user(instance.get('Platform', 'linux')),
'instance_id': instance_id,
'instance_type': instance['InstanceType'],
'cloud_provider': 'aws',
'region': region,
'availability_zone': instance['Placement']['AvailabilityZone'],
'vpc_id': instance['VpcId'],
'subnet_id': instance['SubnetId'],
'private_ip': instance.get('PrivateIpAddress'),
'public_ip': instance.get('PublicIpAddress'),
'launch_time': instance['LaunchTime'].isoformat(),
'state': instance['State']['Name'],
'tags': tags,
'architecture': instance.get('Architecture', 'x86_64'),
'hypervisor': instance.get('Hypervisor', 'xen'),
'monitoring': instance.get('Monitoring', {}).get('State', 'disabled'),
'security_groups': [sg['GroupName'] for sg in instance.get('SecurityGroups', [])]
}
# Add enrichment data
if self.config['enrichment']['add_monitoring_data']:
hostvars.update(self.get_monitoring_data(instance_id, 'aws'))
if self.config['enrichment']['add_cost_data']:
hostvars.update(self.get_cost_data(instance_id, 'aws'))
# Store host variables
self.inventory['_meta']['hostvars'][hostname] = hostvars
# Add to groups
self.add_to_groups(hostname, hostvars)
def get_gcp_instances(self):
"""Fetch GCP Compute Engine instances"""
if not self.config['sources']['gcp']['enabled']:
return
try:
client = compute_v1.InstancesClient()
for project in self.config['sources']['gcp']['projects']:
# List all zones in project
zones_client = compute_v1.ZonesClient()
zones = zones_client.list(project=project)
for zone in zones:
try:
instances = client.list(project=project, zone=zone.name)
for instance in instances:
if instance.status == 'RUNNING':
self.process_gcp_instance(instance, project, zone.name)
except Exception as e:
print(f"Error fetching GCP instances in {zone.name}: {e}", file=sys.stderr)
except Exception as e:
print(f"Error initializing GCP client: {e}", file=sys.stderr)
def process_gcp_instance(self, instance, project, zone):
"""Process individual GCP instance"""
name = instance.name
# Extract labels (GCP's version of tags)
labels = dict(instance.labels) if instance.labels else {}
# Get network interfaces
network_interface = instance.network_interfaces[0] if instance.network_interfaces else {}
internal_ip = network_interface.network_i_p if network_interface else None
external_ip = None
if network_interface and network_interface.access_configs:
external_ip = network_interface.access_configs[0].nat_i_p
# Build host variables
hostvars = {
'ansible_host': external_ip or internal_ip,
'ansible_user': 'ubuntu', # or determine based on image
'instance_name': name,
'machine_type': instance.machine_type.split('/')[-1],
'cloud_provider': 'gcp',
'project': project,
'zone': zone,
'region': '-'.join(zone.split('-')[:-1]),
'internal_ip': internal_ip,
'external_ip': external_ip,
'creation_timestamp': instance.creation_timestamp,
'status': instance.status,
'labels': labels,
'self_link': instance.self_link,
'can_ip_forward': instance.can_ip_forward,
'scheduling': {
'automatic_restart': instance.scheduling.automatic_restart,
'on_host_maintenance': instance.scheduling.on_host_maintenance,
'preemptible': instance.scheduling.preemptible
}
}
# Add disk information
if instance.disks:
hostvars['disks'] = []
for disk in instance.disks:
disk_info = {
'device_name': disk.device_name,
'boot': disk.boot,
'auto_delete': disk.auto_delete,
'type': disk.type_,
'mode': disk.mode
}
hostvars['disks'].append(disk_info)
# Store host variables
self.inventory['_meta']['hostvars'][name] = hostvars
# Add to groups
self.add_to_groups(name, hostvars)
def get_azure_instances(self):
"""Fetch Azure Virtual Machines"""
if not self.config['sources']['azure']['enabled']:
return
try:
credential = DefaultAzureCredential()
for subscription_id in self.config['sources']['azure']['subscription_ids']:
compute_client = ComputeManagementClient(credential, subscription_id)
# Get all VMs in subscription
vms = compute_client.virtual_machines.list_all()
for vm in vms:
if vm.instance_view and vm.instance_view.statuses:
# Check if VM is running
power_state = next(
(status.display_status for status in vm.instance_view.statuses
if status.code.startswith('PowerState/')),
'Unknown'
)
if 'running' in power_state.lower():
self.process_azure_instance(vm, subscription_id)
except Exception as e:
print(f"Error fetching Azure instances: {e}", file=sys.stderr)
def process_azure_instance(self, vm, subscription_id):
"""Process individual Azure instance"""
name = vm.name
# Extract tags
tags = dict(vm.tags) if vm.tags else {}
# Build host variables
hostvars = {
'ansible_host': None, # Will be filled by network interface lookup
'ansible_user': 'azureuser',
'vm_name': name,
'vm_size': vm.hardware_profile.vm_size,
'cloud_provider': 'azure',
'subscription_id': subscription_id,
'resource_group': vm.id.split('/')[4], # Extract from resource ID
'location': vm.location,
'vm_id': vm.vm_id,
'tags': tags,
'os_type': vm.storage_profile.os_disk.os_type.value if vm.storage_profile.os_disk.os_type else 'Unknown',
'provisioning_state': vm.provisioning_state
}
# Store host variables
self.inventory['_meta']['hostvars'][name] = hostvars
# Add to groups
self.add_to_groups(name, hostvars)
def add_to_groups(self, hostname, hostvars):
"""Add host to appropriate groups based on its properties"""
# Cloud provider group
cloud_group = f"{hostvars['cloud_provider']}_instances"
self.add_host_to_group(hostname, cloud_group)
# Environment group (from tags/labels)
env = (
hostvars.get('tags', {}).get('Environment') or
hostvars.get('tags', {}).get('environment') or
hostvars.get('labels', {}).get('environment') or
'untagged'
)
self.add_host_to_group(hostname, f"env_{env}")
# Application group
app = (
hostvars.get('tags', {}).get('Application') or
hostvars.get('tags', {}).get('application') or
hostvars.get('labels', {}).get('application') or
'unknown'
)
self.add_host_to_group(hostname, f"app_{app}")
# Region group
region = hostvars.get('region', 'unknown')
self.add_host_to_group(hostname, f"region_{region}")
# Instance type/size group
instance_type = (
hostvars.get('instance_type') or
hostvars.get('machine_type') or
hostvars.get('vm_size') or
'unknown'
)
self.add_host_to_group(hostname, f"type_{instance_type}")
# Add custom groups based on configuration
self.add_custom_groups(hostname, hostvars)
def add_host_to_group(self, hostname, group_name):
"""Add host to specified group"""
if group_name not in self.inventory:
self.inventory[group_name] = {'hosts': []}
if hostname not in self.inventory[group_name]['hosts']:
self.inventory[group_name]['hosts'].append(hostname)
def add_custom_groups(self, hostname, hostvars):
"""Add hosts to custom groups based on complex logic"""
# High availability group
if any([
hostvars.get('tags', {}).get('HighAvailability') == 'true',
hostvars.get('labels', {}).get('high_availability') == 'true',
'ha' in hostvars.get('instance_type', '').lower()
]):
self.add_host_to_group(hostname, 'high_availability')
# Monitoring-enabled group
if any([
hostvars.get('monitoring') == 'enabled',
hostvars.get('tags', {}).get('Monitoring') == 'enabled',
hostvars.get('labels', {}).get('monitoring') == 'enabled'
]):
self.add_host_to_group(hostname, 'monitoring_enabled')
# Cost-sensitive group
cost_sensitive_types = ['t2.', 't3.', 'f1-micro', 'g1-small', 'Standard_B']
if any(cs_type in hostvars.get('instance_type', '') or
cs_type in hostvars.get('machine_type', '') or
cs_type in hostvars.get('vm_size', '') for cs_type in cost_sensitive_types):
self.add_host_to_group(hostname, 'cost_optimized')
def determine_ssh_user(self, platform):
"""Determine SSH user based on platform"""
platform_users = {
'amazon': 'ec2-user',
'rhel': 'ec2-user',
'centos': 'centos',
'ubuntu': 'ubuntu',
'debian': 'admin',
'suse': 'ec2-user',
'windows': 'Administrator'
}
for platform_key, user in platform_users.items():
if platform_key in platform.lower():
return user
return 'root' # Default fallback
def get_monitoring_data(self, instance_id, cloud_provider):
"""Fetch monitoring data for instance"""
if not self.config['enrichment']['add_monitoring_data']:
return {}
try:
# Simulate monitoring data - replace with actual monitoring system calls
monitoring_data = {
'cpu_utilization_avg': 45.2,
'memory_utilization_avg': 62.8,
'disk_utilization_avg': 23.1,
'network_in_bytes': 1024000,
'network_out_bytes': 2048000,
'last_monitoring_update': datetime.now().isoformat()
}
return monitoring_data
except Exception:
return {}
def get_cost_data(self, instance_id, cloud_provider):
"""Fetch cost data for instance"""
if not self.config['enrichment']['add_cost_data']:
return {}
try:
# Simulate cost data - replace with actual cost management API calls
cost_data = {
'monthly_cost_estimate': 89.50,
'daily_cost_average': 2.98,
'cost_center': 'IT-Infrastructure',
'cost_optimization_score': 7.2,
'last_cost_update': datetime.now().isoformat()
}
return cost_data
except Exception:
return {}
def run(self):
"""Main execution method"""
# Fetch from all enabled sources
self.get_aws_instances()
self.get_gcp_instances()
self.get_azure_instances()
# Apply post-processing filters
self.apply_filters()
return self.inventory
def apply_filters(self):
"""Apply post-processing filters"""
# Remove hosts that don't meet minimum uptime
min_uptime = self.config['filtering'].get('minimum_uptime_hours', 0)
if min_uptime > 0:
cutoff_time = datetime.now() - timedelta(hours=min_uptime)
hosts_to_remove = []
for hostname, hostvars in self.inventory['_meta']['hostvars'].items():
launch_time_str = hostvars.get('launch_time') or hostvars.get('creation_timestamp')
if launch_time_str:
try:
launch_time = datetime.fromisoformat(launch_time_str.replace('Z', '+00:00'))
if launch_time > cutoff_time:
hosts_to_remove.append(hostname)
except ValueError:
pass # Skip if time parsing fails
# Remove hosts from inventory
for hostname in hosts_to_remove:
del self.inventory['_meta']['hostvars'][hostname]
# Remove from all groups
for group_name, group_data in self.inventory.items():
if group_name != '_meta' and 'hosts' in group_data:
if hostname in group_data['hosts']:
group_data['hosts'].remove(hostname)
def main():
"""Main function"""
parser = argparse.ArgumentParser(description='Multi-source dynamic inventory')
parser.add_argument('--list', action='store_true',
help='List all hosts (default behavior)')
parser.add_argument('--host', metavar='HOST',
help='Get variables for specific host')
args = parser.parse_args()
inventory = MultiSourceInventory()
if args.host:
# Return variables for specific host
result = inventory.run()
host_vars = result['_meta']['hostvars'].get(args.host, {})
print(json.dumps(host_vars, indent=2, default=str))
else:
# Return full inventory
result = inventory.run()
print(json.dumps(result, indent=2, default=str))
if __name__ == '__main__':
main()PythonConfiguration File for Custom Script (inventory_config.yml):
# inventory_config.yml
sources:
aws:
enabled: true
regions:
- us-east-1
- us-west-2
- eu-west-1
gcp:
enabled: true
projects:
- production-project-123456
- staging-project-789012
azure:
enabled: true
subscription_ids:
- "12345678-1234-1234-1234-123456789012"
- "87654321-4321-4321-4321-210987654321"
cmdb:
enabled: false
url: "https://cmdb.company.com/api/v1/hosts"
token: "{{ cmdb_api_token }}"
monitoring:
enabled: true
prometheus_url: "https://prometheus.company.com"
grafana_url: "https://grafana.company.com"
enrichment:
add_monitoring_data: true
add_cost_data: true
add_security_scan_results: false
add_compliance_status: true
filtering:
exclude_stopped: true
minimum_uptime_hours: 1
include_tags:
- "Managed=AWX"
- "Environment=production"
- "Environment=staging"
exclude_tags:
- "DoNotManage=true"
- "Temporary=true"
grouping:
custom_groups:
database_servers:
criteria:
- "tags.Role=database"
- "labels.role=database"
- "app_*database*"
web_servers:
criteria:
- "tags.Role=webserver"
- "labels.role=web"
- "app_*web*"
monitoring_targets:
criteria:
- "tags.Monitoring=enabled"
- "labels.monitoring=true"
performance:
cache_duration_minutes: 15
parallel_execution: true
max_workers: 10
timeout_seconds: 300YAMLPerformance Optimization Strategies
Caching and Update Strategies:
graph TD
A[Dynamic Inventory Request] --> B{Cache Valid?}
B -->|Yes| C[Return Cached Data]
B -->|No| D[Fetch Fresh Data]
D --> E[Parallel Source Queries]
E --> F[AWS API]
E --> G[GCP API]
E --> H[Azure API]
E --> I[Custom Sources]
F --> J[Merge Results]
G --> J
H --> J
I --> J
J --> K[Apply Filters]
K --> L[Update Cache]
L --> M[Return Fresh Data]
N[Background Refresh] --> O[Scheduled Updates]
O --> P[Incremental Sync]
P --> L
style C fill:#e8f5e8
style M fill:#e8f5e8AWX Inventory Source Configuration for Performance:
# High-performance inventory source configuration
inventory_source_config:
name: "Multi-Cloud Production Inventory"
source: "scm" # Use SCM for custom script
source_script: "dynamic_inventory.py"
# Performance settings
update_on_launch: false # Disable for faster job starts
update_cache_timeout: 900 # 15 minutes
overwrite: true
overwrite_vars: false
# Scheduling for background updates
custom_virtualenv: "/var/lib/awx/venv/multi-cloud"
timeout: 300
verbosity: 1
# Environment variables
source_vars: |
INVENTORY_CONFIG: /var/lib/awx/projects/inventory_config.yml
AWS_DEFAULT_REGION: us-east-1
GOOGLE_APPLICATION_CREDENTIALS: /var/lib/awx/gcp-service-account.json
AZURE_SUBSCRIPTION_ID: "{{ azure_subscription_id }}"
# Update schedule
update_on_project_update: true
custom_cron_schedule: "*/15 * * * *" # Every 15 minutesYAMLSmart Inventory Examples:
# Smart inventory filters
smart_inventories:
production_web_servers:
filter: >
group_names__contains="webservers" and
variables__environment="production" and
variables__status="active"
aws_instances_by_type:
filter: >
variables__cloud_provider="aws" and
variables__instance_type__startswith="t3"
outdated_systems:
filter: >
variables__last_updated__lt="2024-01-01" or
variables__os_version__lt="20.04"YAML4. Credentials – Security Management
Credentials securely store authentication information and inject them into automation jobs without exposing sensitive data.
Credential Type Matrix:
| Type | Use Case | Fields | Security Features |
|---|---|---|---|
| Machine | SSH/WinRM access | Username, SSH key, Password | Encrypted storage, No user access |
| SCM | Git repository access | Username, Token/Password | Token rotation, Webhook secrets |
| Cloud | AWS/GCP/Azure APIs | Access keys, Service accounts | IAM role assumption, Temporary tokens |
| Vault | External secret stores | Vault token, Namespace | Dynamic secrets, Lease management |
| Network | Network device access | Username, Privilege escalation | Connection pooling, Session management |
Advanced Credential Configuration:
# AWS credential with role assumption
aws_credential:
name: "AWS Production Account"
credential_type: "Amazon Web Services"
inputs:
username: "AKIA..."
password: "secret_access_key"
security_token: "optional_session_token"
# Role assumption for cross-account access
metadata:
assume_role_arn: "arn:aws:iam::123456789012:role/AWXExecutionRole"
external_id: "unique-external-id"
# Vault integration credential
vault_credential:
name: "HashiCorp Vault"
credential_type: "HashiCorp Vault Secret Lookup"
inputs:
url: "https://vault.company.com"
token: "hvs.AAAA..."
namespace: "production"
auth_method: "aws"YAML5. Job Templates – Execution Blueprints
Job templates combine projects, inventories, and credentials into reusable automation units.
Job Template Configuration Levels:
graph TD
A[Job Template] --> B[Basic Configuration]
A --> C[Advanced Options]
A --> D[Survey Specification]
A --> E[Access Control]
B --> F[Project Selection]
B --> G[Playbook Choice]
B --> H[Inventory Assignment]
B --> I[Credential Mapping]
C --> J[Execution Environment]
C --> K[Instance Groups]
C --> L[Timeout Settings]
C --> M[Verbosity Level]
D --> N[Runtime Variables]
D --> O[User Input Forms]
D --> P[Validation Rules]
E --> Q[Team Permissions]
E --> R[User Access]
E --> S[Execution Rights]Production Job Template Example:
job_template:
name: "Multi-Cloud Infrastructure Deployment"
description: "Deploy standardized infrastructure across AWS, GCP, and Azure"
# Core configuration
project: "infrastructure-as-code"
playbook: "site.yml"
inventory: "cloud-inventory"
# Credentials
credentials:
- "aws-production"
- "gcp-service-account"
- "azure-subscription"
- "ssh-deployment-key"
# Execution settings
execution_environment: "custom-cloud-ee"
instance_groups: ["cloud-workers"]
job_type: "run"
verbosity: 1
timeout: 3600
forks: 50
# Advanced options
become_enabled: true
allow_simultaneous: false
use_fact_cache: true
webhook_service: "github"
# Survey for runtime customization
survey_enabled: true
survey_spec:
- name: "target_environment"
type: "multiplechoice"
choices: ["development", "staging", "production"]
required: true
- name: "instance_count"
type: "integer"
min_value: 1
max_value: 10
default: 3
- name: "enable_monitoring"
type: "boolean"
default: trueYAML6. Workflows – Complex Orchestration
Workflows chain multiple job templates together with conditional logic and approval gates.
Workflow Design Patterns:
graph TD
A[Infrastructure Deployment Workflow] --> B[Pre-deployment Checks]
B --> C{Environment Ready?}
C -->|No| D[Fix Issues]
C -->|Yes| E[Deploy Infrastructure]
D --> B
E --> F[Deploy Applications]
F --> G{Deployment Success?}
G -->|No| H[Rollback Infrastructure]
G -->|Yes| I[Run Tests]
H --> J[Notify Failure]
I --> K{Tests Pass?}
K -->|No| H
K -->|Yes| L[Update DNS]
L --> M[Enable Monitoring]
M --> N[Notify Success]
style C fill:#fff9c4
style G fill:#fff9c4
style K fill:#fff9c4# project-structure.yaml
project:
name: "Cloud Infrastructure"
scm_type: "git"
scm_url: "https://github.com/company/ansible-playbooks.git"
scm_branch: "main"
playbooks:
- aws-ec2-provision.yml
- gcp-compute-setup.yml
- azure-vm-deployment.ymlYAML3. Inventories
Collections of hosts that can be targeted by jobs.
[aws_instances]
web-server-1 ansible_host=10.0.1.10
web-server-2 ansible_host=10.0.1.11
[gcp_instances]
app-server-1 ansible_host=10.1.1.10
app-server-2 ansible_host=10.1.1.11
[azure_instances]
db-server-1 ansible_host=10.2.1.10
db-server-2 ansible_host=10.2.1.11
[webservers:children]
aws_instances
gcp_instances
[databases:children]
azure_instancesINI4. Job Templates
Reusable definitions for running Ansible playbooks.
graph LR
A[Job Template] --> B[Project]
A --> C[Inventory]
A --> D[Credentials]
A --> E[Playbook]
F[Job Launch] --> A
A --> G[Job Execution]
G --> H[Results]4. Basic Operations {#basic-operations}
Creating Your First Project
sequenceDiagram
participant User
participant AWX
participant Git
User->>AWX: Create New Project
AWX->>Git: Clone Repository
Git-->>AWX: Playbook Files
AWX->>AWX: Sync Project
AWX-->>User: Project ReadyStep-by-Step Project Creation
- Navigate to Projects
- Click “Add” Button
- Fill Project Details:
name: "Multi-Cloud Infrastructure"
organization: "Default"
scm_type: "Git"
scm_url: "https://github.com/your-org/playbooks.git"
scm_branch: "main"
scm_clean: true
scm_update_on_launch: trueYAMLCreating Inventories
Static Inventory
[production]
web1.example.com
web2.example.com
db1.example.com
[staging]
stage-web.example.com
stage-db.example.com
[all:vars]
ansible_user=ubuntu
ansible_ssh_private_key_file=/path/to/keyINIDynamic Inventory
#!/usr/bin/env python3
# aws_dynamic_inventory.py
import boto3
import json
def get_aws_instances():
ec2 = boto3.client('ec2')
instances = ec2.describe_instances()
inventory = {'_meta': {'hostvars': {}}}
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
if instance['State']['Name'] == 'running':
host = instance['PublicIpAddress']
inventory['_meta']['hostvars'][host] = {
'instance_id': instance['InstanceId'],
'instance_type': instance['InstanceType'],
'region': instance['Placement']['AvailabilityZone'][:-1]
}
return inventory
print(json.dumps(get_aws_instances()))PythonCredential Management
graph TD
A[Credential Types] --> B[Machine]
A --> C[Source Control]
A --> D[Cloud Providers]
A --> E[Network]
B --> F[SSH Keys]
C --> G[Git Tokens]
D --> H[AWS Keys]
D --> I[GCP Service Account]
D --> J[Azure Credentials]
E --> K[Network Device Auth]Running Your First Job
sequenceDiagram
participant User
participant AWX
participant Target
User->>AWX: Launch Job Template
AWX->>AWX: Validate Parameters
AWX->>Target: Execute Playbook
Target-->>AWX: Task Results
AWX->>AWX: Log Output
AWX-->>User: Job Completion5. Basic Operations and Workflows {#basic-operations}
Your First AWX Automation Journey
This section provides hands-on exercises to build practical skills progressively. Each lab builds upon previous knowledge and introduces new concepts.
Lab Environment Setup
Before starting the labs, ensure you have:
graph TD
A[Lab Prerequisites] --> B[AWX Instance Running]
A --> C[Test Infrastructure]
A --> D[Git Repository]
A --> E[Cloud Accounts]
B --> F[Web UI Accessible]
B --> G[Admin Access]
C --> H[Linux VMs]
C --> I[Network Connectivity]
D --> J[Sample Playbooks]
D --> K[SSH Keys]
E --> L[AWS/GCP/Azure]
E --> M[API Credentials]Lab 1: Foundation Setup (Beginner)
Creating Your First Organization
Objective: Set up organizational structure for multi-team collaboration.
Step-by-Step Process:
- Access AWX Web Interface
URL: https://your-awx-server
Username: admin
Password: your-admin-passwordINI- Create Organization
organization:
name: "Learning Lab Organization"
description: "Hands-on learning environment for AWX"
max_hosts: 100
custom_virtualenv: ""YAML- Verification Script
# Verify organization creation via API
curl -k -H "Authorization: Bearer $AWX_TOKEN" \
"https://your-awx-server/api/v2/organizations/" | \
jq '.results[] | select(.name=="Learning Lab Organization")'BashSetting Up Teams and Users
Team Structure Design:
graph TB
A[Learning Lab Organization] --> B[Infrastructure Team]
A --> C[Application Team]
A --> D[Security Team]
B --> E[Lead Admin]
B --> F[DevOps Engineer]
C --> G[App Developer]
C --> H[QA Engineer]
D --> I[Security Analyst]
D --> J[Compliance Officer]
style A fill:#e1f5fe
style B fill:#f3e5f5
style C fill:#e8f5e8
style D fill:#fff3e0Implementation Commands:
# Create teams via AWX CLI (awx command-line tool)
awx teams create \
--name "Infrastructure Team" \
--organization "Learning Lab Organization" \
--description "Manages infrastructure automation"
awx teams create \
--name "Application Team" \
--organization "Learning Lab Organization" \
--description "Handles application deployment"BashLab 2: Project and Inventory Management (Beginner)
Creating Your First Project
Project Repository Structure:
ansible-lab-project/
├── playbooks/
│ ├── site.yml
│ ├── webserver-setup.yml
│ └── database-config.yml
├── roles/
│ ├── common/
│ ├── webserver/
│ └── database/
├── inventories/
│ ├── production/
│ └── staging/
├── group_vars/
├── host_vars/
└── requirements.ymlBashSample Playbook (playbooks/webserver-setup.yml):
---
- name: Setup Web Server
hosts: webservers
become: yes
vars:
http_port: 80
max_clients: 200
tasks:
- name: Install Apache
package:
name: "{{ apache_package }}"
state: present
vars:
apache_package: "{{ 'httpd' if ansible_os_family == 'RedHat' else 'apache2' }}"
- name: Start Apache service
service:
name: "{{ apache_service }}"
state: started
enabled: true
vars:
apache_service: "{{ 'httpd' if ansible_os_family == 'RedHat' else 'apache2' }}"
- name: Configure firewall
firewalld:
port: "{{ http_port }}/tcp"
permanent: true
state: enabled
immediate: true
when: ansible_os_family == "RedHat"
- name: Deploy custom index page
template:
src: index.html.j2
dest: /var/www/html/index.html
mode: '0644'
notify: restart apache
handlers:
- name: restart apache
service:
name: "{{ apache_service }}"
state: restartedYAMLDynamic Inventory Configuration via AWX UI
Setting up dynamic inventory through the AWX web interface is streamlined and user-friendly. This section provides step-by-step guidance for configuring cloud-based dynamic inventories.
UI Navigation and Setup Overview
graph TD
A[AWX Dashboard] --> B[Inventories Section]
B --> C[Create New Inventory]
C --> D[Add Inventory Source]
D --> E[Choose Cloud Provider]
E --> F[AWS EC2]
E --> G[Google Compute Engine]
E --> H[Microsoft Azure RM]
E --> I[VMware vCenter]
F --> J[Configure AWS Credentials]
G --> K[Configure GCP Credentials]
H --> L[Configure Azure Credentials]
J --> M[Set Regions & Filters]
K --> N[Set Projects & Zones]
L --> O[Set Subscriptions & Resource Groups]
M --> P[Configure Grouping & Variables]
N --> P
O --> P
P --> Q[Test & Sync Inventory]
Q --> R[Schedule Updates]
style A fill:#e1f5fe
style Q fill:#e8f5e8
style R fill:#f3e5f5Step-by-Step AWS EC2 Dynamic Inventory Setup
Step 1: Navigate to Inventories
- Access AWX Dashboard
- Open your browser and navigate to
https://your-awx-server - Log in with your credentials
- From the main dashboard, click “Inventories” in the left navigation panel
- Open your browser and navigate to
- Create New Inventory
Navigation Path: Dashboard → Inventories → [+ Add] → Inventory
Step 2: Create Base Inventory
Basic Information:
Name: AWS Production Inventory
Description: Dynamic inventory for AWS EC2 instances in production environment
Organization: Production Org
Instance Groups: [Leave default or select specific groups]
Variables: {}YAMLAdvanced Options:
# Inventory Variables (Optional)
inventory_variables:
environment: production
cloud_provider: aws
monitoring_enabled: true
backup_retention_days: 30YAMLStep 3: Add AWS EC2 Inventory Source
- Navigate to Sources Tab
- After creating the inventory, click on the “Sources” tab
- Click “Add” button to create a new inventory source
- Configure Source Details Basic Configuration:
Name: AWS EC2 Production Source
Description: Discovers EC2 instances in production regions
Source: Amazon EC2
Credential: [Select your AWS credential]YAML- Source Variables Configuration:
# Complete AWS EC2 source configuration
plugin: amazon.aws.aws_ec2
# Regional Configuration
regions:
- us-east-1
- us-west-2
- eu-west-1
# Instance Filtering
filters:
instance-state-name:
- running
- pending
"tag:Environment":
- production
- staging
"tag:Managed":
- AWX
- Ansible
# Advanced Grouping Strategy
keyed_groups:
# Group by environment
- prefix: env
key: tags.Environment | default('untagged')
separator: "_"
# Group by application
- prefix: app
key: tags.Application | default('unknown')
separator: "_"
# Group by instance type family
- prefix: family
key: instance_type.split('.')[0]
separator: "_"
# Group by availability zone
- prefix: az
key: placement.availability_zone
separator: "_"
# Group by VPC
- prefix: vpc
key: vpc_id
separator: "_"
# Group by security groups
- prefix: sg
key: security_groups[0].group_name
separator: "_"
# Conditional Groups
groups:
# Production web servers
production_web: >
tags.Environment == "production" and
tags.Role == "webserver"
# Database servers
database_servers: >
tags.Role == "database" or
"db" in tags.Name.lower()
# High-availability instances
high_availability: >
tags.HighAvailability == "true" or
instance_type.startswith(("c5", "m5", "r5"))
# Cost-optimized instances
cost_optimized: >
instance_type.startswith(("t3", "t4g", "a1"))
# Host Naming Strategy
hostnames:
- tag:Name
- private-ip-address
- public-ip-address
- instance-id
# Host Variables Composition
compose:
# Primary connection details
ansible_host: public_ip_address | default(private_ip_address)
ansible_user: >
'ec2-user' if (image.name | default('') | lower).startswith(('amzn', 'rhel'))
else 'ubuntu' if (image.name | default('') | lower).startswith('ubuntu')
else 'admin' if (image.name | default('') | lower).startswith('debian')
else 'centos' if (image.name | default('') | lower).startswith('centos')
else 'root'
# Infrastructure metadata
cloud_provider: "'aws'"
aws_region: placement.region
availability_zone: placement.availability_zone
vpc_id: vpc_id
subnet_id: subnet_id
# Instance details
instance_family: instance_type.split('.')[0]
instance_size: instance_type.split('.')[1]
architecture: architecture
hypervisor: hypervisor
# Networking
private_ip: private_ip_address
public_ip: public_ip_address | default('')
private_dns: private_dns_name
public_dns: public_dns_name | default('')
# Tags and metadata
environment: tags.Environment | default('development')
application: tags.Application | default('unknown')
cost_center: tags.CostCenter | default('unassigned')
owner: tags.Owner | default('unknown')
# Operational metadata
launch_date: launch_time
monitoring_enabled: tags.Monitoring | default('false') | bool
backup_enabled: tags.BackupEnabled | default('false') | bool
auto_scaling: tags.AutoScaling | default('false') | bool
# Security
security_groups: security_groups | map(attribute='group_name') | list
iam_instance_profile: iam_instance_profile.arn | default('')
# Storage
root_device_type: root_device_type
root_device_name: root_device_name
# Include/Exclude Patterns
include_filters:
- "tag:Managed"
exclude_filters:
- "tag:DoNotManage"
- "instance-state-name: terminated"
- "instance-state-name: stopped"
# Caching and Performance
cache: true
cache_plugin: memory
cache_timeout: 3600
cache_connection: /tmp/aws_inventory_cache
# Strict host key checking
strict: falseYAMLStep 4: Update and Sync Options
Update Configuration:
☑ Overwrite: Yes (Replace inventory content)
☑ Overwrite Variables: No (Preserve custom variables)
☑ Update on Launch: No (For better performance)
☐ Update on Project Update: OptionalYAMLSync Schedule:
Custom Schedule: */30 * * * * (Every 30 minutes)
Cache Timeout: 1800 seconds (30 minutes)
Verbosity: 1 (Normal)
Timeout: 300 secondsYAMLStep-by-Step Google Cloud Platform Setup
Step 1: Create GCP Inventory
- Basic Inventory Creation
Name: GCP Production Inventory
Description: Google Cloud Compute Engine instances
Organization: Production OrgYAMLStep 2: Configure GCP Inventory Source
Source Configuration:
Name: GCP Compute Engine Source
Description: Auto-discovery of GCP compute instances
Source: Google Compute Engine
Credential: [Select GCP Service Account credential]YAMLSource Variables:
# GCP Compute Engine configuration
plugin: google.cloud.gcp_compute
# Project and Zone Configuration
projects:
- production-project-123456
- staging-project-789012
- development-project-345678
zones:
- us-central1-a
- us-central1-b
- us-central1-c
- us-west1-a
- us-west1-b
- europe-west1-b
- europe-west1-c
# Authentication
auth_kind: serviceaccount
service_account_file: /var/lib/awx/gcp-service-account.json
# Instance Filtering
filters:
- status = RUNNING
- labels.managed = awx
- labels.environment = (production OR staging OR development)
# Grouping Configuration
keyed_groups:
# Group by environment
- prefix: gcp_env
key: labels.environment | default('untagged')
separator: "_"
# Group by application
- prefix: gcp_app
key: labels.application | default('unknown')
separator: "_"
# Group by zone
- prefix: gcp_zone
key: zone.split('/')[-1]
separator: "_"
# Group by machine type family
- prefix: gcp_family
key: machineType.split('/')[-1].split('-')[0]
separator: "_"
# Group by network
- prefix: gcp_network
key: networkInterfaces[0].network.split('/')[-1]
separator: "_"
# Conditional Groups
groups:
# Production instances
gcp_production: labels.environment == "production"
# Preemptible instances
gcp_preemptible: scheduling.preemptible == true
# High-memory instances
gcp_high_memory: machineType.split('/')[-1].startswith(('n2-highmem', 'n1-highmem', 'm1-'))
# SSD instances
gcp_ssd_instances: >
disks | selectattr('boot', 'equalto', true) |
selectattr('type', 'search', 'pd-ssd') | list | length > 0
# Host Variables
hostnames:
- name
- public_ip
- private_ip
compose:
# Connection details
ansible_host: networkInterfaces[0].accessConfigs[0].natIP | default(networkInterfaces[0].networkIP)
ansible_user: "'ubuntu'"
# GCP-specific metadata
cloud_provider: "'gcp'"
gcp_project: name.split('/')[1]
gcp_zone: zone.split('/')[-1]
gcp_region: zone.split('/')[-1][:-2]
# Instance details
machine_type: machineType.split('/')[-1]
machine_family: machineType.split('/')[-1].split('-')[0]
cpu_count: machineType.split('/')[-1] | regex_replace('.*-(\d+).*', '\1') | int
# Networking
internal_ip: networkInterfaces[0].networkIP
external_ip: networkInterfaces[0].accessConfigs[0].natIP | default('')
network_name: networkInterfaces[0].network.split('/')[-1]
subnet_name: networkInterfaces[0].subnetwork.split('/')[-1]
# Metadata
environment: labels.environment | default('development')
application: labels.application | default('unknown')
cost_center: labels.cost_center | default('unassigned')
# Operational
creation_timestamp: creationTimestamp
preemptible: scheduling.preemptible | default(false)
automatic_restart: scheduling.automaticRestart | default(true)
# Storage
boot_disk_type: disks[0].type.split('/')[-1]
boot_disk_size: disks[0].diskSizeGb | int
# Service account
service_account_email: serviceAccounts[0].email | default('')
# Performance optimization
cache: true
cache_plugin: jsonfile
cache_connection: /tmp/gcp_inventory_cache
cache_timeout: 1800YAMLStep-by-Step Microsoft Azure Setup
Step 1: Create Azure Inventory
Basic Configuration:
Name: Azure Production Inventory
Description: Azure Virtual Machines across all resource groups
Organization: Production OrgYAMLStep 2: Configure Azure Source
Source Configuration:
Name: Azure Resource Manager Source
Description: Auto-discovery of Azure VMs
Source: Microsoft Azure Resource Manager
Credential: [Select Azure credential]YAMLSource Variables:
# Azure Resource Manager configuration
plugin: azure.azcollection.azure_rm
# Subscription and Resource Group filtering
include_vm_resource_groups:
- production-rg
- staging-rg
- development-rg
- shared-services-rg
# Authentication method
auth_source: auto # Uses credential from AWX
# Host filtering
filters:
- powerstate == "running"
- tags.managed == "awx"
# Grouping Strategy
keyed_groups:
# Group by environment tag
- prefix: azure_env
key: tags.environment | default('untagged')
separator: "_"
# Group by location
- prefix: azure_location
key: location
separator: "_"
# Group by VM size family
- prefix: azure_size_family
key: properties.hardwareProfile.vmSize.split('_')[0]
separator: "_"
# Group by OS type
- prefix: azure_os
key: properties.storageProfile.osDisk.osType
separator: "_"
# Group by resource group
- prefix: azure_rg
key: resourceGroup
separator: "_"
# Conditional Groups
conditional_groups:
# Production instances
azure_production: tags.environment == "production"
# Web servers
azure_web: tags.role == "webserver" or "web" in name.lower()
# Database servers
azure_database: tags.role == "database" or "db" in name.lower()
# High availability VMs
azure_ha: tags.high_availability == "true"
# Managed disks
azure_managed_disk: properties.storageProfile.osDisk.managedDisk != None
# Host naming
hostnames:
- name
- public_ipv4_addresses
- private_ipv4_addresses
# Host variables composition
compose:
# Connection details
ansible_host: public_ipv4_addresses[0] | default(private_ipv4_addresses[0])
ansible_user: "'azureuser'"
# Azure-specific metadata
cloud_provider: "'azure'"
azure_location: location
azure_resource_group: resourceGroup
azure_subscription: id.split('/')[2]
# VM details
vm_size: properties.hardwareProfile.vmSize
vm_size_family: properties.hardwareProfile.vmSize.split('_')[0]
vm_id: properties.vmId
# Operating system
os_type: properties.storageProfile.osDisk.osType
os_disk_name: properties.storageProfile.osDisk.name
image_publisher: properties.storageProfile.imageReference.publisher | default('')
image_offer: properties.storageProfile.imageReference.offer | default('')
# Networking
private_ip: private_ipv4_addresses[0] | default('')
public_ip: public_ipv4_addresses[0] | default('')
network_interface_names: properties.networkProfile.networkInterfaces | map(attribute='id') | map('regex_replace', '.*/(.+)', '\1') | list
# Tags and metadata
environment: tags.environment | default('development')
application: tags.application | default('unknown')
cost_center: tags.cost_center | default('unassigned')
owner: tags.owner | default('unknown')
# Operational
provisioning_state: properties.provisioningState
power_state: powerstate
availability_zone: zones[0] | default('')
# Storage
managed_disk: properties.storageProfile.osDisk.managedDisk != None
disk_size_gb: properties.storageProfile.osDisk.diskSizeGB | default(0)
# Performance settings
parallel: true
cache: true
cache_plugin: jsonfile
cache_connection: /tmp/azure_inventory_cache
cache_timeout: 1800
strict: falseYAMLUI Testing and Validation
Step 1: Test Inventory Source
- Manual Sync Test
- Navigate to your inventory source
- Click the “Sync” button (refresh icon)
- Monitor the sync job in real-time
- Check for any errors or warnings
- Validation Checklist
☑ Sync completed successfully
☑ Expected number of hosts discovered
☑ Groups created correctly
☑ Host variables populated
☑ No authentication errors
☑ Performance acceptable (< 5 minutes for < 1000 hosts)YAMLStep 2: Verify Host Discovery
Navigation Path: Inventories → [Your Inventory] → Hosts
Verification Steps:
- Host Count Validation
- Compare discovered hosts with actual cloud resources
- Verify filtering is working correctly
- Check for any missing critical systems
- Group Structure Validation
Expected Groups:
├── env_production
├── env_staging
├── app_web
├── app_database
├── family_t3 (AWS)
├── gcp_family_n1 (GCP)
├── azure_size_family_Standard (Azure)
└── [Custom conditional groups]YAML- Host Variables Validation
- Click on individual hosts
- Verify variables are populated correctly
- Check ansible_host is accessible
- Validate cloud-specific metadata
Advanced UI Features and Tips
Visual UI Walkthrough for Dynamic Inventory
AWX Dashboard Layout for Inventory Management:
graph TB
subgraph "AWX Web Interface Layout"
A[Top Navigation Bar] --> B[User Menu]
A --> C[Search Bar]
A --> D[Notifications]
E[Left Sidebar] --> F[Dashboard]
E --> G[Jobs]
E --> H[Schedules]
E --> I[Templates]
E --> J[Inventories]
E --> K[Projects]
E --> L[Credentials]
E --> M[Settings]
N[Main Content Area] --> O[Inventory List]
N --> P[Inventory Details]
N --> Q[Source Configuration]
R[Action Buttons] --> S[Add/Create]
R --> T[Edit/Modify]
R --> U[Delete/Remove]
R --> V[Sync/Refresh]
end
style J fill:#e1f5fe
style N fill:#f3e5f5
style V fill:#e8f5e8Step-by-Step UI Navigation Guide
1. Accessing Inventory Management
🖱️ Click Path: Main Dashboard → Left Sidebar → "Inventories"
📍 URL Pattern: https://your-awx-server/#/inventories
🔍 Look For: Grid view with inventory list, search bar, and filter optionsYAMLVisual Indicators:
- Inventory Icon: 📊 Grid/table icon in left sidebar
- Active State: Highlighted in blue when selected
- Badge Numbers: Show count of inventories in organization
2. Creating New Inventory – UI Elements
Button Location: Top right of inventory list
Button Text: "+ Add" (with dropdown arrow)
Dropdown Options:
├── Inventory (regular inventory)
├── Smart Inventory (filtered view)
└── Constructed Inventory (advanced)YAMLForm Fields Visual Layout:
┌─ Inventory Creation Form ────────────────────┐
│ Name: [Text Input - Required] │
│ Description: [Text Area - Optional] │
│ Organization: [Dropdown - Auto-populated] │
│ Instance Groups: [Multi-select dropdown] │
│ Variables: [YAML/JSON Editor with syntax] │
│ │
│ [Cancel] [Save] buttons at bottom │
└──────────────────────────────────────────────┘YAMLSmart Inventory Creation via UI
Navigation Flow:
sequenceDiagram
participant User
participant UI
participant Backend
User->>UI: Click "Inventories" in sidebar
UI->>User: Display inventory list page
User->>UI: Click "+ Add" → "Smart Inventory"
UI->>User: Show smart inventory form
User->>UI: Fill form with filter criteria
UI->>Backend: Validate filter syntax
Backend->>UI: Return validation result
UI->>User: Show preview of matching hosts
User->>UI: Click "Save"
UI->>Backend: Create smart inventory
Backend->>UI: Confirm creation
UI->>User: Redirect to new smart inventorySmart Inventory Form Layout:
┌─ Smart Inventory Configuration ──────────────┐
│ Basic Information: │
│ ├── Name: [Required Text Field] │
│ ├── Description: [Optional Text Area] │
│ └── Organization: [Dropdown] │
│ │
│ Smart Host Filter: [Advanced Text Editor] │
│ ┌─ Filter Builder ─────────────────────────┐ │
│ │ Field: [Dropdown] Operator: [Dropdown] │ │
│ │ Value: [Text Input] [+ Add Condition] │ │
│ │ │ │
│ │ Preview Results: [Button] │ │
│ │ Matching Hosts: [Count Display] │ │
│ └──────────────────────────────────────────┘ │
│ │
│ [Test Filter] [Save] [Cancel] │
└──────────────────────────────────────────────┘YAMLExample Smart Inventory Filters with UI Display:
- Production Web Servers Filter:
UI Display: Visual filter builder
Condition 1: [group__name] [contains] [env_production]
Operator: [AND]
Condition 2: [group__name] [contains] [app_web]
Generated Filter:
group__name__contains="env_production" and group__name__contains="app_web"
Preview Results: 24 hosts foundYAML- High-Performance Instances Filter:
UI Builder:
Condition 1: [variables__instance_type] [starts with] [c5]
Operator: [OR]
Condition 2: [variables__instance_type] [starts with] [m5]
Operator: [OR]
Condition 3: [variables__machine_type] [starts with] [n2-highcpu]
Preview Results: 42 hosts foundYAMLInventory Source Management UI
Source Configuration Tabs Layout:
┌─ Inventory Source Details ──────────────────────────┐
│ [Details] [Schedules] [Notifications] [Jobs] │
│ │
│ Details Tab Content: │
│ ├── Basic Information │
│ ├── Source Configuration │
│ ├── Update Options │
│ └── Advanced Settings │
│ │
│ Action Buttons: │
│ [🔄 Sync Now] [✏️ Edit] [📅 Schedule] [❌ Delete] │
└─────────────────────────────────────────────────────┘YAMLSource Variables Editor Features:
┌─ Source Variables YAML Editor ───────────────┐
│ 📝 Editor Features: │
│ ├── Syntax highlighting │
│ ├── Auto-completion │
│ ├── Error checking │
│ ├── Line numbers │
│ └── Fold/unfold sections │
│ │
│ 🔧 Helper Tools: │
│ ├── [Format YAML] button │
│ ├── [Validate Syntax] button │
│ ├── [Load Template] dropdown │
│ └── [Documentation] link │
│ │
│ Status Indicators: │
│ ✅ Valid YAML syntax │
│ ⚠️ Warning: Unknown plugin parameter │
│ ❌ Error: Invalid format │
└──────────────────────────────────────────────┘YAMLReal-Time Sync Monitoring
Job Progress Display:
┌─ Inventory Sync Job Progress ────────────────┐
│ Job #1234: AWS EC2 Production Source │
│ Status: Running ⟳ │
│ Started: 2025-09-22 14:30:15 │
│ Elapsed: 00:02:34 │
│ │
│ Progress Bar: ████████████░░░░░░░░ 65% │
│ │
│ Current Task: Fetching instances from us-west-2 │
│ │
│ Real-time Output: │
│ ┌─ Console Output ─────────────────────────┐ │
│ │ [14:30:15] Starting inventory sync... │ │
│ │ [14:30:16] Authenticating with AWS... │ │
│ │ [14:30:17] Fetching us-east-1 instances..│ │
│ │ [14:30:45] Found 156 instances │ │
│ │ [14:31:02] Fetching us-west-2 instances..│ │
│ │ [14:32:15] Found 89 instances │ │
│ │ [14:32:30] Processing groups... │ │
│ │ [14:32:49] Sync completed successfully │ │
│ └──────────────────────────────────────────┘ │
│ │
│ Results Summary: │
│ ├── Total Hosts: 245 │
│ ├── New Hosts: 12 │
│ ├── Updated Hosts: 233 │
│ ├── Groups Created: 18 │
│ └── Duration: 00:02:49 │
└──────────────────────────────────────────────┘YAMLHost and Group Visualization
Inventory Tree View:
📊 AWS Production Inventory
├── 📁 Groups (18)
│ ├── 🌍 env_production (156 hosts)
│ ├── 🌍 env_staging (89 hosts)
│ ├── 🖥️ app_web (98 hosts)
│ ├── 🗄️ app_database (24 hosts)
│ ├── ⚡ family_t3 (134 hosts)
│ ├── ⚡ family_m5 (67 hosts)
│ ├── 📍 az_us_east_1a (89 hosts)
│ ├── 📍 az_us_east_1b (67 hosts)
│ └── 📍 az_us_west_2a (89 hosts)
│
├── 🖥️ Hosts (245)
│ ├── 🟢 web-01.prod.aws (online)
│ ├── 🟢 web-02.prod.aws (online)
│ ├── 🟢 db-01.prod.aws (online)
│ ├── 🔴 app-05.staging.aws (offline)
│ └── ... (241 more hosts)
│
└── 📈 Statistics
├── Last Sync: 2 minutes ago
├── Success Rate: 98.2%
├── Avg Sync Time: 2m 34s
└── Cache Hit Rate: 87%YAMLHost Detail View:
┌─ Host Details: web-01.prod.aws ──────────────┐
│ 🔗 Connection Info: │
│ ├── ansible_host: 54.123.45.67 │
│ ├── ansible_user: ec2-user │
│ └── ansible_port: 22 │
│ │
│ ☁️ Cloud Metadata: │
│ ├── cloud_provider: aws │
│ ├── instance_id: i-1234567890abcdef0 │
│ ├── instance_type: t3.medium │
│ ├── region: us-east-1 │
│ └── availability_zone: us-east-1a │
│ │
│ 🏷️ Tags: │
│ ├── Environment: production │
│ ├── Application: web │
│ ├── Owner: devops-team │
│ └── CostCenter: engineering │
│ │
│ 👥 Group Memberships: │
│ ├── env_production │
│ ├── app_web │
│ ├── family_t3 │
│ └── az_us_east_1a │
│ │
│ 📊 Additional Variables: (23 total) │
│ [View All Variables] [Run Ad Hoc Commands] │
└──────────────────────────────────────────────┘YAMLInventory Source Scheduling UI
Schedule Creation Wizard:
graph LR
A[Schedule Type] --> B[Frequency Selection]
B --> C[Time Settings]
C --> D[Advanced Options]
D --> E[Review & Save]
subgraph "Schedule Types"
F[Simple Interval]
G[Cron Expression]
H[Complex Schedule]
end
subgraph "Frequency Options"
I[Every N Minutes]
J[Hourly]
K[Daily]
L[Weekly]
M[Monthly]
endSchedule Configuration Interface:
┌─ Inventory Sync Schedule ────────────────────┐
│ Schedule Name: [Hourly Production Sync] │
│ Description: [Sync every hour during business hours] │
│ │
│ 📅 Schedule Type: │
│ ○ Simple Interval ● Complex Schedule │
│ │
│ ⏰ Timing Configuration: │
│ ├── Start Date: [2025-09-22] [14:00:00] │
│ ├── End Date: [2026-09-22] [14:00:00] │
│ ├── Timezone: [America/New_York ▼] │
│ └── Repeat: [Every 1 Hour] │
│ │
│ 📋 Days of Operation: │
│ ☑️ Monday ☑️ Tuesday ☑️ Wednesday │
│ ☑️ Thursday ☑️ Friday ☐ Saturday │
│ ☐ Sunday │
│ │
│ ⚙️ Advanced Options: │
│ ├── Timeout: [300] seconds │
│ ├── Verbosity: [Normal ▼] │
│ ├── ☑️ Update on Launch │
│ └── ☐ Prompt for Variables │
│ │
│ [Test Schedule] [Save] [Cancel] │
└──────────────────────────────────────────────┘YAMLPerformance Monitoring
Inventory Sync Dashboard:
┌─ Inventory Performance Dashboard ────────────┐
│ 📊 Key Metrics (Last 30 Days): │
│ ┌─────────────────────────────────────────┐ │
│ │ Avg Sync Time │ Success Rate │ │
│ │ 2m 34s │ 98.2% │ │
│ │ ▲ 15s improvement│ ▲ 2.1% improvement │ │
│ └─────────────────────────────────────────┘ │
│ │
│ 📈 Sync Time Trend: │
│ ┌─ Graph ─────────────────────────────────┐ │
│ │ ● │ │
│ │ ● ● │ │
│ │ ● ● │ │
│ │ ●●● │ │
│ │ ●●● │ │
│ │ ────────────────────────────────────── │ │
│ │ Day 1 Day 15 Day 30 │ │
│ └─────────────────────────────────────────┘ │
│ │
│ 🚨 Recent Issues: │
│ ├── 🟡 GCP quota limit warning (resolved) │
│ ├── 🟢 All systems operational │
│ └── 🔵 Performance optimization applied │
│ │
│ 📋 Sync History (Last 10): │
│ ├── #1256 ✅ 2m 18s ago (Success) │
│ ├── #1255 ✅ 1h 18s ago (Success) │
│ ├── #1254 ❌ 2h 22s ago (Failed - timeout) │
│ └── #1253 ✅ 3h 19s ago (Success) │
└──────────────────────────────────────────────┘YAMLTroubleshooting UI Tools
Diagnostic Information Panel:
┌─ Inventory Source Diagnostics ───────────────┐
│ 🔍 Connection Test Results: │
│ ├── ✅ Credential Authentication: Passed │
│ ├── ✅ Network Connectivity: Passed │
│ ├── ⚠️ API Rate Limits: 85% utilized │
│ └── ✅ Permissions Check: Passed │
│ │
│ 📊 Resource Discovery Summary: │
│ ├── AWS Regions: 3 configured, 3 accessible │
│ ├── EC2 Instances: 245 discovered │
│ ├── GCP Projects: 2 configured, 2 accessible│
│ ├── GCE Instances: 189 discovered │
│ ├── Azure Subscriptions: 1 configured │
│ └── Azure VMs: 67 discovered │
│ │
│ ⚙️ Configuration Validation: │
│ ├── ✅ YAML Syntax: Valid │
│ ├── ✅ Plugin Parameters: Valid │
│ ├── ⚠️ Filters: 2 deprecated parameters │
│ └── ✅ Grouping Logic: Valid │
│ │
│ 🔧 Suggested Optimizations: │
│ ├── Enable caching for 40% speed improvement│
│ ├── Use parallel processing │
│ └── Optimize region selection │
│ │
│ [Run Full Diagnostic] [Export Report] │
└──────────────────────────────────────────────┘YAMLThis comprehensive UI utilization guide provides detailed, step-by-step instructions for configuring dynamic inventory through the AWX web interface, complete with visual descriptions, troubleshooting tools, and performance optimization strategies.
Navigation: Jobs → [Filter by Inventory Sync]
Key Metrics to Monitor:
- Sync duration (should be < 5 minutes)
- Number of hosts discovered
- Error rate (should be 0%)
- Memory usage during syncYAML- Optimization Tips
# Add to source variables for better performance
performance_optimizations:
cache: true
cache_timeout: 1800 # 30 minutes
parallel: true # Enable parallel processing
strict: false # Ignore SSL warnings
timeout: 300 # 5 minute timeoutYAMLTroubleshooting Common UI Issues
Issue 1: Credential Authentication Failures
Symptoms:
- Sync jobs fail with authentication errors
- No hosts discovered despite having instances
UI Diagnostic Steps:
- Navigate to
Credentials → [Your Cloud Credential] - Test credential by creating a simple job template
- Check credential permissions in cloud provider console
- Verify credential format matches requirements
Resolution:
# AWS credential requirements
required_permissions:
- ec2:DescribeInstances
- ec2:DescribeRegions
- ec2:DescribeAvailabilityZones
- ec2:DescribeTags
# GCP credential requirements
required_scopes:
- https://www.googleapis.com/auth/compute.readonly
- https://www.googleapis.com/auth/cloud-platform.read-only
# Azure credential requirements
required_permissions:
- Microsoft.Compute/virtualMachines/read
- Microsoft.Resources/subscriptions/resourceGroups/readYAMLIssue 2: Slow Inventory Sync Performance
UI Monitoring:
- Check job output for timing information
- Monitor system resources during sync
- Review inventory source configuration
Optimization via UI:
# Update source variables for better performance
optimization_settings:
# Reduce scope
regions: ["us-east-1"] # Limit to essential regions
filters:
- "instance-state-name: running" # Only running instances
# Enable caching
cache: true
cache_timeout: 3600
# Parallel processing
parallel: true
max_workers: 10YAMLIssue 3: Incorrect Host Grouping
UI Verification Steps:
- Navigate to
Inventories → [Your Inventory] → Groups - Check group membership
- Verify group variables
- Test with simple playbook
Debug Configuration:
# Add debug variables to troubleshoot grouping
debug_grouping:
verbosity: 3 # Increase verbosity
strict: true # Enable strict mode for better error reporting
# Add debug compose variables
debug_info:
raw_tags: tags
raw_labels: labels
group_logic_result: >
"env_" + (tags.Environment | default('untagged'))YAMLThis comprehensive UI guide provides practical, step-by-step instructions for setting up and managing dynamic inventories across all major cloud providers using the AWX web interface, including troubleshooting and optimization strategies.
Only include instances with specific tag
“tag:Managed”: “AWX”
Lab 3: Credential Management and Security (Intermediate)
Setting Up Secure Credential Storage
Credential Hierarchy Strategy:
graph TB
A[Credential Management Strategy] --> B[External Vault Integration]
A --> C[AWX Native Credentials]
A --> D[Temporary Credentials]
B --> E[HashiCorp Vault]
B --> F[AWS Secrets Manager]
B --> G[Azure Key Vault]
C --> H[SSH Private Keys]
C --> I[Cloud API Keys]
C --> J[Database Passwords]
D --> K[Session Tokens]
D --> L[Assumed Roles]
D --> M[Short-lived Certificates]HashiCorp Vault Integration Example:
# Vault credential type configuration
vault_credential:
name: "Production Vault"
credential_type: "HashiCorp Vault Secret Lookup"
inputs:
url: "https://vault.company.com"
token: "{{ vault_token }}"
namespace: "production"
auth_method: "kubernetes"
role: "awx-automation"
injectors:
env:
VAULT_ADDR: "{{ url }}"
VAULT_TOKEN: "{{ token }}"
VAULT_NAMESPACE: "{{ namespace }}"YAMLCustom Credential Type for Cloud Provider:
# custom_cloud_credential.py
CUSTOM_CREDENTIAL_TYPE = {
'name': 'Multi-Cloud Credential',
'description': 'Unified credential for multiple cloud providers',
'kind': 'cloud',
'inputs': {
'fields': [
{
'id': 'cloud_provider',
'label': 'Cloud Provider',
'type': 'string',
'choices': ['aws', 'gcp', 'azure']
},
{
'id': 'credentials_json',
'label': 'Credentials JSON',
'type': 'string',
'secret': True,
'multiline': True
},
{
'id': 'region',
'label': 'Default Region',
'type': 'string'
}
]
},
'injectors': {
'env': {
'CLOUD_PROVIDER': '{{ cloud_provider }}',
'CLOUD_REGION': '{{ region }}',
'CLOUD_CREDENTIALS': '{{ credentials_json }}'
},
'file': {
'template.credentials': '{{ credentials_json }}'
}
}
}PythonLab 4: Job Templates and Workflow Creation (Intermediate)
Building a Multi-Stage Deployment Workflow
Workflow Design:
graph TD
A[Start Deployment] --> B[Pre-deployment Validation]
B --> C{Validation Passed?}
C -->|No| D[Send Failure Notification]
C -->|Yes| E[Deploy Infrastructure]
E --> F[Configure Security Groups]
F --> G[Deploy Application]
G --> H[Run Integration Tests]
H --> I{Tests Passed?}
I -->|No| J[Rollback Deployment]
I -->|Yes| K[Update Load Balancer]
J --> L[Send Rollback Notification]
K --> M[Enable Monitoring]
M --> N[Send Success Notification]
style C fill:#fff9c4
style I fill:#fff9c4Job Template Configuration:
# Infrastructure deployment job template
infrastructure_deployment:
name: "Deploy Infrastructure"
project: "infrastructure-automation"
playbook: "infrastructure/deploy.yml"
inventory: "cloud-dynamic"
credentials:
- "aws-production"
- "ssh-deployment-key"
# Runtime configuration
extra_vars:
environment: "{{ deployment_environment }}"
instance_type: "{{ instance_type | default('t3.medium') }}"
vpc_cidr: "10.0.0.0/16"
# Execution settings
timeout: 3600
forks: 10
verbosity: 1
become_enabled: true
# Survey for runtime input
survey_enabled: true
survey_spec:
- name: "deployment_environment"
description: "Target deployment environment"
type: "multiplechoice"
choices: ["development", "staging", "production"]
required: true
- name: "instance_count"
description: "Number of instances to deploy"
type: "integer"
min_value: 1
max_value: 10
default: 3
- name: "enable_backup"
description: "Enable automated backups"
type: "boolean"
default: trueYAMLAdvanced Workflow Templates
Conditional Workflow Logic:
# Workflow template with complex logic
deployment_workflow:
name: "Production Deployment Pipeline"
nodes:
- name: "pre_deployment_checks"
job_template: "Pre-deployment Validation"
success_nodes: ["infrastructure_deployment"]
failure_nodes: ["notify_failure"]
- name: "infrastructure_deployment"
job_template: "Deploy Infrastructure"
success_nodes: ["security_configuration"]
failure_nodes: ["cleanup_partial_deployment"]
- name: "security_configuration"
job_template: "Configure Security"
success_nodes: ["application_deployment"]
failure_nodes: ["rollback_infrastructure"]
- name: "application_deployment"
job_template: "Deploy Application"
success_nodes: ["integration_tests"]
failure_nodes: ["rollback_all"]
- name: "integration_tests"
job_template: "Run Integration Tests"
success_nodes: ["production_cutover"]
failure_nodes: ["rollback_all"]
- name: "production_cutover"
job_template: "Switch Traffic to New Deployment"
success_nodes: ["enable_monitoring", "notify_success"]
failure_nodes: ["emergency_rollback"]
# Failure handling nodes
- name: "rollback_all"
job_template: "Complete Rollback"
always_nodes: ["notify_failure"]
# Notification nodes
- name: "notify_success"
job_template: "Send Success Notification"
- name: "notify_failure"
job_template: "Send Failure Alert"YAMLLab 5: Advanced Automation Patterns (Advanced)
Self-Healing Infrastructure
Monitoring-Driven Automation:
# Self-healing playbook
---
- name: Infrastructure Health Check and Remediation
hosts: all
gather_facts: true
vars:
health_check_url: "http://{{ ansible_host }}:8080/health"
max_remediation_attempts: 3
tasks:
- name: Check service health
uri:
url: "{{ health_check_url }}"
method: GET
timeout: 10
register: health_check
ignore_errors: true
- name: Identify unhealthy services
set_fact:
service_unhealthy: "{{ health_check.status != 200 }}"
- name: Attempt service restart
service:
name: "{{ item }}"
state: restarted
loop:
- application-service
- monitoring-agent
when: service_unhealthy
register: restart_result
- name: Verify service recovery
uri:
url: "{{ health_check_url }}"
method: GET
timeout: 30
register: recovery_check
retries: 5
delay: 10
when: service_unhealthy
- name: Escalate to operations team
uri:
url: "{{ slack_webhook_url }}"
method: POST
body_format: json
body:
text: "ALERT: Service {{ inventory_hostname }} failed health check and automatic remediation"
channel: "#operations"
username: "AWX Automation"
when:
- service_unhealthy
- recovery_check.status != 200YAMLEvent-Driven Automation
Webhook-Triggered Workflows:
# Event-driven automation configuration
event_driven_automation:
# Git push triggers
git_webhook:
source: "GitHub"
events: ["push", "pull_request"]
target_workflow: "CI/CD Pipeline"
filters:
branches: ["main", "develop"]
paths: ["infrastructure/", "applications/"]
# Monitoring alerts
monitoring_webhook:
source: "Prometheus AlertManager"
events: ["alert"]
target_workflow: "Incident Response"
filters:
severity: ["critical", "warning"]
components: ["infrastructure", "application"]
# Cloud events
cloud_webhook:
source: "AWS EventBridge"
events: ["EC2 State Change", "Auto Scaling"]
target_workflow: "Infrastructure Adjustment"
filters:
regions: ["us-east-1", "us-west-2"]
instance_types: ["t3.*", "m5.*"]YAMLLab Exercise Templates
Exercise 1: Basic Automation
Objective: Deploy a simple web application across multiple servers.
Requirements:
- Create project from Git repository
- Configure static inventory with 3 web servers
- Set up SSH credentials
- Create job template for application deployment
- Execute and verify deployment
Success Criteria:
- All servers respond to HTTP requests
- Application logs show successful startup
- Load balancer health checks pass
Exercise 2: Multi-Cloud Deployment
Objective: Deploy identical infrastructure across AWS and GCP.
Requirements:
- Configure cloud credentials for both providers
- Create dynamic inventories for each cloud
- Build unified deployment playbook
- Implement cross-cloud networking
- Set up monitoring and alerting
Success Criteria:
- Infrastructure deployed in both clouds
- Applications communicate across clouds
- Monitoring dashboards show both environments
Exercise 3: Compliance Automation
Objective: Implement automated security compliance checking.
Requirements:
- Create compliance checking playbooks
- Set up scheduled job execution
- Configure compliance reporting
- Implement remediation workflows
- Set up audit trail logging
Success Criteria:
- Compliance violations automatically detected
- Remediation actions executed successfully
- Compliance reports generated regularly
- Audit logs maintain complete history
6. AWS Integration Deep Dive {#aws-integration}
AWS Credential Setup
# AWS Credential Configuration
credential_type: "Amazon Web Services"
access_key: "AKIA..."
secret_key: "..."
region: "us-east-1"YAMLAWS Dynamic Inventory
# aws_ec2.yml
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
- us-west-2
keyed_groups:
- prefix: tag
key: tags
- prefix: instance_type
key: instance_type
- prefix: aws_region
key: placement.region
hostnames:
- ip-address
- dns-name
compose:
ansible_host: public_ip_addressYAMLAWS Playbook Examples
EC2 Instance Provisioning
# provision-ec2.yml
---
- name: Provision EC2 Instances
hosts: localhost
gather_facts: false
vars:
region: us-east-1
instance_type: t3.micro
image_id: ami-0c55b159cbfafe1d0
key_name: my-keypair
security_group: web-sg
tasks:
- name: Create security group
amazon.aws.ec2_group:
name: "{{ security_group }}"
description: Web server security group
region: "{{ region }}"
rules:
- proto: tcp
ports:
- 80
- 443
- 22
cidr_ip: 0.0.0.0/0
tags:
Environment: production
- name: Launch EC2 instances
amazon.aws.ec2_instance:
name: "web-server-{{ item }}"
image_id: "{{ image_id }}"
instance_type: "{{ instance_type }}"
key_name: "{{ key_name }}"
security_group: "{{ security_group }}"
region: "{{ region }}"
tags:
Environment: production
Role: webserver
wait: true
loop: "{{ range(1, 4) | list }}"
register: ec2_instances
- name: Add instances to inventory
add_host:
name: "{{ item.public_ip_address }}"
groups: webservers
ansible_host: "{{ item.public_ip_address }}"
loop: "{{ ec2_instances.results | map(attribute='instances') | flatten }}"YAMLVPC and Networking
# aws-vpc-setup.yml
---
- name: Create AWS VPC Infrastructure
hosts: localhost
gather_facts: false
vars:
vpc_cidr: "10.0.0.0/16"
public_subnet_cidr: "10.0.1.0/24"
private_subnet_cidr: "10.0.2.0/24"
tasks:
- name: Create VPC
amazon.aws.ec2_vpc_net:
name: production-vpc
cidr_block: "{{ vpc_cidr }}"
region: "{{ aws_region }}"
tags:
Environment: production
state: present
register: vpc
- name: Create Internet Gateway
amazon.aws.ec2_vpc_igw:
vpc_id: "{{ vpc.vpc.id }}"
region: "{{ aws_region }}"
tags:
Name: production-igw
Environment: production
state: present
register: igw
- name: Create public subnet
amazon.aws.ec2_vpc_subnet:
vpc_id: "{{ vpc.vpc.id }}"
cidr: "{{ public_subnet_cidr }}"
region: "{{ aws_region }}"
map_public: yes
tags:
Name: public-subnet
Type: public
state: present
register: public_subnetYAMLAWS Service Integration Patterns
graph TD
A[AWX] --> B[AWS Services]
B --> C[EC2]
B --> D[VPC]
B --> E[S3]
B --> F[RDS]
B --> G[ELB]
B --> H[CloudFormation]
C --> I[Instance Management]
D --> J[Network Configuration]
E --> K[Storage Operations]
F --> L[Database Setup]
G --> M[Load Balancing]
H --> N[Infrastructure as Code]6. Google Cloud Platform Integration {#gcp-integration}
GCP Authentication Setup
// gcp-service-account.json
{
"type": "service_account",
"project_id": "your-project-id",
"private_key_id": "...",
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
"client_email": "awx-automation@your-project.iam.gserviceaccount.com",
"client_id": "...",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token"
}JSONGCP Dynamic Inventory
# gcp_compute.yml
plugin: google.cloud.gcp_compute
projects:
- your-project-id
zones:
- us-central1-a
- us-central1-b
auth_kind: serviceaccount
service_account_file: /path/to/service-account.json
keyed_groups:
- prefix: gcp
key: labels
- prefix: status
key: status
hostnames:
- public_ip
- private_ip
compose:
ansible_host: networkInterfaces[0].accessConfigs[0].natIPYAMLGCP Playbook Examples
Compute Engine Instance Creation
# gcp-compute-provision.yml
---
- name: Create GCP Compute Instances
hosts: localhost
gather_facts: false
vars:
project_id: your-project-id
zone: us-central1-a
machine_type: e2-micro
image_family: ubuntu-2004-lts
image_project: ubuntu-os-cloud
tasks:
- name: Create a disk
google.cloud.gcp_compute_disk:
name: "{{ item }}-disk"
size_gb: 20
source_image: "projects/{{ image_project }}/global/images/family/{{ image_family }}"
zone: "{{ zone }}"
project: "{{ project_id }}"
auth_kind: serviceaccount
service_account_file: "{{ gcp_service_account_file }}"
state: present
loop:
- web-server-1
- web-server-2
register: disks
- name: Create instances
google.cloud.gcp_compute_instance:
name: "{{ item.item }}"
machine_type: "projects/{{ project_id }}/zones/{{ zone }}/machineTypes/{{ machine_type }}"
disks:
- auto_delete: true
boot: true
source: "{{ item.source }}"
network_interfaces:
- network: "projects/{{ project_id }}/global/networks/default"
access_configs:
- name: External NAT
type: ONE_TO_ONE_NAT
zone: "{{ zone }}"
project: "{{ project_id }}"
auth_kind: serviceaccount
service_account_file: "{{ gcp_service_account_file }}"
tags:
items:
- http-server
- https-server
state: present
loop: "{{ disks.results }}"YAMLGKE Cluster Management
# gke-cluster-setup.yml
---
- name: Create GKE Cluster
hosts: localhost
gather_facts: false
vars:
cluster_name: production-cluster
location: us-central1
node_count: 3
tasks:
- name: Create GKE cluster
google.cloud.gcp_container_cluster:
name: "{{ cluster_name }}"
location: "{{ location }}"
project: "{{ project_id }}"
initial_node_count: "{{ node_count }}"
node_config:
machine_type: e2-medium
disk_size_gb: 100
oauth_scopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
auth_kind: serviceaccount
service_account_file: "{{ gcp_service_account_file }}"
state: present
register: cluster
- name: Get cluster credentials
shell: |
gcloud container clusters get-credentials {{ cluster_name }} \
--location {{ location }} \
--project {{ project_id }}
delegate_to: localhostYAMLGCP Service Integration
graph TD
A[AWX + GCP] --> B[Compute Engine]
A --> C[GKE]
A --> D[Cloud Storage]
A --> E[Cloud SQL]
A --> F[VPC Network]
A --> G[Cloud Functions]
B --> H[VM Lifecycle]
C --> I[K8s Management]
D --> J[File Operations]
E --> K[Database Setup]
F --> L[Network Config]
G --> M[Serverless Deploy]7. Azure Integration {#azure-integration}
Azure Authentication
# Azure Credential Setup
credential_type: "Microsoft Azure Resource Manager"
subscription_id: "12345678-1234-1234-1234-123456789012"
tenant_id: "87654321-4321-4321-4321-210987654321"
client_id: "abcdefgh-abcd-abcd-abcd-abcdefghijkl"
client_secret: "your-client-secret"YAMLAzure Dynamic Inventory
# azure_rm.yml
plugin: azure.azcollection.azure_rm
include_vm_resource_groups:
- production-rg
- staging-rg
auth_source: credential_file
keyed_groups:
- prefix: azure
key: tags
- prefix: azure_location
key: location
- prefix: azure_size
key: properties.hardwareProfile.vmSize
hostnames:
- public_ipv4_addresses
- private_ipv4_addressesYAMLAzure Playbook Examples
Virtual Machine Deployment
# azure-vm-deployment.yml
---
- name: Create Azure Virtual Machines
hosts: localhost
gather_facts: false
vars:
resource_group: production-rg
location: East US
vm_size: Standard_B2s
admin_username: azureuser
tasks:
- name: Create resource group
azure.azcollection.azure_rm_resourcegroup:
name: "{{ resource_group }}"
location: "{{ location }}"
tags:
Environment: production
- name: Create virtual network
azure.azcollection.azure_rm_virtualnetwork:
resource_group: "{{ resource_group }}"
name: production-vnet
address_prefixes: "10.0.0.0/16"
tags:
Environment: production
- name: Create subnet
azure.azcollection.azure_rm_subnet:
resource_group: "{{ resource_group }}"
name: production-subnet
address_prefix: "10.0.1.0/24"
virtual_network: production-vnet
- name: Create public IP addresses
azure.azcollection.azure_rm_publicipaddress:
resource_group: "{{ resource_group }}"
allocation_method: Static
name: "{{ item }}-public-ip"
tags:
Environment: production
loop:
- web-vm-1
- web-vm-2
register: public_ips
- name: Create Network Security Group
azure.azcollection.azure_rm_securitygroup:
resource_group: "{{ resource_group }}"
name: production-nsg
rules:
- name: SSH
protocol: Tcp
destination_port_range: 22
access: Allow
priority: 1001
direction: Inbound
- name: HTTP
protocol: Tcp
destination_port_range: 80
access: Allow
priority: 1002
direction: Inbound
- name: Create virtual machines
azure.azcollection.azure_rm_virtualmachine:
resource_group: "{{ resource_group }}"
name: "{{ item.item }}"
vm_size: "{{ vm_size }}"
admin_username: "{{ admin_username }}"
ssh_password_enabled: false
ssh_public_keys:
- path: "/home/{{ admin_username }}/.ssh/authorized_keys"
key_data: "{{ ssh_public_key }}"
network_interfaces: "{{ item.item }}-nic"
image:
offer: UbuntuServer
publisher: Canonical
sku: 18.04-LTS
version: latest
tags:
Environment: production
Role: webserver
loop: "{{ public_ips.results }}"YAMLAzure Kubernetes Service (AKS)
# aks-cluster-setup.yml
---
- name: Create AKS Cluster
hosts: localhost
gather_facts: false
vars:
cluster_name: production-aks
node_count: 3
tasks:
- name: Create AKS cluster
azure.azcollection.azure_rm_aks:
name: "{{ cluster_name }}"
resource_group: "{{ resource_group }}"
location: "{{ location }}"
kubernetes_version: "1.21.2"
node_resource_group: "{{ cluster_name }}-nodes-rg"
agent_pool_profiles:
- name: default
count: "{{ node_count }}"
vm_size: Standard_D2s_v3
os_type: Linux
mode: System
service_principal:
client_id: "{{ azure_client_id }}"
client_secret: "{{ azure_client_secret }}"
tags:
Environment: production
state: present
register: aks_cluster
- name: Get AKS credentials
azure.azcollection.azure_rm_aks_info:
name: "{{ cluster_name }}"
resource_group: "{{ resource_group }}"
show_kubeconfig: user
register: aks_info
- name: Save kubeconfig
copy:
content: "{{ aks_info.aks[0].kube_config }}"
dest: ~/.kube/config-{{ cluster_name }}
mode: '0600'YAMLAzure Service Integration Architecture
graph TD
A[AWX + Azure] --> B[Virtual Machines]
A --> C[AKS]
A --> D[Storage Accounts]
A --> E[SQL Database]
A --> F[Virtual Networks]
A --> G[Functions]
A --> H[App Service]
B --> I[VM Lifecycle]
C --> J[Container Orchestration]
D --> K[Blob/File Storage]
E --> L[Database Management]
F --> M[Network Security]
G --> N[Serverless Computing]
H --> O[Web App Deployment]8. Advanced Features {#advanced-features}
Workflow Templates
graph TD
A[Workflow Start] --> B{Environment Check}
B -->|Production| C[Backup Database]
B -->|Staging| D[Skip Backup]
C --> E[Deploy Application]
D --> E
E --> F{Deployment Success}
F -->|Yes| G[Run Tests]
F -->|No| H[Rollback]
G --> I{Tests Pass}
I -->|Yes| J[Notify Success]
I -->|No| H
H --> K[Restore Backup]
K --> L[Notify Failure]Multi-Cloud Deployment Workflow
# multi-cloud-workflow.yml
---
- name: Multi-Cloud Infrastructure Deployment
hosts: localhost
gather_facts: false
tasks:
- name: Deploy AWS Infrastructure
include_tasks: aws-infrastructure.yml
tags: aws
- name: Deploy GCP Infrastructure
include_tasks: gcp-infrastructure.yml
tags: gcp
when: deploy_gcp | default(false)
- name: Deploy Azure Infrastructure
include_tasks: azure-infrastructure.yml
tags: azure
when: deploy_azure | default(false)
- name: Configure Cross-Cloud Networking
include_tasks: cross-cloud-networking.yml
when: multi_cloud_networking | default(false)
- name: Deploy Applications
include_tasks: application-deployment.yml
tags: applicationsYAMLCustom Credential Types
# custom_credential_type.py
CUSTOM_CREDENTIAL_TYPE = {
'name': 'HashiCorp Vault',
'description': 'Custom credential type for Vault integration',
'kind': 'cloud',
'inputs': {
'fields': [
{
'id': 'vault_url',
'label': 'Vault URL',
'type': 'string',
'help_text': 'The URL of the Vault server'
},
{
'id': 'vault_token',
'label': 'Vault Token',
'type': 'string',
'secret': True,
'help_text': 'Authentication token for Vault'
},
{
'id': 'vault_namespace',
'label': 'Vault Namespace',
'type': 'string',
'help_text': 'Vault namespace (for Vault Enterprise)'
}
]
},
'injectors': {
'env': {
'VAULT_ADDR': '{{ vault_url }}',
'VAULT_TOKEN': '{{ vault_token }}',
'VAULT_NAMESPACE': '{{ vault_namespace }}'
}
}
}PythonSmart Inventory
# smart_inventory_filter.yml
kind: smart
host_filter: >
(instance_filters__name__icontains="web" and
variables__environment="production") or
(labels__role="database" and
variables__cloud_provider="aws")YAMLJob Scheduling and Notifications
sequenceDiagram
participant Scheduler
participant AWX
participant Notification
participant Teams/Slack
Scheduler->>AWX: Trigger Scheduled Job
AWX->>AWX: Execute Playbook
AWX->>Notification: Job Status Update
alt Job Success
Notification->>Teams/Slack: Success Message
else Job Failure
Notification->>Teams/Slack: Failure Alert
Notification->>Teams/Slack: Error Details
endSurvey Specifications
{
"name": "Cloud Deployment Survey",
"description": "Configure cloud deployment parameters",
"spec": [
{
"question_name": "Cloud Provider",
"question_description": "Select the target cloud provider",
"required": true,
"type": "multiplechoice",
"variable": "cloud_provider",
"choices": ["aws", "gcp", "azure"],
"default": "aws"
},
{
"question_name": "Environment",
"question_description": "Target environment",
"required": true,
"type": "multiplechoice",
"variable": "environment",
"choices": ["development", "staging", "production"],
"default": "development"
},
{
"question_name": "Instance Count",
"question_description": "Number of instances to deploy",
"required": true,
"type": "integer",
"variable": "instance_count",
"min": 1,
"max": 10,
"default": 2
}
]
}JSON9. Security and Best Practices {#security}
Security Architecture
graph TD
A[User Access] --> B[Authentication]
B --> C[LDAP/SAML/OAuth]
B --> D[Local Users]
C --> E[Role-Based Access]
D --> E
E --> F[Organization Level]
E --> G[Team Level]
E --> H[Resource Level]
F --> I[Admin Rights]
G --> J[Team Permissions]
H --> K[Object Permissions]
L[Credential Management] --> M[Encrypted Storage]
L --> N[Credential Injection]
L --> O[External Secret Stores]RBAC Configuration
# rbac-configuration.yml
organizations:
- name: "Production Org"
teams:
- name: "Infrastructure Team"
permissions:
- "use" # credentials
- "read" # inventories
- "execute" # job templates
members:
- "devops-user1"
- "devops-user2"
- name: "Development Team"
permissions:
- "read" # limited access
members:
- "dev-user1"
- "dev-user2"
custom_roles:
- name: "Cloud Administrator"
permissions:
- "add_host"
- "change_host"
- "delete_host"
- "use_credential"
- "execute_jobtemplate"
- name: "Read Only"
permissions:
- "read_jobtemplate"
- "read_inventory"YAMLVault Integration
# vault-integration.yml
---
- name: Retrieve Secrets from Vault
hosts: localhost
gather_facts: false
vars:
vault_url: "https://vault.company.com"
vault_path: "secret/data/aws"
tasks:
- name: Read AWS credentials from Vault
uri:
url: "{{ vault_url }}/v1/{{ vault_path }}"
method: GET
headers:
X-Vault-Token: "{{ vault_token }}"
return_content: yes
register: vault_response
no_log: true
- name: Set AWS credentials
set_fact:
aws_access_key: "{{ vault_response.json.data.data.access_key }}"
aws_secret_key: "{{ vault_response.json.data.data.secret_key }}"
no_log: trueYAMLSecurity Best Practices Checklist
graph LR
A[Security Checklist] --> B[Access Control]
A --> C[Credential Security]
A --> D[Network Security]
A --> E[Audit Logging]
B --> F[✓ RBAC Configured]
B --> G[✓ MFA Enabled]
B --> H[✓ Session Timeout]
C --> I[✓ Encrypted Storage]
C --> J[✓ Credential Rotation]
C --> K[✓ External Vaults]
D --> L[✓ TLS/HTTPS]
D --> M[✓ Network Isolation]
D --> N[✓ Firewall Rules]
E --> O[✓ Activity Logs]
E --> P[✓ SIEM Integration]
E --> Q[✓ Compliance Reports]10. Troubleshooting {#troubleshooting}
Common Issues and Solutions
graph TD
A[AWX Issues] --> B[Job Failures]
A --> C[Connectivity Problems]
A --> D[Performance Issues]
A --> E[Authentication Errors]
B --> F[Check Playbook Syntax]
B --> G[Verify Inventory]
B --> H[Review Credentials]
C --> I[Network Connectivity]
C --> J[Firewall Rules]
C --> K[DNS Resolution]
D --> L[Resource Limits]
D --> M[Database Performance]
D --> N[Worker Capacity]
E --> O[Credential Validation]
E --> P[Permission Issues]
E --> Q[Token Expiration]Diagnostic Commands
# AWX Container Logs
docker logs awx_web
docker logs awx_task
# Database Connectivity
docker exec -it awx_postgres psql -U awx -d awx
# Redis Queue Status
docker exec -it awx_redis redis-cli
> KEYS *
> LLEN default
# System Resources
docker stats
df -h
free -m
# Network Debugging
ping target-host
telnet target-host 22
nslookup target-hostBashLog Analysis Playbook
# log-analysis.yml
---
- name: AWX Log Analysis
hosts: awx_servers
gather_facts: true
tasks:
- name: Collect AWX logs
shell: |
docker logs awx_web --tail 1000 > /tmp/awx_web.log
docker logs awx_task --tail 1000 > /tmp/awx_task.log
- name: Check for common errors
lineinfile:
path: /tmp/awx_web.log
regexp: "{{ item }}"
state: absent
check_mode: yes
register: error_check
loop:
- "ERROR"
- "CRITICAL"
- "Connection refused"
- "Authentication failed"
- name: Generate error report
template:
src: error-report.j2
dest: /tmp/awx-error-report.html
when: error_check is changedYAMLPerformance Monitoring
# performance-monitoring.yml
---
- name: AWX Performance Monitoring
hosts: localhost
gather_facts: false
tasks:
- name: Check job queue length
uri:
url: "http://awx-server/api/v2/jobs/?status=pending"
method: GET
headers:
Authorization: "Bearer {{ awx_token }}"
register: pending_jobs
- name: Monitor system resources
uri:
url: "http://awx-server/api/v2/metrics/"
method: GET
headers:
Authorization: "Bearer {{ awx_token }}"
register: metrics
- name: Alert if queue is too long
mail:
to: admin@company.com
subject: "AWX Queue Alert"
body: "Pending jobs: {{ pending_jobs.json.count }}"
when: pending_jobs.json.count > 10YAML11. Real-World Projects {#projects}
Project 1: Multi-Cloud Web Application Deployment
graph TB
subgraph "Multi-Cloud Architecture"
A[Load Balancer] --> B[AWS Region]
A --> C[GCP Region]
A --> D[Azure Region]
B --> E[AWS ALB]
C --> F[GCP Load Balancer]
D --> G[Azure Load Balancer]
E --> H[EC2 Instances]
F --> I[GCE Instances]
G --> J[Azure VMs]
H --> K[RDS Database]
I --> L[Cloud SQL]
J --> M[Azure SQL]
endMaster Playbook
# multi-cloud-webapp.yml
---
- name: Deploy Multi-Cloud Web Application
hosts: localhost
gather_facts: false
vars:
app_name: "webapp"
app_version: "{{ app_version | default('latest') }}"
regions:
aws: us-east-1
gcp: us-central1
azure: eastus
tasks:
- name: Deploy to AWS
include_tasks: aws-webapp-deploy.yml
vars:
cloud_provider: aws
region: "{{ regions.aws }}"
when: deploy_aws | default(true)
- name: Deploy to GCP
include_tasks: gcp-webapp-deploy.yml
vars:
cloud_provider: gcp
region: "{{ regions.gcp }}"
when: deploy_gcp | default(false)
- name: Deploy to Azure
include_tasks: azure-webapp-deploy.yml
vars:
cloud_provider: azure
region: "{{ regions.azure }}"
when: deploy_azure | default(false)
- name: Configure Global Load Balancing
include_tasks: global-lb-config.yml
when: global_lb | default(false)
- name: Run Health Checks
include_tasks: health-checks.yml
- name: Update DNS Records
include_tasks: dns-update.ymlYAMLProject 2: Kubernetes Multi-Cloud Management
# k8s-multi-cloud.yml
---
- name: Multi-Cloud Kubernetes Management
hosts: localhost
gather_facts: false
tasks:
- name: Deploy EKS Cluster
include_tasks: eks-cluster.yml
when: deploy_eks | default(false)
- name: Deploy GKE Cluster
include_tasks: gke-cluster.yml
when: deploy_gke | default(false)
- name: Deploy AKS Cluster
include_tasks: aks-cluster.yml
when: deploy_aks | default(false)
- name: Configure Multi-Cloud Service Mesh
include_tasks: service-mesh-config.yml
when: service_mesh | default(false)
- name: Deploy Applications
include_tasks: k8s-app-deploy.yml
loop: "{{ applications }}"YAMLProject 3: Disaster Recovery Automation
sequenceDiagram
participant Monitor
participant AWX
participant Primary
participant Secondary
participant DNS
Monitor->>AWX: Trigger DR Workflow
AWX->>Primary: Check Service Status
Primary-->>AWX: Service Down
AWX->>Secondary: Activate Backup Site
Secondary-->>AWX: Services Started
AWX->>DNS: Update Records
DNS-->>AWX: Records Updated
AWX->>Monitor: DR Complete# disaster-recovery.yml
---
- name: Disaster Recovery Automation
hosts: localhost
gather_facts: false
vars:
primary_region: us-east-1
dr_region: us-west-2
tasks:
- name: Check primary site health
uri:
url: "https://{{ primary_site_url }}/health"
method: GET
timeout: 10
register: primary_health
ignore_errors: true
- name: Activate disaster recovery
block:
- name: Start DR instances
amazon.aws.ec2_instance:
instance_ids: "{{ dr_instance_ids }}"
state: started
region: "{{ dr_region }}"
- name: Update load balancer
amazon.aws.elb_application_lb:
name: "{{ app_lb_name }}"
state: present
scheme: internet-facing
listeners:
- Protocol: HTTP
Port: 80
DefaultActions:
- Type: forward
TargetGroupArn: "{{ dr_target_group_arn }}"
- name: Update DNS records
amazon.aws.route53:
command: create
zone: "{{ dns_zone }}"
record: "{{ app_domain }}"
type: A
alias: true
alias_hosted_zone_id: "{{ dr_lb_zone_id }}"
value: "{{ dr_lb_dns_name }}"
- name: Send notification
mail:
to: "{{ notification_emails }}"
subject: "DR Activated for {{ app_name }}"
body: |
Disaster recovery has been activated.
Primary site: {{ primary_site_url }} (DOWN)
DR site: {{ dr_site_url }} (ACTIVE)
when: primary_health.status != 200YAMLProject 4: CI/CD Pipeline Integration
# cicd-integration.yml
---
- name: CI/CD Pipeline with AWX
hosts: localhost
gather_facts: false
tasks:
- name: Checkout code
git:
repo: "{{ git_repo }}"
dest: /tmp/app-code
version: "{{ git_branch | default('main') }}"
- name: Build application
shell: |
cd /tmp/app-code
docker build -t {{ app_name }}:{{ build_number }} .
- name: Push to registry
docker_image:
name: "{{ app_name }}"
tag: "{{ build_number }}"
push: yes
repository: "{{ docker_registry }}/{{ app_name }}"
- name: Deploy to staging
k8s:
state: present
definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: "{{ app_name }}"
namespace: staging
spec:
replicas: 2
selector:
matchLabels:
app: "{{ app_name }}"
template:
metadata:
labels:
app: "{{ app_name }}"
spec:
containers:
- name: "{{ app_name }}"
image: "{{ docker_registry }}/{{ app_name }}:{{ build_number }}"
ports:
- containerPort: 8080
- name: Run tests
uri:
url: "http://{{ staging_url }}/api/health"
method: GET
status_code: 200
retries: 5
delay: 10YAMLChapter 17: CI/CD Integration Patterns {#cicd}
Jenkins Integration with AWX
AWX seamlessly integrates with Jenkins to create powerful CI/CD pipelines that combine application deployment with infrastructure automation. This integration enables GitOps workflows and automated deployment strategies.
Architecture Overview
graph TB
subgraph "CI/CD Pipeline Architecture"
A[Git Repository] --> B[Jenkins Pipeline]
B --> C[Build & Test]
C --> D[Container Registry]
D --> E[AWX Job Template]
E --> F[Ansible Playbooks]
F --> G[Target Infrastructure]
H[GitOps Repository] --> I[Config Changes]
I --> J[Webhook Trigger]
J --> E
K[Monitoring] --> L[Alerts]
L --> M[Auto-remediation]
M --> E
end
subgraph "AWX Components"
N[Projects]
O[Job Templates]
P[Workflows]
Q[Inventories]
end
E --> N
E --> O
E --> P
E --> Q
style B fill:#326ce5
style E fill:#ee0000
style H fill:#f39c12Jenkins AWX Plugin Configuration
1. Install AWX Plugin in Jenkins
// Jenkinsfile - Plugin Installation
pipeline {
agent any
tools {
ansible 'ansible-latest'
}
environment {
AWX_URL = 'https://awx.company.com'
AWX_TOKEN = credentials('awx-api-token')
DOCKER_REGISTRY = 'registry.company.com'
}
stages {
stage('Setup AWX Connection') {
steps {
script {
// Configure AWX connection
awxConfig = [
url: env.AWX_URL,
token: env.AWX_TOKEN,
trustSelfSignedCerts: true
]
}
}
}
}
}Groovy2. Jenkins Pipeline with AWX Integration
// Complete Jenkins Pipeline
pipeline {
agent any
parameters {
choice(
name: 'ENVIRONMENT',
choices: ['development', 'staging', 'production'],
description: 'Target deployment environment'
)
choice(
name: 'DEPLOYMENT_TYPE',
choices: ['rolling', 'blue-green', 'canary'],
description: 'Deployment strategy'
)
booleanParam(
name: 'RUN_TESTS',
defaultValue: true,
description: 'Run integration tests'
)
}
environment {
APP_NAME = 'webapp'
BUILD_NUMBER = "${env.BUILD_NUMBER}"
GIT_COMMIT_SHORT = "${env.GIT_COMMIT[0..7]}"
}
stages {
stage('Code Checkout') {
steps {
checkout scm
script {
env.VERSION = sh(
script: "git describe --tags --always",
returnStdout: true
).trim()
}
}
}
stage('Build Application') {
steps {
script {
def buildImage = docker.build(
"${DOCKER_REGISTRY}/${APP_NAME}:${VERSION}"
)
buildImage.push()
buildImage.push('latest')
}
}
}
stage('Security Scan') {
steps {
script {
sh """
trivy image --exit-code 1 --severity HIGH,CRITICAL \
${DOCKER_REGISTRY}/${APP_NAME}:${VERSION}
"""
}
}
}
stage('Deploy to Environment') {
steps {
script {
def jobResult = triggerAWXJob([
jobTemplate: "Deploy ${APP_NAME}",
inventory: "${ENVIRONMENT}-inventory",
extraVars: [
app_name: env.APP_NAME,
app_version: env.VERSION,
environment: params.ENVIRONMENT,
deployment_type: params.DEPLOYMENT_TYPE,
build_number: env.BUILD_NUMBER,
git_commit: env.GIT_COMMIT_SHORT,
docker_image: "${DOCKER_REGISTRY}/${APP_NAME}:${VERSION}"
],
credential: 'awx-ssh-key',
waitForCompletion: true,
throwOnFailure: true
])
echo "AWX Job ID: ${jobResult.id}"
echo "Job Status: ${jobResult.status}"
}
}
}
stage('Integration Tests') {
when {
expression { params.RUN_TESTS }
}
steps {
script {
triggerAWXJob([
jobTemplate: "Integration Tests",
inventory: "${ENVIRONMENT}-inventory",
extraVars: [
target_environment: params.ENVIRONMENT,
app_version: env.VERSION
],
waitForCompletion: true,
throwOnFailure: true
])
}
}
}
stage('Smoke Tests') {
steps {
script {
triggerAWXJob([
jobTemplate: "Smoke Tests",
inventory: "${ENVIRONMENT}-inventory",
extraVars: [
app_url: "https://${APP_NAME}-${ENVIRONMENT}.company.com",
expected_version: env.VERSION
],
waitForCompletion: true,
throwOnFailure: true
])
}
}
}
}
post {
success {
script {
// Update deployment status
triggerAWXJob([
jobTemplate: "Update Deployment Status",
extraVars: [
deployment_status: "success",
app_name: env.APP_NAME,
version: env.VERSION,
environment: params.ENVIRONMENT,
build_url: env.BUILD_URL
],
waitForCompletion: false
])
// Send notifications
slackSend(
channel: '#deployments',
color: 'good',
message: "✅ ${APP_NAME} v${VERSION} deployed to ${ENVIRONMENT} successfully!"
)
}
}
failure {
script {
triggerAWXJob([
jobTemplate: "Deployment Rollback",
inventory: "${ENVIRONMENT}-inventory",
extraVars: [
app_name: env.APP_NAME,
environment: params.ENVIRONMENT,
rollback_reason: "Build ${BUILD_NUMBER} failed"
],
waitForCompletion: true
])
slackSend(
channel: '#deployments',
color: 'danger',
message: "❌ ${APP_NAME} deployment to ${ENVIRONMENT} failed! Rollback initiated."
)
}
}
always {
// Archive deployment artifacts
archiveArtifacts artifacts: 'deployment-logs/**', allowEmptyArchive: true
// Cleanup
sh "docker rmi ${DOCKER_REGISTRY}/${APP_NAME}:${VERSION} || true"
}
}
}GroovyGitOps Workflow Implementation
1. GitOps Repository Structure
gitops-configs/
├── environments/
│ ├── development/
│ │ ├── apps/
│ │ │ ├── webapp/
│ │ │ │ ├── deployment.yml
│ │ │ │ ├── service.yml
│ │ │ │ └── configmap.yml
│ │ │ └── api/
│ │ └── infrastructure/
│ │ ├── networking.yml
│ │ └── storage.yml
│ ├── staging/
│ └── production/
├── ansible/
│ ├── playbooks/
│ │ ├── deploy-app.yml
│ │ ├── rollback.yml
│ │ └── gitops-sync.yml
│ └── roles/
│ ├── kubernetes-deploy/
│ ├── config-management/
│ └── monitoring/
└── jenkins/
├── Jenkinsfile.deploy
├── Jenkinsfile.gitops
└── shared-libraries/Bash2. GitOps Synchronization Playbook
# ansible/playbooks/gitops-sync.yml
---
- name: GitOps Configuration Sync
hosts: kubernetes_masters
gather_facts: false
vars:
gitops_repo: "https://github.com/company/gitops-configs.git"
gitops_branch: "{{ environment | default('main') }}"
sync_timeout: 300
tasks:
- name: Clone GitOps repository
git:
repo: "{{ gitops_repo }}"
dest: "/tmp/gitops-{{ environment }}"
version: "{{ gitops_branch }}"
force: yes
delegate_to: localhost
run_once: true
- name: Validate Kubernetes manifests
shell: |
kubectl --dry-run=client apply -f {{ item }}
loop: "{{ config_files }}"
delegate_to: localhost
run_once: true
- name: Apply infrastructure configurations
k8s:
state: present
src: "/tmp/gitops-{{ environment }}/environments/{{ environment }}/infrastructure/"
wait: true
wait_timeout: "{{ sync_timeout }}"
delegate_to: localhost
run_once: true
- name: Apply application configurations
k8s:
state: present
src: "/tmp/gitops-{{ environment }}/environments/{{ environment }}/apps/"
wait: true
wait_timeout: "{{ sync_timeout }}"
delegate_to: localhost
run_once: true
- name: Verify deployment status
k8s_info:
api_version: apps/v1
kind: Deployment
namespace: "{{ item.namespace }}"
name: "{{ item.name }}"
register: deployment_status
loop: "{{ applications }}"
delegate_to: localhost
run_once: true
- name: Wait for rollout completion
shell: |
kubectl rollout status deployment/{{ item.name }} -n {{ item.namespace }} --timeout=300s
loop: "{{ applications }}"
delegate_to: localhost
run_once: true
- name: Record deployment event
uri:
url: "{{ monitoring_webhook }}"
method: POST
body_format: json
body:
event_type: "deployment"
environment: "{{ environment }}"
timestamp: "{{ ansible_date_time.iso8601 }}"
applications: "{{ applications }}"
git_commit: "{{ ansible_env.GIT_COMMIT | default('unknown') }}"
status: "success"
delegate_to: localhost
run_once: trueYAML3. Webhook-Triggered GitOps Pipeline
// jenkins/Jenkinsfile.gitops
pipeline {
agent any
triggers {
// GitHub webhook trigger
githubPush()
}
parameters {
string(
name: 'CHANGED_FILES',
defaultValue: '',
description: 'Comma-separated list of changed files'
)
}
environment {
KUBECONFIG = credentials('kubeconfig')
}
stages {
stage('Detect Changes') {
steps {
script {
def changedFiles = env.CHANGED_FILES?.split(',') ?: []
def affectedEnvironments = []
changedFiles.each { file ->
if (file.startsWith('environments/')) {
def envMatch = file =~ /environments\/([^\/]+)\//
if (envMatch) {
affectedEnvironments << envMatch[0][1]
}
}
}
env.AFFECTED_ENVIRONMENTS = affectedEnvironments.unique().join(',')
echo "Affected environments: ${env.AFFECTED_ENVIRONMENTS}"
}
}
}
stage('Validate Configurations') {
steps {
script {
env.AFFECTED_ENVIRONMENTS.split(',').each { environment ->
if (environment) {
triggerAWXJob([
jobTemplate: "Validate GitOps Config",
extraVars: [
environment: environment,
git_branch: env.BRANCH_NAME,
validation_mode: true
],
waitForCompletion: true,
throwOnFailure: true
])
}
}
}
}
}
stage('Apply Changes') {
steps {
script {
env.AFFECTED_ENVIRONMENTS.split(',').each { environment ->
if (environment) {
def deploymentJob = triggerAWXJob([
jobTemplate: "GitOps Sync",
inventory: "${environment}-kubernetes",
extraVars: [
environment: environment,
git_branch: env.BRANCH_NAME,
git_commit: env.GIT_COMMIT,
triggered_by: "GitOps Webhook",
sync_mode: "apply"
],
waitForCompletion: true,
throwOnFailure: true
])
echo "GitOps sync completed for ${environment}"
echo "Job ID: ${deploymentJob.id}"
}
}
}
}
}
stage('Verify Deployment') {
steps {
script {
env.AFFECTED_ENVIRONMENTS.split(',').each { environment ->
if (environment) {
triggerAWXJob([
jobTemplate: "Post-Deployment Verification",
inventory: "${environment}-kubernetes",
extraVars: [
environment: environment,
verification_timeout: 600
],
waitForCompletion: true,
throwOnFailure: true
])
}
}
}
}
}
}
post {
success {
script {
if (env.AFFECTED_ENVIRONMENTS) {
slackSend(
channel: '#gitops',
color: 'good',
message: """✅ GitOps sync completed successfully!
Environments: ${env.AFFECTED_ENVIRONMENTS}
Commit: ${env.GIT_COMMIT[0..7]}
Branch: ${env.BRANCH_NAME}"""
)
}
}
}
failure {
script {
if (env.AFFECTED_ENVIRONMENTS) {
slackSend(
channel: '#gitops',
color: 'danger',
message: """❌ GitOps sync failed!
Environments: ${env.AFFECTED_ENVIRONMENTS}
Check: ${env.BUILD_URL}"""
)
}
}
}
}
}GroovyAWX Job Template Configuration for Jenkins Integration
1. Deploy Application Job Template
# AWX Job Template: Deploy Application
name: "Deploy Application"
description: "Deploy application via Jenkins CI/CD pipeline"
job_type: "run"
inventory: "Dynamic - {{ environment }}"
project: "Application Deployment"
playbook: "deploy-app.yml"
credential: "SSH Key - Deployment"
verbosity: 1
extra_vars: |
# These variables are passed from Jenkins
app_name: "{{ app_name }}"
app_version: "{{ app_version }}"
environment: "{{ environment }}"
deployment_type: "{{ deployment_type | default('rolling') }}"
docker_image: "{{ docker_image }}"
build_number: "{{ build_number }}"
git_commit: "{{ git_commit }}"
# Environment-specific configurations
replicas: "{% if environment == 'production' %}5{% elif environment == 'staging' %}3{% else %}1{% endif %}"
resources:
requests:
memory: "{% if environment == 'production' %}512Mi{% else %}256Mi{% endif %}"
cpu: "{% if environment == 'production' %}500m{% else %}250m{% endif %}"
limits:
memory: "{% if environment == 'production' %}1Gi{% else %}512Mi{% endif %}"
cpu: "{% if environment == 'production' %}1000m{% else %}500m{% endif %}"
survey_enabled: true
survey_spec:
- variable: environment
question_name: "Target Environment"
question_description: "Select the deployment environment"
required: true
type: "multiplechoice"
choices: ["development", "staging", "production"]
- variable: deployment_type
question_name: "Deployment Strategy"
question_description: "Choose deployment strategy"
required: true
type: "multiplechoice"
choices: ["rolling", "blue-green", "canary"]
default: "rolling"
- variable: skip_tests
question_name: "Skip Tests"
question_description: "Skip post-deployment tests"
required: false
type: "boolean"
default: falseYAML2. Application Deployment Playbook
# playbooks/deploy-app.yml
---
- name: Deploy Application to Kubernetes
hosts: kubernetes_masters[0]
gather_facts: false
vars:
namespace: "{{ app_name }}-{{ environment }}"
deployment_timestamp: "{{ ansible_date_time.epoch }}"
pre_tasks:
- name: Validate required variables
assert:
that:
- app_name is defined
- app_version is defined
- docker_image is defined
- environment is defined
fail_msg: "Required deployment variables are missing"
- name: Create namespace if not exists
k8s:
name: "{{ namespace }}"
api_version: v1
kind: Namespace
state: present
tasks:
- name: Deploy application based on strategy
include_tasks: "deployment-strategies/{{ deployment_type }}.yml"
- name: Apply configuration maps
k8s:
state: present
definition:
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ app_name }}-config"
namespace: "{{ namespace }}"
labels:
app: "{{ app_name }}"
version: "{{ app_version }}"
environment: "{{ environment }}"
data:
environment: "{{ environment }}"
app_version: "{{ app_version }}"
build_number: "{{ build_number | default('unknown') }}"
git_commit: "{{ git_commit | default('unknown') }}"
- name: Apply service configuration
k8s:
state: present
definition:
apiVersion: v1
kind: Service
metadata:
name: "{{ app_name }}"
namespace: "{{ namespace }}"
labels:
app: "{{ app_name }}"
spec:
selector:
app: "{{ app_name }}"
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIP
- name: Apply ingress configuration
k8s:
state: present
definition:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: "{{ app_name }}"
namespace: "{{ namespace }}"
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- "{{ app_name }}-{{ environment }}.company.com"
secretName: "{{ app_name }}-tls"
rules:
- host: "{{ app_name }}-{{ environment }}.company.com"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: "{{ app_name }}"
port:
number: 80
post_tasks:
- name: Wait for deployment rollout
k8s_info:
api_version: apps/v1
kind: Deployment
name: "{{ app_name }}"
namespace: "{{ namespace }}"
wait: true
wait_condition:
type: Progressing
status: "True"
reason: NewReplicaSetAvailable
wait_timeout: 600
- name: Verify pods are running
k8s_info:
api_version: v1
kind: Pod
namespace: "{{ namespace }}"
label_selectors:
- app={{ app_name }}
register: pod_status
- name: Run health checks
uri:
url: "https://{{ app_name }}-{{ environment }}.company.com/health"
method: GET
status_code: 200
retries: 10
delay: 30
when: not skip_tests | default(false)
- name: Record deployment metrics
uri:
url: "{{ prometheus_pushgateway_url }}/metrics/job/deployment"
method: POST
body: |
deployment_total{app="{{ app_name }}",environment="{{ environment }}",version="{{ app_version }}"} 1
deployment_timestamp{app="{{ app_name }}",environment="{{ environment }}"} {{ deployment_timestamp }}
headers:
Content-Type: "text/plain"
ignore_errors: trueYAMLRolling Deployment Strategy
# deployment-strategies/rolling.yml
---
- name: Rolling deployment strategy
k8s:
state: present
definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: "{{ app_name }}"
namespace: "{{ namespace }}"
labels:
app: "{{ app_name }}"
version: "{{ app_version }}"
environment: "{{ environment }}"
annotations:
deployment.kubernetes.io/revision: "{{ build_number | default('1') }}"
spec:
replicas: "{{ replicas | int }}"
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
selector:
matchLabels:
app: "{{ app_name }}"
template:
metadata:
labels:
app: "{{ app_name }}"
version: "{{ app_version }}"
environment: "{{ environment }}"
spec:
containers:
- name: "{{ app_name }}"
image: "{{ docker_image }}"
ports:
- containerPort: 8080
env:
- name: ENVIRONMENT
value: "{{ environment }}"
- name: VERSION
value: "{{ app_version }}"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
resources: "{{ resources }}"YAMLBlue-Green Deployment Strategy
# deployment-strategies/blue-green.yml
---
- name: Get current deployment
k8s_info:
api_version: apps/v1
kind: Deployment
name: "{{ app_name }}-blue"
namespace: "{{ namespace }}"
register: blue_deployment
ignore_errors: true
- name: Get current deployment
k8s_info:
api_version: apps/v1
kind: Deployment
name: "{{ app_name }}-green"
namespace: "{{ namespace }}"
register: green_deployment
ignore_errors: true
- name: Determine target color
set_fact:
target_color: "{% if blue_deployment.resources %}green{% else %}blue{% endif %}"
inactive_color: "{% if blue_deployment.resources %}blue{% else %}green{% endif %}"
- name: Deploy to target environment
k8s:
state: present
definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: "{{ app_name }}-{{ target_color }}"
namespace: "{{ namespace }}"
labels:
app: "{{ app_name }}"
color: "{{ target_color }}"
version: "{{ app_version }}"
spec:
replicas: "{{ replicas | int }}"
selector:
matchLabels:
app: "{{ app_name }}"
color: "{{ target_color }}"
template:
metadata:
labels:
app: "{{ app_name }}"
color: "{{ target_color }}"
version: "{{ app_version }}"
spec:
containers:
- name: "{{ app_name }}"
image: "{{ docker_image }}"
ports:
- containerPort: 8080
resources: "{{ resources }}"
- name: Wait for new deployment
k8s_info:
api_version: apps/v1
kind: Deployment
name: "{{ app_name }}-{{ target_color }}"
namespace: "{{ namespace }}"
wait: true
wait_condition:
type: Available
status: "True"
wait_timeout: 600
- name: Update service to point to new deployment
k8s:
state: present
definition:
apiVersion: v1
kind: Service
metadata:
name: "{{ app_name }}"
namespace: "{{ namespace }}"
spec:
selector:
app: "{{ app_name }}"
color: "{{ target_color }}"
ports:
- port: 80
targetPort: 8080
- name: Remove old deployment
k8s:
state: absent
api_version: apps/v1
kind: Deployment
name: "{{ app_name }}-{{ inactive_color }}"
namespace: "{{ namespace }}"
when: not keep_previous_version | default(false)YAMLJenkins Shared Library for AWX
// vars/triggerAWXJob.groovy
def call(Map config) {
def awxUrl = env.AWX_URL ?: 'https://awx.company.com'
def awxToken = env.AWX_TOKEN
if (!awxToken) {
error "AWX_TOKEN environment variable is required"
}
def jobTemplateId = getJobTemplateId(config.jobTemplate)
def inventoryId = getInventoryId(config.inventory)
def launchData = [
inventory: inventoryId,
extra_vars: config.extraVars ?: [:],
job_tags: config.tags ?: '',
skip_tags: config.skipTags ?: '',
limit: config.limit ?: '',
verbosity: config.verbosity ?: 0
]
if (config.credential) {
launchData.credential = getCredentialId(config.credential)
}
def response = httpRequest(
httpMode: 'POST',
url: "${awxUrl}/api/v2/job_templates/${jobTemplateId}/launch/",
requestBody: groovy.json.JsonOutput.toJson(launchData),
contentType: 'APPLICATION_JSON',
customHeaders: [
[name: 'Authorization', value: "Bearer ${awxToken}"]
],
validResponseCodes: '200:299'
)
def jobData = readJSON text: response.content
def jobId = jobData.id
echo "AWX Job launched: ${jobId}"
echo "Job URL: ${awxUrl}/#/jobs/playbook/${jobId}"
if (config.waitForCompletion) {
return waitForJobCompletion(jobId, config.throwOnFailure ?: false)
}
return [id: jobId, status: 'running']
}
def getJobTemplateId(String templateName) {
def response = httpRequest(
httpMode: 'GET',
url: "${env.AWX_URL}/api/v2/job_templates/?name=${URLEncoder.encode(templateName, 'UTF-8')}",
customHeaders: [
[name: 'Authorization', value: "Bearer ${env.AWX_TOKEN}"]
],
validResponseCodes: '200:299'
)
def data = readJSON text: response.content
if (data.count == 0) {
error "Job template '${templateName}' not found"
}
return data.results[0].id
}
def waitForJobCompletion(int jobId, boolean throwOnFailure) {
def maxAttempts = 120 // 10 minutes with 5-second intervals
def attempts = 0
while (attempts < maxAttempts) {
def response = httpRequest(
httpMode: 'GET',
url: "${env.AWX_URL}/api/v2/jobs/${jobId}/",
customHeaders: [
[name: 'Authorization', value: "Bearer ${env.AWX_TOKEN}"]
],
validResponseCodes: '200:299'
)
def jobData = readJSON text: response.content
def status = jobData.status
if (status in ['successful', 'failed', 'error', 'canceled']) {
echo "AWX Job ${jobId} completed with status: ${status}"
if (throwOnFailure && status != 'successful') {
error "AWX Job ${jobId} failed with status: ${status}"
}
return [id: jobId, status: status, result: jobData]
}
echo "AWX Job ${jobId} status: ${status} (attempt ${attempts + 1}/${maxAttempts})"
sleep 5
attempts++
}
if (throwOnFailure) {
error "AWX Job ${jobId} timed out"
}
return [id: jobId, status: 'timeout']
}YAMLThis comprehensive Jenkins integration section provides complete CI/CD pipeline patterns with AWX, GitOps workflows, and deployment strategies that organizations can implement for automated infrastructure and application management.
- name: Deploy to production
k8s:
state: present
definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: "{{ app_name }}"
namespace: production
spec:
replicas: 5
selector:
matchLabels:
app: "{{ app_name }}"
template:
metadata:
labels:
app: "{{ app_name }}"
spec:
containers:
- name: "{{ app_name }}"
image: "{{ docker_registry }}/{{ app_name }}:{{ build_number }}"
ports:
- containerPort: 8080
when: staging_tests_passed | default(false)YAMLConclusion
This comprehensive guide has covered AWX/Ansible Tower from basic concepts to advanced multi-cloud automation scenarios. Key takeaways include:
Architecture Summary
graph TB
A[AWX Platform] --> B[Multi-Cloud Support]
A --> C[Enterprise Features]
A --> D[Security & RBAC]
B --> E[AWS Integration]
B --> F[GCP Integration]
B --> G[Azure Integration]
C --> H[Workflow Automation]
C --> I[Job Scheduling]
C --> J[Inventory Management]
D --> K[Credential Security]
D --> L[Access Control]
D --> M[Audit Logging]Best Practices Summary
- Security First: Always use encrypted credentials and RBAC
- Infrastructure as Code: Version control all playbooks
- Testing: Implement staging environments and testing workflows
- Monitoring: Set up comprehensive logging and alerting
- Documentation: Maintain clear documentation for all automation
Next Steps
- Practice with the provided examples
- Set up a lab environment
- Integrate with your existing infrastructure
- Explore advanced features like custom credential types
- Join the Ansible community for support and updates
This guide provides the foundation for becoming proficient with AWX/Ansible Tower across all major cloud platforms. Continue practicing and building on these concepts to become a cloud automation expert.
Discover more from Altgr Blog
Subscribe to get the latest posts sent to your email.
