Courses/SOA-C03/Deployment, Provisioning, and Automation
Practice questions →
AWSSOA-C03

Deployment, Provisioning, and Automation

Topic 3 of 5 · Study notes

AWS Certified CloudOps Engineer - Associate (SOA-C03) — Domain 3: Deployment, Provisioning and Automation

Exam Code: SOA-C03  |  Level: Associate
Domain Weight: 18%  |  Total Domains: 6  |  Passing Score: 720/1000


Table of Contents

  1. AWS CloudFormation — Deep Dive
  2. AWS Systems Manager — Complete Reference
  3. EC2 Instances — Provisioning and Purchasing
  4. Containers and Serverless
  5. Messaging and Decoupling
  6. AWS Organizations — Account Governance
  7. Exam Tips & Quick Reference

1. AWS CloudFormation — Deep Dive

CloudFormation is the primary infrastructure-as-code service for AWS. The exam tests template structure, dependency management, update controls, and multi-account deployment via StackSets.

1.1 Template Structure

AWSTemplateFormatVersion: "2010-09-09"
Description: "Template description"

Parameters:   # Inputs at deploy time — enables reuse across environments
Mappings:     # Lookup tables (e.g., region → AMI ID)
Conditions:   # Conditionally create resources
Transform:    # SAM/macros transformations
Resources:    # REQUIRED — all AWS resources defined here
Outputs:      # Values to export for cross-stack references or display

Parameters — Enabling Template Reuse

Parameters allow the same template to deploy across dev/staging/prod with different values:

Parameters:
  EnvironmentType:
    Type: String
    AllowedValues: [dev, staging, prod]
    Default: dev
  InstanceType:
    Type: String
    Default: t3.micro
    AllowedValues: [t3.micro, t3.small, t3.medium, m5.large]

Exam Tip: "Write a single CloudFormation template that can be reused for multiple environments" → use Parameters — not nested stacks, not user data, not stack policies.

Mappings — Region-Specific Values

AMI IDs are region-specific — the same AMI ID does not exist in both us-east-1 and eu-west-1. Use Mappings with !FindInMap to resolve the correct AMI per region:

Mappings:
  RegionMap:
    us-east-1:
      AMI: ami-0abc123
    eu-west-1:
      AMI: ami-0def456

Resources:
  WebServer:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: !FindInMap [RegionMap, !Ref "AWS::Region", AMI]

Exam Tip: "CloudFormation template works in us-east-1 but fails in eu-west-1 with 'AMI does not exist'" → add AMI IDs to the Mappings section per region, and run aws ec2 copy-image to copy the AMI to the new region first.

1.2 Resource Dependencies

DependsOn Attribute

Use DependsOn when CloudFormation cannot infer the dependency automatically — for example, an EC2 instance that connects to RDS on startup must wait for the DB to be created:

WebServer:
  Type: AWS::EC2::Instance
  DependsOn: DatabaseInstance
  Properties: ...

Note: Changing the order of resources in the template does NOT affect creation order. CloudFormation builds a dependency graph; order in the template file is irrelevant. You must use DependsOn explicitly.

CreationPolicy — Waiting for Bootstrap Completion

For Auto Scaling groups and EC2 instances that run initialization scripts, use CreationPolicy to wait for cfn-signal before marking the resource as complete:

WebServerGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  CreationPolicy:
    ResourceSignal:
      Count: 3
      Timeout: PT4H  # ISO 8601 duration
  Properties:
    MinSize: "3"
    MaxSize: "6"

The user data script must call cfn-signal at the end:

#!/bin/bash
yum install -y httpd
# ... configure application ...

/opt/aws/bin/cfn-signal -e $? \
  --stack ${AWS::StackName} \
  --resource WebServerGroup \
  --region ${AWS::Region}

Exam Tip: "Stack creation fails because Auto Scaling group wait condition is not receiving required signals" → run cfn-signal at the completion of the user data script. If cfn-signal is not called, CloudFormation waits until timeout then rolls back.

1.3 Stack Update Policies

Stack Policies — Preventing Accidental Updates

A stack policy is a JSON document applied to a stack that controls which resources can be updated:

{
  "Statement": [
    { "Effect": "Allow", "Principal": "*", "Action": "Update:*", "Resource": "*" },
    { "Effect": "Deny", "Principal": "*", "Action": "Update:*", "Resource": "LogicalResourceId/ProductionDatabase" }
  ]
}

Note: Stack policies protect specific resources within a stack. IAM policies protect actions on CloudFormation as a service. They serve different purposes.

AutoScalingRollingUpdate Policy

Controls how instances in an Auto Scaling group are replaced during stack updates:

WebServerGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  UpdatePolicy:
    AutoScalingRollingUpdate:
      MaxBatchSize: 2
      MinInstancesInService: 3
      PauseTime: PT5M
      WaitOnResourceSignals: true

AutoScalingReplacingUpdate creates a completely new Auto Scaling group and switches traffic after it is healthy — full capacity maintained, fast rollback, but double capacity cost temporarily.

1.4 Resource Management

Deletion Policies

Control what happens to resources when a stack is deleted:

Policy Behavior
Delete (default) Resource is deleted with the stack
Retain Resource persists after stack deletion
Snapshot Takes a final snapshot before deleting (RDS, EBS, ElastiCache)

Exam Tip: "Delete CloudFormation stack without deleting the DynamoDB table" → add DeletionPolicy: Retain to the DynamoDB resource.

DELETE_FAILED Stack State

Stack gets stuck in DELETE_FAILED when a resource cannot be deleted. Common causes: S3 bucket contains objects, or a security group has dependencies from other resources.

Fix: delete the stack again but specify resources to retain:

aws cloudformation delete-stack \
  --stack-name my-stack \
  --retain-resources ProblematicSecurityGroup S3BucketWithData

Dynamic References to Secrets

Instead of hardcoding credentials, use dynamic references:

Database:
  Type: AWS::RDS::DBInstance
  Properties:
    MasterUserPassword: "{{resolve:secretsmanager:MyDBSecret:SecretString:password}}"
    # Or for SSM Parameter Store:
    # SomeParameter: "{{resolve:ssm:/my/parameter/name}}"
    # Or for SSM SecureString:
    # SomeSecret: "{{resolve:ssm-secure:/my/secret/name}}"

1.5 StackSets

StackSets deploy the same CloudFormation template across multiple accounts and regions from a single management account.

Management Account
       │
       ▼
  StackSets ──► Account A (us-east-1, eu-west-1)
                Account B (us-east-1, eu-west-1)
                Account C (us-east-1, eu-west-1)

Exam Tip: "Enable AWS Config in all accounts of the organization and in all AWS Regions in the most operationally efficient way" → CloudFormation StackSets from management account with SERVICE_MANAGED permission model. SERVICE_MANAGED uses the Organizations service role, so new accounts added to the OU automatically get the StackSet applied.

When a StackSet stack instance shows OUTDATED, common causes are: the template is creating a global resource with a name collision across regions, the service is not available in the target region, or there is a permission issue in the target account.

1.6 Nested Stacks

Nested stacks reuse common template components across multiple parent templates. The parent passes parameters to child stacks and uses !GetAtt to read child stack outputs.

Feature Nested Stacks StackSets
Purpose Reuse components within one account/region Deploy same template across multiple accounts/regions
Scope Single account, single region Multi-account, multi-region
Use case Modular templates Org-wide governance

2. AWS Systems Manager — Complete Reference

2.1 Session Manager

Session Manager provides browser-based or CLI interactive shell access to EC2 instances and on-premises servers with no SSH port (22) required, no SSH key pairs, and no bastion hosts. All session activity is logged to CloudWatch Logs or S3.

Prerequisites: SSM Agent installed and running, IAM instance profile with AmazonSSMManagedInstanceCore, outbound HTTPS (443) from instance to SSM endpoints, and for private subnet instances without NAT: VPC interface endpoints for SSM.

Controlling Access by Instance Tags

Use IAM policy conditions on tags to restrict which instances a user can access:

{
  "Action": "ssm:StartSession",
  "Resource": "*",
  "Condition": {
    "StringEquals": { "ssm:resourceTag/Environment": "Development" }
  }
}

2.2 SSM Run Command

Run Command executes commands or scripts on many instances simultaneously without SSH, targeting by instance ID, tag, resource group, or all managed instances. Documents include AWS-RunShellScript (Linux) and AWS-RunPowerShellScript (Windows). Output is stored in S3 or CloudWatch Logs.

Run Command vs. State Manager

Feature Run Command State Manager
Execution One-time, on-demand Continuous, on schedule
Purpose Ad-hoc task execution Enforce desired state
Good for Patching, security response, one-off tasks Ensure consistent config, agent health

2.3 SSM Patch Manager

Patch Manager Architecture

Patch Baseline → defines which patches to approve/reject
Patch Group   → set of instances (tagged with "Patch Group: <name>")
Maintenance Window → when patching occurs (days, times, duration)
Run Command   → executes patching using AWS-RunPatchBaseline document

Instances are assigned to patch groups via the tag key Patch Group (exactly this spelling, including the space). Each patch group can have its own patch baseline — for example, development baseline auto-approves patches 2 days after release while production baseline auto-approves after 5 days.

Exam Tip: "Dev instances patched 2 days after release; prod patched 5 days after release; 2-hour maintenance window for all" → two patch groups + two patch baselines + one maintenance window.

2.4 Parameter Store vs. Secrets Manager

Feature Parameter Store Secrets Manager
Cost Free (standard) $0.40/secret/month + API charges
Secret rotation Manual only Built-in rotation with Lambda
RDS integration None Native RDS password rotation
Encryption Optional (SecureString with KMS) Always encrypted with KMS
Max value size 4 KB standard, 8 KB advanced 64 KB
Cross-account sharing No Yes (resource policy)
Requirement Use
Store and auto-rotate RDS credentials every 30 days Secrets Manager
Store CloudWatch agent JSON configuration for fleet Parameter Store
Store non-secret application configuration values Parameter Store

3. EC2 Instances — Provisioning and Purchasing

3.1 AMI Management

Pre-Baking (Golden AMI) Pattern

User data scripts that install and configure software take 5–20 minutes, making scale-out responses slow. The solution is to pre-bake AMIs: launch a base instance, install all software, create an AMI from the configured instance, and use this golden AMI in launch templates. New instances launch with software already installed and are ready in under 2 minutes.

Use EC2 Image Builder to automate the golden AMI creation pipeline: source image → build steps (install software) → test steps → distribute to regions.

3.2 EC2 Placement Groups

Group Type Layout Max Instances per AZ Use Case
Cluster Same rack, same AZ; 10 Gbps between instances Unlimited HPC, MPI, tight low-latency communication
Spread Each instance on a different rack 7 Small deployments needing maximum fault isolation
Partition Groups of instances per partition; each partition on separate hardware Hundreds Large distributed systems (Hadoop, Cassandra, Kafka)

Exam Tip: "10 EC2 instances, highly available, distinct underlying hardware" → Spread placement group spanning multiple AZs. "HPC application, minimum latency between nodes" → Cluster placement group in a single subnet.

3.3 EC2 Purchasing Options

Option Commitment Best For
Standard Reserved Instances 1 or 3 years; specific family/size/region Stable workloads; can sell on Marketplace
Convertible Reserved Instances 1 or 3 years; exchangeable for different family/OS When instance type may need to change mid-term
Compute Savings Plans $/hour commitment; any EC2 + Lambda + Fargate; any region Region migrations during commitment period
Instance Savings Plans $/hour; specific family and region Predictable workloads needing higher discount
Spot Instances None; up to 90% discount; interruptible Fault-tolerant batch, rendering, testing
Spot Blocks 1–6 hour defined duration; not interruptible Batch jobs < 6 hours that cannot be interrupted

Exam Tip: "Batch jobs run < 2 hours; if a job fails it must restart from beginning" → Spot Blocks (defined duration). Regular Spot could be interrupted mid-job requiring a full restart.

Spot Fleet Allocation Strategies

Strategy Behavior Use Case
lowest-price Selects cheapest pool(s) Maximum cost savings; higher interruption risk
capacity-optimized Selects pools with most available capacity Minimum interruption risk
diversified Distributes across all pools Balance of cost and reliability

3.4 Elastic Beanstalk Deployment Policies

Policy Downtime Capacity During Update Rollback Speed
All at once Yes Reduced (all updating) Redeploy (same speed as deploy)
Rolling No Reduced by batch size Rolling back each batch
Rolling with additional batch No Full (extra instances launched first) Roll back batches
Immutable No Full (new ASG created alongside old) Instant (route back to old ASG)

Exam Tip: "Maintain full capacity during deployment" → Immutable or Rolling with additional batch. Both maintain full capacity, but Immutable allows faster rollback by simply routing traffic back to the old Auto Scaling group.


4. Containers and Serverless

4.1 Amazon ECS — Task Networking

Network Mode ENI Assignment Security Group Use Case
awsvpc Per-task ENI Per-task security group Production; micro-segmentation
bridge Shared host ENI Host-level security group Simple apps; port mapping
host Shares host ENI directly Host security group Maximum performance

Exam Tip: "Monitor traffic flows between ECS tasks" → requires awsvpc network mode so each task gets its own ENI that can have VPC Flow Logs.

4.2 Lambda in VPC

Lambda functions run outside your VPC by default and have internet access but no access to VPC resources (RDS, ElastiCache, etc.). Placing Lambda in a VPC gives it access to VPC resources but removes direct internet access — just like any other private subnet resource.

To restore internet access for Lambda in a VPC: place Lambda in a private subnet and route through a NAT gateway in a public subnet.

Lambda → RDS connection problem: Lambda creates many short-lived connections and can exhaust the RDS max_connections parameter. Fix: deploy RDS Proxy between Lambda and RDS.

4.3 EKS Configuration

The kubeconfig file is required for kubectl to communicate with an EKS cluster. It maps the cluster API server endpoint, cluster CA certificate, and AWS IAM authentication credentials.

aws eks update-kubeconfig \
  --region us-east-1 \
  --name my-cluster

Exam Tip: "CloudOps Engineer needs to manage EKS cluster using kubectl. What must be configured?" → The kubeconfig file.


5. Messaging and Decoupling

5.1 Amazon SQS — Complete Reference

Standard vs. FIFO Queues

Feature Standard FIFO
Message ordering Best-effort only Guaranteed strict ordering
Exactly-once delivery At-least-once (duplicates possible) Exactly-once processing
Throughput Unlimited 300 TPS (3,000 with batching)
Naming Any name Must end in .fifo

Migrating from Standard to FIFO

You cannot convert an existing Standard queue to FIFO. You must: create a new FIFO queue (with .fifo suffix), enable content-based deduplication, update the application to include MessageGroupId in all messages, update consumers to read from the new queue, then drain and delete the old queue.

SQS Priority Queue Pattern

To process alarms before status updates: create two separate queues (high-priority and low-priority). The application polls the high-priority queue first and only polls the low-priority queue when the high-priority queue is empty.


6. AWS Organizations — Account Governance

6.1 Service Control Policies (SCPs)

Key Concept: SCPs set the maximum permissions for accounts in an organization. They apply to all users and roles including the root user of member accounts. SCPs cannot grant permissions — they only restrict. An explicit Deny in an SCP overrides any Allow in IAM policies.

Common SCP patterns:

Restrict to approved regions:

{
  "Effect": "Deny",
  "Action": "*",
  "Resource": "*",
  "Condition": {
    "StringNotEquals": { "aws:RequestedRegion": ["us-east-1", "us-west-2"] }
  }
}

Require tags on resource creation:

{
  "Effect": "Deny",
  "Action": ["ec2:RunInstances"],
  "Resource": "*",
  "Condition": { "Null": { "aws:RequestTag/CostCenter": "true" } }
}

SCPs can be applied at the Root (all accounts), OU level (all accounts in that OU), or individual account level. The effective restriction is the combination of all parent + account-level SCPs.

6.2 AWS Control Tower

Account Factory

Automates new account creation with pre-configured baselines: IAM Identity Center SSO, centralized logging, CloudTrail, Config, and guardrails. Every new account starts compliant.

Guardrails

Preventive guardrails (SCPs) prevent non-compliant actions. Detective guardrails (Config rules) detect and report non-compliance after the fact.

6.3 AWS Service Catalog

When a portfolio is shared with another AWS account, the recipient can view and launch products from the imported portfolio and add products to their own local portfolio, but cannot modify products in the imported portfolio — all changes must be made in the original admin account.

Exam Tip: "Most efficient way to create replica of existing infrastructure in a new AWS account" → Share the Service Catalog portfolio with the new account → import it into the new account.

TagOptions Library enforces consistent tagging at product launch by defining allowed tag keys and values that are required or automatically applied when a user launches a product.


Exam Tips & Quick Reference

Scenario-to-Answer Mapping

Scenario Keyword / Requirement Correct Answer
"Same template across environments" Use Parameters in CloudFormation template
"CloudFormation fails — AMI not found in region" Use Mappings section with AMI IDs per region
"Wait for bootstrap before stack completes" CreationPolicy + cfn-signal in user data
"cfn-signal not received, stack fails" Add cfn-signal call at end of user data script
"Stack DELETE_FAILED due to dependencies" Delete again, specify resources to retain
"Prevent accidental updates to specific resources" CloudFormation stack policy with Deny
"Apply same CloudFormation to all org accounts" CloudFormation StackSets with SERVICE_MANAGED
"Store/auto-rotate RDS credentials in template" AWS::SecretsManager::Secret + RotationSchedule
"Access to EC2 without SSH; centrally logged" SSM Session Manager
"Run commands on fleet by OS tag" SSM Run Command with tag-based targets
"Continuously enforce CloudWatch agent config" SSM State Manager association
"Automated patching with different schedules dev/prod" Two patch baselines + two patch groups + one maintenance window
"Moving regions; need committed discount" Compute Savings Plans (not Standard RIs)
"Change instance type mid-commitment" Convertible RIs
"Batch job < 2 hrs; cannot be interrupted" Spot Blocks (defined duration)
"Maintain full capacity during Beanstalk deployment" Immutable or Rolling with additional batch
"kubectl won't connect to EKS" Configure kubeconfig file
"Lambda → RDS too many connections" Deploy RDS Proxy
"Migrate from SQS Standard to FIFO" Create new FIFO queue; cannot convert existing
"Prioritize alarm messages over status messages" Two SQS queues; application polls alarm queue first
"Replica of infra in new account via Service Catalog" Share portfolio → import in new account
"Prevent specific service in all org accounts" SCP at organization root
"New account provisioning with guardrails" AWS Control Tower Account Factory
"Tag enforcement when launching products" Service Catalog TagOptions Library

Common Traps

  • Changing resource order in a template has no effect: CloudFormation uses a dependency graph. The only way to enforce creation order is DependsOn.
  • StackSets vs. nested stacks: StackSets are for multi-account/multi-region deployment. Nested stacks are for modular template reuse within one account. These are frequently confused.
  • Compute vs. Instance Savings Plans: If the scenario involves migrating regions or changing instance families during the commitment period, the answer is always Compute Savings Plans.
  • SCP cannot grant permissions: SCPs can only restrict what IAM policies can do — they do not replace or override IAM.

Key Terms — Domain 3

Term One-Line Definition
CloudFormation Mappings Lookup table section used to resolve region-specific values like AMI IDs
CreationPolicy CloudFormation attribute that waits for cfn-signal before marking a resource COMPLETE
Stack Policy JSON document protecting specific resources in a stack from accidental updates
StackSets CloudFormation feature for deploying templates across multiple accounts and regions
DeletionPolicy: Retain Causes a resource to persist after the CloudFormation stack is deleted
Spot Blocks Spot Instances with a guaranteed defined duration (1–6 hours)
Immutable deployment Beanstalk deployment that creates a new ASG alongside the old one
Compute Savings Plans Cost commitment that applies to any EC2, Lambda, or Fargate in any region
SCP Service Control Policy setting maximum permissions for all principals in an AWS account
Account Factory Control Tower feature that automates new account creation with compliant baselines

End of Domain 3. Continue to Domain 4: Security and Compliance →


Ready to test yourself?

Practice questions for this topic

Start Practicing →

SOA-C03 Topics

Topic 3 of 5