Courses/DVA-C02/Domain 1: Development with AWS Services
Practice questions →
AWSDVA-C02

Domain 1: Development with AWS Services

Topic 1 of 4 · Study notes

AWS Certified Developer – Associate (DVA-C02)

Domain 1: Development with AWS Services

Exam Code: DVA-C02  |  Level: Associate
Domain Weight: 32%  |  Total Domains: 4  |  Passing Score: 720/1000


Table of Contents

  1. AWS Lambda
  2. Amazon API Gateway
  3. Amazon DynamoDB
  4. Amazon S3
  5. Amazon SQS
  6. Amazon SNS
  7. Amazon EventBridge
  8. AWS Step Functions
  9. Amazon Kinesis
  10. AWS SAM — Serverless Application Model
  11. AWS SDK — Patterns & Error Handling
  12. Exam Tips & Quick Reference

1. AWS Lambda

Lambda is the backbone of serverless development on AWS. It runs code in response to events without you provisioning or managing servers. You are billed per request and per GB-second of compute.

1.1 Core Concepts

Concept Detail
Runtime Node.js, Python, Java, Go, .NET, Ruby — or bring your own via Custom Runtime (bootstrap file)
Handler The function entry point. Format: filename.method_name (e.g., index.handler)
Memory 128 MB – 10,240 MB (in 1 MB increments). CPU power scales linearly with memory allocation
Timeout 1 second – 15 minutes (900 seconds). Default is 3 seconds
Package Size 50 MB (zipped), 250 MB (unzipped). Up to 10 GB via container image
Temp Storage /tmp — 512 MB to 10,240 MB. Persists within the same execution environment
Environment Variables Key-value pairs; can be encrypted with KMS. Max 4 KB total
┌─────────────────────────────────────────────────────────────────┐
│                     Lambda Execution Model                       │
│                                                                   │
│  Event Source ──► Lambda Service ──► Execution Environment       │
│                        │                    │                     │
│                   (trigger,             ┌───────────┐            │
│                   routing)              │  Init Phase│ ◄── Cold  │
│                                         │  (INIT)    │    Start  │
│                                         └─────┬──────┘           │
│                                               │                  │
│                                         ┌─────▼──────┐           │
│                                         │ Invoke Phase│          │
│                                         │ (handler)  │           │
│                                         └─────┬──────┘           │
│                                               │                  │
│                                         ┌─────▼──────┐           │
│                                         │  Shutdown  │           │
│                                         │  Phase     │           │
│                                         └────────────┘           │
└─────────────────────────────────────────────────────────────────┘

Critical Concept: Code placed outside the handler function runs during the Init Phase. This includes database connections, SDK clients, and configuration loading. Reusing these across invocations (within the same warm execution environment) significantly reduces latency and cost.

1.2 Invocation Models

Lambda has three fundamental invocation models. The exam frequently tests which model a given event source uses.

┌─────────────────────────────────────────────────────────────────────────┐
│               Lambda Invocation Model Decision Tree                      │
│                                                                           │
│  ┌───────────────────┐   ┌───────────────────┐   ┌───────────────────┐  │
│  │   SYNCHRONOUS     │   │   ASYNCHRONOUS    │   │  EVENT SOURCE     │  │
│  │   (Push)          │   │   (Fire & Forget) │   │  MAPPING (Poll)   │  │
│  ├───────────────────┤   ├───────────────────┤   ├───────────────────┤  │
│  │ • API Gateway     │   │ • S3              │   │ • SQS             │  │
│  │ • ALB             │   │ • SNS             │   │ • DynamoDB Stream │  │
│  │ • CloudFront      │   │ • EventBridge     │   │ • Kinesis         │  │
│  │ • Cognito         │   │ • SES             │   │ • Kafka (MSK)     │  │
│  │ • SDK (RequestResponse)│ • CloudWatch Logs│   │ • SQS FIFO        │  │
│  ├───────────────────┤   ├───────────────────┤   ├───────────────────┤  │
│  │ Caller WAITS for  │   │ Lambda retries    │   │ Lambda polls the  │  │
│  │ response. Errors  │   │ 2x on failure.    │   │ source. Managed   │  │
│  │ returned to caller│   │ DLQ supported     │   │ by Lambda service │  │
│  └───────────────────┘   └───────────────────┘   └───────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘
Model Who Handles Retries DLQ Support Example Sources
Synchronous Caller No API Gateway, ALB, SDK
Asynchronous Lambda (2 retries) Yes (SQS or SNS DLQ) S3, SNS, EventBridge
Event Source Mapping Lambda (configurable) Yes (bisect on error, DLQ) SQS, Kinesis, DynamoDB Streams

Exam Trap: For SQS → Lambda, the Lambda service internally polls SQS using long polling. The consumer is Lambda's event source mapping, not your code.

1.3 Execution Environment & Lifecycle

Cold Start occurs when Lambda provisions a new execution environment. The Init Phase includes downloading the code package, starting the runtime, and running initialization code (outside the handler). A warm start reuses an existing environment, skipping the Init Phase.

Phase What Happens Duration Impact
Init (Cold Start) Download code, start runtime, run init code Adds 100ms–several seconds
Invoke Execute the handler function Your business logic time
Shutdown Environment frozen or destroyed No billing

Strategies to Minimize Cold Starts:

  • Use Provisioned Concurrency — pre-warms a set number of execution environments. Eliminates cold starts. Billed even when idle.
  • Reduce deployment package size (fewer dependencies to load).
  • Choose runtimes with faster startup: Node.js and Python start faster than Java.
  • Keep initialization code lean.

Key Concept: /tmp storage persists within an execution environment across multiple invocations of the same warm instance. Do NOT store sensitive data in /tmp unless it is short-lived. Use it for caching large files or compiled artifacts between warm invocations.

1.4 Concurrency & Throttling

┌──────────────────────────────────────────────────────────────────────┐
│                     Lambda Concurrency Model                          │
│                                                                        │
│  Account Default: 1,000 concurrent executions (soft limit, regional)  │
│                                                                        │
│  ┌─────────────────────────────────────────────────────────────┐      │
│  │                    Account Concurrency Pool                  │      │
│  │                                                               │      │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐   │      │
│  │  │ Reserved     │  │ Provisioned  │  │   Unreserved     │   │      │
│  │  │ Concurrency  │  │ Concurrency  │  │   Concurrency    │   │      │
│  │  │ (guaranteed) │  │ (pre-warmed) │  │ (shared pool)    │   │      │
│  │  └──────────────┘  └──────────────┘  └──────────────────┘   │      │
│  └─────────────────────────────────────────────────────────────┘      │
│                                                                        │
│  Throttle: 429 TooManyRequestsException                                │
│  Burst Limit: 3,000 (us-east-1, us-west-2), 1,000 (other regions)     │
└──────────────────────────────────────────────────────────────────────┘
Concurrency Type Purpose Cost
Reserved Concurrency Guarantees capacity for a function; caps max concurrency No extra cost
Provisioned Concurrency Pre-warms environments; eliminates cold starts Billed per hour
Unreserved Default pool shared by all functions Standard Lambda pricing

Critical: Setting Reserved Concurrency to 0 effectively disables a Lambda function (no invocations allowed). Use this to throttle non-critical functions during incidents.

Concurrency Calculation:

Concurrent Executions = (Invocations per second) × (Average Duration in seconds)
Example: 100 req/s × 0.5s duration = 50 concurrent executions needed

1.5 Lambda Layers & Destinations

Lambda Layers package reusable code (shared libraries, custom runtimes, configuration) separately from function code. Up to 5 layers per function. Total unzipped size (function + layers) must not exceed 250 MB.

Lambda Destinations (for asynchronous invocations) send the result of a function execution to a target — regardless of success or failure. More powerful than DLQs because they capture both success and failure with full event context.

Destination Target Supports
SQS On success, on failure
SNS On success, on failure
EventBridge On success, on failure
Another Lambda On success, on failure

Exam Tip: DLQ only captures failures. Lambda Destinations capture both success and failure events with full event context. Prefer Destinations for modern serverless architectures.


2. Amazon API Gateway

API Gateway is a fully managed service that acts as the front door for your backend services. It handles authentication, throttling, caching, and monitoring at the edge.

2.1 REST API vs HTTP API vs WebSocket API

Feature REST API HTTP API WebSocket API
Latency Higher ~60% lower Persistent connection
Cost Higher ~70% cheaper Per message + connection
Auth Lambda Authorizer, Cognito, IAM, API Key Lambda Authorizer, Cognito, IAM, JWT Lambda Authorizer, IAM
Usage Plans Yes No No
Request Validation Yes No No
Private Integration Yes Yes No
Canary Deployments Yes No No
Use Case Feature-rich public APIs Low-latency microservices Real-time apps (chat, gaming)

Exam Tip: If the question says "lower cost", "lower latency", or "simpler Lambda proxy" — choose HTTP API. If it requires request validation, usage plans, or API keys — choose REST API.

2.2 Integration Types

Integration Type Description Use Case
Lambda Proxy Full request (headers, params, body) forwarded as event. Lambda must format the HTTP response. Most common; maximum flexibility
Lambda Custom API Gateway transforms request/response using mapping templates (Velocity Template Language). Data transformation before Lambda
AWS Service Directly integrate with AWS services (DynamoDB, SQS, SNS) — no Lambda needed. Reduce hops; lower latency
HTTP Forward request to an HTTP endpoint (external URL). Third-party APIs, on-prem
Mock API Gateway returns a hardcoded response without any backend. Testing; development stubs

AWS Service Integration Example (SQS):

Client ──► API Gateway ──► SQS Queue (no Lambda!)
           (transforms      (direct integration)
            to SQS action)

Critical Concept: AWS Service Integration (e.g., API Gateway → DynamoDB directly) is a powerful cost and latency optimization. The exam may present a scenario asking you to eliminate Lambda and test whether you recognize direct integration as a valid option.

2.3 Stages, Deployments & Canary

A deployment is a snapshot of your API configuration. A stage is a named reference to a deployment (e.g., dev, staging, prod). Deployments must be made explicit — changes to the API are NOT live until deployed.

Stage Variables act like environment variables for stages. They allow a single API definition to route to different Lambda aliases or HTTP endpoints per stage.

# Stage Variable Example: Lambda Alias Routing
# In integration URI:
arn:aws:lambda:us-east-1:123456789:function:MyFunc:${stageVariables.lambdaAlias}

# dev stage → lambdaAlias = dev
# prod stage → lambdaAlias = prod

Canary Deployments on REST APIs allow you to route a percentage of traffic to a new API deployment while the rest goes to the current stable version. Configure the canary percentage in the stage settings. This maps directly to Lambda alias traffic shifting.

2.4 Throttling, Caching & CORS

Throttling:

Level Default Limit Configuration
Account-level 10,000 RPS, 5,000 burst AWS Support to increase
Stage-level Inherits account default Set in Stage Settings
Method-level Inherits stage default Granular per-method override
Usage Plan Customer-defined via API Key Set RPS and monthly quota

When throttled: HTTP 429 Too Many Requests.

Caching:

  • Enable at the stage level. Cache capacity: 0.5 GB – 237 GB.
  • TTL: 0–3600 seconds. Default: 300 seconds.
  • Cache is keyed on method and URL. Add query strings or headers as cache keys.
  • Clients can invalidate cache by passing Cache-Control: max-age=0 (requires IAM permission).

CORS (Cross-Origin Resource Sharing):

  • For Lambda Proxy integration: Lambda function must return the Access-Control-Allow-Origin header.
  • For non-proxy integration: Enable CORS in the API Gateway console (adds an OPTIONS method).
  • Browser sends a preflight OPTIONS request before the actual request. OPTIONS must return 200.

Common Trap: CORS errors appear in the browser. If you enable CORS on API Gateway but your Lambda still returns the response without the CORS header in proxy integration, the browser will still reject the response.


3. Amazon DynamoDB

DynamoDB is a fully managed, serverless, key-value and document NoSQL database designed for single-digit millisecond performance at any scale.

3.1 Data Model & Key Design

The Primary Key uniquely identifies every item in a table. Two forms exist:

┌─────────────────────────────────────────────────────────────────────┐
│                     DynamoDB Key Design                              │
│                                                                       │
│  Option 1: Simple Primary Key (Partition Key only)                   │
│  ┌─────────────────────────────────────────┐                         │
│  │  PK (Partition Key)  │  Attributes...   │                         │
│  │  UserID = "U-001"    │  name, email...  │                         │
│  └─────────────────────────────────────────┘                         │
│  → PK must be unique per item. Best for simple lookups.              │
│                                                                       │
│  Option 2: Composite Primary Key (Partition Key + Sort Key)          │
│  ┌──────────────┬───────────────────┬──────────────┐                 │
│  │  PK           │  SK               │ Attributes   │                 │
│  │  (Partition)  │  (Sort)           │              │                 │
│  │  OrderID-001  │  2024-01-15       │  amount...   │                 │
│  │  OrderID-001  │  2024-02-20       │  amount...   │                 │
│  │  OrderID-001  │  2024-03-10       │  amount...   │                 │
│  └──────────────┴───────────────────┴──────────────┘                 │
│  → PK + SK must be unique. Multiple items per PK sorted by SK.       │
└─────────────────────────────────────────────────────────────────────┘

Partition Key Design Best Practices:

  • High cardinality — many distinct values prevent hot partitions.
  • Avoid sequential IDs if they lead to monotonic writes to one shard.
  • Use techniques like write sharding: append a random suffix (UserID#1, UserID#2) and fan-out reads.

Key Concept: DynamoDB distributes data across partitions based on the partition key hash. All items with the same partition key are stored on the same partition and sorted by the sort key. A "hot partition" (too many writes to one PK) will throttle your table even if overall capacity is adequate.

3.2 Read/Write Capacity & On-Demand Mode

Mode How it Works Best For
Provisioned (with Auto Scaling) Set RCU/WCU manually or with auto-scaling. Predictable, cheaper at scale. Steady, predictable traffic
On-Demand No capacity planning. Pay per request. Scales instantly. Unpredictable spikes, new apps

Capacity Unit Definitions:

Unit Strongly Consistent Eventually Consistent Transactional
1 RCU 1 read of item ≤ 4 KB 2 reads of items ≤ 4 KB 0.5 reads of items ≤ 4 KB
1 WCU 1 write of item ≤ 1 KB Same as strongly consistent 0.5 writes of items ≤ 1 KB

RCU Calculation Example:

Read: 10 items/second, each item is 10 KB, strongly consistent
→ Each item requires: CEIL(10 KB / 4 KB) = 3 RCU per item
→ Total: 10 items/s × 3 RCU = 30 RCU needed

WCU Calculation Example:

Write: 20 items/second, each item is 3.5 KB
→ Each item requires: CEIL(3.5 KB / 1 KB) = 4 WCU per item
→ Total: 20 items/s × 4 WCU = 80 WCU needed

3.3 Indexes — GSI & LSI

Feature Local Secondary Index (LSI) Global Secondary Index (GSI)
Partition Key Same as base table Any attribute (different PK)
Sort Key Different from base table Any attribute
Creation At table creation only (cannot add later) Anytime (before or after table creation)
Consistency Strongly or eventually consistent reads Eventually consistent reads only
Capacity Shares RCU/WCU with base table Has its own RCU/WCU
Per-table limit Max 5 LSIs Max 20 GSIs
Scope Same partition as base table Across all partitions

Critical Exam Trap: LSIs must be created with the table — you cannot add them later. GSIs can be added at any time. Strongly consistent reads are only possible on the base table and LSIs, NOT on GSIs.

Sparse Indexes (GSI Best Practice):
Create a GSI on an attribute that only some items have. Only those items appear in the index. This is efficient for queries like "all orders with status=PENDING" if most orders are COMPLETED and don't have the PENDING attribute.

3.4 DynamoDB Streams & TTL

DynamoDB Streams capture a time-ordered sequence of item-level modifications (INSERT, MODIFY, REMOVE) in a table. Retention: 24 hours.

Stream View Type What's Included
KEYS_ONLY Only the key attributes of the modified item
NEW_IMAGE The entire item after the modification
OLD_IMAGE The entire item before the modification
NEW_AND_OLD_IMAGES Both pre- and post-modification images
DynamoDB Table ──► Streams ──► Lambda ──► ElasticSearch / DynamoDB (cross-region) / SNS
                   (24h TTL)   (polls up
                                to 2 concurrent
                                shards per shard)

Time to Live (TTL):

  • Designate any Number attribute as the TTL attribute. Store value as Unix epoch timestamp.
  • DynamoDB automatically deletes expired items within 48 hours (not guaranteed to the second).
  • Deletes do NOT consume WCU.
  • Deletions appear in Streams as REMOVE events (use this to trigger cleanup logic).

Exam Tip: TTL deletions are NOT guaranteed to be instant. Applications reading expired items before deletion must filter on the TTL attribute in their code.

3.5 Transactions & Conditional Writes

DynamoDB Transactions allow all-or-nothing (ACID) operations across multiple items and tables.

API Description
TransactWriteItems Up to 100 write operations atomically
TransactGetItems Up to 100 read operations atomically

Transactions consume 2x the normal capacity units (RCU/WCU).

Conditional Writes:

# Only update if the item's version matches (Optimistic Locking)
table.update_item(
    Key={'PK': 'item-001'},
    UpdateExpression='SET price = :newprice, version = :newver',
    ConditionExpression='version = :currentver',
    ExpressionAttributeValues={
        ':newprice': 99,
        ':newver': 2,
        ':currentver': 1
    }
)
# Raises ConditionalCheckFailedException if version doesn't match

Common DynamoDB API Operations:

Operation Description
PutItem Create or fully replace an item
UpdateItem Add/modify/remove attributes on an existing item without replacing it
DeleteItem Remove an item
GetItem Read a single item by primary key (most efficient single-item read)
Query Read items with the same PK and optional SK filter — efficient
Scan Read ALL items in the table and optionally filter — expensive
BatchGetItem Up to 100 items across tables in one API call
BatchWriteItem Up to 25 PutItem or DeleteItem operations in one API call

Critical: Scan reads every item in the table before applying filters. Always prefer Query (requires knowing the partition key). Use Parallel Scan only for full-table ETL operations with adequate capacity.

3.6 DynamoDB Accelerator (DAX)

DAX is an in-memory, DynamoDB-compatible caching layer that provides microsecond read latency for cached data (vs. single-digit milliseconds for DynamoDB directly).

Application ──► DAX Cluster (in-memory cache) ──► DynamoDB
                      │
                 Cache Hit → return immediately
                 Cache Miss → fetch from DynamoDB → cache → return
Feature Detail
Latency Single-digit microseconds (vs. milliseconds for DynamoDB)
Write behavior Write-through: writes go to DynamoDB AND DAX simultaneously
Suitable for Read-heavy workloads; repeated reads of same data
NOT suitable for Write-heavy workloads; strongly consistent reads; financial/transactional data
VPC Deployed inside your VPC
TTL Item cache TTL (default 5 min), Query cache TTL (default 1 min)

Exam Tip: DAX does NOT support strongly consistent reads. If the question requires the latest data from DynamoDB at all times, DAX is NOT the answer. Use DAX for caching eventually-consistent reads.


4. Amazon S3

S3 is the foundational object storage service. For the DVA-C02 exam, focus on access patterns, pre-signed URLs, events, and lifecycle management.

4.1 Storage Classes

Storage Class Availability Min Storage Min Retrieval Use Case
S3 Standard 99.99% None Immediate Active data
S3 Intelligent-Tiering 99.9% None Immediate Changing access patterns
S3 Standard-IA 99.9% 30 days Immediate Infrequent, rapid access
S3 One Zone-IA 99.5% 30 days Immediate Non-critical infrequent
S3 Glacier Instant 99.9% 90 days Milliseconds Archive, quarterly access
S3 Glacier Flexible 99.99% 90 days Minutes–hours Long-term archive
S3 Glacier Deep Archive 99.99% 180 days 12–48 hours Compliance/long-term

4.2 Object Lifecycle, Versioning & Replication

Versioning:

  • Enabled at the bucket level. Once enabled, cannot be disabled — only suspended.
  • Each PUT/DELETE creates a new version. DELETE adds a delete marker (does not remove versions).
  • To permanently delete, you must delete the specific version ID.
  • Versioning is required for Replication and MFA Delete.

Lifecycle Rules:

  • Transition actions: Move objects between storage classes after N days.
  • Expiration actions: Delete objects or old versions after N days.
  • Can filter rules by prefix or tags.

Replication:

Feature Same-Region Replication (SRR) Cross-Region Replication (CRR)
Use case Log aggregation, dev/prod same region Compliance, lower latency globally
Versioning required Yes (both buckets) Yes (both buckets)
Existing objects NOT replicated automatically (use S3 Batch Replication) NOT replicated automatically
Delete markers Optional to replicate Optional to replicate

Critical: Replication only applies to new objects after replication is enabled. Existing objects must be replicated manually via S3 Batch Replication. Delete markers are NOT replicated by default.

4.3 Pre-signed URLs & Access Patterns

Pre-signed URLs allow temporary access to a private S3 object without requiring AWS credentials. The requester inherits the permissions of the IAM entity that generated the URL.

# Generate a pre-signed URL for a GET (download)
url = s3_client.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-bucket', 'Key': 'my-key'},
    ExpiresIn=3600  # 1 hour
)

# Generate a pre-signed URL for PUT (upload)
url = s3_client.generate_presigned_url(
    'put_object',
    Params={'Bucket': 'my-bucket', 'Key': 'upload-key'},
    ExpiresIn=900   # 15 minutes
)

S3 Multipart Upload:

  • Recommended for objects > 100 MB. Required for objects > 5 GB.
  • Upload parts independently and in parallel.
  • Parts must be between 5 MB and 5 GB (last part can be any size).
  • Use lifecycle rules to abort incomplete multipart uploads after N days.

S3 Transfer Acceleration:

  • Routes uploads through CloudFront edge locations to the S3 bucket via AWS backbone.
  • Enabled per bucket. Generates a separate endpoint (bucket.s3-accelerate.amazonaws.com).

4.4 S3 Events & Notifications

S3 can trigger events on object creation, removal, replication, and lifecycle transitions. Event destinations:

Destination Notes
SQS Decouple event processing
SNS Fan-out to multiple consumers
Lambda Synchronous invocation from S3
EventBridge Advanced filtering, multiple targets, archive

Key Concept: S3 event notifications vs. EventBridge: S3 notifications are simpler but limited in filtering. EventBridge for S3 supports filtering on object metadata, prefix, suffix, tags, and can route to 20+ AWS services. For complex routing, always prefer EventBridge.


5. Amazon SQS

SQS is a fully managed message queue service for decoupling microservices and distributed systems.

5.1 Standard vs FIFO Queues

Feature Standard Queue FIFO Queue
Throughput Unlimited 300 msg/s (3,000 with batching)
Delivery At-least-once (may deliver duplicates) Exactly-once processing
Ordering Best-effort (not guaranteed) Strict FIFO within a message group
Deduplication Manual (application handles) Built-in (5-minute deduplication window)
Queue Name Any name Must end with .fifo
Use Case High-throughput, order doesn't matter Financial transactions, ordered events

FIFO Message Groups:
Use MessageGroupId to parallelize processing within a FIFO queue. Messages in the same group are processed in order. Messages in different groups can be processed in parallel.

5.2 Visibility Timeout, DLQ & Polling

Visibility Timeout:
When a consumer reads a message, it becomes invisible to other consumers for the visibility timeout duration. If the consumer fails to delete the message within the timeout, the message reappears in the queue for redelivery.

Default: 30 seconds
Min: 0 seconds
Max: 12 hours

Consumer picks message → message hidden for 30s → Consumer deletes it (success)
                                                 → Timeout expires (no delete) → message reappears

Best Practice: Set visibility timeout to at least 6× the average processing time of your Lambda function. If your Lambda timeout is 5 minutes, set visibility timeout to 30+ minutes.

Dead-Letter Queue (DLQ):
After a message fails processing N times (maxReceiveCount), it's moved to the DLQ. Configure maxReceiveCount on the source queue's redrive policy, not on the DLQ itself.

Setting Description
maxReceiveCount Number of receives before moving to DLQ (1–1000)
messageRetentionPeriod How long messages stay in queue (60s – 14 days, default 4 days)
DLQ Type Standard queue can use a standard DLQ. FIFO queue must use a FIFO DLQ.

Polling Modes:

Mode Behavior Recommendation
Short Polling Returns immediately, even if queue is empty. Costs more (empty receives billed). Avoid
Long Polling Waits up to 20 seconds for messages. Reduces empty responses, lowers cost. Always preferred

Key Setting: ReceiveMessageWaitTimeSeconds > 0 enables long polling. Set to maximum of 20 seconds.

Message Size: Max 256 KB. For larger payloads, use the S3 Extended Client Library — store payload in S3, send a pointer in the SQS message.


6. Amazon SNS

SNS is a fully managed pub/sub messaging service. Publishers send messages to a topic; subscribers receive messages from that topic.

6.1 Topics, Subscriptions & Fan-out Pattern

Supported Subscriber Protocols:
SQS, Lambda, HTTP/HTTPS, Email, Email-JSON, SMS, Mobile Push (APNS, GCM), Kinesis Data Firehose.

Fan-out Pattern:
One SNS topic publishes to multiple SQS queues simultaneously. This decouples the publisher from multiple independent consumers.

                              ┌──────────────────────┐
                              │  SQS Queue A          │──► Consumer A (Order Processing)
S3 Event ──► SNS Topic ───────┤                       │
                              ├──────────────────────┤
                              │  SQS Queue B          │──► Consumer B (Inventory Update)
                              ├──────────────────────┤
                              │  SQS Queue C          │──► Consumer C (Analytics)
                              └──────────────────────┘

Best Practice: SNS → SQS fan-out is preferred over SNS → Lambda fan-out because SQS provides buffering, retry, and rate control for downstream Lambdas.

6.2 Message Filtering & FIFO Topics

Message Filtering (Subscription Filter Policies):
Each SQS or Lambda subscriber can define a filter policy — a JSON document that specifies which messages it wants to receive based on message attributes.

// Filter policy: only receive messages with type = "order" AND priority = "high"
{
  "type": ["order"],
  "priority": ["high"]
}

Without a filter policy, a subscriber receives all messages from the topic.

SNS FIFO Topics:

  • Strictly ordered delivery to SQS FIFO queues only.
  • Deduplication: 5-minute window.
  • Throughput: 300 msg/s (same as SQS FIFO).
  • Cannot fan-out to Lambda, HTTP, or Email — only to SQS FIFO queues.

7. Amazon EventBridge

EventBridge is a serverless event bus that connects applications using events. It supersedes CloudWatch Events and adds significant power.

7.1 Event Bus, Rules & Targets

Three Types of Event Buses:

Type Description
Default Event Bus Receives events from AWS services (EC2 state changes, S3 events, etc.)
Partner Event Bus Receives events from SaaS partners (Zendesk, Datadog, Shopify, etc.)
Custom Event Bus Receives events from your own applications via PutEvents API

Event Pattern Matching:
Rules filter events using pattern matching on event fields (source, detail-type, detail, region, account). Supports exact match, prefix match, numeric range, and anything-but conditions.

// Example: Catch all EC2 instance state changes to "stopped"
{
  "source": ["aws.ec2"],
  "detail-type": ["EC2 Instance State-change Notification"],
  "detail": {
    "state": ["stopped"]
  }
}

Targets (up to 5 per rule): Lambda, SQS, SNS, Kinesis, Step Functions, API Gateway, CloudWatch Logs, CodePipeline, EC2 Run Command, and more.

EventBridge Pipes: Point-to-point integration between a source (SQS, DynamoDB Streams, Kinesis) and a target with optional enrichment (Lambda or Step Functions) in the middle.

7.2 EventBridge vs SNS vs SQS

Dimension EventBridge SNS SQS
Model Event Bus (pub/sub with routing) Pub/Sub Queue (point-to-point)
Filtering Rich pattern matching on event body Attribute-based filter policies No filtering
Consumers Up to 5 targets per rule Up to 12.5M subscribers Single consumer per message
Schema Registry Yes (auto-discovers event schemas) No No
Replay Yes (Archive and Replay) No No
SaaS Integration Yes (Partner Event Bus) No No
Throughput Soft limit: 10,000 events/s High Unlimited (Standard)
Use case Complex routing, SaaS, serverless choreography Fan-out, notifications Decoupling, buffering, DLQ

8. AWS Step Functions

Step Functions orchestrates distributed applications and microservices as visual, auditable workflows called State Machines.

8.1 State Machine Types

Type Billing Duration Invocation Depth Use Case
Standard Per state transition Up to 1 year Up to 25 million Long-running, auditable, human approval
Express Per execution + duration Up to 5 minutes High throughput IoT, high-volume streaming, short jobs

Express Subtypes:

Synchronous Express Asynchronous Express
Caller waits Yes (returns result inline) No (fire and forget)
Audit history CloudWatch Logs CloudWatch Logs

Exam Tip: Standard workflows guarantee exactly-once execution. Express workflows support at-least-once execution. For financial transactions — use Standard.

8.2 State Types & Error Handling

State Type Purpose
Task Execute work — Lambda, SNS, SQS, DynamoDB, ECS, Glue, etc.
Choice Branching logic based on input values
Wait Pause execution for a set time or until a timestamp
Parallel Execute multiple branches simultaneously; waits for all to complete
Map Iterate over an array and apply the same states to each element
Pass Pass input directly to output; useful for injecting static data
Succeed Successfully end the workflow
Fail End the workflow with failure

Error Handling in Step Functions:

{
  "Type": "Task",
  "Resource": "arn:aws:lambda:...",
  "Retry": [
    {
      "ErrorEquals": ["Lambda.ServiceException", "Lambda.TooManyRequestsException"],
      "IntervalSeconds": 2,
      "MaxAttempts": 3,
      "BackoffRate": 2
    }
  ],
  "Catch": [
    {
      "ErrorEquals": ["States.ALL"],
      "Next": "ErrorHandlerState",
      "ResultPath": "$.error"
    }
  ]
}

Wait for Callback (Task Token Pattern):
Step Functions pauses until an external system calls SendTaskSuccess or SendTaskFailure with the task token. Ideal for human approval workflows or long-running third-party API calls.

Step Functions sends task token → External system/human does work → Calls SendTaskSuccess → Workflow continues

9. Amazon Kinesis

Kinesis handles real-time streaming data at scale. The DVA-C02 exam focuses primarily on Kinesis Data Streams.

9.1 Kinesis Data Streams

┌──────────────────────────────────────────────────────────────────────┐
│                       Kinesis Data Streams                            │
│                                                                        │
│  Producers                Shards                  Consumers           │
│  ┌──────────┐     ┌──────────────────────┐     ┌──────────────────┐  │
│  │ App/IoT  │────►│ Shard 1              │────►│ Lambda           │  │
│  │ Logs     │────►│ Shard 2              │────►│ Kinesis Analytics│  │
│  │ Metrics  │────►│ Shard N              │────►│ Firehose         │  │
│  └──────────┘     └──────────────────────┘     └──────────────────┘  │
│                                                                        │
│  • 1 shard = 1 MB/s write, 2 MB/s read, up to 1,000 records/s       │
│  • Retention: 24h (default), up to 365 days                           │
│  • Shard: unit of capacity — add shards to scale (reshard)           │
└──────────────────────────────────────────────────────────────────────┘
Feature Detail
Shard capacity (write) 1 MB/s or 1,000 records/s per shard
Shard capacity (read) 2 MB/s per shard (shared across consumers). Enhanced fan-out: 2 MB/s per consumer per shard
Retention Default 24 hours; extended up to 365 days (additional cost)
Record size Max 1 MB
Partition key Determines shard routing. Use high-cardinality keys to avoid hot shards

Kinesis Consumer Types:

Consumer Type Pull or Push Limit Use Case
Classic (GetRecords) Pull 2 MB/s shared per shard Multiple consumers sharing a shard's bandwidth
Enhanced Fan-out Push (HTTP/2) 2 MB/s per consumer per shard Multiple independent consumers needing full bandwidth

Critical: Enhanced Fan-out uses a dedicated throughput per consumer per shard. If 3 Lambda functions all read from the same shard, with standard consumers they share 2 MB/s. With Enhanced Fan-out, each gets their own 2 MB/s.

Kinesis vs SQS Decision:

Requirement Choose
Multiple consumers reading the same data Kinesis
Real-time analytics / time-ordered stream Kinesis
Replay historical data Kinesis
Simple decoupling, one consumer SQS
Guaranteed at-least-once delivery SQS
Large volume, order doesn't matter SQS Standard

9.2 Kinesis Firehose & Analytics

Kinesis Data Firehose:

  • Fully managed delivery stream — no shards, no consumers to manage.
  • Batches, compresses, transforms, and delivers data to: S3, Redshift, OpenSearch, Splunk, HTTP endpoints.
  • Near-real-time: minimum 60 second buffer or 1 MB (whichever comes first).
  • Can invoke Lambda for data transformation before delivery.

Kinesis Data Analytics:

  • Run standard SQL or Apache Flink queries on streaming data in real time.
  • Sources: Kinesis Data Streams, Kinesis Firehose.
  • Outputs: Kinesis Data Streams, Kinesis Firehose, Lambda.

10. AWS SAM — Serverless Application Model

SAM is an open-source framework that extends CloudFormation with simplified syntax for serverless resources. It transforms SAM templates into CloudFormation templates during deployment.

10.1 SAM Template Anatomy

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31      # ← Mandatory: signals SAM transform
Description: My Serverless App

Globals:                                    # ← Apply settings to all functions
  Function:
    Runtime: python3.12
    Timeout: 30
    MemorySize: 256
    Environment:
      Variables:
        TABLE_NAME: !Ref MyTable

Resources:

  # SAM Resource Type: AWS::Serverless::Function
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: src/handler.lambda_handler
      CodeUri: src/
      Events:
        ApiEvent:
          Type: Api
          Properties:
            Path: /users
            Method: GET
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt MyQueue.Arn
            BatchSize: 10

  # SAM Resource Type: AWS::Serverless::Api
  MyApi:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Auth:
        DefaultAuthorizer: MyCognitoAuthorizer
        Authorizers:
          MyCognitoAuthorizer:
            UserPoolArn: !GetAtt MyUserPool.Arn

  # SAM Resource Type: AWS::Serverless::SimpleTable (DynamoDB)
  MyTable:
    Type: AWS::Serverless::SimpleTable
    Properties:
      PrimaryKey:
        Name: userId
        Type: String

SAM Resource Types (Shorthand vs CloudFormation):

SAM Type Expands To
AWS::Serverless::Function Lambda Function + IAM Role + Event Source Mappings
AWS::Serverless::Api API Gateway REST API + Deployment + Stage
AWS::Serverless::HttpApi API Gateway HTTP API
AWS::Serverless::SimpleTable DynamoDB Table
AWS::Serverless::StateMachine Step Functions State Machine
AWS::Serverless::Application SAR (Serverless Application Repository) reference
AWS::Serverless::LayerVersion Lambda Layer

10.2 SAM CLI Commands

Command Purpose
sam init Scaffold a new serverless application from a template
sam build Build the application locally (installs dependencies into .aws-sam/)
sam local invoke Invoke a Lambda function locally in a Docker container
sam local start-api Start a local API Gateway instance for testing
sam local start-lambda Start a local Lambda endpoint for SDK testing
sam local generate-event Generate sample event payloads (S3, SQS, API GW, etc.)
sam validate Validate the SAM template
sam deploy Deploy to AWS (creates/updates CloudFormation stack)
sam deploy --guided Interactive deployment wizard (creates samconfig.toml)
sam logs Tail CloudWatch Logs for a deployed Lambda function
sam sync Hot-swap code changes without a full CloudFormation deployment

Key Concept: sam deploy --guided creates a samconfig.toml file that saves your deployment preferences. Subsequent sam deploy commands use this config without the --guided flag.


11. AWS SDK — Patterns & Error Handling

11.1 Exponential Backoff & Retry Logic

AWS SDK retries on transient errors (network issues, throttling) automatically. However, understanding the retry mechanism is critical for the exam.

Throttling Errors (Retryable):

  • ProvisionedThroughputExceededException (DynamoDB)
  • ThrottlingException
  • RequestLimitExceeded
  • HTTP 429, HTTP 503 (Service Unavailable)
  • HTTP 500 (Internal Server Error) — retryable

Non-Retryable Errors (Client Errors):

  • ValidationException
  • AccessDeniedException
  • ResourceNotFoundException
  • HTTP 400 (Bad Request) — generally non-retryable

Exponential Backoff Formula:

Wait Time = min(cap, base × 2^attempt) + random jitter

Example:
Attempt 1: wait 1s + jitter
Attempt 2: wait 2s + jitter
Attempt 3: wait 4s + jitter
Attempt 4: wait 8s + jitter

"Full Jitter": randomize the wait to spread out request storms
Wait = random(0, min(cap, base × 2^attempt))

Exam Tip: If you see ProvisionedThroughputExceededException from DynamoDB, the solution is exponential backoff — not to immediately increase capacity. Throttling often indicates a burst that can be absorbed by retrying.

11.2 Pagination Patterns

Many AWS API calls return paginated results. Always implement pagination in production code.

# DynamoDB Scan with pagination (Python)
paginator = dynamodb.get_paginator('scan')
pages = paginator.paginate(TableName='MyTable')
for page in pages:
    for item in page['Items']:
        process(item)

# Manual pagination using ExclusiveStartKey
response = table.scan()
items = response['Items']
while 'LastEvaluatedKey' in response:
    response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    items.extend(response['Items'])

Common Mistake: Calling Scan or Query without handling LastEvaluatedKey will only retrieve the first page (~1 MB) of results. Always check for and use LastEvaluatedKey to iterate all pages.


12. Exam Tips & Quick Reference

Scenario-to-Answer Mapping

Scenario Keyword / Requirement Correct Answer
"Decouple services; one publisher, many subscribers" SNS Fan-out (SNS → multiple SQS)
"Ensure exactly-once processing, preserve order" SQS FIFO Queue
"Pause a Lambda function and wait for human approval" Step Functions Task Token pattern
"Cache DynamoDB reads; reduce read load" DAX (eventually consistent only)
"Real-time stream; multiple consumers replay data" Kinesis Data Streams
"Deliver stream data to S3 with transformation" Kinesis Firehose + Lambda transform
"Deploy serverless app with infrastructure as code" AWS SAM
"API Gateway returns 429" Throttling — increase limits or implement exponential backoff
"Lambda times out during SQS processing" Increase Lambda timeout AND SQS visibility timeout
"Store shared dependencies across Lambda functions" Lambda Layers
"Eliminate Lambda cold starts for critical endpoint" Provisioned Concurrency
"Route S3 events to multiple services with rich filtering" S3 → EventBridge (not S3 native events)
"API Gateway → DynamoDB without Lambda" AWS Service Integration
"Allow user to upload directly to S3 securely" Pre-signed URL (PUT)
"Step Functions — high throughput, short duration" Express Workflow
"Step Functions — long running, human approval" Standard Workflow
"Lambda cannot send result to SQS after success" Lambda Destinations (not DLQ)
"Read all items in DynamoDB; large dataset" Parallel Scan with multiple workers

Common Traps

  • Lambda timeout vs SQS visibility timeout: Lambda timeout is the max execution time. SQS visibility timeout is how long the message stays hidden. If Lambda timeout > visibility timeout, a second consumer may receive and process the same message concurrently. Always set visibility timeout ≥ Lambda timeout × 6.
  • GSI vs LSI: LSI must be created at table creation. GSI can be added anytime. GSI only supports eventual consistency. Don't confuse them.
  • SQS DLQ vs Lambda DLQ vs Lambda Destinations: SQS DLQ is on the queue. Lambda async DLQ is on the Lambda function. Lambda Destinations replaces Lambda DLQ for async invocations and supports both success and failure routing.
  • Kinesis standard vs enhanced fan-out: Standard consumers share 2 MB/s per shard. Enhanced fan-out gives each consumer their own 2 MB/s per shard but costs more.
  • SAM Transform: Transform: AWS::Serverless-2016-10-31 is mandatory in SAM templates. Without it, CloudFormation does not recognize SAM resource types.
  • EventBridge vs CloudWatch Events: EventBridge is the evolution of CloudWatch Events. Both use the same underlying API, but EventBridge adds custom buses, SaaS partners, schema registry, and pipes. For new development, always use EventBridge.

Key Terms — Domain 1

Term One-Line Definition
Cold Start The latency added when Lambda initializes a new execution environment from scratch
Provisioned Concurrency Pre-warmed Lambda execution environments that eliminate cold starts
Reserved Concurrency A guaranteed concurrency limit for one function; also caps its maximum
Visibility Timeout Time an SQS message is hidden from other consumers after being received
Dead-Letter Queue (DLQ) Destination for messages that failed processing after maxReceiveCount attempts
Partition Key The attribute that determines which DynamoDB partition stores the data
Sort Key The secondary key that enables range queries within a partition
RCU / WCU Read/Write Capacity Units — the billing and throughput units for DynamoDB
Pre-signed URL A time-limited URL that grants temporary access to a private S3 object
Fan-out Pattern One SNS message triggers multiple SQS queues in parallel
Event Source Mapping Lambda's built-in polling mechanism for SQS, Kinesis, and DynamoDB Streams
Shard The fundamental unit of capacity in Kinesis Data Streams
Enhanced Fan-out Kinesis feature giving each consumer dedicated 2 MB/s per shard bandwidth
Task Token A unique identifier in Step Functions that pauses a workflow until the token is returned
SAM Transform CloudFormation macro that expands SAM shorthand into full CloudFormation resources

End of Domain 1. Continue to Domain 2: Security →

Ready to test yourself?

Practice questions for this topic

Start Practicing →

DVA-C02 Topics

Topic 1 of 4