Domain 1: Development with AWS Services
Topic 1 of 4 · Study notes
AWS Certified Developer – Associate (DVA-C02)
Domain 1: Development with AWS Services
Exam Code: DVA-C02 | Level: Associate
Domain Weight: 32% | Total Domains: 4 | Passing Score: 720/1000
Table of Contents
- AWS Lambda
- Amazon API Gateway
- Amazon DynamoDB
- Amazon S3
- Amazon SQS
- Amazon SNS
- Amazon EventBridge
- AWS Step Functions
- Amazon Kinesis
- AWS SAM — Serverless Application Model
- AWS SDK — Patterns & Error Handling
- Exam Tips & Quick Reference
1. AWS Lambda
Lambda is the backbone of serverless development on AWS. It runs code in response to events without you provisioning or managing servers. You are billed per request and per GB-second of compute.
1.1 Core Concepts
| Concept | Detail |
|---|---|
| Runtime | Node.js, Python, Java, Go, .NET, Ruby — or bring your own via Custom Runtime (bootstrap file) |
| Handler | The function entry point. Format: filename.method_name (e.g., index.handler) |
| Memory | 128 MB – 10,240 MB (in 1 MB increments). CPU power scales linearly with memory allocation |
| Timeout | 1 second – 15 minutes (900 seconds). Default is 3 seconds |
| Package Size | 50 MB (zipped), 250 MB (unzipped). Up to 10 GB via container image |
| Temp Storage | /tmp — 512 MB to 10,240 MB. Persists within the same execution environment |
| Environment Variables | Key-value pairs; can be encrypted with KMS. Max 4 KB total |
┌─────────────────────────────────────────────────────────────────┐
│ Lambda Execution Model │
│ │
│ Event Source ──► Lambda Service ──► Execution Environment │
│ │ │ │
│ (trigger, ┌───────────┐ │
│ routing) │ Init Phase│ ◄── Cold │
│ │ (INIT) │ Start │
│ └─────┬──────┘ │
│ │ │
│ ┌─────▼──────┐ │
│ │ Invoke Phase│ │
│ │ (handler) │ │
│ └─────┬──────┘ │
│ │ │
│ ┌─────▼──────┐ │
│ │ Shutdown │ │
│ │ Phase │ │
│ └────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Critical Concept: Code placed outside the handler function runs during the Init Phase. This includes database connections, SDK clients, and configuration loading. Reusing these across invocations (within the same warm execution environment) significantly reduces latency and cost.
1.2 Invocation Models
Lambda has three fundamental invocation models. The exam frequently tests which model a given event source uses.
┌─────────────────────────────────────────────────────────────────────────┐
│ Lambda Invocation Model Decision Tree │
│ │
│ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ │
│ │ SYNCHRONOUS │ │ ASYNCHRONOUS │ │ EVENT SOURCE │ │
│ │ (Push) │ │ (Fire & Forget) │ │ MAPPING (Poll) │ │
│ ├───────────────────┤ ├───────────────────┤ ├───────────────────┤ │
│ │ • API Gateway │ │ • S3 │ │ • SQS │ │
│ │ • ALB │ │ • SNS │ │ • DynamoDB Stream │ │
│ │ • CloudFront │ │ • EventBridge │ │ • Kinesis │ │
│ │ • Cognito │ │ • SES │ │ • Kafka (MSK) │ │
│ │ • SDK (RequestResponse)│ • CloudWatch Logs│ │ • SQS FIFO │ │
│ ├───────────────────┤ ├───────────────────┤ ├───────────────────┤ │
│ │ Caller WAITS for │ │ Lambda retries │ │ Lambda polls the │ │
│ │ response. Errors │ │ 2x on failure. │ │ source. Managed │ │
│ │ returned to caller│ │ DLQ supported │ │ by Lambda service │ │
│ └───────────────────┘ └───────────────────┘ └───────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
| Model | Who Handles Retries | DLQ Support | Example Sources |
|---|---|---|---|
| Synchronous | Caller | No | API Gateway, ALB, SDK |
| Asynchronous | Lambda (2 retries) | Yes (SQS or SNS DLQ) | S3, SNS, EventBridge |
| Event Source Mapping | Lambda (configurable) | Yes (bisect on error, DLQ) | SQS, Kinesis, DynamoDB Streams |
Exam Trap: For SQS → Lambda, the Lambda service internally polls SQS using long polling. The consumer is Lambda's event source mapping, not your code.
1.3 Execution Environment & Lifecycle
Cold Start occurs when Lambda provisions a new execution environment. The Init Phase includes downloading the code package, starting the runtime, and running initialization code (outside the handler). A warm start reuses an existing environment, skipping the Init Phase.
| Phase | What Happens | Duration Impact |
|---|---|---|
| Init (Cold Start) | Download code, start runtime, run init code | Adds 100ms–several seconds |
| Invoke | Execute the handler function | Your business logic time |
| Shutdown | Environment frozen or destroyed | No billing |
Strategies to Minimize Cold Starts:
- Use Provisioned Concurrency — pre-warms a set number of execution environments. Eliminates cold starts. Billed even when idle.
- Reduce deployment package size (fewer dependencies to load).
- Choose runtimes with faster startup: Node.js and Python start faster than Java.
- Keep initialization code lean.
Key Concept:
/tmpstorage persists within an execution environment across multiple invocations of the same warm instance. Do NOT store sensitive data in/tmpunless it is short-lived. Use it for caching large files or compiled artifacts between warm invocations.
1.4 Concurrency & Throttling
┌──────────────────────────────────────────────────────────────────────┐
│ Lambda Concurrency Model │
│ │
│ Account Default: 1,000 concurrent executions (soft limit, regional) │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Account Concurrency Pool │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │
│ │ │ Reserved │ │ Provisioned │ │ Unreserved │ │ │
│ │ │ Concurrency │ │ Concurrency │ │ Concurrency │ │ │
│ │ │ (guaranteed) │ │ (pre-warmed) │ │ (shared pool) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ Throttle: 429 TooManyRequestsException │
│ Burst Limit: 3,000 (us-east-1, us-west-2), 1,000 (other regions) │
└──────────────────────────────────────────────────────────────────────┘
| Concurrency Type | Purpose | Cost |
|---|---|---|
| Reserved Concurrency | Guarantees capacity for a function; caps max concurrency | No extra cost |
| Provisioned Concurrency | Pre-warms environments; eliminates cold starts | Billed per hour |
| Unreserved | Default pool shared by all functions | Standard Lambda pricing |
Critical: Setting Reserved Concurrency to 0 effectively disables a Lambda function (no invocations allowed). Use this to throttle non-critical functions during incidents.
Concurrency Calculation:
Concurrent Executions = (Invocations per second) × (Average Duration in seconds)
Example: 100 req/s × 0.5s duration = 50 concurrent executions needed
1.5 Lambda Layers & Destinations
Lambda Layers package reusable code (shared libraries, custom runtimes, configuration) separately from function code. Up to 5 layers per function. Total unzipped size (function + layers) must not exceed 250 MB.
Lambda Destinations (for asynchronous invocations) send the result of a function execution to a target — regardless of success or failure. More powerful than DLQs because they capture both success and failure with full event context.
| Destination Target | Supports |
|---|---|
| SQS | On success, on failure |
| SNS | On success, on failure |
| EventBridge | On success, on failure |
| Another Lambda | On success, on failure |
Exam Tip: DLQ only captures failures. Lambda Destinations capture both success and failure events with full event context. Prefer Destinations for modern serverless architectures.
2. Amazon API Gateway
API Gateway is a fully managed service that acts as the front door for your backend services. It handles authentication, throttling, caching, and monitoring at the edge.
2.1 REST API vs HTTP API vs WebSocket API
| Feature | REST API | HTTP API | WebSocket API |
|---|---|---|---|
| Latency | Higher | ~60% lower | Persistent connection |
| Cost | Higher | ~70% cheaper | Per message + connection |
| Auth | Lambda Authorizer, Cognito, IAM, API Key | Lambda Authorizer, Cognito, IAM, JWT | Lambda Authorizer, IAM |
| Usage Plans | Yes | No | No |
| Request Validation | Yes | No | No |
| Private Integration | Yes | Yes | No |
| Canary Deployments | Yes | No | No |
| Use Case | Feature-rich public APIs | Low-latency microservices | Real-time apps (chat, gaming) |
Exam Tip: If the question says "lower cost", "lower latency", or "simpler Lambda proxy" — choose HTTP API. If it requires request validation, usage plans, or API keys — choose REST API.
2.2 Integration Types
| Integration Type | Description | Use Case |
|---|---|---|
| Lambda Proxy | Full request (headers, params, body) forwarded as event. Lambda must format the HTTP response. | Most common; maximum flexibility |
| Lambda Custom | API Gateway transforms request/response using mapping templates (Velocity Template Language). | Data transformation before Lambda |
| AWS Service | Directly integrate with AWS services (DynamoDB, SQS, SNS) — no Lambda needed. | Reduce hops; lower latency |
| HTTP | Forward request to an HTTP endpoint (external URL). | Third-party APIs, on-prem |
| Mock | API Gateway returns a hardcoded response without any backend. | Testing; development stubs |
AWS Service Integration Example (SQS):
Client ──► API Gateway ──► SQS Queue (no Lambda!)
(transforms (direct integration)
to SQS action)
Critical Concept: AWS Service Integration (e.g., API Gateway → DynamoDB directly) is a powerful cost and latency optimization. The exam may present a scenario asking you to eliminate Lambda and test whether you recognize direct integration as a valid option.
2.3 Stages, Deployments & Canary
A deployment is a snapshot of your API configuration. A stage is a named reference to a deployment (e.g., dev, staging, prod). Deployments must be made explicit — changes to the API are NOT live until deployed.
Stage Variables act like environment variables for stages. They allow a single API definition to route to different Lambda aliases or HTTP endpoints per stage.
# Stage Variable Example: Lambda Alias Routing
# In integration URI:
arn:aws:lambda:us-east-1:123456789:function:MyFunc:${stageVariables.lambdaAlias}
# dev stage → lambdaAlias = dev
# prod stage → lambdaAlias = prod
Canary Deployments on REST APIs allow you to route a percentage of traffic to a new API deployment while the rest goes to the current stable version. Configure the canary percentage in the stage settings. This maps directly to Lambda alias traffic shifting.
2.4 Throttling, Caching & CORS
Throttling:
| Level | Default Limit | Configuration |
|---|---|---|
| Account-level | 10,000 RPS, 5,000 burst | AWS Support to increase |
| Stage-level | Inherits account default | Set in Stage Settings |
| Method-level | Inherits stage default | Granular per-method override |
| Usage Plan | Customer-defined via API Key | Set RPS and monthly quota |
When throttled: HTTP 429 Too Many Requests.
Caching:
- Enable at the stage level. Cache capacity: 0.5 GB – 237 GB.
- TTL: 0–3600 seconds. Default: 300 seconds.
- Cache is keyed on method and URL. Add query strings or headers as cache keys.
- Clients can invalidate cache by passing
Cache-Control: max-age=0(requires IAM permission).
CORS (Cross-Origin Resource Sharing):
- For Lambda Proxy integration: Lambda function must return the
Access-Control-Allow-Originheader. - For non-proxy integration: Enable CORS in the API Gateway console (adds an OPTIONS method).
- Browser sends a preflight OPTIONS request before the actual request. OPTIONS must return 200.
Common Trap: CORS errors appear in the browser. If you enable CORS on API Gateway but your Lambda still returns the response without the CORS header in proxy integration, the browser will still reject the response.
3. Amazon DynamoDB
DynamoDB is a fully managed, serverless, key-value and document NoSQL database designed for single-digit millisecond performance at any scale.
3.1 Data Model & Key Design
The Primary Key uniquely identifies every item in a table. Two forms exist:
┌─────────────────────────────────────────────────────────────────────┐
│ DynamoDB Key Design │
│ │
│ Option 1: Simple Primary Key (Partition Key only) │
│ ┌─────────────────────────────────────────┐ │
│ │ PK (Partition Key) │ Attributes... │ │
│ │ UserID = "U-001" │ name, email... │ │
│ └─────────────────────────────────────────┘ │
│ → PK must be unique per item. Best for simple lookups. │
│ │
│ Option 2: Composite Primary Key (Partition Key + Sort Key) │
│ ┌──────────────┬───────────────────┬──────────────┐ │
│ │ PK │ SK │ Attributes │ │
│ │ (Partition) │ (Sort) │ │ │
│ │ OrderID-001 │ 2024-01-15 │ amount... │ │
│ │ OrderID-001 │ 2024-02-20 │ amount... │ │
│ │ OrderID-001 │ 2024-03-10 │ amount... │ │
│ └──────────────┴───────────────────┴──────────────┘ │
│ → PK + SK must be unique. Multiple items per PK sorted by SK. │
└─────────────────────────────────────────────────────────────────────┘
Partition Key Design Best Practices:
- High cardinality — many distinct values prevent hot partitions.
- Avoid sequential IDs if they lead to monotonic writes to one shard.
- Use techniques like write sharding: append a random suffix (
UserID#1,UserID#2) and fan-out reads.
Key Concept: DynamoDB distributes data across partitions based on the partition key hash. All items with the same partition key are stored on the same partition and sorted by the sort key. A "hot partition" (too many writes to one PK) will throttle your table even if overall capacity is adequate.
3.2 Read/Write Capacity & On-Demand Mode
| Mode | How it Works | Best For |
|---|---|---|
| Provisioned (with Auto Scaling) | Set RCU/WCU manually or with auto-scaling. Predictable, cheaper at scale. | Steady, predictable traffic |
| On-Demand | No capacity planning. Pay per request. Scales instantly. | Unpredictable spikes, new apps |
Capacity Unit Definitions:
| Unit | Strongly Consistent | Eventually Consistent | Transactional |
|---|---|---|---|
| 1 RCU | 1 read of item ≤ 4 KB | 2 reads of items ≤ 4 KB | 0.5 reads of items ≤ 4 KB |
| 1 WCU | 1 write of item ≤ 1 KB | Same as strongly consistent | 0.5 writes of items ≤ 1 KB |
RCU Calculation Example:
Read: 10 items/second, each item is 10 KB, strongly consistent
→ Each item requires: CEIL(10 KB / 4 KB) = 3 RCU per item
→ Total: 10 items/s × 3 RCU = 30 RCU needed
WCU Calculation Example:
Write: 20 items/second, each item is 3.5 KB
→ Each item requires: CEIL(3.5 KB / 1 KB) = 4 WCU per item
→ Total: 20 items/s × 4 WCU = 80 WCU needed
3.3 Indexes — GSI & LSI
| Feature | Local Secondary Index (LSI) | Global Secondary Index (GSI) |
|---|---|---|
| Partition Key | Same as base table | Any attribute (different PK) |
| Sort Key | Different from base table | Any attribute |
| Creation | At table creation only (cannot add later) | Anytime (before or after table creation) |
| Consistency | Strongly or eventually consistent reads | Eventually consistent reads only |
| Capacity | Shares RCU/WCU with base table | Has its own RCU/WCU |
| Per-table limit | Max 5 LSIs | Max 20 GSIs |
| Scope | Same partition as base table | Across all partitions |
Critical Exam Trap: LSIs must be created with the table — you cannot add them later. GSIs can be added at any time. Strongly consistent reads are only possible on the base table and LSIs, NOT on GSIs.
Sparse Indexes (GSI Best Practice):
Create a GSI on an attribute that only some items have. Only those items appear in the index. This is efficient for queries like "all orders with status=PENDING" if most orders are COMPLETED and don't have the PENDING attribute.
3.4 DynamoDB Streams & TTL
DynamoDB Streams capture a time-ordered sequence of item-level modifications (INSERT, MODIFY, REMOVE) in a table. Retention: 24 hours.
| Stream View Type | What's Included |
|---|---|
KEYS_ONLY |
Only the key attributes of the modified item |
NEW_IMAGE |
The entire item after the modification |
OLD_IMAGE |
The entire item before the modification |
NEW_AND_OLD_IMAGES |
Both pre- and post-modification images |
DynamoDB Table ──► Streams ──► Lambda ──► ElasticSearch / DynamoDB (cross-region) / SNS
(24h TTL) (polls up
to 2 concurrent
shards per shard)
Time to Live (TTL):
- Designate any Number attribute as the TTL attribute. Store value as Unix epoch timestamp.
- DynamoDB automatically deletes expired items within 48 hours (not guaranteed to the second).
- Deletes do NOT consume WCU.
- Deletions appear in Streams as REMOVE events (use this to trigger cleanup logic).
Exam Tip: TTL deletions are NOT guaranteed to be instant. Applications reading expired items before deletion must filter on the TTL attribute in their code.
3.5 Transactions & Conditional Writes
DynamoDB Transactions allow all-or-nothing (ACID) operations across multiple items and tables.
| API | Description |
|---|---|
TransactWriteItems |
Up to 100 write operations atomically |
TransactGetItems |
Up to 100 read operations atomically |
Transactions consume 2x the normal capacity units (RCU/WCU).
Conditional Writes:
# Only update if the item's version matches (Optimistic Locking)
table.update_item(
Key={'PK': 'item-001'},
UpdateExpression='SET price = :newprice, version = :newver',
ConditionExpression='version = :currentver',
ExpressionAttributeValues={
':newprice': 99,
':newver': 2,
':currentver': 1
}
)
# Raises ConditionalCheckFailedException if version doesn't match
Common DynamoDB API Operations:
| Operation | Description |
|---|---|
PutItem |
Create or fully replace an item |
UpdateItem |
Add/modify/remove attributes on an existing item without replacing it |
DeleteItem |
Remove an item |
GetItem |
Read a single item by primary key (most efficient single-item read) |
Query |
Read items with the same PK and optional SK filter — efficient |
Scan |
Read ALL items in the table and optionally filter — expensive |
BatchGetItem |
Up to 100 items across tables in one API call |
BatchWriteItem |
Up to 25 PutItem or DeleteItem operations in one API call |
Critical:
Scanreads every item in the table before applying filters. Always preferQuery(requires knowing the partition key). Use Parallel Scan only for full-table ETL operations with adequate capacity.
3.6 DynamoDB Accelerator (DAX)
DAX is an in-memory, DynamoDB-compatible caching layer that provides microsecond read latency for cached data (vs. single-digit milliseconds for DynamoDB directly).
Application ──► DAX Cluster (in-memory cache) ──► DynamoDB
│
Cache Hit → return immediately
Cache Miss → fetch from DynamoDB → cache → return
| Feature | Detail |
|---|---|
| Latency | Single-digit microseconds (vs. milliseconds for DynamoDB) |
| Write behavior | Write-through: writes go to DynamoDB AND DAX simultaneously |
| Suitable for | Read-heavy workloads; repeated reads of same data |
| NOT suitable for | Write-heavy workloads; strongly consistent reads; financial/transactional data |
| VPC | Deployed inside your VPC |
| TTL | Item cache TTL (default 5 min), Query cache TTL (default 1 min) |
Exam Tip: DAX does NOT support strongly consistent reads. If the question requires the latest data from DynamoDB at all times, DAX is NOT the answer. Use DAX for caching eventually-consistent reads.
4. Amazon S3
S3 is the foundational object storage service. For the DVA-C02 exam, focus on access patterns, pre-signed URLs, events, and lifecycle management.
4.1 Storage Classes
| Storage Class | Availability | Min Storage | Min Retrieval | Use Case |
|---|---|---|---|---|
| S3 Standard | 99.99% | None | Immediate | Active data |
| S3 Intelligent-Tiering | 99.9% | None | Immediate | Changing access patterns |
| S3 Standard-IA | 99.9% | 30 days | Immediate | Infrequent, rapid access |
| S3 One Zone-IA | 99.5% | 30 days | Immediate | Non-critical infrequent |
| S3 Glacier Instant | 99.9% | 90 days | Milliseconds | Archive, quarterly access |
| S3 Glacier Flexible | 99.99% | 90 days | Minutes–hours | Long-term archive |
| S3 Glacier Deep Archive | 99.99% | 180 days | 12–48 hours | Compliance/long-term |
4.2 Object Lifecycle, Versioning & Replication
Versioning:
- Enabled at the bucket level. Once enabled, cannot be disabled — only suspended.
- Each PUT/DELETE creates a new version. DELETE adds a delete marker (does not remove versions).
- To permanently delete, you must delete the specific version ID.
- Versioning is required for Replication and MFA Delete.
Lifecycle Rules:
- Transition actions: Move objects between storage classes after N days.
- Expiration actions: Delete objects or old versions after N days.
- Can filter rules by prefix or tags.
Replication:
| Feature | Same-Region Replication (SRR) | Cross-Region Replication (CRR) |
|---|---|---|
| Use case | Log aggregation, dev/prod same region | Compliance, lower latency globally |
| Versioning required | Yes (both buckets) | Yes (both buckets) |
| Existing objects | NOT replicated automatically (use S3 Batch Replication) | NOT replicated automatically |
| Delete markers | Optional to replicate | Optional to replicate |
Critical: Replication only applies to new objects after replication is enabled. Existing objects must be replicated manually via S3 Batch Replication. Delete markers are NOT replicated by default.
4.3 Pre-signed URLs & Access Patterns
Pre-signed URLs allow temporary access to a private S3 object without requiring AWS credentials. The requester inherits the permissions of the IAM entity that generated the URL.
# Generate a pre-signed URL for a GET (download)
url = s3_client.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-bucket', 'Key': 'my-key'},
ExpiresIn=3600 # 1 hour
)
# Generate a pre-signed URL for PUT (upload)
url = s3_client.generate_presigned_url(
'put_object',
Params={'Bucket': 'my-bucket', 'Key': 'upload-key'},
ExpiresIn=900 # 15 minutes
)
S3 Multipart Upload:
- Recommended for objects > 100 MB. Required for objects > 5 GB.
- Upload parts independently and in parallel.
- Parts must be between 5 MB and 5 GB (last part can be any size).
- Use lifecycle rules to abort incomplete multipart uploads after N days.
S3 Transfer Acceleration:
- Routes uploads through CloudFront edge locations to the S3 bucket via AWS backbone.
- Enabled per bucket. Generates a separate endpoint (
bucket.s3-accelerate.amazonaws.com).
4.4 S3 Events & Notifications
S3 can trigger events on object creation, removal, replication, and lifecycle transitions. Event destinations:
| Destination | Notes |
|---|---|
| SQS | Decouple event processing |
| SNS | Fan-out to multiple consumers |
| Lambda | Synchronous invocation from S3 |
| EventBridge | Advanced filtering, multiple targets, archive |
Key Concept: S3 event notifications vs. EventBridge: S3 notifications are simpler but limited in filtering. EventBridge for S3 supports filtering on object metadata, prefix, suffix, tags, and can route to 20+ AWS services. For complex routing, always prefer EventBridge.
5. Amazon SQS
SQS is a fully managed message queue service for decoupling microservices and distributed systems.
5.1 Standard vs FIFO Queues
| Feature | Standard Queue | FIFO Queue |
|---|---|---|
| Throughput | Unlimited | 300 msg/s (3,000 with batching) |
| Delivery | At-least-once (may deliver duplicates) | Exactly-once processing |
| Ordering | Best-effort (not guaranteed) | Strict FIFO within a message group |
| Deduplication | Manual (application handles) | Built-in (5-minute deduplication window) |
| Queue Name | Any name | Must end with .fifo |
| Use Case | High-throughput, order doesn't matter | Financial transactions, ordered events |
FIFO Message Groups:
Use MessageGroupId to parallelize processing within a FIFO queue. Messages in the same group are processed in order. Messages in different groups can be processed in parallel.
5.2 Visibility Timeout, DLQ & Polling
Visibility Timeout:
When a consumer reads a message, it becomes invisible to other consumers for the visibility timeout duration. If the consumer fails to delete the message within the timeout, the message reappears in the queue for redelivery.
Default: 30 seconds
Min: 0 seconds
Max: 12 hours
Consumer picks message → message hidden for 30s → Consumer deletes it (success)
→ Timeout expires (no delete) → message reappears
Best Practice: Set visibility timeout to at least 6× the average processing time of your Lambda function. If your Lambda timeout is 5 minutes, set visibility timeout to 30+ minutes.
Dead-Letter Queue (DLQ):
After a message fails processing N times (maxReceiveCount), it's moved to the DLQ. Configure maxReceiveCount on the source queue's redrive policy, not on the DLQ itself.
| Setting | Description |
|---|---|
maxReceiveCount |
Number of receives before moving to DLQ (1–1000) |
messageRetentionPeriod |
How long messages stay in queue (60s – 14 days, default 4 days) |
| DLQ Type | Standard queue can use a standard DLQ. FIFO queue must use a FIFO DLQ. |
Polling Modes:
| Mode | Behavior | Recommendation |
|---|---|---|
| Short Polling | Returns immediately, even if queue is empty. Costs more (empty receives billed). | Avoid |
| Long Polling | Waits up to 20 seconds for messages. Reduces empty responses, lowers cost. | Always preferred |
Key Setting:
ReceiveMessageWaitTimeSeconds> 0 enables long polling. Set to maximum of 20 seconds.
Message Size: Max 256 KB. For larger payloads, use the S3 Extended Client Library — store payload in S3, send a pointer in the SQS message.
6. Amazon SNS
SNS is a fully managed pub/sub messaging service. Publishers send messages to a topic; subscribers receive messages from that topic.
6.1 Topics, Subscriptions & Fan-out Pattern
Supported Subscriber Protocols:
SQS, Lambda, HTTP/HTTPS, Email, Email-JSON, SMS, Mobile Push (APNS, GCM), Kinesis Data Firehose.
Fan-out Pattern:
One SNS topic publishes to multiple SQS queues simultaneously. This decouples the publisher from multiple independent consumers.
┌──────────────────────┐
│ SQS Queue A │──► Consumer A (Order Processing)
S3 Event ──► SNS Topic ───────┤ │
├──────────────────────┤
│ SQS Queue B │──► Consumer B (Inventory Update)
├──────────────────────┤
│ SQS Queue C │──► Consumer C (Analytics)
└──────────────────────┘
Best Practice: SNS → SQS fan-out is preferred over SNS → Lambda fan-out because SQS provides buffering, retry, and rate control for downstream Lambdas.
6.2 Message Filtering & FIFO Topics
Message Filtering (Subscription Filter Policies):
Each SQS or Lambda subscriber can define a filter policy — a JSON document that specifies which messages it wants to receive based on message attributes.
// Filter policy: only receive messages with type = "order" AND priority = "high"
{
"type": ["order"],
"priority": ["high"]
}
Without a filter policy, a subscriber receives all messages from the topic.
SNS FIFO Topics:
- Strictly ordered delivery to SQS FIFO queues only.
- Deduplication: 5-minute window.
- Throughput: 300 msg/s (same as SQS FIFO).
- Cannot fan-out to Lambda, HTTP, or Email — only to SQS FIFO queues.
7. Amazon EventBridge
EventBridge is a serverless event bus that connects applications using events. It supersedes CloudWatch Events and adds significant power.
7.1 Event Bus, Rules & Targets
Three Types of Event Buses:
| Type | Description |
|---|---|
| Default Event Bus | Receives events from AWS services (EC2 state changes, S3 events, etc.) |
| Partner Event Bus | Receives events from SaaS partners (Zendesk, Datadog, Shopify, etc.) |
| Custom Event Bus | Receives events from your own applications via PutEvents API |
Event Pattern Matching:
Rules filter events using pattern matching on event fields (source, detail-type, detail, region, account). Supports exact match, prefix match, numeric range, and anything-but conditions.
// Example: Catch all EC2 instance state changes to "stopped"
{
"source": ["aws.ec2"],
"detail-type": ["EC2 Instance State-change Notification"],
"detail": {
"state": ["stopped"]
}
}
Targets (up to 5 per rule): Lambda, SQS, SNS, Kinesis, Step Functions, API Gateway, CloudWatch Logs, CodePipeline, EC2 Run Command, and more.
EventBridge Pipes: Point-to-point integration between a source (SQS, DynamoDB Streams, Kinesis) and a target with optional enrichment (Lambda or Step Functions) in the middle.
7.2 EventBridge vs SNS vs SQS
| Dimension | EventBridge | SNS | SQS |
|---|---|---|---|
| Model | Event Bus (pub/sub with routing) | Pub/Sub | Queue (point-to-point) |
| Filtering | Rich pattern matching on event body | Attribute-based filter policies | No filtering |
| Consumers | Up to 5 targets per rule | Up to 12.5M subscribers | Single consumer per message |
| Schema Registry | Yes (auto-discovers event schemas) | No | No |
| Replay | Yes (Archive and Replay) | No | No |
| SaaS Integration | Yes (Partner Event Bus) | No | No |
| Throughput | Soft limit: 10,000 events/s | High | Unlimited (Standard) |
| Use case | Complex routing, SaaS, serverless choreography | Fan-out, notifications | Decoupling, buffering, DLQ |
8. AWS Step Functions
Step Functions orchestrates distributed applications and microservices as visual, auditable workflows called State Machines.
8.1 State Machine Types
| Type | Billing | Duration | Invocation Depth | Use Case |
|---|---|---|---|---|
| Standard | Per state transition | Up to 1 year | Up to 25 million | Long-running, auditable, human approval |
| Express | Per execution + duration | Up to 5 minutes | High throughput | IoT, high-volume streaming, short jobs |
Express Subtypes:
| Synchronous Express | Asynchronous Express | |
|---|---|---|
| Caller waits | Yes (returns result inline) | No (fire and forget) |
| Audit history | CloudWatch Logs | CloudWatch Logs |
Exam Tip: Standard workflows guarantee exactly-once execution. Express workflows support at-least-once execution. For financial transactions — use Standard.
8.2 State Types & Error Handling
| State Type | Purpose |
|---|---|
| Task | Execute work — Lambda, SNS, SQS, DynamoDB, ECS, Glue, etc. |
| Choice | Branching logic based on input values |
| Wait | Pause execution for a set time or until a timestamp |
| Parallel | Execute multiple branches simultaneously; waits for all to complete |
| Map | Iterate over an array and apply the same states to each element |
| Pass | Pass input directly to output; useful for injecting static data |
| Succeed | Successfully end the workflow |
| Fail | End the workflow with failure |
Error Handling in Step Functions:
{
"Type": "Task",
"Resource": "arn:aws:lambda:...",
"Retry": [
{
"ErrorEquals": ["Lambda.ServiceException", "Lambda.TooManyRequestsException"],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2
}
],
"Catch": [
{
"ErrorEquals": ["States.ALL"],
"Next": "ErrorHandlerState",
"ResultPath": "$.error"
}
]
}
Wait for Callback (Task Token Pattern):
Step Functions pauses until an external system calls SendTaskSuccess or SendTaskFailure with the task token. Ideal for human approval workflows or long-running third-party API calls.
Step Functions sends task token → External system/human does work → Calls SendTaskSuccess → Workflow continues
9. Amazon Kinesis
Kinesis handles real-time streaming data at scale. The DVA-C02 exam focuses primarily on Kinesis Data Streams.
9.1 Kinesis Data Streams
┌──────────────────────────────────────────────────────────────────────┐
│ Kinesis Data Streams │
│ │
│ Producers Shards Consumers │
│ ┌──────────┐ ┌──────────────────────┐ ┌──────────────────┐ │
│ │ App/IoT │────►│ Shard 1 │────►│ Lambda │ │
│ │ Logs │────►│ Shard 2 │────►│ Kinesis Analytics│ │
│ │ Metrics │────►│ Shard N │────►│ Firehose │ │
│ └──────────┘ └──────────────────────┘ └──────────────────┘ │
│ │
│ • 1 shard = 1 MB/s write, 2 MB/s read, up to 1,000 records/s │
│ • Retention: 24h (default), up to 365 days │
│ • Shard: unit of capacity — add shards to scale (reshard) │
└──────────────────────────────────────────────────────────────────────┘
| Feature | Detail |
|---|---|
| Shard capacity (write) | 1 MB/s or 1,000 records/s per shard |
| Shard capacity (read) | 2 MB/s per shard (shared across consumers). Enhanced fan-out: 2 MB/s per consumer per shard |
| Retention | Default 24 hours; extended up to 365 days (additional cost) |
| Record size | Max 1 MB |
| Partition key | Determines shard routing. Use high-cardinality keys to avoid hot shards |
Kinesis Consumer Types:
| Consumer Type | Pull or Push | Limit | Use Case |
|---|---|---|---|
| Classic (GetRecords) | Pull | 2 MB/s shared per shard | Multiple consumers sharing a shard's bandwidth |
| Enhanced Fan-out | Push (HTTP/2) | 2 MB/s per consumer per shard | Multiple independent consumers needing full bandwidth |
Critical: Enhanced Fan-out uses a dedicated throughput per consumer per shard. If 3 Lambda functions all read from the same shard, with standard consumers they share 2 MB/s. With Enhanced Fan-out, each gets their own 2 MB/s.
Kinesis vs SQS Decision:
| Requirement | Choose |
|---|---|
| Multiple consumers reading the same data | Kinesis |
| Real-time analytics / time-ordered stream | Kinesis |
| Replay historical data | Kinesis |
| Simple decoupling, one consumer | SQS |
| Guaranteed at-least-once delivery | SQS |
| Large volume, order doesn't matter | SQS Standard |
9.2 Kinesis Firehose & Analytics
Kinesis Data Firehose:
- Fully managed delivery stream — no shards, no consumers to manage.
- Batches, compresses, transforms, and delivers data to: S3, Redshift, OpenSearch, Splunk, HTTP endpoints.
- Near-real-time: minimum 60 second buffer or 1 MB (whichever comes first).
- Can invoke Lambda for data transformation before delivery.
Kinesis Data Analytics:
- Run standard SQL or Apache Flink queries on streaming data in real time.
- Sources: Kinesis Data Streams, Kinesis Firehose.
- Outputs: Kinesis Data Streams, Kinesis Firehose, Lambda.
10. AWS SAM — Serverless Application Model
SAM is an open-source framework that extends CloudFormation with simplified syntax for serverless resources. It transforms SAM templates into CloudFormation templates during deployment.
10.1 SAM Template Anatomy
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31 # ← Mandatory: signals SAM transform
Description: My Serverless App
Globals: # ← Apply settings to all functions
Function:
Runtime: python3.12
Timeout: 30
MemorySize: 256
Environment:
Variables:
TABLE_NAME: !Ref MyTable
Resources:
# SAM Resource Type: AWS::Serverless::Function
MyFunction:
Type: AWS::Serverless::Function
Properties:
Handler: src/handler.lambda_handler
CodeUri: src/
Events:
ApiEvent:
Type: Api
Properties:
Path: /users
Method: GET
SQSEvent:
Type: SQS
Properties:
Queue: !GetAtt MyQueue.Arn
BatchSize: 10
# SAM Resource Type: AWS::Serverless::Api
MyApi:
Type: AWS::Serverless::Api
Properties:
StageName: prod
Auth:
DefaultAuthorizer: MyCognitoAuthorizer
Authorizers:
MyCognitoAuthorizer:
UserPoolArn: !GetAtt MyUserPool.Arn
# SAM Resource Type: AWS::Serverless::SimpleTable (DynamoDB)
MyTable:
Type: AWS::Serverless::SimpleTable
Properties:
PrimaryKey:
Name: userId
Type: String
SAM Resource Types (Shorthand vs CloudFormation):
| SAM Type | Expands To |
|---|---|
AWS::Serverless::Function |
Lambda Function + IAM Role + Event Source Mappings |
AWS::Serverless::Api |
API Gateway REST API + Deployment + Stage |
AWS::Serverless::HttpApi |
API Gateway HTTP API |
AWS::Serverless::SimpleTable |
DynamoDB Table |
AWS::Serverless::StateMachine |
Step Functions State Machine |
AWS::Serverless::Application |
SAR (Serverless Application Repository) reference |
AWS::Serverless::LayerVersion |
Lambda Layer |
10.2 SAM CLI Commands
| Command | Purpose |
|---|---|
sam init |
Scaffold a new serverless application from a template |
sam build |
Build the application locally (installs dependencies into .aws-sam/) |
sam local invoke |
Invoke a Lambda function locally in a Docker container |
sam local start-api |
Start a local API Gateway instance for testing |
sam local start-lambda |
Start a local Lambda endpoint for SDK testing |
sam local generate-event |
Generate sample event payloads (S3, SQS, API GW, etc.) |
sam validate |
Validate the SAM template |
sam deploy |
Deploy to AWS (creates/updates CloudFormation stack) |
sam deploy --guided |
Interactive deployment wizard (creates samconfig.toml) |
sam logs |
Tail CloudWatch Logs for a deployed Lambda function |
sam sync |
Hot-swap code changes without a full CloudFormation deployment |
Key Concept:
sam deploy --guidedcreates asamconfig.tomlfile that saves your deployment preferences. Subsequentsam deploycommands use this config without the--guidedflag.
11. AWS SDK — Patterns & Error Handling
11.1 Exponential Backoff & Retry Logic
AWS SDK retries on transient errors (network issues, throttling) automatically. However, understanding the retry mechanism is critical for the exam.
Throttling Errors (Retryable):
ProvisionedThroughputExceededException(DynamoDB)ThrottlingExceptionRequestLimitExceeded- HTTP 429, HTTP 503 (Service Unavailable)
- HTTP 500 (Internal Server Error) — retryable
Non-Retryable Errors (Client Errors):
ValidationExceptionAccessDeniedExceptionResourceNotFoundException- HTTP 400 (Bad Request) — generally non-retryable
Exponential Backoff Formula:
Wait Time = min(cap, base × 2^attempt) + random jitter
Example:
Attempt 1: wait 1s + jitter
Attempt 2: wait 2s + jitter
Attempt 3: wait 4s + jitter
Attempt 4: wait 8s + jitter
"Full Jitter": randomize the wait to spread out request storms
Wait = random(0, min(cap, base × 2^attempt))
Exam Tip: If you see
ProvisionedThroughputExceededExceptionfrom DynamoDB, the solution is exponential backoff — not to immediately increase capacity. Throttling often indicates a burst that can be absorbed by retrying.
11.2 Pagination Patterns
Many AWS API calls return paginated results. Always implement pagination in production code.
# DynamoDB Scan with pagination (Python)
paginator = dynamodb.get_paginator('scan')
pages = paginator.paginate(TableName='MyTable')
for page in pages:
for item in page['Items']:
process(item)
# Manual pagination using ExclusiveStartKey
response = table.scan()
items = response['Items']
while 'LastEvaluatedKey' in response:
response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
items.extend(response['Items'])
Common Mistake: Calling
ScanorQuerywithout handlingLastEvaluatedKeywill only retrieve the first page (~1 MB) of results. Always check for and useLastEvaluatedKeyto iterate all pages.
12. Exam Tips & Quick Reference
Scenario-to-Answer Mapping
| Scenario Keyword / Requirement | Correct Answer |
|---|---|
| "Decouple services; one publisher, many subscribers" | SNS Fan-out (SNS → multiple SQS) |
| "Ensure exactly-once processing, preserve order" | SQS FIFO Queue |
| "Pause a Lambda function and wait for human approval" | Step Functions Task Token pattern |
| "Cache DynamoDB reads; reduce read load" | DAX (eventually consistent only) |
| "Real-time stream; multiple consumers replay data" | Kinesis Data Streams |
| "Deliver stream data to S3 with transformation" | Kinesis Firehose + Lambda transform |
| "Deploy serverless app with infrastructure as code" | AWS SAM |
| "API Gateway returns 429" | Throttling — increase limits or implement exponential backoff |
| "Lambda times out during SQS processing" | Increase Lambda timeout AND SQS visibility timeout |
| "Store shared dependencies across Lambda functions" | Lambda Layers |
| "Eliminate Lambda cold starts for critical endpoint" | Provisioned Concurrency |
| "Route S3 events to multiple services with rich filtering" | S3 → EventBridge (not S3 native events) |
| "API Gateway → DynamoDB without Lambda" | AWS Service Integration |
| "Allow user to upload directly to S3 securely" | Pre-signed URL (PUT) |
| "Step Functions — high throughput, short duration" | Express Workflow |
| "Step Functions — long running, human approval" | Standard Workflow |
| "Lambda cannot send result to SQS after success" | Lambda Destinations (not DLQ) |
| "Read all items in DynamoDB; large dataset" | Parallel Scan with multiple workers |
Common Traps
- Lambda timeout vs SQS visibility timeout: Lambda timeout is the max execution time. SQS visibility timeout is how long the message stays hidden. If Lambda timeout > visibility timeout, a second consumer may receive and process the same message concurrently. Always set visibility timeout ≥ Lambda timeout × 6.
- GSI vs LSI: LSI must be created at table creation. GSI can be added anytime. GSI only supports eventual consistency. Don't confuse them.
- SQS DLQ vs Lambda DLQ vs Lambda Destinations: SQS DLQ is on the queue. Lambda async DLQ is on the Lambda function. Lambda Destinations replaces Lambda DLQ for async invocations and supports both success and failure routing.
- Kinesis standard vs enhanced fan-out: Standard consumers share 2 MB/s per shard. Enhanced fan-out gives each consumer their own 2 MB/s per shard but costs more.
- SAM Transform:
Transform: AWS::Serverless-2016-10-31is mandatory in SAM templates. Without it, CloudFormation does not recognize SAM resource types. - EventBridge vs CloudWatch Events: EventBridge is the evolution of CloudWatch Events. Both use the same underlying API, but EventBridge adds custom buses, SaaS partners, schema registry, and pipes. For new development, always use EventBridge.
Key Terms — Domain 1
| Term | One-Line Definition |
|---|---|
| Cold Start | The latency added when Lambda initializes a new execution environment from scratch |
| Provisioned Concurrency | Pre-warmed Lambda execution environments that eliminate cold starts |
| Reserved Concurrency | A guaranteed concurrency limit for one function; also caps its maximum |
| Visibility Timeout | Time an SQS message is hidden from other consumers after being received |
| Dead-Letter Queue (DLQ) | Destination for messages that failed processing after maxReceiveCount attempts |
| Partition Key | The attribute that determines which DynamoDB partition stores the data |
| Sort Key | The secondary key that enables range queries within a partition |
| RCU / WCU | Read/Write Capacity Units — the billing and throughput units for DynamoDB |
| Pre-signed URL | A time-limited URL that grants temporary access to a private S3 object |
| Fan-out Pattern | One SNS message triggers multiple SQS queues in parallel |
| Event Source Mapping | Lambda's built-in polling mechanism for SQS, Kinesis, and DynamoDB Streams |
| Shard | The fundamental unit of capacity in Kinesis Data Streams |
| Enhanced Fan-out | Kinesis feature giving each consumer dedicated 2 MB/s per shard bandwidth |
| Task Token | A unique identifier in Step Functions that pauses a workflow until the token is returned |
| SAM Transform | CloudFormation macro that expands SAM shorthand into full CloudFormation resources |
End of Domain 1. Continue to Domain 2: Security →
Ready to test yourself?
Practice questions for this topic