AWSDVA-C02

Domain 1: Development with AWS Services

Topic 1 of 4 · Study notes

AWS Certified Developer – Associate (DVA-C02)

Domain 1: Development with AWS Services

Exam Code: DVA-C02 | Level: Associate
Domain Weight: 32% | Total Domains: 4 | Passing Score: 720/1000

Amazon S3
Amazon DynamoDB
Amazon ElastiCache
- 3.1 Redis vs Memcached
- 3.2 Caching Strategies
AWS Lambda
Amazon Kinesis
Amazon SQS
Amazon SNS
- 7.1 Topics, Subscriptions & Fan-out
- 7.2 SNS FIFO & Message Filtering
Amazon API Gateway
Amazon ECS & ECR
- 9.1 Launch Types & Task Definitions
- 9.2 IAM, Storage & Auto Scaling
Amazon EC2, ELB & ASG
- 10.1 IMDS, Security Groups & ELB Types
- 10.2 ASG Scaling Policies
Amazon RDS & Aurora
- 11.1 RDS vs Aurora
- 11.2 RDS Proxy & Integrations
Amazon EFS
- 12.1 Performance Modes & Storage Tiers
AWS Step Functions
- 13.1 State Types & Error Handling
Amazon Cognito
- 14.1 User Pools vs Identity Pools
AWS AppSync
Exam Tips & Quick Reference

1. Amazon S3

S3 is the foundational object storage service. For DVA-C02, focus on access patterns, encryption, events, versioning, pre-signed URLs, and lifecycle management.

1.1 Core Facts & Storage Classes

Concept	Detail
Max object size	5 TB
Single PUT limit	5 GB
Multipart Upload	Required > 5 GB, recommended > 100 MB
Durability	11 nines (99.999999999%) across all storage classes
Default bucket limit	100 per account (soft limit, raiseable)
Bucket names	Globally unique across all AWS accounts and regions

Storage Class	Retrieval	Min Duration	Use Case
S3 Standard	Instant	None	Frequently accessed active data
S3 Intelligent-Tiering	Instant/minutes	None	Unknown or changing access patterns
S3 Standard-IA	Instant	30 days	Infrequent but rapid access required
S3 One Zone-IA	Instant	30 days	Recreatable infrequent data
S3 Glacier Instant	Milliseconds	90 days	Quarterly access, instant retrieval
S3 Glacier Flexible	1–12 hours	90 days	Long-term archive
S3 Glacier Deep Archive	12–48 hours	180 days	Compliance, multi-year retention

1.2 Versioning, Replication & Lifecycle

Versioning:

Enabled at the bucket level. Once enabled, cannot be fully disabled — only suspended.
Deleting a versioned object adds a delete marker — versions are not permanently deleted.
To permanently delete: delete the specific version ID.
Files uploaded before versioning was enabled have version null.

Replication:

Feature	Same-Region (SRR)	Cross-Region (CRR)
Use case	Log aggregation, dev/prod mirroring	Compliance, global latency
Versioning required	Yes (both buckets)	Yes (both buckets)
Existing objects	NOT replicated — use S3 Batch Replication	NOT replicated — use S3 Batch Replication
Delete markers	Optional to replicate	Optional to replicate
Chaining	A→B→C does NOT replicate A objects to C	Same rule

Critical: Replication only applies to new objects after enabling. Delete markers are NOT replicated by default. Deletes with a specific version ID are never replicated.

Lifecycle Rules:

Transition actions: Move objects between storage classes after N days.
Expiration actions: Delete objects or old versions after N days.
Filter by prefix or object tags.
Use a lifecycle rule to abort incomplete multipart uploads after N days (saves cost).

1.3 Pre-signed URLs, CORS & Access Patterns

Pre-signed URLs:
The requester inherits the permissions of the IAM entity that generated the URL.

# Generate a pre-signed URL for GET (download)
url = s3_client.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-bucket', 'Key': 'private-file.pdf'},
    ExpiresIn=3600   # 1 hour
)

# Generate a pre-signed URL for PUT (upload directly to S3)
url = s3_client.generate_presigned_url(
    'put_object',
    Params={'Bucket': 'my-bucket', 'Key': 'upload-key'},
    ExpiresIn=900    # 15 minutes
)

Method	Max Expiry
AWS Console	720 minutes (12 hours)
AWS CLI	604,800 seconds (7 days)
SDK (IAM User)	7 days
SDK (IAM Role / STS)	Until the STS token expires

CORS (Cross-Origin Resource Sharing):

Configure on the target bucket — the bucket that holds the assets being loaded cross-origin.
Required when a browser loads resources from a different S3 bucket origin.
A 403 error on a cross-origin request typically means the CORS config is missing.

[{
  "AllowedOrigins": ["https://www.example.com"],
  "AllowedMethods": ["GET", "PUT"],
  "AllowedHeaders": ["*"],
  "MaxAgeSeconds": 3000
}]

1.4 S3 Events, Access Points & Performance

Event Notifications:

Destination	Notes
SQS	Requires a resource policy on the SQS queue
SNS	Requires a resource policy on the SNS topic
Lambda	Requires a resource-based policy on Lambda
EventBridge	Advanced filtering, 18+ targets, archive and replay

Key Concept: S3 native events are simpler but limited. Use EventBridge for complex routing, filtering by object metadata or prefix, or when you need to send the event to multiple targets simultaneously.

Performance Optimization:

S3 supports 3,500 PUT/s and 5,500 GET/s per prefix. Multiple prefixes multiply throughput.
Randomize key prefixes to distribute requests across S3 partitions and avoid hot partitions.
Use Transfer Acceleration (routes through CloudFront edge locations) for long-distance uploads.
Use Byte-Range Fetches for parallel downloads of large files.
Use S3 Select to run SQL queries directly on CSV/JSON/Parquet in S3 — up to 400% faster and 80% cheaper than downloading the full file.

S3 Access Points:

Each access point has its own DNS hostname and access policy.
VPC origin: restrict an access point to a specific VPC (requires VPC Endpoint).
Simplify bucket policies when multiple teams or prefixes need different permissions.

S3 Object Lambda:

Transform objects on retrieval without copying or modifying the stored object.
Flow: S3 Access Point → S3 Object Lambda Access Point → Lambda function → caller.
Use cases: redact PII, convert XML to JSON, add watermarks, resize images.

Common Trap: Enabling access logging on a bucket and setting the logging destination to the same bucket creates an infinite logging loop. Log storage grows exponentially. Always log to a separate bucket.

2. Amazon DynamoDB

DynamoDB is a fully managed, serverless, key-value and document NoSQL database. It delivers single-digit millisecond performance at any scale. The DVA-C02 exam tests DynamoDB heavily — expect capacity calculations, index decisions, and stream/TTL behavior.

2.1 Data Model & Key Design

┌─────────────────────────────────────────────────────────────────────┐
│                     DynamoDB Key Design                              │
│                                                                       │
│  Option 1: Simple Primary Key (Partition Key only)                   │
│  ┌──────────────────────────────────────────────┐                    │
│  │  PK (Partition Key)   │  Attributes...        │                    │
│  │  UserID = "U-001"     │  name, email, role    │                    │
│  └──────────────────────────────────────────────┘                    │
│  PK must be unique per item.                                          │
│                                                                       │
│  Option 2: Composite Primary Key (Partition Key + Sort Key)          │
│  ┌──────────────┬───────────────────┬───────────────────┐            │
│  │  PK           │  SK               │  Attributes       │            │
│  │  OrderID-001  │  2024-01-15       │  amount = 99.00   │            │
│  │  OrderID-001  │  2024-02-20       │  amount = 149.00  │            │
│  │  OrderID-001  │  2024-03-10       │  amount = 49.00   │            │
│  └──────────────┴───────────────────┴───────────────────┘            │
│  PK + SK must be unique. Multiple items share the same PK.           │
└─────────────────────────────────────────────────────────────────────┘

Partition Key Design Best Practices:

Use high-cardinality attributes (e.g., UserID, OrderID, DeviceID) — many distinct values prevent hot partitions.
Avoid low-cardinality keys (e.g., Status with 3 values) — all traffic concentrates on a few partitions.
For write sharding on low-cardinality keys: append a random suffix (Candidate_A#3) and fan-out reads across all suffixes.

Critical Concept: DynamoDB distributes data across partitions based on the partition key hash. A hot partition — one partition key receiving disproportionate traffic — will throttle writes/reads even if overall provisioned capacity is sufficient.

2.2 Read/Write Capacity & Capacity Math

Mode	How It Works	Best For
Provisioned (with Auto Scaling)	Set RCU/WCU. Auto Scaling adjusts within min/max. Cheaper at scale.	Steady, predictable traffic
On-Demand	No planning. Scales instantly. ~2.5x more expensive. No throttling.	Unpredictable spikes, new apps

Capacity modes can be switched once every 24 hours.

Capacity Unit Definitions:

Unit	Strongly Consistent	Eventually Consistent	Transactional
1 RCU	1 read/s for items ≤ 4 KB	2 reads/s for items ≤ 4 KB	2× the cost of strongly consistent
1 WCU	1 write/s for items ≤ 1 KB	Same as strongly consistent	2× the cost of standard write

Worked Examples:

RCU Example — Strongly Consistent:
  Read 10 items/second, each item is 10 KB
  → Each item: CEIL(10 KB / 4 KB) = 3 RCUs per item
  → Total: 10 × 3 = 30 RCUs needed

RCU Example — Eventually Consistent:
  Read 16 items/second, each item is 12 KB
  → Each item: CEIL(12 KB / 4 KB) = 3 RCUs (strong) → 1.5 RCUs (eventual)
  → Total: 16 × 1.5 = 24 RCUs needed

WCU Example:
  Write 20 items/second, each item is 3.5 KB
  → Each item: CEIL(3.5 KB / 1 KB) = 4 WCUs per item
  → Total: 20 × 4 = 80 WCUs needed

Transactional WCU Example:
  Write 10 items/second, each item is 5 KB (transactional)
  → Each item: CEIL(5 KB / 1 KB) = 5 WCUs × 2 (transactional) = 10 WCUs
  → Total: 10 × 10 = 100 WCUs needed

2.3 Indexes — GSI & LSI

Feature	Local Secondary Index (LSI)	Global Secondary Index (GSI)
Partition Key	Same as base table	Any attribute (different PK allowed)
Sort Key	Different from base table	Any attribute
Creation	At table creation only — cannot add later	Anytime — before or after table creation
Consistency	Strongly or eventually consistent	Eventually consistent reads only
Capacity	Shares RCU/WCU with base table	Has its own provisioned RCU/WCU
Per-table limit	Max 5 LSIs	Max 20 GSIs

Critical Exam Trap: If GSI write throughput is insufficient, the main table write operations will be throttled — even if the main table has adequate WCU. Monitor and provision GSI WCU appropriately.

Critical Exam Trap: LSIs must be created at table creation — there is no way to add an LSI to an existing table. GSIs can be added at any time.

Sparse Indexes: Create a GSI on an attribute that only some items have. Items without that attribute are not included in the index. Efficient for queries like "all orders in PENDING status" if most orders are COMPLETED.

2.4 Write Patterns & Transactions

API Operations:

Operation	Description
`PutItem`	Create or fully replace an item. Same PK = overwrite.
`UpdateItem`	Modify specific attributes without replacing. Creates item if missing. Used for atomic counters.
`DeleteItem`	Remove an item. Supports conditional delete.
`GetItem`	Read a single item by primary key. Add `ConsistentRead=true` for strongly consistent.
`Query`	Read items with same PK and optional SK filter. Efficient. 1 MB per call, paginated.
`Scan`	Read ALL items, then optionally filter. Expensive. Avoid for production reads.
`BatchGetItem`	Up to 100 items across tables in one API call.
`BatchWriteItem`	Up to 25 PutItem or DeleteItem operations. Cannot UpdateItem.

Critical: Scan reads every item before applying filters. FilterExpression does NOT reduce the RCU consumed — you pay for all data read regardless of how much is filtered out. Always prefer Query.

Conditional Writes and Optimistic Locking:

# Optimistic Locking — update only if version matches
table.update_item(
    Key={'PK': 'item-001'},
    UpdateExpression='SET price = :p, version = :newver',
    ConditionExpression='version = :currentver',
    ExpressionAttributeValues={
        ':p': 99,
        ':newver': 2,
        ':currentver': 1  # must match current value in DynamoDB
    }
)
# Raises ConditionalCheckFailedException if version does not match

Pattern	API	Idempotent	Use Case
Atomic Counter	`UpdateItem` with ADD	No	Page views, vote tallies
Optimistic Locking	`UpdateItem` with `ConditionExpression`	Yes	Concurrent item updates
Conditional Write	`PutItem`/`UpdateItem` with `attribute_not_exists(pk)`	Yes	Safe insert, no overwrite

Transactions:

# TransactWriteItems — all succeed or all fail
dynamodb.transact_write_items(
    TransactItems=[
        {'Put': {'TableName': 'Orders', 'Item': {...}}},
        {'Update': {'TableName': 'Inventory', 'Key': {...}, 'UpdateExpression': 'SET stock = stock - :q', ...}}
    ]
)

TransactWriteItems: up to 100 write operations atomically.
TransactGetItems: up to 100 read operations atomically.
Transactions consume 2× the normal RCU/WCU.

2.5 DynamoDB Streams, TTL & DAX

DynamoDB Streams:
Captures a time-ordered log of item-level modifications (INSERT, MODIFY, REMOVE). Retention: 24 hours.

Stream View Type	What Is Included
`KEYS_ONLY`	Only the key attributes of the modified item
`NEW_IMAGE`	The entire item after modification
`OLD_IMAGE`	The entire item before modification
`NEW_AND_OLD_IMAGES`	Both pre- and post-modification images

DynamoDB Table → Streams (24h) → Lambda (Event Source Mapping) → downstream processing
                                                                  (search indexing, notifications, replication)

Exam Trap: Enabling Streams alone does NOT trigger Lambda. You must also configure an Event Source Mapping connecting the stream to the Lambda function.

Time to Live (TTL):

Designate a Number attribute as the TTL attribute. Store values as Unix epoch timestamps.
DynamoDB automatically deletes expired items, typically within 48 hours — not guaranteed to the second.
TTL deletes consume no WCU.
TTL deletions appear in Streams as REMOVE events — use this to trigger cleanup or archival workflows.
Expired items that have not yet been deleted will still appear in queries and scans. Filter them in application code.

DynamoDB Accelerator (DAX):

Application → DAX Cluster (in-memory, microsecond reads) → DynamoDB
              Cache Hit  → return immediately
              Cache Miss → fetch from DynamoDB → populate cache → return

Feature	Detail
Latency	Single-digit microseconds vs. single-digit milliseconds for DynamoDB
Consistency	Eventually consistent reads only — does NOT support strongly consistent reads
Write behavior	Write-through: writes go to DynamoDB first, then DAX
Use case	Read-heavy workloads with repeated reads of same data; hot key mitigation
NOT suitable for	Strongly consistent reads, write-heavy workloads, financial/transactional data
Default TTL	Item cache: 5 minutes. Query/Scan cache: 1 minute

Exam Tip: DAX is NOT the answer when the question requires strongly consistent reads or always-fresh data. Use DAX only for eventually consistent, read-heavy workloads.

3. Amazon ElastiCache

ElastiCache is a managed in-memory data store that reduces database load and delivers sub-millisecond latency. Requires application code changes — it is not a transparent add-on.

3.1 Redis vs Memcached

Feature	Redis	Memcached
Multi-AZ / Failover	Yes (automatic failover)	No
Read Replicas	Yes (up to 5 per shard)	No
Data Persistence	Yes (AOF / RDB snapshots)	No
Data Structures	Strings, Hashes, Lists, Sets, Sorted Sets, Bitmaps	Simple key-value only
Pub/Sub	Yes	No
Transactions	Yes	No
Multi-threaded	No (single-threaded)	Yes
Node failure impact	Failover to replica; data survives	Complete data loss
Choose when	HA, persistence, leaderboards, pub/sub, sessions	Pure cache, max throughput, no HA needed

3.2 Caching Strategies

Lazy Loading (Cache-Aside):

Read Request:
  1. Check cache
  2a. Cache Hit  → return data immediately
  2b. Cache Miss → query database → write to cache → return data

Write Request:
  1. Write to database only
  (cache is NOT updated; data becomes stale until TTL expires)

Pros: Only requested data is cached. Node failures are non-fatal (just slower reads temporarily).
Cons: Cache miss incurs 3 round trips (cache check + DB query + cache write). Data can be stale.

Write-Through:

Write Request:
  1. Write to database
  2. Write to cache simultaneously

Pros: Cache is always fresh. Reads are fast.
Cons: Extra write latency. Cache churn if data is written but rarely read.

Best Practice: Combine Lazy Loading as the foundation with Write-Through for hot or frequently-read keys. Always set a TTL to prevent stale data buildup.

4. AWS Lambda

Lambda runs code without provisioning or managing servers. Billed per request and per GB-second of compute duration.

4.1 Core Configuration & Limits

Setting	Value
Memory	128 MB – 10,240 MB (in 1 MB increments)
vCPU	Scales linearly with memory. 1 full vCPU at 1,792 MB
Timeout	Default 3 seconds. Maximum 900 seconds (15 minutes)
Deployment package (zip)	50 MB compressed, 250 MB unzipped
Container image	Up to 10 GB from Amazon ECR
/tmp storage	512 MB default, configurable up to 10,240 MB
Layers	Up to 5 layers per function. Total unzipped size ≤ 250 MB
Default concurrency	1,000 concurrent executions per account per region (soft limit)
Environment variables	Max 4 KB total. Encrypted at rest with KMS

4.2 Invocation Types

┌──────────────────────────────────────────────────────────────────────────┐
│               Lambda Invocation Models                                    │
│                                                                            │
│  SYNCHRONOUS              ASYNCHRONOUS            EVENT SOURCE MAPPING    │
│  (Push — caller waits)    (Fire & Forget)          (Pull — Lambda polls)  │
│  ─────────────────────    ─────────────────────    ──────────────────────  │
│  API Gateway              S3                       SQS                     │
│  ALB                      SNS                      SQS FIFO                │
│  CloudFront               EventBridge              Kinesis Data Streams    │
│  Cognito                  SES                      DynamoDB Streams        │
│  SDK (RequestResponse)    CloudFormation           Amazon MSK (Kafka)      │
│                           CloudWatch Logs                                  │
│                                                                            │
│  Error handling:          Lambda retries 2×        Lambda polls; managed   │
│  returned to caller       DLQ or Destinations      by Lambda service       │
└──────────────────────────────────────────────────────────────────────────┘

Model	Who Handles Retries	DLQ Support	Notes
Synchronous	Caller	No	Errors returned immediately to caller
Asynchronous	Lambda (2 retries with delays)	Yes (SQS or SNS)	Lambda Destinations for success + failure
Event Source Mapping	Lambda service (configurable)	Yes (bisect batch, DLQ)	Lambda polls SQS/Kinesis/DynamoDB Streams

Exam Trap: For SQS → Lambda, the Lambda service internally uses long polling to read from SQS. The consumer is Lambda's event source mapping — your code does not poll SQS directly.

4.3 Execution Environment & Performance

Lambda Execution Lifecycle:
  ┌──────────────┐    ┌────────────────┐    ┌───────────────┐
  │  INIT Phase  │ → │  INVOKE Phase  │ → │ SHUTDOWN Phase│
  │  (Cold Start)│    │  (your handler)│    │               │
  │              │    │                │    │               │
  │ Download code│    │ Execute handler│    │ Environment   │
  │ Start runtime│    │                │    │ frozen or     │
  │ Run init code│    │                │    │ destroyed     │
  └──────────────┘    └────────────────┘    └───────────────┘
       ↑
  Code OUTSIDE handler runs here.
  Put DB connections, SDK clients, config loading here.
  They persist across warm invocations — reuse them.

Cold Start vs Warm Start:

Cold Start: Lambda provisions a new execution environment. Adds 100ms to several seconds depending on runtime and package size.
Warm Start: Lambda reuses an existing environment. The Init Phase is skipped.

Performance Best Practices:

Practice	Why
Initialize DB connections outside the handler	Connection persists across warm invocations — eliminates connection overhead on every call
Increase memory allocation	CPU scales linearly with memory — the only way to add more CPU
Use Provisioned Concurrency	Pre-warms execution environments — eliminates cold starts entirely
Reduce deployment package size	Smaller packages download and initialize faster
Avoid Spring Framework in Java	Spring's startup time is a major cold start contributor

4.4 Concurrency & Throttling

Account Concurrency Pool (default: 1,000 per region)
  ┌─────────────────────────────────────────────────────────────┐
  │                                                               │
  │  ┌────────────────────┐  ┌────────────────────────────────┐  │
  │  │ Reserved           │  │ Unreserved (shared pool)       │  │
  │  │ Concurrency        │  │                                │  │
  │  │ (per function cap) │  │ All other functions share this │  │
  │  └────────────────────┘  └────────────────────────────────┘  │
  │                                                               │
  │  Provisioned Concurrency (subset of Reserved):                │
  │  Pre-warms environments → no cold starts                      │
  └─────────────────────────────────────────────────────────────┘

Concurrency Type	Purpose	Cost Implication
Reserved Concurrency	Guarantees a capacity floor for one function; also caps its maximum	No extra cost
Provisioned Concurrency	Pre-warms N execution environments; eliminates cold starts	Billed per hour even when idle
Unreserved	Default shared pool across all functions	Standard Lambda pricing

Critical: Setting Reserved Concurrency to 0 disables the function — no invocations are allowed. Use this to temporarily disable a non-critical function during incidents.

Concurrency Estimation:

Concurrent Executions = (Invocations per second) × (Average Duration in seconds)
Example: 100 req/s × 2s average = 200 concurrent executions needed
If account limit is 1,000, this function alone could use 200 of that pool.

4.5 Versions, Aliases & Layers

Versions:

$LATEST is the mutable working copy. All changes go here first.
Publishing creates an immutable snapshot: V1, V2, V3. Code and configuration are frozen.
Each version has its own ARN: arn:aws:lambda:region:account:function:FunctionName:3

Aliases:

Named pointer to a specific version. Mutable — can be updated without changing the ARN.
Cannot point to another alias — only to numbered versions or $LATEST.
Supports weighted traffic routing for canary deployments.
API Gateway stage variables + Lambda aliases = deploy to different environments without changing API Gateway.

PROD alias → 90% traffic to V2 + 10% traffic to V3  (canary release)
DEV  alias → $LATEST
TEST alias → V1

Lambda Layers:

Package shared libraries, custom runtimes, or configuration data separately from function code.
Up to 5 layers per function. Total unzipped size (function + all layers) ≤ 250 MB.
Layers are immutable once published. Update by publishing a new layer version.

Lambda Destinations:
Available for asynchronous invocations. Captures both success and failure with full event context.

Destination Target	Supported For
SQS	On success and on failure
SNS	On success and on failure
EventBridge	On success and on failure
Another Lambda	On success and on failure

Exam Tip: DLQ only captures failures. Lambda Destinations capture both success and failure events with the full event payload. Prefer Destinations for new serverless architectures.

4.6 Lambda in VPC & Error Handling

Lambda in VPC:

Default (no VPC config):
  Lambda runs in AWS-managed VPC → can reach internet → CANNOT reach your VPC resources (RDS, ElastiCache)

With VPC config:
  Lambda creates an ENI in your specified subnets → can reach VPC resources
  But: Lambda in a public subnet does NOT get a public IP → no internet access
  Solution: Lambda in private subnet + NAT Gateway in public subnet → internet access
  Alternative: VPC Endpoints for S3 and DynamoDB (no NAT needed, cheaper)

Error Handling:

Invocation Type	Retry Behavior	DLQ/Destination
Asynchronous	2 retries with 1-min then 2-min wait	DLQ on Lambda function OR Lambda Destinations
Kinesis/DynamoDB Streams	Retries until success or record expiry; shard pauses	`BisectBatchOnFunctionError`, `MaximumRetryAttempts`, `DestinationConfig`
SQS	Failed batch returned to queue; individual messages re-queued	DLQ configured on the SQS queue — NOT on Lambda

Critical Exam Trap: For SQS → Lambda, configure the DLQ on the SQS queue itself, not on the Lambda function. Lambda's DLQ setting only applies to asynchronous (non-ESM) invocations.

5. Amazon Kinesis

Kinesis is the platform for real-time streaming data. DVA-C02 focuses primarily on Kinesis Data Streams.

5.1 Kinesis Data Streams

┌──────────────────────────────────────────────────────────────────────┐
│                       Kinesis Data Streams                            │
│                                                                        │
│  Producers                Shards                  Consumers           │
│  ┌──────────┐     ┌──────────────────────┐     ┌──────────────────┐  │
│  │ App/IoT  │────►│ Shard 1              │────►│ Lambda           │  │
│  │ Logs     │────►│ Shard 2              │────►│ Kinesis Analytics│  │
│  │ Metrics  │────►│ Shard N              │────►│ Firehose         │  │
│  └──────────┘     └──────────────────────┘     └──────────────────┘  │
│                                                                        │
│  • 1 shard = 1 MB/s or 1,000 records/s write                          │
│  • 1 shard = 2 MB/s read (shared among standard consumers)            │
│  • Enhanced Fan-out: 2 MB/s per consumer per shard (dedicated)        │
│  • Retention: 24 hours default, up to 365 days                        │
│  • Records are ordered per shard; same partition key = same shard     │
└──────────────────────────────────────────────────────────────────────┘

Feature	Detail
Shard write capacity	1 MB/s or 1,000 records/s per shard
Shard read capacity	2 MB/s shared across all standard consumers per shard
Enhanced Fan-out read	2 MB/s dedicated per consumer per shard (push via HTTP/2)
Max record size	1 MB
Retention	Default 24 hours; up to 365 days (additional cost)
Ordering	Guaranteed per shard. Same partition key → same shard → ordered. Not across shards.
Scaling	Manual only — call `UpdateShardCount`. No auto-scaling.

ShardIterator Types:

Type	Behavior
`TRIM_HORIZON`	Start from the oldest record available in the shard
`LATEST`	Start from new records only (ignores existing data)
`AT_SEQUENCE_NUMBER`	Start at a specific sequence number
`AT_TIMESTAMP`	Start from a specific timestamp

KCL (Kinesis Client Library):

Manages shard iterators, checkpointing, and load balancing automatically.
Maximum: 1 KCL worker per shard. Additional workers beyond shard count will be idle.
Checkpoints are stored in a DynamoDB table — throttled DynamoDB = broken checkpoints.

Lambda + Kinesis:

Lambda uses Event Source Mapping. One invocation per shard (up to 10 batches per shard with parallelization factor).
A single bad record blocks the shard indefinitely. Fix: enable BisectBatchOnFunctionError=true and set MaximumRetryAttempts.

5.2 Kinesis Firehose & Analytics

Kinesis Data Firehose:

Fully managed delivery stream — no shards or consumer code required.
Buffers, compresses, transforms, and delivers data.
Destinations: S3, Redshift, OpenSearch, Splunk, HTTP endpoints.
Near-real-time: minimum 60 second buffer or 1 MB buffer (whichever triggers first).
Can invoke a Lambda function for data transformation before delivery.
No replay capability — once delivered, stream data is not retained.

Kinesis Data Analytics (now Amazon Managed Service for Apache Flink):

Run standard SQL or Apache Flink queries on streaming data in real time.
Sources: Kinesis Data Streams, Kinesis Firehose.

5.3 Kinesis vs SQS vs SNS

Requirement	Best Choice
Multiple independent consumers reading the same data	Kinesis Data Streams
Replay historical stream data	Kinesis Data Streams
Real-time analytics on ordered time-series data	Kinesis Data Streams
Simple point-to-point decoupling, one consumer per message	SQS
Guaranteed at-least-once delivery with retry	SQS
Fan-out to multiple consumers simultaneously	SNS → multiple SQS queues
Exactly-once ordered delivery	SQS FIFO

6. Amazon SQS

SQS is a fully managed message queue for decoupling and scaling microservices.

6.1 Standard vs FIFO Queues

Feature	Standard Queue	FIFO Queue
Throughput	Unlimited	300 msg/s; 3,000 msg/s with batching
Delivery guarantee	At-least-once (duplicates possible)	Exactly-once processing
Ordering	Best-effort (not guaranteed)	Strict FIFO within a message group
Deduplication	Application must handle	Built-in (5-minute deduplication window)
Queue name	Any name	Must end with `.fifo`
DLQ type	Must use Standard DLQ	Must use FIFO DLQ
Use case	High throughput, order unimportant	Financial transactions, ordered events

Message Size:

Native SQS maximum: 256 KB.
For larger payloads: use the SQS Extended Client Library — store payload in S3, send an S3 pointer in the SQS message.

6.2 Visibility Timeout, DLQ & Polling

Visibility Timeout:

Timeline:
  t=0s:   Consumer receives message. Message becomes INVISIBLE.
  t=30s:  Default visibility timeout expires.
             → Consumer deleted message (success) → message permanently removed
             → Consumer did NOT delete → message becomes VISIBLE AGAIN → redelivered

Settings:
  Default:  30 seconds
  Minimum:  0 seconds
  Maximum:  12 hours

Best Practice: Set visibility timeout to at least 6× your Lambda function timeout.
If Lambda timeout = 5 minutes, set visibility timeout to 30+ minutes.

Extending Visibility Mid-Processing:
Call ChangeMessageVisibility to extend the timeout before it expires if processing takes longer than expected.

Dead-Letter Queue (DLQ):
After a message fails maxReceiveCount times, it is moved to the DLQ.

Setting	Description
`maxReceiveCount`	Number of receives before moving to DLQ (1–1000). Set on the source queue's redrive policy.
`messageRetentionPeriod`	How long messages stay in queue (60 seconds – 14 days; default 4 days)
DLQ type	Standard queue DLQ → Standard. FIFO queue DLQ → must also be FIFO.

Critical Exam Trap: For Lambda + SQS Event Source Mapping, configure the DLQ on the SQS queue — not on the Lambda function. Lambda's function-level DLQ only applies to asynchronous (non-ESM) invocations.

Polling Modes:

Mode	Behavior	Recommendation
Short Polling	Returns immediately even if empty. Queries a random subset of SQS servers.	Avoid — empty responses are billed
Long Polling	Waits up to 20 seconds for messages. Queries all SQS servers.	Always preferred — reduces cost and empty responses

Enable long polling: set ReceiveMessageWaitTimeSeconds > 0 (up to 20 seconds).

6.3 Fan-out & FIFO Deep Dive

Fan-out Pattern (SNS → SQS):

S3 Event → SNS Topic ──→ SQS Queue A → Consumer A (Order Processing)
                     ├──→ SQS Queue B → Consumer B (Inventory Update)
                     └──→ SQS Queue C → Consumer C (Analytics)

Problem solved: S3 can only send a native event to ONE destination.
Solution: S3 → SNS → multiple SQS queues processes events in parallel independently.

SQS FIFO — Message Groups:
Use MessageGroupId to control parallel processing within a FIFO queue. Messages in the same group are processed in strict order by one consumer. Messages in different groups can be processed in parallel.

Single MessageGroupId → single active consumer → throughput bottleneck.
Multiple MessageGroupIds → parallel processing, each group remains ordered internally.

Delay Queue:

Hides a new message for N seconds after it is published (producer-side delay).
Default: 0 seconds. Maximum: 15 minutes.
Different from Visibility Timeout: delay is applied at publish time, not at receive time.

7. Amazon SNS

SNS is a fully managed pub/sub messaging service. Publishers send to a topic; all subscribers receive a copy.

7.1 Topics, Subscriptions & Fan-out

Supported Subscriber Protocols: SQS, Lambda, HTTP/HTTPS, Email, Email-JSON, SMS, Mobile Push (APNs, FCM), Kinesis Firehose.

SNS Message Format: JSON (not XML). Contains: MessageId, Subject, Message, UnsubscribeURL, Timestamp.

Fan-out Pattern:
One SNS topic publishes to multiple SQS queues simultaneously, allowing independent parallel consumers.

Best Practice: SNS → SQS fan-out is preferred over SNS → Lambda fan-out because SQS provides buffering, retry, rate control, and a DLQ for downstream Lambda functions.

SNS FIFO Topics:

Strictly ordered delivery to SQS FIFO queues only (cannot fan-out to Lambda, HTTP, or Email).
Built-in deduplication with a 5-minute window.
Throughput: 300 msg/s (3,000 with batching) — same as SQS FIFO.

Message Filtering (Subscription Filter Policies):
Each subscriber can define a JSON filter policy to receive only matching messages based on message attributes.

// This SQS queue only receives messages where type is "order" AND priority is "high"
{
  "type": ["order"],
  "priority": ["high"]
}

Without a filter policy, a subscriber receives every message published to the topic.

8. Amazon API Gateway

API Gateway is the managed front door for backend services. It handles authentication, throttling, caching, monitoring, and request/response transformation.

8.1 API Types & Endpoint Types

Feature	REST API	HTTP API	WebSocket API
Cost	Higher	~70% cheaper than REST	Per-message + connection
Latency	Higher	~60% lower than REST	Persistent connection
Usage Plans / API Keys	Yes	No	No
Request Validation	Yes	No	No
Mapping Templates	Yes	No	No
Canary Deployments	Yes	No	No
Resource Policies	Yes	No	No
Use case	Feature-rich public APIs	Low-latency Lambda proxy	Real-time bidirectional (chat, gaming)

Exam Tip: If the question mentions lower cost, lower latency, or simpler Lambda proxy — choose HTTP API. If it requires request validation, usage plans, API keys, or canary deployments — choose REST API.

Endpoint Types:

Type	How Traffic Routes	ACM Certificate Region
Edge-Optimized (default)	Through CloudFront edge locations globally	Must be in us-east-1
Regional	Directly to the region; add CloudFront manually for caching	Same region as the API
Private	Via Interface VPC Endpoint only; add a resource policy	N/A

Exam Trap: For Edge-Optimized APIs, the ACM certificate must be in us-east-1 regardless of the API's deployed region. This is one of the most commonly tested API Gateway gotchas.

8.2 Integration Types

Type	Behavior	Use Case
Lambda Proxy	Full request forwarded as-is. Lambda must format the full HTTP response. No mapping templates.	Most common; maximum flexibility
Lambda Custom	API GW transforms request/response via Velocity mapping templates.	Data transformation before Lambda
AWS Service	Directly integrates with DynamoDB, SQS, SNS etc. No Lambda needed.	Reduce hops; lower cost
HTTP Proxy	Forwards request to an HTTP endpoint unchanged.	Third-party APIs, on-premises
Mock	API GW returns a hardcoded response without any backend call.	Testing, development stubs

Lambda Proxy — Event and Response:

# Event received by Lambda from API Gateway Proxy integration
{
    "httpMethod": "GET",
    "path": "/users/123",
    "pathParameters": {"userId": "123"},
    "queryStringParameters": {"sort": "asc"},
    "headers": {"Authorization": "Bearer ..."},
    "body": None,
    "requestContext": {"stage": "prod"}
}

# Lambda MUST return this structure for API Gateway to forward correctly
{
    "statusCode": 200,
    "headers": {"Content-Type": "application/json"},
    "body": '{"message": "Hello"}',
    "isBase64Encoded": False
}

502 Bad Gateway from API Gateway almost always means the Lambda function returned a malformed response — missing statusCode, wrong body format, or a Python exception that was not caught.

8.3 Stages, Throttling & Caching

Stages and Deployments:

Changes to the API definition are NOT live until you create a deployment to a stage.
Stage URL: https://api-id.execute-api.region.amazonaws.com/stage-name
Stage Variables act as environment variables for a stage. Use them to route to different Lambda aliases or HTTP endpoints per environment.

# Lambda ARN with stage variable — no API Gateway changes needed when Lambda alias updates
arn:aws:lambda:us-east-1:123456789:function:MyFunc:${stageVariables.lambdaAlias}

Throttling:

Level	Default	Override
Account-level	10,000 RPS, 5,000 burst	Request increase via AWS Support
Stage-level	Inherits account default	Set in Stage Settings
Method-level	Inherits stage default	Per-method override in Stage Settings
Usage Plan (per API Key)	Customer-defined	Set RPS rate and monthly quota

When throttled: HTTP 429 Too Many Requests.

Caching:

Enabled at the stage level. Capacity: 0.5 GB – 237 GB.
TTL: 0–3600 seconds (default: 300 seconds).
Client cache invalidation: send Cache-Control: max-age=0 header (requires IAM permission).

Usage Plans and API Keys Setup Order:

Create API → require API key on methods → deploy to stage.
Generate or import API keys.
Create usage plan (define throttle rate and monthly quota).
Associate the API stage and API keys with the usage plan via CreateUsagePlanKey.

Exam Trap: A newly created API key returns 403 Forbidden until CreateUsagePlanKey has been called to associate the key with a usage plan. Creating the key alone is not sufficient.

8.4 Authorizers & Security

Authorizer	Mechanism	Best For
IAM	Caller signs request with SigV4	AWS internal services, EC2, Lambda callers
Lambda (Custom)	Lambda function validates token/headers and returns IAM policy	Third-party JWT, OAuth, custom auth logic
Cognito	API GW validates Cognito JWT automatically — no Lambda needed	Mobile/web app users, social login

Exam Tip for Cognito Authorizer: Cognito handles authentication (who you are). Authorization (what you can do) must be implemented in your backend code — API GW does not enforce method-level permissions based on Cognito user attributes.

Error Codes to Know:

Code	Cause
400	Malformed request or failed request validation
403	IAM authorization denied or WAF rule blocked the request
429	Throttling — account, stage, or method limit exceeded
502	Backend returned malformed response (Lambda format error)
504	Integration timeout — backend exceeded the 29-second hard limit

8.5 WebSocket & CORS

WebSocket API:

Maintains persistent bidirectional connections between clients and the backend.
Built-in routes: $connect (on connection open), $disconnect (on close), $default (no matching route).
Custom routes matched by a route selection expression on incoming JSON (e.g., $request.body.action).
Server-to-client push: POST to https://{api-id}.execute-api.{region}.amazonaws.com/{stage}/@connections/{connectionId}.
Store connectionId in DynamoDB to enable server-initiated pushes.

CORS:

For Lambda Proxy integration: Lambda must return Access-Control-Allow-Origin and other CORS headers in its response.
For non-proxy integrations: enable CORS in the API Gateway console (creates an OPTIONS method automatically).
The browser sends a preflight OPTIONS request before the actual method. OPTIONS must return 200 with correct headers.

9. Amazon ECS & ECR

9.1 Launch Types & Task Definitions

Feature	EC2 Launch Type	Fargate Launch Type
Infrastructure	You provision and manage EC2 instances	Serverless — AWS manages all infrastructure
Scaling	Two layers: tasks + EC2 instances	Task level only
OS access	Yes	No
Use when	Need OS-level control, maximize cost efficiency	No server management, simpler operations

Task Definitions:

JSON blueprint defining how containers run: image URI, port mappings, CPU, memory, IAM role, environment variables, volumes, logging.
Up to 10 containers per task definition.
Environment variables are defined in the task definition's environment parameter — not in the service definition.

9.2 IAM, Storage & Auto Scaling

IAM — Two Separate Roles:

EC2 Instance Profile → Used by the ECS Agent on the EC2 host
                       (pull images from ECR, publish logs to CloudWatch)

ECS Task Role        → Used by your application code inside the container
                       (access S3, DynamoDB, Secrets Manager, etc.)

ECS_ENABLE_TASK_IAM_ROLE=true must be set in the ECS Agent config on EC2 launch type hosts.
One task role per task definition for least privilege.

Critical Exam Trap: EC2 Instance Profile and ECS Task Role are entirely separate. The task role is what your application uses. The instance profile is what the ECS daemon uses.

Storage Options:

Bind Mounts: Share data between containers in the same task. Ephemeral — lost when task stops.
EFS: Persistent shared storage across tasks and AZs. Fargate + EFS = fully serverless persistent storage.
S3 cannot be mounted as a file system in ECS. Access S3 via the SDK only.

X-Ray on ECS:

EC2 launch type: X-Ray daemon as a sidecar container (one per task) or daemon on the EC2 host.
Fargate: sidecar container only — no daemon on the host.
Task role requires: xray:PutTraceSegments, xray:PutTelemetryRecords.

10. Amazon EC2, ELB & ASG

10.1 IMDS, Security Groups & ELB Types

Instance Metadata Service (IMDS):

URL (reachable only from within the instance): http://169.254.169.254/latest/meta-data/
IMDSv2 (recommended): two-step — PUT to get session token, then GET with token. Enforced via HttpTokens=required in the Launch Template.
Can retrieve: instance-id, public-ipv4, local-ipv4, IAM role name (NOT the IAM policy document).

Security Groups:

Stateful — return traffic is automatically allowed.
Allow rules only — no explicit DENY rules (use NACLs for deny).
Connection Timeout = Security Group is blocking traffic. Connection Refused = application issue.
To restrict EC2 access to ALB only: configure EC2 SG inbound rule to allow traffic from the ALB Security Group (not a CIDR range).

ELB Type Comparison:

Type	OSI Layer	Static IP	SNI Support	Cross-Zone Default	Key Use Case
ALB	Layer 7 (HTTP)	No (DNS only)	Yes	On — no extra charge	Path/host/query routing, microservices
NLB	Layer 4 (TCP/UDP)	Yes (Elastic IP per AZ)	Yes	Off — charged if enabled	Ultra-high throughput, static IP, UDP
GWLB	Layer 3 (IP)	No	No	Off — charged if enabled	3rd-party security appliances (GENEVE port 6081)
CLB	Layer 4+7	No	No	Off — no extra charge	Legacy only

Deregistration Delay:

ALB/NLB: called Deregistration Delay (default 300s, range 0–3600s).
CLB: called Connection Draining.
Allows in-flight requests to complete before the instance is removed from the target group.
If file uploads fail during scale-in events, increase the Deregistration Delay.

10.2 ASG Scaling Policies

Policy	How It Works	Best For
Target Tracking	Set a target metric value. AWS automatically adds/removes instances to maintain it.	Simplest; most commonly used
Step Scaling	CloudWatch Alarm triggers. Different actions based on alarm magnitude.	Fine-grained control
Scheduled Scaling	Pre-configure capacity changes at specific times.	Predictable recurring traffic patterns
Predictive Scaling	ML-based. Forecasts load from historical patterns. Scales proactively.	Recurring patterns with unknown timing

Scaling Cooldown (default 300s): ASG ignores all new scaling requests during the cooldown period after a scaling event.
Instance Refresh: rolls out a new AMI across the fleet. Set MinHealthyPercentage to control capacity during rollout.
Lifecycle Hooks: pause instance in Pending:Wait (launch) or Terminating:Wait (terminate) for custom logic.

11. Amazon RDS & Aurora

11.1 RDS vs Aurora

Feature	RDS (MySQL/PostgreSQL)	Aurora
Supported engines	MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, IBM DB2	MySQL-compatible and PostgreSQL-compatible only
Read Replicas	Up to 5	Up to 15
Replica lag	Seconds	Sub-10 milliseconds
Failover time	1–2 minutes	Under 30 seconds
Storage	Manual or auto-scaling	Automatically grows in 10 GB increments up to 128 TB
HA copies	2 copies (Multi-AZ)	6 copies across 3 AZs
Backtrack	No	Yes — restore to any point without using a backup
Choose when	Need Oracle, SQL Server, MariaDB, or cost optimization	High availability and performance on MySQL/PostgreSQL

Multi-AZ vs Read Replicas:

Dimension	Multi-AZ	Read Replicas
Purpose	High availability and disaster recovery	Read scaling
Replication	Synchronous	Asynchronous
Standby serves reads	No — standby is passive	Yes — update connection string to use replica endpoint
Failover	Automatic (single DNS endpoint)	Manual promotion

11.2 RDS Proxy & Integrations

RDS Proxy:

Fully managed connection pool between application and RDS/Aurora.
Critical for Lambda + RDS: Lambda can create thousands of short-lived connections that overwhelm the database. RDS Proxy pools and reuses connections.
Reduces failover time by up to 66%.
Deployed inside a VPC — never publicly accessible.
Enforces IAM authentication; credentials stored in Secrets Manager.

Integration Patterns:

ECS/Fargate: Pass RDS connection string as an environment variable. Store credentials in Secrets Manager.
Elastic Beanstalk: For development, RDS can be inside the EB environment. For production, always create RDS separately — RDS inside EB is deleted when the environment is deleted.
Lambda: Always use RDS Proxy to avoid connection exhaustion. Lambda must be in the same VPC as RDS.

12. Amazon EFS

EFS is a fully managed NFS (NFSv4.1) file system that can be mounted by multiple EC2 instances simultaneously across multiple AZs.

12.1 Performance Modes & Storage Tiers

Performance Modes (set at creation — cannot change later):

General Purpose (default): low latency; suitable for web servers, CMS, development environments.
Max I/O: higher aggregate throughput; suitable for big data, media processing, highly parallel workloads.

Throughput Modes (can change after creation):

Bursting: throughput scales with the amount of data stored.
Provisioned: set a fixed throughput regardless of storage size.
Elastic (recommended): automatically adjusts throughput based on workload. Best for unpredictable traffic.

Storage Tiers:

Standard: frequently accessed files.
EFS Infrequent Access (EFS-IA): lower storage cost; retrieval fee applies. Use lifecycle policy to move files automatically.
Archive: for files accessed only a few times per year.

EFS vs EBS vs S3:

Dimension	EFS	EBS	S3
Access	Multiple EC2 instances across AZs simultaneously	Single EC2 instance (mostly)	HTTP/HTTPS from anywhere
OS compatibility	Linux only (POSIX)	Linux and Windows	Any OS, any client
Latency	Low milliseconds	Single-digit milliseconds	100–200 milliseconds
Mount as file system	Yes	Yes	No — accessed via SDK/CLI only
Lambda access	Yes (must be in VPC, use Access Point)	No	Yes (directly)
Use case	Shared content, CMS, home directories	OS volumes, database storage	Object storage, backups, static assets

13. AWS Step Functions

Step Functions orchestrates distributed applications as visual, auditable workflows called State Machines.

13.1 State Types & Error Handling

State Machine Types:

Type	Max Duration	Execution Model	Use Case
Standard	Up to 1 year	Exactly-once	Long-running, auditable, human approval workflows
Express	Up to 5 minutes	At-least-once	High-volume, short-duration, IoT processing

State Types:

State	Purpose
Task	Execute work via Lambda, SNS, SQS, DynamoDB, ECS, Step Functions, and more
Choice	Conditional branching based on input values
Wait	Pause execution for a set duration or until a specific timestamp
Parallel	Execute multiple branches concurrently; waits for all branches to complete
Map	Iterate over an array, applying the same states to each element
Pass	Passes input directly to output; injects static data
Succeed	Ends execution successfully
Fail	Ends execution with a failure

Error Handling:

{
  "Type": "Task",
  "Resource": "arn:aws:lambda:...",
  "Retry": [{
    "ErrorEquals": ["Lambda.TooManyRequestsException"],
    "IntervalSeconds": 2,
    "MaxAttempts": 3,
    "BackoffRate": 2
  }],
  "Catch": [{
    "ErrorEquals": ["States.ALL"],
    "Next": "ErrorHandlerState",
    "ResultPath": "$.error"
  }]
}

Exam Tip: ResultPath: "$.error" preserves the original input and appends error details as a new key. Use this to pass both the original event data and error information to the error handler state.

Wait for Callback (Task Token Pattern):
Step Functions pauses and sends a task token to an external system (via SQS, Lambda, or SNS). The workflow resumes only when the external system calls SendTaskSuccess or SendTaskFailure with the token. Used for human approval workflows and long-running third-party integrations.

14. Amazon Cognito

Cognito provides authentication and authorization for web and mobile applications. Two distinct services work together.

14.1 User Pools vs Identity Pools

┌─────────────────────────────────────────────────────────────────────┐
│                  Cognito Full-Stack Flow                              │
│                                                                       │
│  Mobile App                                                           │
│     │                                                                 │
│     ▼                                                                 │
│  Cognito User Pool (CUP) ────────► JWT Token (ID + Access + Refresh) │
│     │                                                                 │
│     ▼                                                                 │
│  Cognito Identity Pool (CIP) ───► STS AssumeRoleWithWebIdentity      │
│     │                                                                 │
│     ▼                                                                 │
│  Temporary AWS Credentials (AccessKey + SecretKey + SessionToken)    │
│     │                                                                 │
│     ▼                                                                 │
│  Direct SDK access to S3, DynamoDB, API Gateway, and more            │
└─────────────────────────────────────────────────────────────────────┘

Feature	User Pools (CUP)	Identity Pools (CIP)
Purpose	Authentication — who are you?	Authorization — what AWS resources can you access?
Output	JWT tokens (ID token, Access token, Refresh token)	Temporary AWS credentials via STS
Access controls	Your APIs, ALB, API Gateway	AWS services directly (S3, DynamoDB, Kinesis)
Guest access	No	Yes — unauthenticated identities
Federation	Facebook, Google, SAML, OIDC	Cognito User Pools, SAML, social IdPs, developer-authenticated

Fine-Grained Access Control with Identity Pools:

// IAM policy: restrict each user to their own S3 prefix
{
  "Effect": "Allow",
  "Action": "s3:GetObject",
  "Resource": "arn:aws:s3:::my-bucket/${cognito-identity.amazonaws.com:sub}/*"
}

// IAM policy: restrict each user to their own DynamoDB rows
"Condition": {
  "ForAllValues:StringEquals": {
    "dynamodb:LeadingKeys": ["${cognito-identity.amazonaws.com:sub}"]
  }
}

Lambda Triggers (User Pools):

Trigger	When It Fires	Common Use
Pre Sign-Up	Before registration completes	Block unwanted registrations
Pre Authentication	Before login	Custom validation logic
Post Confirmation	After email/phone verified	Send welcome email
Pre Token Generation	Before JWT is issued	Add or suppress claims in the token
Migrate User	On first login for unknown users	Silently migrate from legacy user store

Critical: The Cognito Hosted UI custom domain requires an ACM certificate in us-east-1, regardless of the User Pool's region.

15. AWS AppSync

AppSync is a fully managed GraphQL API service. It differs from API Gateway in that it uses GraphQL instead of REST, and it provides built-in real-time data subscriptions via WebSocket.

Feature	API Gateway	AppSync
Protocol	REST, HTTP, WebSocket	GraphQL
Real-time	WebSocket API	Built-in subscriptions
Resolvers	Mapping templates (Velocity)	Direct resolvers to DynamoDB, Lambda, RDS, HTTP
Use case	Standard REST APIs	GraphQL APIs, real-time data sync, offline apps

16. Exam Tips & Quick Reference

Scenario-to-Answer Mapping

Scenario Keyword	Correct Answer
One publisher, multiple independent consumers	SNS → multiple SQS queues (fan-out)
Exactly-once processing, strict ordering	SQS FIFO
Multiple consumers reading the same stream, replay possible	Kinesis Data Streams
Deliver stream data to S3 with transformation	Kinesis Firehose + Lambda transform
Real-time bidirectional communication	API Gateway WebSocket API
Cache DynamoDB reads, reduce read load	DAX (eventually consistent only)
Pause workflow, wait for human approval	Step Functions Standard Workflow + Task Token
Pre-warm Lambda to eliminate cold starts	Provisioned Concurrency
Cap a Lambda function at 0 to disable it	Reserved Concurrency = 0
Lambda takes too long; SQS message reprocessed	Increase visibility timeout to 6× Lambda timeout
Lambda + RDS connection errors at scale	Use RDS Proxy for connection pooling
Route S3 events to multiple services with rich filtering	S3 → EventBridge (not S3 native notifications)
Allow user to upload directly to S3	Pre-signed URL (PUT operation)
Deploy serverless app as infrastructure as code	AWS SAM
Share libraries across multiple Lambda functions	Lambda Layers
API Gateway returning 504	Backend exceeded 29-second integration timeout
API Gateway returning 502	Lambda returned malformed response
API Gateway returning 429	Throttling — increase limits or use exponential backoff
New API key returning 403	Call `CreateUsagePlanKey` to link key to usage plan
Multiple EC2 instances need shared file system	Amazon EFS
Store and auto-rotate database credentials	AWS Secrets Manager
Low-cardinality DynamoDB partition key causing throttling	Write sharding with random suffix
DynamoDB streams not triggering Lambda	Enable streams AND create Event Source Mapping
ECS containers need shared persistent storage (Fargate)	EFS — Fargate + EFS = serverless persistent storage

Common Traps

Lambda visibility timeout vs SQS visibility timeout: Lambda timeout is the max execution time. SQS visibility timeout is how long the message stays hidden from other consumers. Set SQS visibility timeout to at least 6× Lambda timeout to prevent duplicate processing.
GSI vs LSI: LSI must be created at table creation — cannot be added later. GSI can be added anytime. GSI throttling also throttles the main table. GSIs do not support strongly consistent reads.
SQS DLQ placement: For Lambda + SQS Event Source Mapping, the DLQ goes on the SQS queue — not on the Lambda function.
Lambda Destinations vs DLQ: DLQ captures failures only. Lambda Destinations capture both success and failure with full event context.
S3 logging to itself: Logging a bucket to itself creates an infinite loop. Always log to a separate bucket.
DAX and strong consistency: DAX does not support strongly consistent reads. Use direct DynamoDB if latest data is always required.
SAM Transform header: Transform: AWS::Serverless-2016-10-31 is mandatory. Without it, CloudFormation does not recognize SAM resource types.
API Gateway Edge-Optimized cert region: ACM certificate for an Edge-Optimized API must always be in us-east-1.
Cognito CUP vs CIP: User Pools produce JWT tokens (for your API). Identity Pools produce temporary AWS credentials (for AWS services). A JWT alone cannot call S3 or DynamoDB — you need Identity Pools to exchange it for credentials.

Key Terms — Domain 1

Term	One-Line Definition
Cold Start	Latency added when Lambda initializes a new execution environment
Provisioned Concurrency	Pre-warmed Lambda environments that eliminate cold starts
Reserved Concurrency	Guaranteed capacity for one function; also caps its maximum
Visibility Timeout	Duration an SQS message is hidden from other consumers after receipt
Dead-Letter Queue (DLQ)	Destination for messages that failed processing after maxReceiveCount attempts
Partition Key	Attribute that determines which DynamoDB partition stores an item
Sort Key	Enables range queries within a partition; with PK forms composite primary key
RCU / WCU	Read/Write Capacity Units — billing and throughput units for DynamoDB
Pre-signed URL	Time-limited URL granting temporary access to a private S3 object
Fan-out Pattern	One SNS message triggers multiple SQS queues independently in parallel
Event Source Mapping	Lambda's built-in polling mechanism for SQS, Kinesis, and DynamoDB Streams
Shard	Unit of capacity in Kinesis Data Streams (1 MB/s write, 2 MB/s read)
Enhanced Fan-out	Kinesis feature providing dedicated 2 MB/s per consumer per shard
Task Token	Unique identifier in Step Functions that pauses workflow until returned
Sparse Index	A GSI built on an attribute not present in all items — only those items appear

End of Domain 1: Development with AWS Services. Continue to Domain 2: Security →

Domain 2: Security

Ready to test yourself?

Practice questions for this topic

Start Practicing →

Domain 1: Development with AWS Services

AWS Certified Developer – Associate (DVA-C02)