EKS Ephemeral Lab

Production-style AWS/EKS DevOps learning platform.

Timeline: Part-time weekends (12 hrs/week) | Budget: $250/month | Progress: Weeks 0-15 ✅ (Weeks 14, 16-18 skipped)

Quick Start

make up      # Create infrastructure + configure kubectl
make down    # Destroy everything

Completed (Weeks 0-15)

Week	Topic	Status
0	AWS Setup, Billing, Terraform State	✅
1	VPC Foundation	✅
2	EKS Cluster	✅
3	GitOps (Argo CD)	✅
4	AWS Load Balancer Controller	✅
5	ExternalDNS	✅
6	TLS (cert-manager)	✅
7	CI/CD Build (ECR, GitHub Actions)	✅
8	CI/CD Deploy (GitOps flow)	✅
9	Observability: Metrics (Container Insights)	✅
10	Observability: Logs & Traces	✅
11	Scaling: Karpenter	✅
12	Stateful: DynamoDB	✅
13	Security & Policy Enforcement	✅
14	Async: SQS/SNS Workers	⏭️
15	Resilience & Chaos	✅
16	EKS Upgrade	⏭️
17	Multi-Region & DR	⏭️
18	Cost Optimization & Wrap-Up	⏭️

State Backend: S3 ryan-eks-lab-tfstate + DynamoDB eks-lab-tfstate-lock

Week Details

Week 10 – Observability: Metrics ✅

Goal: ~~AMP + AMG + ADOT~~ → CloudWatch Container Insights

AMP was ~$20/day (too expensive). Switched to Container Insights via amazon-cloudwatch-observability addon.

Note: Before descoping, hit ADOT bug where relabel_configs couldn't construct __address__ for custom app metrics. See docs/week10-guestbook-metrics-investigation.md.

Cost: ~$3-5/month

Week 11 – Observability: Logs & Traces ✅

Goal: Centralized logging and distributed tracing

Fluent Bit via amazon-cloudwatch-observability addon → CloudWatch Logs
Structured JSON logging in guestbook app (user, action, trace_id, client_ip)
X-Ray tracing configured via CloudWatch Agent OTLP endpoints
Log retention set to 7 days (Terraform-managed)
Security review: GuardDuty findings triage, CloudTrail review

Cost: ~$5/session (CloudWatch Logs ~$0.50/GB)

Week 12 – Scaling: Karpenter ✅

Goal: Automatic node provisioning

Create Karpenter IAM role (Pod Identity, not IRSA)
Install Karpenter Helm chart (v1.0.8)
Create NodePool (Spot + On-Demand, t4g/m6g/c6g Graviton families)
Create EC2NodeClass (AL2023, 20GB gp3, IMDSv2)
SQS queue for Spot interruption handling
Consolidation policy for cost optimization

Cost: ~$4/session (potential Spot savings)

Week 13 – Security & Policy Enforcement ✅

Goal: Admission control and secrets management

Install Kyverno
Create policies: no :latest, require limits, no privileged, require labels
Add Trivy to CI pipeline (fail on HIGH/CRITICAL)
Install External Secrets Operator
Sync secret from Secrets Manager → K8s

Cost: ~$5/session (Secrets Manager ~$0.40/secret/month)

Week 14 – Async: SQS/SNS Workers ⏭️ SKIPPED

Goal: Event-driven architecture

Status: Skipped - Already familiar with SQS/SNS patterns; doesn't add meaningful functionality to guestbook app.

Cost: $0 (skipped)

Week 15 – Resilience & Chaos ✅

Goal: Understand failure modes

Add PodDisruptionBudgets (minAvailable: 2 for guestbook)
Manual chaos: delete pods, drain nodes (runbook)
AWS FIS experiment: terminate EC2 instance
Document runbooks (node failure, crashloop debugging)

Cost: ~$12/session

Week 16 – EKS Upgrade ⏭️ SKIPPED

Goal: Safe upgrade procedures

Status: Skipped - New role doesn't require EKS; lab objectives met.

Cost: $0 (skipped)

Week 17 – Multi-Region & DR ⏭️ SKIPPED

Goal: Basic disaster recovery

Status: Skipped - New role doesn't require EKS; lab objectives met.

Cost: $0 (skipped)

Week 18 – Cost Optimization & Wrap-Up ⏭️ SKIPPED

Goal: Production-ready cost controls and documentation

Status: Skipped - New role doesn't require EKS; lab objectives met.

Cost: $0 (skipped)

Success Criteria (Week 18)

✅ Spin up full platform in <30 min
✅ Deploy via GitOps, zero manual kubectl
✅ Automatic HTTPS + DNS
✅ Metrics, logs, traces observable
✅ Graceful failure handling
✅ Security policies enforced
✅ DB + queue connectivity
✅ Destroy cleanly with one command
✅ Understand and explain every component

Repo Structure

infra/           # Terraform
k8s/             # Helm/manifests
  argocd/        # Argo CD config + Applications
  guestbook/     # Sample app
scripts/         # up.sh, down.sh
docs/            # Week-specific notes

Tagging Convention

All resources tagged with: project, env, owner, created_at, ttl_hours

Security Baseline

MFA on root, IAM user for daily work
Security Hub, GuardDuty, Config enabled
IRSA for pod-to-AWS access
HTTPS for public endpoints
No 0.0.0.0/0 except documented ALB

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.claude/commands		.claude/commands
.devcontainer		.devcontainer
.github/workflows		.github/workflows
dashboards		dashboards
docs		docs
infra		infra
k8s		k8s
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
DEVCONTAINER_SECURITY.md		DEVCONTAINER_SECURITY.md
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
extended_readme.md		extended_readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EKS Ephemeral Lab

Quick Start

Completed (Weeks 0-15)

Week Details

Week 10 – Observability: Metrics ✅

Week 11 – Observability: Logs & Traces ✅

Week 12 – Scaling: Karpenter ✅

Week 13 – Security & Policy Enforcement ✅

Week 14 – Async: SQS/SNS Workers ⏭️ SKIPPED

Week 15 – Resilience & Chaos ✅

Week 16 – EKS Upgrade ⏭️ SKIPPED

Week 17 – Multi-Region & DR ⏭️ SKIPPED

Week 18 – Cost Optimization & Wrap-Up ⏭️ SKIPPED

Success Criteria (Week 18)

Repo Structure

Tagging Convention

Security Baseline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EKS Ephemeral Lab

Quick Start

Completed (Weeks 0-15)

Week Details

Week 10 – Observability: Metrics ✅

Week 11 – Observability: Logs & Traces ✅

Week 12 – Scaling: Karpenter ✅

Week 13 – Security & Policy Enforcement ✅

Week 14 – Async: SQS/SNS Workers ⏭️ SKIPPED

Week 15 – Resilience & Chaos ✅

Week 16 – EKS Upgrade ⏭️ SKIPPED

Week 17 – Multi-Region & DR ⏭️ SKIPPED

Week 18 – Cost Optimization & Wrap-Up ⏭️ SKIPPED

Success Criteria (Week 18)

Repo Structure

Tagging Convention

Security Baseline

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages