Skip to content

Latest commit

 

History

History
106 lines (84 loc) · 4.67 KB

File metadata and controls

106 lines (84 loc) · 4.67 KB

Secrets & Identity (Implementation Details)*

  1. Secrets Management
    → Tools: HashiCorp Vault, AWS Secrets Manager, K8s SealedSecrets
    → Ban .env in repos; enable pre-commit secret scanning (gitleaks, truffleHog)
    → Auto-rotate keys every 30–90 days via Terraform/Ansible
    → Verify: No secrets in Git history, rotation logs in SIEM

  2. Least Privilege IAM
    → Per-job/per-service identities with exact resource scoping
    → CI/CD runners: READ-only to prod by default
    → Time-boxed elevation with full audit trail
    → Verify: Policy analyzer shows no wildcards, elevation logs reviewed weekly

  3. Zero-Trust Architecture
    → Every service call authenticates + authorizes + logs
    → Deny-by-default network policies (K8s NetworkPolicy, AWS Security Groups)
    → No "trusted internal subnet" assumptions
    → Verify: Network policy audit, failed auth alerts trigger

Encryption & Data Protection

  1. End-to-End Encryption
    → TLS 1.3 in transit (including internal services), AES-256/KMS at rest
    → Automated cert expiry checks, HSTS on edges
    → Verify: SSL Labs A+ rating, KMS key rotation enabled

  2. Field-Level Protection
    → Tokenize/pseudonymize PII at ingest (not in warehouse)
    → Envelope encryption (KMS + pgcrypto) for high-risk columns
    → Never load raw PII into dev/staging environments
    → Verify: PII scanner finds zero unmasked records in non-prod

  3. Database Security Controls
    → Enable TDE (Transparent Data Encryption)
    → Row-Level Security (Postgres RLS, Snowflake) for user/tenant scoping
    → Separate read/write roles, no shared admin accounts
    → Verify: Query logs show RLS enforcement, admin access <1% of queries

Container & Runtime Security

  1. Runtime Hardening
    → Non-root user, read-only filesystem, drop Linux capabilities
    → Seccomp/AppArmor profiles, no shell in prod images
    → Block metadata abuse (AWS IMDSv2 with hop limit=1)
    → Verify: Image scan shows no root processes, metadata endpoint unreachable

  2. Service Mesh & Network Isolation
    → Mutual TLS between all services (Istio, Linkerd, or Envoy sidecars)
    → K8s NetworkPolicy so only approved pods reach databases
    → Verify: mTLS metrics in Prometheus, unauthorized connections blocked

Supply Chain & CI/CD

  1. Supply Chain Security
    → Generate SBOMs for all images/dependencies (Syft)
    → Scan in CI (Trivy, Grype, pip-audit), fail builds on critical CVEs
    → Sign images (cosign), enforce signature verification at deploy
    → Pin base images to SHA256 digests
    → Verify: Zero unsigned images in prod, SBOM in artifact registry

  2. CI/CD Isolation
    → Runners have READ-only prod access by default
    → Separate deploy pipelines for dev/staging/prod
    → Require code review + approval for prod deploys
    → Verify: Deployment logs show approver identity, no direct commits to prod

Monitoring & Incident Response

  1. Detection & Alerting
    → Alert on: failed Vault access, unusual S3 reads, anomalous SQL, IAM policy changes
    → Runtime sensors (Falco) for container escape patterns
    → Verify: Weekly alert drills, <5 min mean time to detect (MTTD)

  2. Log Hygiene
    → Structured logs with auto-redaction (tokens, passwords, PII)
    → Encrypted at rest, short retention (90 days max for sensitive logs)
    → Verify: Log sample contains zero secrets, PII redaction rate >99%

  3. Data Egress Control
    → Restrict outbound to approved domains/S3 buckets (egress firewall)
    → DLP rules on exports, watermark every extract
    → Data catalog tracks owner + expiry for all extracts
    → Verify: Blocked egress attempts logged, catalog coverage >95%

Backup & Recovery

  1. Disaster-Proof Backups
    → Cross-account/region storage with different KMS keys
    → Immutability/Object Lock enabled where available
    → Restore drills quarterly with pass/fail criteria
    → Verify: Latest backup <24 hours old, last restore drill passed

Culture & Process

  1. Incident Readiness
    → Written runbooks for common scenarios
    → Break-glass roles (short TTL, full audit)
    → Key revocation playbook, 24-hour secret rotation SLA
    → Verify: Tabletop exercise quarterly, runbook updated after each incident

  2. Environment Separation
    → Prod: full encryption + audit, dedicated AWS account/VPC
    → Dev: synthetic or masked data only, separate network
    → Test: speed over security, zero prod credentials
    → Verify: No prod data in dev (DLP scan), network segmentation tested