Documentation Index
Fetch the complete documentation index at: https://threatbasis.io/llms.txt
Use this file to discover all available pages before exploring further.
Cloud-native architectures favor managed control planes, ephemeral workloads, and event-driven patterns that fundamentally change security models compared to traditional infrastructure. Security engineers design cloud-native applications with identity-bound access controls, policy-driven security enforcement, and minimal attack surfaces that leverage platform security capabilities while addressing unique cloud-native threats.
The Cloud-Native Security Paradigm Shift
The transition to cloud-native architectures requires fundamental changes to security approaches:
- Infrastructure model: Long-lived servers → Ephemeral functions and containers
- Access control: Network-based perimeters → Identity-based zero-trust access
- Application architecture: Monolithic applications → Distributed microservices
- Security enforcement: Immutable infrastructure with policy-driven controls
- Visibility: Comprehensive observability across distributed system behavior
Serverless and Function Security
Least Privilege IAM per Function
Serverless functions require granular permission management to minimize blast radius during compromise. Follow these principles:
Permission Scoping Best Practices:
- Grant minimal IAM permissions required for specific function purpose
- Use separate execution roles per function (avoid shared roles)
- Scope permissions to individual resources (specific S3 buckets, DynamoDB tables, SQS queues)
- Avoid wildcard permissions that create excessive blast radius
- Implement resource-based policies for defense-in-depth
Advanced Permission Controls:
- Permission boundaries: Limit maximum assumable permissions to prevent privilege escalation
- Regular audits: Identify and remove unused permissions over time
- Dual authorisation: Require both function execution role AND resource policy allowances
Learn more about AWS IAM best practices and Azure Function security.
Event Source Protection
Serverless functions trigger from multiple event sources, each requiring specific security controls:
| Event Source Type | Security Controls | Threat Mitigation |
|---|
| Message Queues (SQS, Service Bus) | Resource policies, encryption at rest/transit | Unauthorized event injection |
| Storage Buckets (S3, Blob Storage) | Bucket policies, event filtering | Malicious object uploads |
| API Gateways | Authentication, rate limiting, WAF | DDoS, injection attacks |
| Message Topics (SNS, Event Grid) | Subscription policies, message validation | Event spoofing |
Event Source Security Implementation:
- Access control: Restrict which principals can publish events via resource policies
- Encryption: Protect event data in transit and at rest
- Validation: Verify event authenticity and structure before processing
- Rate limiting: Prevent resource exhaustion from event floods
- Dead letter queues: Capture failed events for investigation without blocking processing
Reference AWS Lambda event source security and OWASP Serverless Top 10 for comprehensive guidance.
Secrets Management
Why Environment Variables Are Insufficient:
| Storage Method | Encryption | Access Control | Rotation | Audit Logging | Exposure Risk |
|---|
| Environment Variables | ❌ | Limited | Manual redeploy | ❌ | High (visible in logs, console) |
| Secrets Manager | ✅ | Fine-grained | Automatic | ✅ | Low (encrypted, audited) |
Secrets Management Best Practices:
- Retrieve secrets at runtime from AWS Secrets Manager, Azure Key Vault, or Google Secret Manager
- Avoid environment variables for sensitive data (visible to anyone with function read permissions)
- Implement fine-grained access control with separate permissions for reading vs. managing secrets
- Enable automatic rotation without function redeployment
- Cache secrets within function execution contexts to reduce API calls
- Set reasonable refresh intervals to balance security and performance
Managed Service Security
Private Connectivity
Eliminate public internet exposure through private networking configurations:
Private Connectivity Options:
| Connectivity Type | Implementation | Use Case | Security Benefit |
|---|
| Private Endpoints | AWS PrivateLink, Azure Private Link | Managed services (S3, DynamoDB) | Private IP within VPC, no internet exposure |
| VPC-Native Services | RDS in VPC, Elasticache | Databases, caches | Network-level access controls (security groups, NACLs) |
| Service Endpoints | VPC endpoints | Regional services | Traffic stays within cloud backbone |
| VPN/Bastion Access | Site-to-Site VPN, bastion hosts | Administrative access | Controlled entry points |
Implementation Priorities:
- Disable public endpoints wherever possible
- Deploy VPC-native services for databases and caches
- Implement private endpoints for managed services
- Apply service-specific firewall rules restricting access to specific IP ranges
- Use IP allowlisting + authentication for defense-in-depth when public endpoints are required
Audit Logging and Monitoring
Enable comprehensive audit logging across all managed services to detect security incidents and maintain compliance:
Critical Audit Log Categories:
- Data access logs: Query patterns, object access, read/write operations
- Authentication logs: Login attempts, credential usage, MFA events
- Configuration changes: Schema modifications, permission updates, lifecycle policies
- Administrative operations: Service configuration, backup/restore, scaling events
Logging Infrastructure:
-
Enable service-specific audit logs:
-
Centralize log aggregation for cross-service correlation and long-term retention
-
Configure monitoring alerts for anomalous patterns indicating compromise or data exfiltration
-
Implement log retention policies meeting compliance requirements (GDPR, SOC 2, HIPAA)
Multi-Tenancy and Data Isolation
Managed services use varying tenancy models with different security implications:
Tenancy Model Comparison:
| Tenancy Model | Isolation Level | Cost | Use Case | Risk Profile |
|---|
| Multi-tenant shared | Logical (row/column level) | Low | Non-sensitive workloads | Higher (shared infrastructure) |
| Multi-tenant with per-tenant keys | Cryptographic | Medium | Moderate sensitivity | Medium (cryptographic boundaries) |
| Dedicated instances | Physical | High | Highly sensitive data | Lower (isolated infrastructure) |
| Single-tenant | Complete | Highest | Regulated data (HIPAA, PCI-DSS) | Lowest (no sharing) |
Data Isolation Techniques:
- Row-level security (RLS): Prevent cross-tenant data access through application vulnerabilities
- Column-level encryption: Protect sensitive fields with per-tenant encryption keys
- Per-tenant encryption keys: Enable cryptographic isolation with separate key management
- Network isolation: Dedicated VPCs or subnets per tenant
Service Selection Strategy:
- Classify data sensitivity using regulatory requirements and business impact
- Match tenancy model to data classification (dedicated for PII/PHI, multi-tenant for public data)
- Implement defense-in-depth with multiple isolation layers
- Balance security requirements with cost constraints
Integrated CSPM, CWPP, and CIEM
Cloud-Native Application Protection Platforms unify multiple security disciplines into comprehensive platforms:
CNAPP Component Breakdown:
| Component | Full Name | Primary Function | Key Capabilities |
|---|
| CSPM | Cloud Security Posture Management | Configuration assessment | Detect misconfigurations, compliance monitoring, policy enforcement |
| CWPP | Cloud Workload Protection Platform | Runtime protection | Vulnerability scanning, malware detection, behavioral monitoring |
| CIEM | Cloud Infrastructure Entitlement Management | Identity analysis | Excessive privilege detection, unused permission identification |
CSPM Detection Examples:
- Public storage buckets (S3, Blob Storage)
- Overly permissive security groups
- Disabled audit logging
- Unencrypted data stores
- Non-compliant resource configurations
Integration Benefits:
- Attack path analysis: Correlate configuration weaknesses + excessive permissions + vulnerable workloads
- Risk-based prioritization: Focus remediation on highest-impact issues
- Unified visibility: Single pane of glass across infrastructure, workloads, and identities
- Automated remediation: Policy-driven fixes for common misconfigurations
Leading CNAPP platforms include Wiz, Prisma Cloud, Orca Security, and native cloud solutions like AWS Security Hub.
Infrastructure-as-Code Scanning
Shift security left by detecting misconfigurations before deployment:
IaC Scanning Tools:
| Tool | Supported IaC Formats | Key Features | Integration |
|---|
| Checkov | Terraform, CloudFormation, Kubernetes, ARM | 1000+ policies, custom policies | CI/CD, IDE, pre-commit |
| tfsec | Terraform | Fast scanning, custom checks | GitHub Actions, GitLab CI |
| Terrascan | Terraform, Kubernetes, Helm, Dockerfiles | 500+ policies, compliance frameworks | CI/CD pipelines |
| Snyk IaC | Terraform, CloudFormation, Kubernetes | Developer-first, fix suggestions | IDE, SCM, CI/CD |
Common Misconfiguration Detection:
- Unencrypted storage (S3, EBS, databases)
- Public network access (0.0.0.0/0 ingress rules)
- Missing audit logging
- Excessive IAM permissions (wildcard actions/resources)
- Non-compliant encryption algorithms
- Missing backup configurations
Implementation Workflow:
- Integrate IaC scanning as CI/CD gates to prevent deployment of insecure infrastructure
- Configure custom policies for organization-specific requirements (tagging, instance types, regions)
- Provide developer feedback through pull request comments and IDE integrations
- Set enforcement levels: Block critical issues, warn on medium/low severity
- Track remediation metrics to measure security posture improvement
Configuration drift creates security gaps when infrastructure deviates from IaC-defined state:
Drift Detection Strategies:
| Approach | Detection Method | Remediation | Use Case |
|---|
| Continuous scanning | Compare actual vs. desired state | Manual review + approval | Production environments |
| Scheduled reconciliation | Periodic Terraform plan/apply | Automatic for low-risk | Development environments |
| Event-driven detection | CloudWatch/EventBridge triggers | Immediate alerting | Critical resources |
| GitOps-based | ArgoCD, Flux drift detection | Automatic sync | Kubernetes workloads |
Auto-Remediation Best Practices:
-
Classify drift severity:
- Critical: Security group changes, IAM policy modifications → Immediate alert + manual review
- High: Encryption settings, logging configuration → Approval workflow
- Low: Tags, descriptions → Automatic remediation
-
Implement approval workflows for high-impact changes to prevent service disruptions
-
Track drift metrics by team/service to identify process issues requiring workflow improvements
-
Use drift as a signal for security incidents (unauthorized changes) vs. operational issues (emergency fixes)
Tools: Terraform Cloud drift detection, AWS Config, Azure Policy
Supply Chain Security
Image and Function Signing
Cryptographic signing ensures artifact integrity and provenance throughout the deployment pipeline:
Signing Technologies:
| Technology | Artifact Type | Key Features | Adoption |
|---|
| Docker Content Trust | Container images | Built into Docker, Notary-based | Mature |
| Sigstore/Cosign | Container images, binaries | Keyless signing, transparency log | Growing |
| Notary | Container images | TUF-based, registry integration | Mature |
| AWS Signer | Lambda functions, IoT code | Managed code signing | AWS-specific |
| Azure Code Signing | Functions, executables | Managed certificates | Azure-specific |
Implementation Steps:
- Sign artifacts in CI/CD pipelines using build system credentials
- Store signatures in registries alongside artifacts
- Enforce signature verification via admission controllers (Kubernetes) or deployment policies
- Rotate signing keys regularly with automated key management
- Audit signature verification failures as potential supply chain attacks
Benefits:
- Prove artifacts were built through approved pipelines
- Detect tampering between build and deployment
- Prevent deployment of unsigned or invalidly signed artifacts
Provenance Attestations
Document the complete build process to verify artifact authenticity:
SLSA Framework Levels:
| SLSA Level | Requirements | Build Isolation | Provenance | Use Case |
|---|
| SLSA 1 | Build process documented | None | Basic metadata | Initial adoption |
| SLSA 2 | Version control + build service | Minimal | Authenticated provenance | Standard deployments |
| SLSA 3 | Hardened build platform | Strong | Unforgeable provenance | Production workloads |
| SLSA 4 | Two-party review | Hermetic | Comprehensive audit trail | Critical infrastructure |
Provenance Attestation Contents:
- Source information: Repository URL, commit hash, branch
- Build parameters: Compiler flags, dependencies, build environment
- Security validations: SAST/DAST results, vulnerability scans, test coverage
- Builder identity: CI/CD system, build timestamp, builder credentials
- Cryptographic binding: Hash of source code → hash of artifact
Implementation with SLSA:
- Generate provenance during CI/CD builds using SLSA GitHub Generator or similar tools
- Store attestations alongside artifacts in registries
- Verify provenance before deployment using policy engines
- Enforce minimum SLSA levels for production (recommend SLSA 3+)
- Audit provenance for compliance and incident investigation
Learn more: SLSA specification, in-toto attestation framework
Registry Allowlisting
Prevent supply chain attacks by controlling artifact sources:
Registry Security Strategy:
-
Allowlist approved registries:
- Private registries (ECR, ACR, GCR, Harbor)
- Approved public registries (Docker Hub verified publishers)
- Internal artifact repositories (JFrog Artifactory, Nexus)
-
Implement image promotion workflows:
- Scan public images for vulnerabilities
- Sign approved images
- Copy to private registry
- Deploy only from private registry
-
Enforce registry policies via admission controllers or deployment gates
Registry Security Tools:
Runtime Policy Enforcement
Enforce security requirements during workload execution to complement build-time and deployment-time controls:
Runtime Security Layers:
| Layer | Technology | Enforcement Capabilities | Detection Capabilities |
|---|
| Admission Control | OPA, Kyverno | Image sources, security contexts, resource limits | Policy violations before deployment |
| Runtime Security | Falco, Aqua, Sysdig | System call filtering, network policies | Anomalous behavior, unexpected connections |
| Service Mesh | Istio, Linkerd | mTLS, authorisation policies | Traffic anomalies, unauthorized access |
| eBPF-based | Cilium, Tetragon | Network policies, process restrictions | Kernel-level visibility |
Runtime Policy Examples:
- Prevent privileged containers: Block
privileged: true in pod specs
- Restrict system calls: Allow only necessary syscalls via seccomp profiles
- Enforce network policies: Deny egress to public internet except approved endpoints
- Detect anomalous behavior: Alert on unexpected file modifications or network connections
- Require security contexts: Enforce non-root users, read-only root filesystems
Observability and Monitoring
Distributed Tracing
Cloud-native applications span multiple services, requiring distributed tracing for visibility:
Distributed Tracing Platforms:
| Platform | Key Features | Integration | Best For |
|---|
| Jaeger | CNCF project, OpenTelemetry native | Kubernetes, microservices | Open-source deployments |
| Zipkin | Mature, simple architecture | Spring Boot, polyglot | Lightweight tracing |
| AWS X-Ray | Managed, AWS-native | Lambda, ECS, API Gateway | AWS-centric architectures |
| Google Cloud Trace | Managed, GCP-native | Cloud Run, GKE, App Engine | GCP environments |
| Datadog APM | Commercial, comprehensive | Multi-cloud, hybrid | Enterprise observability |
Security-Relevant Trace Attributes:
- Authentication context: User identity, session ID, authentication method
- Authorisation decisions: Policy evaluation results, granted/denied permissions
- Data access patterns: Resources accessed, query parameters, response sizes
- Service dependencies: Call graphs revealing lateral movement paths
- Error conditions: Exceptions, failed authentications, authorisation failures
Sampling Strategies:
| Strategy | Sample Rate | Use Case | Trade-off |
|---|
| Head-based sampling | Fixed % (e.g., 1%) | High-volume services | May miss rare events |
| Tail-based sampling | Dynamic based on outcome | Error-focused tracing | Higher processing cost |
| Priority sampling | 100% for errors/security events | Security investigation | Balanced cost/coverage |
Implement OpenTelemetry for vendor-neutral instrumentation across polyglot environments.
Structured Logging
Implement consistent log schemas for automated analysis and correlation:
Structured Logging Best Practices:
{
"timestamp": "2025-10-19T14:32:15Z",
"request_id": "req-abc123",
"service": "payment-api",
"level": "INFO",
"user_id": "user-456",
"event_type": "authorisation_decision",
"resource": "arn:aws:s3:::sensitive-bucket/data.csv",
"action": "s3:GetObject",
"result": "DENIED",
"policy": "data-access-policy-v2",
"reason": "insufficient_permissions"
}
Critical Security Log Events:
- Authentication: Login attempts, MFA challenges, credential validation
- Authorisation: Policy decisions, permission grants/denials, role assumptions
- Data access: Object reads/writes, query executions, export operations
- Configuration changes: IAM policy updates, security group modifications, encryption settings
- Error conditions: Exceptions, failed validations, rate limit violations
Log Aggregation Platforms:
Policy Decision Logging
Log authorisation decisions with sufficient context for security investigations and compliance:
Policy Decision Log Schema:
| Field | Description | Example |
|---|
| Subject | Identity making request | user:alice@example.com |
| Resource | Target resource | arn:aws:dynamodb:us-east-1:123456789012:table/Users |
| Action | Requested operation | dynamodb:GetItem |
| Result | Grant or deny | ALLOW / DENY |
| Policy | Applicable policy | user-data-access-policy |
| Reason | Evaluation explanation | matched_condition: department=engineering |
| Context | Additional attributes | IP address, time, MFA status |
Use Cases:
- Security investigations: Reconstruct access patterns during incident response
- Compliance auditing: Demonstrate access controls for SOC 2, ISO 27001, HIPAA
- Policy debugging: Identify why access was unexpectedly granted or denied
- Anomaly detection: Detect unusual access patterns indicating compromise
Implement policy decision logging with Open Policy Agent or cloud-native authorisation services.
Conclusion
Cloud-native security requires rethinking traditional security models to address ephemeral workloads, event-driven architectures, and managed services. Security engineers design cloud-native applications with identity-based access controls, comprehensive observability, and supply chain security that leverages platform capabilities while addressing unique cloud-native threats.
Success requires treating security as integral to cloud-native architecture rather than an afterthought, with security controls embedded in development workflows, deployment pipelines, and runtime environments. Organizations that invest in cloud-native security fundamentals build resilient applications that scale securely with business growth.
References and Further Reading
Standards and Frameworks
Cloud Provider Security Guides
Supply Chain Security