Multi-Cloud: Strategy and Reality
Multi-cloud is the strategy of using services from two or more cloud providers (AWS, Azure, GCP) to distribute workloads, reduce vendor lock-in risk, and optimize costs. For a platform team, designing a multi-cloud IDP means building abstractions that hide the differences between providers, allowing developers to deploy without worrying about the underlying cloud.
However, multi-cloud is not a silver bullet. It introduces significant complexity in infrastructure management, networking, security, and monitoring. In this article, we will analyze when multi-cloud makes sense, which patterns to adopt, and how to mitigate vendor lock-in without falling into the trap of excessive complexity.
What You'll Learn
- Multi-cloud strategies: active-active, active-passive, best-of-breed
- Kubernetes as an abstraction layer for portability
- Provider-agnostic IaC: patterns and trade-offs
- Cross-cloud secrets management with HashiCorp Vault
- Multi-cloud cost management and optimization
- Service mesh for cross-cloud networking
Multi-Cloud Strategies
There is no single multi-cloud strategy. Organizations adopt different approaches based on their needs:
- Active-Active: workloads actively distributed across multiple cloud providers. Maximum resilience but maximum complexity
- Active-Passive: one primary cloud provider with failover to a secondary. Good balance between resilience and complexity
- Best-of-Breed: using the best service from each provider (e.g., AWS for compute, GCP for ML, Azure for enterprise). Optimizes capabilities but complicates governance
- Hybrid: combination of public cloud and on-premise infrastructure. Common in regulated sectors (finance, healthcare)
# Multi-cloud strategy: active-passive with failover
multi-cloud-architecture:
primary: AWS (eu-west-1)
secondary: GCP (europe-west1)
strategy: active-passive
workload-distribution:
primary-aws:
services:
- all production workloads
- primary databases (RDS PostgreSQL)
- primary cache (ElastiCache Redis)
- primary message queue (MSK Kafka)
traffic: 100% (normal operations)
secondary-gcp:
services:
- read replicas (Cloud SQL)
- DR databases (standby)
- static assets (Cloud CDN)
- batch processing (BigQuery)
traffic: 0% (failover only)
failover:
trigger: "Primary region unavailable for > 5 minutes"
rto: "15 minutes"
rpo: "5 minutes"
steps:
1: "DNS failover (Route53 health check)"
2: "Promote GCP read replicas to primary"
3: "Scale up GCP compute"
4: "Verify service health"
5: "Notify teams"
data-replication:
databases: "Async replication, 5-minute lag"
object-storage: "Cross-cloud sync (rclone)"
secrets: "Vault replication"
Kubernetes as an Abstraction Layer
Kubernetes is the most effective abstraction layer for multi-cloud portability. A containerized application running on Kubernetes can be deployed on EKS (AWS), GKE (GCP), AKS (Azure), or any Kubernetes cluster with minimal changes.
However, portability is not automatic. Areas where cloud specificity emerges include:
- Storage: StorageClass and PersistentVolume differ per provider
- Networking: LoadBalancer, Ingress Controller, and DNS have provider-specific implementations
- IAM: IRSA (AWS), Workload Identity (GCP), Pod Identity (Azure) for pod authentication
- Managed services: managed databases, caches, and message queues are the primary lock-in vector
The Portability Trade-Off
100% portability is an unrealistic and expensive goal. Every abstraction layer added for portability introduces complexity and potential performance loss. The optimal strategy is portability where it matters: abstract compute (Kubernetes), but use native managed services for databases and storage where performance and management benefits outweigh lock-in risk.
Cross-Cloud Secrets Management
Secrets management in a multi-cloud environment is particularly critical. Each cloud provider has its own secret management service (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault), and developers should not have to interact with each one separately.
HashiCorp Vault is the most widely adopted solution for centralizing secrets management in multi-cloud environments:
- Centralization: a single management point for all secrets, regardless of cloud
- Dynamic secrets: on-demand generation of temporary credentials (database, AWS IAM, certificates)
- Auto-rotation: automatic secret rotation without application downtime
- Audit: complete log of every secret access for compliance
# Vault: multi-cloud secrets configuration
vault:
storage:
type: raft
ha_enabled: true
nodes:
- vault-0.vault.svc (primary)
- vault-1.vault.svc (follower)
- vault-2.vault.svc (follower)
auth-methods:
# AWS authentication
aws:
path: auth/aws
config:
access_key: 






