elastic-pvc is a lean Kubernetes controller that expands PersistentVolumeClaims when filesystem usage crosses a threshold.
The target use-case of this project is to handle large disk spills on Spark-on-EKS deployments.
elastic-pvc polls kubelet stats on every node and checks PVC filesystem usage. When free space drops below a configurable threshold, it patches the PVC to request more storage. The AWS EBS CSI driver handles the actual volume resize (ec2:ModifyVolume) and online filesystem expansion.
No Prometheus or external metrics system required.
See Architecture for more details.
- EKS cluster with the AWS EBS CSI Driver installed
- A StorageClass with
allowVolumeExpansion: true
helm install elastic-pvc deploy/helm/elastic-pvc/ \
--namespace elastic-pvc --create-namespace \
--set storageClass.enabled=trueAdd these annotations to PVCs you want elastic-pvc to manage:
metadata:
annotations:
elastic-pvc.io/storage-limit: "500Gi" # max size (required)
elastic-pvc.io/threshold: "20%" # expand when free space < 20% (optional, default: 20%)
elastic-pvc.io/increase: "50%" # grow by 50% each time (optional, default: 50%)The StorageClass must opt in:
metadata:
annotations:
elastic-pvc.io/enabled: "true"| Annotation | Required | Default | Description |
|---|---|---|---|
elastic-pvc.io/storage-limit |
Yes | - | Maximum size the PVC can grow to (e.g., 500Gi) |
elastic-pvc.io/threshold |
No | 20% |
Free-space threshold that triggers expansion. Percentage or absolute (e.g., 10Gi) |
elastic-pvc.io/increase |
No | 50% |
How much to grow each time. Percentage of current capacity or absolute |
elastic-pvc.io/cooldown |
No | 5m |
Minimum interval between resizes for this PVC (e.g., 10m, 1h). Overrides global --resize-cooldown |
| Flag | Default | Description |
|---|---|---|
--interval |
1m |
How often to check PVC usage |
--max-resizes-per-cycle |
10 |
Maximum resize operations per reconciliation cycle |
--resize-cooldown |
5m |
Minimum interval between resizes for the same PVC |
--metrics-addr |
:8080 |
Prometheus metrics endpoint |
--health-addr |
:8081 |
Health/readiness probe endpoint |
elastic-pvc includes rate limiting to prevent EBS API exhaustion during burst scenarios:
-
Per-cycle limit: Only
--max-resizes-per-cyclePVCs are resized per reconciliation cycle. PVCs with the lowest available space are prioritized. -
Per-PVC cooldown: After a resize, each PVC enters a cooldown period (default: 5 minutes) before it can be resized again. This can be overridden per-PVC with the
elastic-pvc.io/cooldownannotation.
Rate limiting metrics are exposed at /metrics:
elastic_pvc_rate_limited_total{reason="cooldown"}- resizes skipped due to cooldownelastic_pvc_rate_limited_total{reason="per_cycle_limit"}- resizes deferred due to per-cycle limitelastic_pvc_resizes_total- successful resize operations
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local.options.claimName=OnDemand
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local.options.storageClass=spark-local-ebs
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local.options.sizeLimit=150Gi
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local.mount.path=/tmp/spark
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local.mount.readOnly=false
spark.local.dir=/tmp/sparkEach executor gets a fresh EBS-backed PVC. elastic-pvc watches them and grows as Spark spills data to disk. When the executor terminates, reclaimPolicy: Delete cleans up the EBS volume.
AWS allows up to four Elastic Volume modifications per volume in a rolling 24-hour window. Once a volume hits this limit, further resize attempts will fail until the window rolls over. The --resize-cooldown flag (and its per-PVC override elastic-pvc.io/cooldown) helps mitigate this by spacing out resize operations on the same volume, reducing the chance of exhausting the four-modification quota. However, if the threshold and increase values are set too aggressively, a single PVC can still hit the AWS limit within 24 hours.
The EC2 ModifyVolume API is subject to standard AWS API rate limits. In clusters with many PVCs resizing concurrently, API throttling can cause resize requests to fail or be delayed. The --max-resizes-per-cycle flag bounds concurrent modifications per reconciliation cycle, but operators should monitor for throttling errors in clusters with hundreds of managed PVCs.
Automatic storage expansion means EBS costs can grow without manual approval. In large clusters with many short-lived PVCs (e.g., Spark executors), rapid expansion across hundreds of volumes can lead to significant cost increases. The elastic-pvc.io/storage-limit annotation caps the maximum size each PVC can reach, providing an upper bound on per-volume cost. Set this annotation thoughtfully — it is the primary safeguard against unbounded storage growth.
elastic-pvc is not the only way to handle storage for spill-heavy workloads. Depending on your instance types and workload patterns, these alternatives may be a better fit.
For clusters already using storage-optimized instances (r6id, c6id, i3, etc.), local NVMe volumes offer significantly higher I/O throughput than EBS. Karpenter's instanceStorePolicy: RAID0 makes setup straightforward. The trade-off is fixed capacity — if spill size is unpredictable, NVMe alone may not be enough. A hybrid approach combining NVMe with elastic-pvc is also possible.
See NVMe Instance Store Alternative for setup details and trade-offs.
A Karpenter blueprint for dynamic EBS volume sizing uses a DaemonSet to detect instance type and resize root EBS volumes accordingly. This approach sizes volumes at node launch rather than reacting to usage at runtime. It is useful when different instance types need different root volume sizes, but does not handle PVC-level expansion for per-pod storage.
make build # build binary
make test # run tests
make lint # fmt + vet
make docker-build # build container image