Designing Stateful Workloads in EKS: Using Node Affinity to Avoid EBS ZoneMismatch Errors
EKS Admin
While making some improvements to my EKS cluster recently, I revisited how Grafana was scheduled across nodes.
The cluster runs a mixture of on-demand and spot instances, which is a common cost optimisation strategy. However, workloads running on spot instances can disappear at any time when AWS reclaims the capacity.
Because of this, I wanted to improve how Grafana behaved when nodes were reclaimed and pods needed to be rescheduled.
During this work I ran into an issue that highlights an important design consideration when running stateful workloads in Kubernetes using AWS EBS volumes.
The Error That Exposed the Problem
After Grafana restarted due to spot instance being reclaimed, the pod failed to start and remained pending.
Inspecting the pod events revealed the following:
AttachVolume.Attach failed for volume “grafana” : rpc error: code = Internal desc =
Could not attach volume “vol-xxxxxxxxxxxx” to node “i-xxxxxxxxxxxx”:
could not attach volume “vol-xxxxxxxxxxxx” to node “i-xxxxxxxxxxxx”:
InvalidVolume.ZoneMismatch: The volume ‘vol-xxxxxxxxxxxx’ is not in the same
availability zone as instance ‘i-xxxxxxxxxxxx’This can typically be seen with:
kubectl describe pod prometheus-grafana-xxxxx -n prometheusor
kubectl get events -n prometheusThe key clue here is the message:
InvalidVolume.ZoneMismatchThis indicates that Kubernetes attempted to attach an EBS volume that exists in one availability zone to a node running in another availability zone.
Because my cluster spans multiple AZs, the scheduler had placed the replacement pod onto a node that could not attach the existing volume.
Understanding EBS Volume Attachments
Another important aspect of EBS is how it attaches to nodes.
Unlike many network-attached storage systems, EBS does not support multi-node read/write access.
A typical EBS volume supports:
ReadWriteOnceThis means the volume can only be attached to one node at a time.
In contrast, traditional network storage solutions often support:
ReadWriteManywhere multiple nodes can mount the same volume simultaneously.
Because EBS behaves more like a block device attached to a single machine, Kubernetes must schedule the pod onto the node where the volume can attach.
This is why the AZ mismatch caused the workload to fail.
Why Not Use EFS Instead?
One alternative would be Amazon EFS.
EFS behaves more like traditional network-attached storage and supports:
ReadWriteManymeaning multiple pods across multiple nodes can mount the same filesystem simultaneously.
However there is a trade-off.
EFS is typically more expensive than EBS and has different performance characteristics.
Since Grafana only requires a single writer volume, using EBS remains the more cost-effective option for this workload.
So the architecture decision here was deliberate:
EBS for cost efficiency
single-attach storage
careful scheduling considerations
Designing Scheduling Rules for Stateful Workloads
Given these constraints, we need to ensure Kubernetes schedules pods in a way that respects how the storage behaves.
Kubernetes provides several mechanisms for controlling where pods are placed:
node selectors
node affinity
taints and tolerations
topology-aware scheduling
For stateful workloads using EBS-backed volumes, it’s important to influence the scheduler so that pods are placed onto nodes that can successfully attach the volume.
Using Node Affinity
Node affinity allows us to define rules about which nodes a pod should run on, based on node labels.
A simplified example looks like this:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-west-2aBreaking this down:
requiredDuringScheduling
The rule must be satisfied when Kubernetes schedules the pod.
IgnoredDuringExecution
If the rule stops being true later, the pod will not be evicted.
This allows us to guide the scheduler without forcing unnecessary restarts of running workloads.
The Bigger Architectural Lesson
This issue highlights an important distinction in Kubernetes.
Stateless workloads can generally run anywhere in the cluster.
Stateful workloads, however, often have infrastructure constraints that influence where they can run.
When using EBS-backed storage, those constraints include:
availability zones
single-node attachment
storage cost considerations
Understanding these constraints helps design scheduling policies that prevent runtime failures like the ZoneMismatch error.
Closing Thoughts
This issue surfaced while improving the resilience of Grafana in a cluster running spot instances.
What started as a simple scheduling tweak turned into a deeper look at how stateful workloads interact with cloud storage primitives.
These kinds of debugging exercises are valuable because they highlight how:
Kubernetes scheduling
cloud infrastructure
and storage architecture
all interact with each other.
In the next post, I’ll walk through how I migrated Grafana’s storage from the legacy in-tree AWS EBS driver to the modern EBS CSI driver, while safely preserving the existing data using Kubernetes volume snapshots.



