Was this page helpful?
Migrate a rack to a new node pool¶
Move a ScyllaDB rack from one Kubernetes node pool to another without downtime, by adding a new rack on the target node pool and gradually migrating data away from the old rack.
When to use this procedure¶
Common reasons to migrate a rack to a new node pool:
Upgrading to a different instance type (e.g. larger machines, newer generation).
Moving to nodes with different storage configuration.
Replacing a node pool that uses a deprecated OS image.
Switching from a shared node pool to a dedicated one.
Prerequisites¶
The new node pool must already exist with appropriate labels, taints, and instance types. See Set up dedicated node pools for setup.
A
NodeConfigresource targeting the new node pool must be applied (if your cluster uses tuning, RAID, or filesystem configuration). See Configure nodes for details.The replication factor of your keyspaces must be large enough to tolerate the temporary imbalance during migration. For example, with RF=3 you can safely have one rack temporarily empty.
The new rack must be in the same datacenter as the old rack.
Procedure¶
The migration follows a gradual scale-up / scale-down pattern: add nodes to the new rack one at a time, then remove nodes from the old rack one at a time.
Suppose you have a ScyllaCluster with a rack us-east-1a on the old node pool and you want to migrate it to a new node pool labelled pool: scylladb-new.
Step 1: Add the new rack with zero members¶
Add a new rack to the spec that targets the new node pool.
Set members: 0 initially — no pods are created yet.
apiVersion: scylla.scylladb.com/v1
kind: ScyllaCluster
metadata:
name: scylla
namespace: scylla
spec:
datacenter:
name: us-east-1
racks:
- name: us-east-1a # old rack
members: 3
storage:
capacity: 500Gi
resources:
limits:
cpu: 4
memory: 8Gi
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pool
operator: In
values:
- scylladb-old
tolerations:
- key: role
operator: Equal
value: scylladb
effect: NoSchedule
- name: us-east-1b # new rack — starts at 0
members: 0
storage:
capacity: 500Gi
resources:
limits:
cpu: 4
memory: 8Gi
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pool
operator: In
values:
- scylladb-new
tolerations:
- key: role
operator: Equal
value: scylladb
effect: NoSchedule
Apply the change:
kubectl -n scylla apply -f scyllacluster.yaml
Step 2: Scale up the new rack one node at a time¶
Increase the new rack’s members by 1.
Each new node joins the cluster, takes ownership of a token range, and begins streaming data from existing nodes.
kubectl -n scylla patch scyllacluster/scylla --type=json \
-p='[{"op": "replace", "path": "/spec/datacenter/racks/1/members", "value": 1}]'
Wait for the new node to be ready:
kubectl -n scylla wait --timeout=15m --for='condition=Available' scyllaclusters.scylla.scylladb.com/scylla
Verify the cluster state:
kubectl -n scylla exec -it scylla-us-east-1-us-east-1a-0 -c scylla -- nodetool status
Repeat this step until the new rack has the same number of members as the old rack.
Step 3: Scale down the old rack one node at a time¶
Decrease the old rack’s members by 1.
The Operator decommissions the highest-ordinal node, streaming its data to the remaining nodes before deleting the pod.
kubectl -n scylla patch scyllacluster/scylla --type=json \
-p='[{"op": "replace", "path": "/spec/datacenter/racks/0/members", "value": 2}]'
Wait for the decommission to finish:
kubectl -n scylla wait --timeout=30m --for='condition=Available' scyllaclusters.scylla.scylladb.com/scylla
Repeat until the old rack has 0 members.
Step 4: Remove the old rack¶
Once the old rack has 0 members, remove its definition from the spec:
apiVersion: scylla.scylladb.com/v1
kind: ScyllaCluster
metadata:
name: scylla
namespace: scylla
spec:
datacenter:
name: us-east-1
racks:
- name: us-east-1b # only the new rack remains
members: 3
storage:
capacity: 500Gi
resources:
limits:
cpu: 4
memory: 8Gi
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pool
operator: In
values:
- scylladb-new
tolerations:
- key: role
operator: Equal
value: scylladb
effect: NoSchedule
Warning
Do not remove a rack from the spec while it still has members. The Operator rejects this change through validation.
Step 5: Run a repair¶
After migration, run a repair to ensure data consistency across the new token ranges:
kubectl -n scylla exec -it scylla-us-east-1-us-east-1b-0 -c scylla -- nodetool repair -pr
Or use a ScyllaDB Manager repair task if Manager is configured.