ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Deployments
    • Cloud
    • Server
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
    • Supported Driver Versions
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Install
Search Ask AI
ScyllaDB Docs ScyllaDB Operator Understand Pod disruption budgets

Caution

You're viewing documentation for an unstable version of ScyllaDB Operator. Switch to the latest stable version.

Pod disruption budgets¶

This page explains how ScyllaDB Operator uses Kubernetes PodDisruptionBudgets (PDBs) to protect ScyllaDB availability during voluntary disruptions.

What a PDB does¶

A PodDisruptionBudget limits how many pods matching a selector can be voluntarily evicted at the same time. Voluntary disruptions include:

  • Kubernetes node drains (maintenance, upgrades).

  • Cluster autoscaler scale-down.

  • Manual pod evictions via the Eviction API.

PDBs do not protect against involuntary disruptions such as hardware failures or OOM kills.

ScyllaDB cluster PDB¶

The Operator creates one PDB per ScyllaDB datacenter with:

spec:
  maxUnavailable: 1

This ensures that at most one ScyllaDB node can be voluntarily disrupted at a time across the entire datacenter. The Kubernetes API server blocks eviction requests that would violate this budget.

Excluding cleanup Jobs¶

The PDB selector uses the same labels as the ScyllaDB pods but adds a MatchExpression that excludes pods with the batch.kubernetes.io/job-name label. Kubernetes automatically adds this label to every pod created by a Job. This means cleanup Job pods (see Automatic data cleanup) do not count toward the PDB budget and cannot block node drains.

Operator and webhook PDBs¶

The Operator deployment and the webhook server deployment each have their own PDB:

Component

PDB spec

When created

scylla-operator

minAvailable: 1

When running with more than one replica

webhook-server

minAvailable: 1

When running with more than one replica

These PDBs ensure that at least one Operator pod and one webhook pod remain available during node drains, preventing a complete loss of the control plane during cluster maintenance.

PDB interaction with operations¶

Rolling updates¶

The Operator uses a partition-based rollout strategy for StatefulSets. During an upgrade:

  1. All StatefulSets are partitioned at their current replica count, preventing any pod from restarting.

  2. The partition is decremented by one, allowing a single pod to pick up the new template and restart.

  3. The controller waits for the restarted pod to become ready before decrementing the partition again.

  4. Only one rack makes progress per reconciliation cycle.

This one-at-a-time rollout naturally respects the maxUnavailable: 1 PDB because at most one pod is unavailable during each step.

Scale-down¶

When scaling down, the Operator decommissions one member at a time. The SidecarController drives the decommission process inside each pod. Because only one pod is being removed at a time, the PDB is not violated.

Node replacement¶

Node replacement follows a similar pattern — one node is replaced at a time. The PDB prevents the Kubernetes scheduler from evicting additional ScyllaDB pods while a replacement is in progress.

Kubernetes node drains¶

When a Kubernetes node is drained (for example, during a Kubernetes upgrade), the drain process evicts pods through the Eviction API, which respects PDBs. If the ScyllaDB cluster already has one node unavailable (due to a concurrent operation or failure), the PDB blocks further evictions until the first node recovers.

Related pages¶

  • Statefulsets and racks — rolling update strategy and partition-based rollout.

  • Understand — reconciliation model.

Was this page helpful?

PREVIOUS
Ignition
NEXT
Security
  • Create an issue
  • Edit this page

On this page

  • Pod disruption budgets
    • What a PDB does
    • ScyllaDB cluster PDB
      • Excluding cleanup Jobs
    • Operator and webhook PDBs
    • PDB interaction with operations
      • Rolling updates
      • Scale-down
      • Node replacement
      • Kubernetes node drains
    • Related pages
ScyllaDB Operator
Search Ask AI
  • master
    • master
    • v1.21
    • v1.20
    • v1.19
    • v1.18
  • Get Started
    • What Is ScyllaDB Operator?
    • ScyllaDB Concepts on Kubernetes
  • Install Operator
    • Provision infrastructure
      • Set up a GKE cluster for ScyllaDB
      • Set up an EKS cluster for ScyllaDB
      • Set up an OKE cluster for ScyllaDB
      • Set up an OpenShift cluster for ScyllaDB
    • Install with GitOps
    • Install with Helm
    • Install on OpenShift
  • Deploy ScyllaDB
    • Before you deploy
      • Set up dedicated node pools
      • Configure CPU pinning
      • Configure nodes
      • Configure ScyllaDB Operator
    • Deploy your first cluster
    • Reference deployments
      • Reference deployment: GKE
      • Reference deployment: EKS
      • Reference deployment: OKE
      • Reference deployment: OpenShift
    • Install ScyllaDB Manager
    • Set up networking
      • Configure external access
      • IPv6 networking
        • Getting started with IPv6 networking
        • Configure dual-stack networking
        • Configure IPv6-only networking
        • Migrate clusters to IPv6
        • Troubleshoot IPv6 networking issues
        • IPv6 networking concepts
    • Set up monitoring
      • Set up ScyllaDB Monitoring
      • Set up ScyllaDB Monitoring on OpenShift
      • Expose Grafana
    • Production checklist
  • Connect Your App
    • Connect via CQL
    • Alternator (DynamoDB API)
    • Discovery endpoint
  • Understand
    • Storage
    • Tuning
    • ScyllaDB Manager
    • Networking
    • ScyllaDB Monitoring overview
    • Bootstrap synchronisation
    • Automatic data cleanup
    • Sidecar and pod anatomy
    • Ignition
    • Pod disruption budgets
    • Security
    • StatefulSets and racks
  • Operate
    • Scale, add, remove racks
    • Replace nodes
    • Expand storage volumes
    • Use maintenance mode
    • Back up and restore
    • Restore from backup
    • Perform a rolling restart
    • Migrate a rack to a new node pool
    • Pass additional ScyllaDB arguments
    • Configure precomputed IO properties
  • Upgrade
    • Upgrading ScyllaDB Operator
    • Upgrading ScyllaDB clusters
  • Troubleshoot
    • Investigate pod restarts
    • Change log level on a live cluster
    • Recover from a failed node replace
    • Troubleshoot performance
    • Collect debugging information
      • Collect data with must-gather
      • must-gather contents
      • Query system tables for debugging
    • Collect core dumps
  • Reference
    • API Reference
      • scylla.scylladb.com
        • NodeConfig (scylla.scylladb.com/v1alpha1)
        • RemoteKubernetesCluster (scylla.scylladb.com/v1alpha1)
        • RemoteOwner (scylla.scylladb.com/v1alpha1)
        • ScyllaCluster (scylla.scylladb.com/v1)
        • ScyllaDBCluster (scylla.scylladb.com/v1alpha1)
        • ScyllaDBDatacenterNodesStatusReport (scylla.scylladb.com/v1alpha1)
        • ScyllaDBDatacenter (scylla.scylladb.com/v1alpha1)
        • ScyllaDBManagerClusterRegistration (scylla.scylladb.com/v1alpha1)
        • ScyllaDBManagerTask (scylla.scylladb.com/v1alpha1)
        • ScyllaDBMonitoring (scylla.scylladb.com/v1alpha1)
        • ScyllaOperatorConfig (scylla.scylladb.com/v1alpha1)
    • Feature gates
    • IPv6 configuration reference
    • Releases
    • Known issues
    • Conditions reference
    • nodetool alternatives
  • Contributing to ScyllaDB Operator
Docs Tutorials University Contact Us About Us
© 2026, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 22 May 2026.
Powered by Sphinx 9.1.0 & ScyllaDB Theme 1.9.2