Was this page helpful?
Caution
You're viewing documentation for an unstable version of ScyllaDB Operator. Switch to the latest stable version.
Set up a GKE cluster for ScyllaDB¶
This guide provisions a Google Kubernetes Engine (GKE) cluster suitable for running ScyllaDB. At the end, you will have:
A regional GKE cluster with a system node pool for infrastructure workloads.
A dedicated node pool with local NVMe SSDs, static CPU manager policy, and ScyllaDB labels and taints, spread across 3 zones.
Tip
All the steps in this guide are also available as a single executable script:
examples/gke/setup-gke-cluster.sh.
Set GCP_PROJECT and GCP_REGION, then run the script to provision everything in one go.
Prerequisites¶
A Google Cloud project with the Kubernetes Engine API enabled.
A GCP account with the
Kubernetes Engine Admin(roles/container.admin) IAM role on the project.The
gcloudCLI installed and configured (gcloud init).kubectlinstalled.Sufficient quota for
n2-highmem-16instances with local SSDs in your target region. See ScyllaDB cloud instance recommendations for GCE for recommended machine types and system requirements for minimum specifications.
Set environment variables¶
The rest of the guide refers to the variables defined here.
Set your GCP project and region — these have no defaults and must be provided:
export GCP_PROJECT="<your-project>" # e.g. my-scylla-project
export GCP_REGION="<your-region>" # e.g. us-central1
The remaining variables have sensible defaults and can be copied as-is. Override any value before running if needed:
# Cluster name.
export GKE_CLUSTER_NAME="${GKE_CLUSTER_NAME:-scylladb-demo}"
# Availability zones — one per ScyllaDB rack.
export GKE_ZONE_1="${GKE_ZONE_1:-${GCP_REGION}-a}"
export GKE_ZONE_2="${GKE_ZONE_2:-${GCP_REGION}-b}"
export GKE_ZONE_3="${GKE_ZONE_3:-${GCP_REGION}-c}"
# System node pool machine type.
export SYSTEM_MACHINE_TYPE="${SYSTEM_MACHINE_TYPE:-n2-standard-8}"
export SYSTEM_NODE_COUNT="${SYSTEM_NODE_COUNT:-2}"
# Dedicated ScyllaDB node pool. n2-highmem-16 provides 16 vCPU and 128 GiB RAM.
# See https://docs.scylladb.com/manual/stable/getting-started/cloud-instance-recommendations.html#google-compute-engine-gce
export SCYLLA_MACHINE_TYPE="${SCYLLA_MACHINE_TYPE:-n2-highmem-16}"
# Node count is per zone. With 3 zones this gives 3 nodes total.
export SCYLLA_NODE_COUNT="${SCYLLA_NODE_COUNT:-1}"
export SCYLLA_LOCAL_SSD_COUNT="${SCYLLA_LOCAL_SSD_COUNT:-4}"
Create a temporary directory¶
Create a temporary directory for configuration files used in this guide:
TMPDIR="$(mktemp -d)"
trap 'rm -rf "${TMPDIR}"' EXIT
Create the kubelet configuration¶
The dedicated ScyllaDB node pool requires static CPU manager policy for CPU pinning.
Create a systemconfig.yaml that will be passed to the node pool:
cat > "${TMPDIR}/systemconfig.yaml" <<EOF
kubeletConfig:
cpuManagerPolicy: static
EOF
Create the GKE cluster¶
Create a regional GKE cluster with a system node pool for infrastructure workloads (cert-manager, ScyllaDB Operator, etc.). The system pool is placed in a single zone to keep costs down:
gcloud container clusters create "${GKE_CLUSTER_NAME}" \
--project="${GCP_PROJECT}" \
--region="${GCP_REGION}" \
--node-locations="${GKE_ZONE_1}" \
--cluster-version="latest" \
--machine-type="${SYSTEM_MACHINE_TYPE}" \
--num-nodes="${SYSTEM_NODE_COUNT}" \
--disk-type='pd-ssd' --disk-size='20' \
--image-type='UBUNTU_CONTAINERD' \
--no-enable-autoupgrade \
--no-enable-autorepair
Create the dedicated ScyllaDB node pool¶
This pool is dedicated to ScyllaDB workloads. Each node has local NVMe SSDs for storage and the static CPU manager policy enabled via the system config file created earlier. The pool spans 3 zones so each ScyllaDB rack can be placed in a separate zone for fault tolerance:
gcloud container node-pools create 'scyllaclusters' \
--project="${GCP_PROJECT}" \
--region="${GCP_REGION}" \
--cluster="${GKE_CLUSTER_NAME}" \
--node-locations="${GKE_ZONE_1},${GKE_ZONE_2},${GKE_ZONE_3}" \
--node-version="latest" \
--machine-type="${SCYLLA_MACHINE_TYPE}" \
--num-nodes="${SCYLLA_NODE_COUNT}" \
--disk-type='pd-ssd' --disk-size='20' \
--local-nvme-ssd-block="count=${SCYLLA_LOCAL_SSD_COUNT}" \
--image-type='UBUNTU_CONTAINERD' \
--system-config-from-file="${TMPDIR}/systemconfig.yaml" \
--no-enable-autoupgrade \
--no-enable-autorepair \
--node-labels='scylla.scylladb.com/node-type=scylla' \
--node-taints='scylla-operator.scylladb.com/dedicated=scyllaclusters:NoSchedule'
Note
The --local-nvme-ssd-block flag provisions raw NVMe block devices that NodeConfig will configure as a RAID0 array.
Do not use --ephemeral-storage-local-ssd, which makes the SSDs managed by the kubelet and unavailable for direct RAID setup.
Get cluster credentials¶
gcloud container clusters get-credentials "${GKE_CLUSTER_NAME}" \
--project="${GCP_PROJECT}" \
--region="${GCP_REGION}"
Verify connectivity and node readiness:
kubectl get nodes -L scylla.scylladb.com/node-type
Example expected output:
NAME STATUS ROLES AGE VERSION NODE-TYPE
gke-scylladb-demo-default-pool-xxxxxxxx-xxxx Ready <none> 10m v1.32.1
gke-scylladb-demo-default-pool-xxxxxxxx-xxxx Ready <none> 10m v1.32.1
gke-scylladb-demo-scyllaclusters-xxxxxxxx-xxxx Ready <none> 5m v1.32.1 scylla
gke-scylladb-demo-scyllaclusters-xxxxxxxx-xxxx Ready <none> 5m v1.32.1 scylla
gke-scylladb-demo-scyllaclusters-xxxxxxxx-xxxx Ready <none> 5m v1.32.1 scylla
Clean up¶
Delete the GKE cluster (this also deletes all node pools):
gcloud container clusters delete "${GKE_CLUSTER_NAME}" \
--project="${GCP_PROJECT}" \
--region="${GCP_REGION}" \
--quiet
Next steps¶
Follow the Reference deployment: GKE for a complete ScyllaDB deployment on this cluster.