Was this page helpful?
Caution
You're viewing documentation for an unstable version of Scylla Operator. Switch to the latest stable version.
ScyllaDB Monitoring overview¶
Architecture¶
ScyllaDB exposes its metrics in the Prometheus format.
Scylla Operator provides the ScyllaDBMonitoring custom resource
that allows you to set up a complete monitoring stack for your ScyllaDB clusters based on the following components:
Prometheus for metrics collection and alerting (along with scraping and alerting rules targeting ScyllaDB instances).
Grafana for metrics visualization (with pre-configured dashboards for ScyllaDB).
flowchart BT
%% Styles
classDef scyllaVibe fill:#57d1e5,stroke:#4f5d75,color:#000;
classDef prometheusVibe fill:#e6512d,stroke:#ff9800,color:#fff;
classDef grafanaVibe fill:#f5a612,stroke:#ff9800,color:#fff;
%% Nodes
SC[ScyllaCluster]
class SC scyllaVibe
SDM[ScyllaDBMonitoring]
class SDM scyllaVibe
Prometheus
class Prometheus prometheusVibe
GrafanaDeployment
class GrafanaDeployment grafanaVibe
ServiceMonitor
class ServiceMonitor prometheusVibe
PrometheusRule
class PrometheusRule prometheusVibe
OtherGrafanaResources
class OtherGrafanaResources grafanaVibe
PrometheusOperator(Prometheus Operator)
class PrometheusOperator prometheusVibe
Operator(👷 Human Operator)
%% Grouping
subgraph K8sCluster[Kubernetes Cluster]
direction TB
MS
SC
PrometheusOperator
Prometheus
end
subgraph MS[Monitoring Stack]
SDM
subgraph PrometheusStack[Prometheus]
ServiceMonitor
PrometheusRule
end
subgraph GrafanaStack[Grafana]
direction LR
GrafanaDeployment[Deployment]
OtherGrafanaResources["`_Other objects..._`"]
end
end
%% Relationships between nodes
SDM -->|selects| SC
SDM -->|creates| PrometheusStack
SDM -->|creates| GrafanaStack
Prometheus -->|scrapes| SC
Operator -->|creates| Prometheus
Operator -->|creates| SDM
PrometheusOperator -->|reconciles| PrometheusStack
PrometheusOperator -->|reconciles| Prometheus
Note
The ScyllaDBMonitoring CRD is still in its v1alpha1 version, yet it is considered stable and ready for production use, with
the following caveats:
spec.components.grafana.exposeOptionsandspec.components.prometheus.exposeOptionsare deprecated and will be removed in the next API version,the Managed mode is deprecated; the External mode for Prometheus is likely to be the only supported mode in the next API version.
Prometheus¶
For deploying and/or configuring Prometheus, Scylla Operator relies on Prometheus Operator.
Scylla Operator supports two modes of operation for ScyllaDBMonitoring regarding Prometheus deployment:
Managed and External. You can choose the mode that best fits your needs by setting the spec.components.prometheus.mode field in the ScyllaDBMonitoring resource.
Depending on the mode chosen, Scylla Operator may deploy and manage a Prometheus instance for you, or it can be configured to use an existing Prometheus instance (managed by the Prometheus Operator) in your cluster.
The following Prometheus Operator resources are created by Scylla Operator when you deploy a ScyllaDBMonitoring resource:
Prometheus- the Prometheus instance itself (it may be omitted in External mode).ServiceMonitor- the resource that defines how to scrape metrics from ScyllaDB nodes.PrometheusRule- the resource that defines alerting rules for Prometheus.
The Prometheus version used in the deployment is tied to the version of Scylla Operator. You can find the exact version used in the
config.yaml file under operator.prometheusVersion key.
External¶
The External mode plugs into an existing Prometheus that is managed by Prometheus Operator (and therefore can be configured by ServiceMonitor).
In the External mode, Scylla Operator will not deploy a Prometheus resource (will instead expect a running Prometheus in the cluster already), but it will still create the ServiceMonitor and PrometheusRule resources
that an existing Prometheus Operator in your cluster will use to configure your Prometheus instance. This mode is useful if you already have a Prometheus instance
deployed in your cluster, and you want to use it for monitoring your ScyllaDB clusters. If you don’t have Prometheus deployed in your cluster, you need to deploy one to proceed.
When using this mode, you need to ensure that the existing Prometheus instance is configured to discover and scrape the
ServiceMonitor and PrometheusRule resources created by Scylla Operator. Please refer to the Setting up ScyllaDB Monitoring guide for more details.
Please note that in this mode, ScyllaDBMonitoring has to be configured so that Grafana can access the Prometheus instance.
You can configure Grafana datasources in the spec.components.grafana.datasources field of the ScyllaDBMonitoring resource.
Please refer to the ScyllaDBMonitoring API reference for details.
Managed¶
Note
This mode is deprecated and will be removed in a future version. Instead, please deploy your own Prometheus and use External.
In the Managed mode, Scylla Operator will deploy a Prometheus instance for you. This is the default mode.
What this means is that when you create a ScyllaDBMonitoring resource, Scylla Operator will create a Prometheus
resource (from the Prometheus Operator) in the same namespace as the ScyllaDBMonitoring resource.
This Prometheus instance will be configured to scrape metrics from the ScyllaDB nodes in the cluster that ScyllaDBMonitoring is monitoring and
will also have alerting rules configured for ScyllaDB (using ServiceMonitor and PrometheusRule CRs).
Grafana¶
For deploying Grafana, Scylla Operator doesn’t use any third-party operator. Instead, it manages the Grafana deployment directly. It preconfigures Grafana with dashboards from scylla-monitoring.
The Grafana image used in the deployment is tied to the version of Scylla Operator. You can find the exact image used in the
config.yaml file under operator.grafanaImage key.
Exposing Grafana¶
Scylla Operator creates a ClusterIP Service named <scyllaDBMonitoringName>-grafana for each ScyllaDBMonitoring.
You can access it outside the cluster using your preferred method:
Port forwarding using
kubectl port-forwardcommand for temporary access.Using Ingress or Gateway API resources (e.g., HTTPRoute) for production access.
You can learn more about exposing Grafana in the Exposing Grafana guide.