ScyllaDB Docs ScyllaDB Operator Management Monitoring ScyllaDB Monitoring overview

Caution

You're viewing documentation for an unstable version of ScyllaDB Operator. Switch to the latest stable version.

ScyllaDB Monitoring overview¶

Architecture¶

ScyllaDB exposes its metrics in the Prometheus format. ScyllaDB Operator provides the ScyllaDBMonitoring custom resource that allows you to set up a complete monitoring stack for your ScyllaDB clusters based on the following components:

Prometheus for metrics collection and alerting (along with scraping and alerting rules targeting ScyllaDB instances).
Grafana for metrics visualization (with pre-configured dashboards for ScyllaDB).

        flowchart BT

%% Styles
classDef scyllaVibe fill:#57d1e5,stroke:#4f5d75,color:#000;
classDef prometheusVibe fill:#e6512d,stroke:#ff9800,color:#fff;
classDef grafanaVibe fill:#f5a612,stroke:#ff9800,color:#fff;

%% Nodes
SC[ScyllaCluster]
class SC scyllaVibe

SDM[ScyllaDBMonitoring]
class SDM scyllaVibe

Prometheus
class Prometheus prometheusVibe

GrafanaDeployment
class GrafanaDeployment grafanaVibe

ServiceMonitor
class ServiceMonitor prometheusVibe

PrometheusRule
class PrometheusRule prometheusVibe

OtherGrafanaResources
class OtherGrafanaResources grafanaVibe

PrometheusOperator(Prometheus Operator)
class PrometheusOperator prometheusVibe

Operator(👷 Human Operator)

%% Grouping
subgraph K8sCluster[Kubernetes Cluster]
direction TB
MS
SC
PrometheusOperator
Prometheus
end
subgraph MS[Monitoring Stack]
SDM
subgraph PrometheusStack[Prometheus]
ServiceMonitor
PrometheusRule
end

subgraph GrafanaStack[Grafana]
direction LR
GrafanaDeployment[Deployment]
OtherGrafanaResources["`_Other objects..._`"]
end
end

%% Relationships between nodes
SDM -->|selects| SC
SDM -->|creates| PrometheusStack
SDM -->|creates| GrafanaStack
Prometheus -->|scrapes| SC

Operator -->|creates| Prometheus
Operator -->|creates| SDM

PrometheusOperator -->|reconciles| PrometheusStack
PrometheusOperator -->|reconciles| Prometheus

Note

The ScyllaDBMonitoring CRD is still in its v1alpha1 version, yet it is considered stable and ready for production use, with the following caveats:

spec.components.grafana.exposeOptions and spec.components.prometheus.exposeOptions are deprecated and will be removed in the next API version,
the Managed mode is deprecated; the External mode for Prometheus is likely to be the only supported mode in the next API version.

Prometheus¶

For deploying and/or configuring Prometheus, ScyllaDB Operator relies on Prometheus Operator.

ScyllaDB Operator supports two modes of operation for ScyllaDBMonitoring regarding Prometheus deployment: Managed and External. You can choose the mode that best fits your needs by setting the spec.components.prometheus.mode field in the ScyllaDBMonitoring resource.

Depending on the mode chosen, ScyllaDB Operator may deploy and manage a Prometheus instance for you, or it can be configured to use an existing Prometheus instance (managed by the Prometheus Operator) in your cluster.

The following Prometheus Operator resources are created by ScyllaDB Operator when you deploy a ScyllaDBMonitoring resource:

Prometheus - the Prometheus instance itself (it may be omitted in External mode).
ServiceMonitor - the resource that defines how to scrape metrics from ScyllaDB nodes.
PrometheusRule - the resource that defines alerting rules for Prometheus.

The Prometheus version used in the deployment is tied to the version of ScyllaDB Operator. You can find the exact version used in the config.yaml file under operator.prometheusVersion key.

External¶

The External mode plugs into an existing Prometheus that is managed by Prometheus Operator (and therefore can be configured by ServiceMonitor).

In the External mode, ScyllaDB Operator will not deploy a Prometheus resource (will instead expect a running Prometheus in the cluster already), but it will still create the ServiceMonitor and PrometheusRule resources that an existing Prometheus Operator in your cluster will use to configure your Prometheus instance. This mode is useful if you already have a Prometheus instance deployed in your cluster, and you want to use it for monitoring your ScyllaDB clusters. If you don’t have Prometheus deployed in your cluster, you need to deploy one to proceed.

When using this mode, you need to ensure that the existing Prometheus instance is configured to discover and scrape the ServiceMonitor and PrometheusRule resources created by ScyllaDB Operator. Please refer to the Setting up ScyllaDB Monitoring guide for more details.

Please note that in this mode, ScyllaDBMonitoring has to be configured so that Grafana can access the Prometheus instance. You can configure Grafana datasources in the spec.components.grafana.datasources field of the ScyllaDBMonitoring resource. Please refer to the ScyllaDBMonitoring API reference for details.

Managed¶

Note

This mode is deprecated and will be removed in a future version. Instead, please deploy your own Prometheus and use External.

In the Managed mode, ScyllaDB Operator will deploy a Prometheus instance for you. This is the default mode. What this means is that when you create a ScyllaDBMonitoring resource, ScyllaDB Operator will create a Prometheus resource (from the Prometheus Operator) in the same namespace as the ScyllaDBMonitoring resource. This Prometheus instance will be configured to scrape metrics from the ScyllaDB nodes in the cluster that ScyllaDBMonitoring is monitoring and will also have alerting rules configured for ScyllaDB (using ServiceMonitor and PrometheusRule CRs).

Grafana¶

For deploying Grafana, ScyllaDB Operator doesn’t use any third-party operator. Instead, it manages the Grafana deployment directly. It preconfigures Grafana with dashboards from scylla-monitoring.

The Grafana image used in the deployment is tied to the version of ScyllaDB Operator. You can find the exact image used in the config.yaml file under operator.grafanaImage key.

Exposing Grafana¶

ScyllaDB Operator creates a ClusterIP Service named <scyllaDBMonitoringName>-grafana for each ScyllaDBMonitoring. You can access it outside the cluster using your preferred method:

Port forwarding using kubectl port-forward command for temporary access.
Using Ingress or Gateway API resources (e.g., HTTPRoute) for production access.

You can learn more about exposing Grafana in the Exposing Grafana guide.

Was this page helpful?