Version: Next

Collecting metrics using Prometheus

LitmusChaos facilitates real-time monitoring for events and metrics using it’s native chaos exporter. These events and metrics can be exported into any TSDBs (Time-series databases) to overlay on top of application performance graphs and also as additional visualizations for chaos testing statistics. To set up or configure your monitoring infrastructure to support litmus chaos events and metrics, we provide both service endpoints and service monitors setup.

Before you begin

The following are required before integrating Prometheus in litmus 2.0:

Prometheus deployment with scrape job

Clone the litmus repo

git clone https://github.com/litmuschaos/litmus.git
cd litmus/monitoring

Create monitoring namespace on the cluster

kubectl create ns monitoring

Deploy prometheus components

kubectl -n monitoring apply -f utils/prometheus/prometheus-scrape-configuration/

Deploy metrics exporters

kubectl -n monitoring apply -f utils/metrics-exporters/node-exporter/
kubectl -n monitoring apply -f utils/metrics-exporters/kube-state-metrics/

Deploy chaos-exporter when the cluster is not connected to litmus 2.0 control plane via litmus chaos delegate (exporter is installed as a part of the chaos delegate bundle)

kubectl -n litmus apply -f utils/metrics-exporters/litmus-metrics/chaos-exporter/

Sample scrape job

- job_name: 'chaos-exporter'
  static_configs:
    - targets: ['chaos-exporter.litmus.svc.cluster.local:8080']
  relabel_configs:
    - target_label: instance
      replacement: 'chaos-exporter-service'

Prometheus operator with service monitor

Clone the litmus repo

git clone https://github.com/litmuschaos/litmus.git
cd litmus/monitoring

Create monitoring namespace on the cluster

kubectl create ns monitoring

Create the operator to instantiate all CRDs

kubectl -n monitoring apply -f utils/prometheus/prometheus-operator/

Deploy monitoring components

kubectl -n monitoring apply -f utils/metrics-exporters-with-service-monitors/node-exporter/
kubectl -n monitoring apply -f utils/metrics-exporters-with-service-monitors/kube-state-metrics/

Deploy chaos-exporter when the cluster is not connected to litmus 2.0 control plane via litmus chaos delegate (exporter is installed as a part of the chaos delegate bundle)

kubectl -n litmus apply -f utils/metrics-exporters-with-service-monitors/litmus-metrics/chaos-exporter/

Deploy prometheus instance and all the service monitors for targets

kubectl -n monitoring apply -f utils/prometheus/prometheus-configuration/

Sample service monitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: chaos-exporter
  labels:
    k8s-app: chaos-exporter
  namespace: litmus
spec:
  jobLabel: app
  selector:
    matchLabels:
      app: chaos-exporter
  namespaceSelector:
    matchNames:
      - litmus
  endpoints:
    - port: tcp
      interval: 1s
      metricRelabelings:
        - targetLabel: instance
          replacement: 'chaos-exporter-service'

Prometheus community version (helm) - kube-prometheus-stack with pod monitor

Clone the litmus repo

git clone https://github.com/litmuschaos/litmus.git
cd litmus/monitoring

Deploy chaos-exporter when the cluster is not connected to litmus 2.0 control plane via litmus chaos delegate (exporter is installed as a part of the chaos delegate bundle)

kubectl -n litmus apply -f utils/metrics-exporters/litmus-metrics/chaos-exporter/

Create monitoring namespace on the cluster

kubectl create ns monitoring

Install prometheus via helm

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prom prometheus-community/kube-prometheus-stack --namespace monitoring

Create the pod monitor as specified

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: chaos-exporter-monitor
  namespace: monitoring
  labels:
    release: prometheus-stack
spec:
  selector:
    matchLabels:
      app: chaos-exporter
  namespaceSelector:
    matchNames:
      - litmus
  podMetricsEndpoints:
    - port: tcp
    - interval: 1s
      metricRelabelings:
        - targetLabel: instance
          replacement: 'chaos-exporter-service'

Prometheus alertmanager for generating alerts for experiment results

Prometheus alerts can be triggered on alertmanager based on chaos experiment verdicts, probe success percentage and related metadata on the metric litmuschaos_experiment_verdict from chaos-exporter

Link: https://github.com/litmuschaos/tutorials/issues/6

promProbe for prometheus metrics

The promProbe allows users to run Prometheus queries and match the resulting output against specific conditions. The intent behind this probe is to allow users to define metrics-based SLOs in a declarative way and determine the experiment verdict based on its success. The probe runs the query on a Prometheus server defined by the endpoint, and checks whether the output satisfies the specified criteria.

Know more on promProbe here

Collecting metrics using Prometheus

Before you begin

Prometheus deployment with scrape job

Prometheus operator with service monitor

Prometheus community version (helm) - kube-prometheus-stack with pod monitor

Prometheus alertmanager for generating alerts for experiment results

promProbe for prometheus metrics

Resources

Learn More

Before you begin​

Prometheus deployment with scrape job​

Prometheus operator with service monitor​

Prometheus community version (helm) - kube-prometheus-stack with pod monitor​

Prometheus alertmanager for generating alerts for experiment results​

promProbe for prometheus metrics​

Resources​

Learn More​

Before you begin

Prometheus deployment with scrape job

Prometheus operator with service monitor

Prometheus community version (helm) - kube-prometheus-stack with pod monitor

Prometheus alertmanager for generating alerts for experiment results

promProbe for prometheus metrics

Resources

Learn More