Added more prometheus documentation
This commit is contained in:
parent
ed508e7b9b
commit
646a390c9a
2 changed files with 124 additions and 1 deletions
|
@ -10,6 +10,7 @@ This way, the merics will be scraped into the configured prometheus and correctl
|
|||
As an example, lets look at ServiceMonitor for bodhi:
|
||||
|
||||
::
|
||||
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: ServiceMonitor
|
||||
metadata:
|
||||
|
@ -30,6 +31,7 @@ machinery at our disposal, see `Matcher <https://v1-17.docs.kubernetes.io/docs/r
|
|||
To manage alerting, you can create an alerting rule:
|
||||
|
||||
::
|
||||
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: PrometheusRule
|
||||
metadata:
|
||||
|
@ -79,3 +81,32 @@ As part of the proof of concept we have instrumented Bodhi application,
|
|||
to collect data through prometheus_client python library:
|
||||
https://github.com/fedora-infra/bodhi/pull/4079
|
||||
|
||||
Notes on alerting
|
||||
-----------------
|
||||
|
||||
To be be notified of alerts, you need to be subscribed to recievers that
|
||||
have been configured in alertmanager.
|
||||
|
||||
The configuration of the rules you want to alert on can be done in the namspace of your application.
|
||||
For example:
|
||||
|
||||
::
|
||||
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: PrometheusRule
|
||||
metadata:
|
||||
labels:
|
||||
monitoring-key: cpe
|
||||
name: prometheus-application-monitoring-rules
|
||||
spec:
|
||||
groups:
|
||||
- name: general.rules
|
||||
rules:
|
||||
- alert: AlertBodhi500Status
|
||||
annotations:
|
||||
summary: Alerting on too many server errors
|
||||
expr: (100*sum(rate(pyramid_request_count{namespace="bodhi", path_info_pattern=~".*[^healthz]", status="500"}[20m]))/sum(rate(pyramid_request_count{namespace="bodhi", path_info_pattern=~".*[^healthz]"}[20m])))>1
|
||||
labels:
|
||||
severity: high
|
||||
|
||||
would alert if there is more than 1% responses with 500 status code.
|
Loading…
Add table
Add a link
Reference in a new issue