Few typos in prometheus_for_ops

Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
This commit is contained in:
Pierre-Yves Chibon 2021-04-19 11:55:21 +02:00
parent 1727112869
commit 0788095004

View file

@ -21,9 +21,9 @@ Notes on operator deployment
------------------- -------------------
Operator pattern is often used with kubernetes and openshift for more complex deployments. Operator pattern is often used with kubernetes and openshift for more complex deployments.
Instead of applying all of the configuration to dpeloy your services, you deploy a special, Instead of applying all of the configuration to deploy your services, you deploy a special,
smaller service called operator, that has necessary permissions to deploy and configure the complex service. smaller service called operator, that has necessary permissions to deploy and configure the complex service.
Once the operator is running, instead of configuring the service itself with servie-specific config-maps, Once the operator is running, instead of configuring the service itself with service-specific config-maps,
you create operator specific kubernetes objects, so-alled CRDs. you create operator specific kubernetes objects, so-alled CRDs.
The deployment of the operator in question was done by configuring the CRDs, roles and rolebinding and operator setup: The deployment of the operator in question was done by configuring the CRDs, roles and rolebinding and operator setup:
@ -62,7 +62,7 @@ that can serve as persistent storage.
For the persistent volume to work for this purpose, it has to For the persistent volume to work for this purpose, it has to
**needs to have POSIX-compliant filesystem**, and NFS we currently have configured is not. **needs to have POSIX-compliant filesystem**, and NFS we currently have configured is not.
This is discussed in the `operational aspects <https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects>`_ This is discussed in the `operational aspects <https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects>`_
of Prmetheus documentation of Prometheus documentation
The easiest supported way to have a POSIX-compliant `filesystem is to setup local-storage <https://docs.openshift.com/container-platform/3.11/install_config/configuring_local.html>`_ The easiest supported way to have a POSIX-compliant `filesystem is to setup local-storage <https://docs.openshift.com/container-platform/3.11/install_config/configuring_local.html>`_
in the cluster. in the cluster.
@ -70,7 +70,7 @@ in the cluster.
In 4.x versions of OpenShift `there is a local-storage-operator <https://docs.openshift.com/container-platform/4.7/storage/persistent_storage/persistent-storage-local.html>`_ for this purpose. In 4.x versions of OpenShift `there is a local-storage-operator <https://docs.openshift.com/container-platform/4.7/storage/persistent_storage/persistent-storage-local.html>`_ for this purpose.
This is the simplest way to have working persistence, but it prevents us to have multiple instanes This is the simplest way to have working persistence, but it prevents us to have multiple instanes
across openshift nodes, as the pod is using the underlying gilesystem on the node. across openshift nodes, as the pod is using the underlying filesystem on the node.
To ask the operator to create persisted prometheus, you specify in its configuration i.e.: To ask the operator to create persisted prometheus, you specify in its configuration i.e.:
@ -91,9 +91,9 @@ By default retention is set to 24 hours and can be over-ridden
Notes on long term storage Notes on long term storage
-------------------- --------------------
Usually, the prometheus itself is setup to store its metrics for shorter ammount of time, Usually, prometheus itself is setup to store its metrics for shorter ammount of time,
and it is expected that for longterm storage and analysis, there is some other storage solution, and it is expected that for longterm storage and analysis, there is some other storage solution,
such as influxdb, timescale. such as influxdb or timescaledb.
We are currently running a POC that sychronizes Prometheus with Timescaledb (running on Postgresql) We are currently running a POC that sychronizes Prometheus with Timescaledb (running on Postgresql)
through a middleware service called `promscale <https://github.com/timescale/promscale>`_ . through a middleware service called `promscale <https://github.com/timescale/promscale>`_ .
@ -101,10 +101,10 @@ through a middleware service called `promscale <https://github.com/timescale/pro
Promscale just needs an access to a appropriate postgresql database: Promscale just needs an access to a appropriate postgresql database:
and can be configured through PROMSCALE_DB_PASSWORD, PROMSCALE_DB_HOST. and can be configured through PROMSCALE_DB_PASSWORD, PROMSCALE_DB_HOST.
By default it will ensure the database has timescale installed and cofigures its database By default it will ensure the database has timescaledb installed and configures its database
automatically. automatically.
We setup the prometheus with directive to use promscale service as a backend: We setup prometheus with directive to use promscale service as a backend:
https://github.com/timescale/promscale https://github.com/timescale/promscale
:: ::
@ -118,14 +118,13 @@ Notes on auxialiary services
---------------------------- ----------------------------
As prometheus is primarily targeted to collect metrics from As prometheus is primarily targeted to collect metrics from
services that have beein instrumented to expose them, if you don't services that have beein instrumented to expose them, if your service is not instrumented,
your service is not instrumented, or it is not a service, or it is not a service, i.e. a batch-job, you need an adapter to help you with the metrics collection.
i.e. a batch-job, you need an adapter to help you with the metrics collection.
There are two services that help with this. There are two services that help with this.
* `blackbox exporter <https://github.com/prometheus/blackbox_exporter>`_ to monitor services that have not been instruented based on querying public a.p.i. * `blackbox exporter <https://github.com/prometheus/blackbox_exporter>`_ to monitor services that have not been instrumented based on querying public a.p.i.
* `push gateqay <https://prometheus.io/docs/practices/pushing/#should-i-be-using-the-pushgateway>`_ that helps collect information from batch-jobs * `push gateway <https://prometheus.io/docs/practices/pushing/#should-i-be-using-the-pushgateway>`_ that helps collect information from batch-jobs
Maintaining the push-gateway can be relegated to the application developer, Maintaining the push-gateway can be relegated to the application developer,
as it is lightweight, and by colloecting metrics from the namespace it is running in, as it is lightweight, and by colloecting metrics from the namespace it is running in,
@ -151,7 +150,7 @@ of the prometheus definition:
name: blackbox name: blackbox
We can then instruct what is to be monitored through the configmap-blackbox, you can find `relevant examples <https://github.com/prometheus/blackbox_exporter/blob/master/example.yml>` in the project repo. We can then instruct what is to be monitored through the configmap-blackbox, you can find `relevant examples <https://github.com/prometheus/blackbox_exporter/blob/master/example.yml>` in the project repo.
Beause blackox exporter is in the sam epod, we need to use the additional-scrape-config to add it in. Beause blackox exporter is in the same pod, we need to use the additional-scrape-config to add it in.
Notes on alerting Notes on alerting
----------------- -----------------
@ -180,10 +179,10 @@ manage the forwarding of these alerts.
serverName: alertmanager-service.application-monitoring.svc serverName: alertmanager-service.application-monitoring.svc
We already have alertmanager running and configured by the alertmanager-operator. We already have alertmanager running and configured by the alertmanager-operator.
Alertmanager itself is really simplistic with a simple ui and api, that alows for silencing an Alertmanager itself is really simplistic with a simple ui and api, that allows for silencing an
alert for a given ammount of time. alert for a given ammount of time.
It it is expected that the actual user-interaction is happening elsewhere, It is expected that the actual user-interaction is happening elsewhere,
either through services like OpsGenie, or through i.e. `integration with zabbix <https://devopy.io/setting-up-zabbix-alertmanager-integration/>`_ either through services like OpsGenie, or through i.e. `integration with zabbix <https://devopy.io/setting-up-zabbix-alertmanager-integration/>`_
More of a build-it yourself solution is to use i.e. https://karma-dashboard.io/, More of a build-it yourself solution is to use i.e. https://karma-dashboard.io/,