arc/docs/monitoring_metrics/index.rst

Monitoring / Metrics
========================

As an ARC team initiative we want to investigate Prometheus and Zabbix
as our new monitoring and metrics solutions, by:

 -  Installing Zabbix server in a VM, and hooking up the staging dist-git to it with an agent
 -  Installing Prometheus in our Open Shift and collecting metrics for a selected project in a self-service fashion

Prior POCs/deployments
----------------------

Fabian Arrotin deployed and utilizes zabbix in centos infrastructure.
 - https://github.com/CentOS/ansible-role-zabbix-server

Adam Saleh has deployed a POC prometheus deployment for CoreOS team.
 - https://pagure.io/centos-infra/issue/112

David Kirwan was part of the development team of https://github.com/integr8ly/application-monitoring-operator/ and did some POC around prometheus push-gateway in centos openshift

Investigation
-------------

In process we want to be able to answer the questions posed in the latest mailing thread and by the end have a setup that can lead directly into mirating us away from nagios. The questions (mostly from Kevin):

 -  How can we provision both of them automatically from ansible?
 -  Can we get zabbix to pull from prometheus?
 -  Can zabbix handle our number of machines?
 -  How flexible is the alerting?
Added the monitoring/metrics initiative 2021-03-08 14:29:49 +01:00			`Monitoring / Metrics`
			`========================`

			`As an ARC team initiative we want to investigate Prometheus and Zabbix`
			`as our new monitoring and metrics solutions, by:`

			`- Installing Zabbix server in a VM, and hooking up the staging dist-git to it with an agent`
			`- Installing Prometheus in our Open Shift and collecting metrics for a selected project in a self-service fashion`

			`Prior POCs/deployments`
			`----------------------`

Fix a few typos in the docs Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr> 2021-03-16 09:25:40 +01:00			`Fabian Arrotin deployed and utilizes zabbix in centos infrastructure.`
Added the monitoring/metrics initiative 2021-03-08 14:29:49 +01:00			`- https://github.com/CentOS/ansible-role-zabbix-server`

			`Adam Saleh has deployed a POC prometheus deployment for CoreOS team.`
			`- https://pagure.io/centos-infra/issue/112`

			`David Kirwan was part of the development team of https://github.com/integr8ly/application-monitoring-operator/ and did some POC around prometheus push-gateway in centos openshift`

			`Investigation`
			`-------------`

			`In process we want to be able to answer the questions posed in the latest mailing thread and by the end have a setup that can lead directly into mirating us away from nagios. The questions (mostly from Kevin):`

			`- How can we provision both of them automatically from ansible?`
Fix a few typos in the docs Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr> 2021-03-16 09:25:40 +01:00			`- Can we get zabbix to pull from prometheus?`
			`- Can zabbix handle our number of machines?`
			`- How flexible is the alerting?`