diff --git a/docs/initiatives.rst b/docs/initiatives.rst index 87dd0b5..7d1e1bf 100644 --- a/docs/initiatives.rst +++ b/docs/initiatives.rst @@ -5,3 +5,4 @@ Initiatives :maxdepth: 2 datanommer_datagrepper/index + monitoring_metrics/index diff --git a/docs/monitoring_metrics/index.rst b/docs/monitoring_metrics/index.rst new file mode 100644 index 0000000..f5315cb --- /dev/null +++ b/docs/monitoring_metrics/index.rst @@ -0,0 +1,29 @@ +Monitoring / Metrics +======================== + +As an ARC team initiative we want to investigate Prometheus and Zabbix +as our new monitoring and metrics solutions, by: + + - Installing Zabbix server in a VM, and hooking up the staging dist-git to it with an agent + - Installing Prometheus in our Open Shift and collecting metrics for a selected project in a self-service fashion + +Prior POCs/deployments +---------------------- + +Fabian Arrotin deployed and utilizes zabbix in centos ifrastructure. + - https://github.com/CentOS/ansible-role-zabbix-server + +Adam Saleh has deployed a POC prometheus deployment for CoreOS team. + - https://pagure.io/centos-infra/issue/112 + +David Kirwan was part of the development team of https://github.com/integr8ly/application-monitoring-operator/ and did some POC around prometheus push-gateway in centos openshift + +Investigation +------------- + +In process we want to be able to answer the questions posed in the latest mailing thread and by the end have a setup that can lead directly into mirating us away from nagios. The questions (mostly from Kevin): + + - How can we provision both of them automatically from ansible? + - can we get zabbix to pull from prometheus? + - Can zabbix handle our number of machines? + - How flexible is the alerting.