2021-04-14 20:28:38 +05:30
|
|
|
Monitoring / Metrics with Prometheus
|
2023-11-16 08:02:56 +10:00
|
|
|
====================================
|
2021-04-14 20:28:38 +05:30
|
|
|
|
2023-11-16 08:02:56 +10:00
|
|
|
We are using Zabbix 5.0 (lts) server with PostgreSQL database. Starting with manual
|
|
|
|
configuration in a test vm and then automating it for for deployment, Ansible roles
|
|
|
|
`zabbix-server` and `zabbix-agent` are to results of this PoC work. Please follow FAQ to
|
|
|
|
see how to access staging deployment of zabbix.
|
2021-04-14 20:28:38 +05:30
|
|
|
|
|
|
|
zabbix-server
|
|
|
|
-------------
|
|
|
|
|
2023-11-16 08:02:56 +10:00
|
|
|
This role is ready at the base level but as the complexity of the monitoring increases,
|
|
|
|
more work would be needed. At the current level, it
|
2021-04-14 20:28:38 +05:30
|
|
|
|
2023-11-16 08:02:56 +10:00
|
|
|
- Installs needed packages for server
|
|
|
|
- configure zabbix, apache and PostgreSQL configuration files
|
|
|
|
- configures web UI
|
|
|
|
- configures kerberos authentication
|
2021-04-14 20:28:38 +05:30
|
|
|
|
2023-11-16 08:02:56 +10:00
|
|
|
While these basic things are good for POC, they are not ready to be in production until
|
|
|
|
we have configured the following
|
2021-04-14 20:28:38 +05:30
|
|
|
|
2023-11-16 08:02:56 +10:00
|
|
|
- add inventory files for groups and users and have zabbix-cli restore those in case
|
|
|
|
of a fresh installation
|
|
|
|
- Network config audit (see common challenges)
|
2021-04-14 20:28:38 +05:30
|
|
|
|
|
|
|
zabbix-agent
|
|
|
|
------------
|
|
|
|
|
|
|
|
This role is ready to be used and existing templates are good to gather basic
|
2023-11-16 08:02:56 +10:00
|
|
|
information. Though specific of what kind of common data would be collected from all
|
|
|
|
agent nodes needs to be discussed widely and set in template. Other than common metrics,
|
|
|
|
one can also export custom metrics using zabbix-sender (see FAQ).
|
2021-04-14 20:28:38 +05:30
|
|
|
|
|
|
|
Common challenges
|
|
|
|
-----------------
|
2023-11-16 08:02:56 +10:00
|
|
|
|
|
|
|
Lack of experience in selinux policies and network configuration, we are not very
|
|
|
|
confident with those. A veteran sysadmin would be needed audit.
|