Review nagios SOP

Signed-off-by: Michal Konečný <mkonecny@redhat.com>
This commit is contained in:
Michal Konečný 2021-09-07 14:19:29 +02:00
parent b3bba37f97
commit 09ebb5d459
2 changed files with 12 additions and 10 deletions

View file

@ -28,23 +28,25 @@ nagios (noc01)::
The nagios configuration on noc01 should only monitor general host
statistics ansible status, uptime, apache status (up/down), SSH etc.
+
The configurations are found in nagios ansible module:
ansible/roles/nagios
The configurations are found in nagios ansible roles:
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client]
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server]
nagios-external (noc02)::
The nagios configuration on noc02 is located outside of our main
datacenter and should monitor our user websites/applications
(fedoraproject.org, FAS, PackageDB, Bodhi/Updates).
+
The configurations are found in nagios ansible role: roles/nagios
The configurations are found in nagios ansible roles:
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client]
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server]
[NOTE]
.Note
====
Production and staging instances through SSH: Please make sure you are
into 'sysadmin' and 'sysadmin-noc' FAS groups before trying to access
these hosts.
See SSH Access SOP
See xref:sshaccess.adoc[SSH Access SOP]
====
=== NRPE
@ -58,7 +60,7 @@ https://assets.nagios.com/downloads/nagioscore/docs/nrpe/NRPE.pdf
== Understanding the Messages
=== General:
=== General
Nagios notifications are generally easy to read, and follow this
consistent format:
@ -70,7 +72,7 @@ consistent format:
Reading the message will provide extra information on what is wrong.
=== Disk Space Warning/Critical:
=== Disk Space Warning/Critical
Disk space warnings normally include the following information:
@ -83,5 +85,5 @@ not the inode usage and is a sign that more diskspace is required.
=== Further Reading
* Ansible SOP
* Outages SOP
* xref:ansible.adoc[Ansible SOP]
* xref:outage.adoc[Outages SOP]