Review nagios SOP

Signed-off-by: Michal Konečný <mkonecny@redhat.com>
This commit is contained in:
Michal Konečný 2021-09-07 14:19:29 +02:00
parent b3bba37f97
commit 09ebb5d459
2 changed files with 12 additions and 10 deletions

View file

@ -74,7 +74,7 @@
** xref:mirrormanager.adoc[MirrorManager Infrastructure - SOP]
** xref:mirrormanager-S3-EC2-netblocks.adoc[AWS Mirrors - SOP]
** xref:mote.adoc[mote - SOP]
** xref:nagios.adoc[nagios - SOP in review ]
** xref:nagios.adoc[Fedora Infrastructure Nagios - SOP]
** xref:netapp.adoc[netapp - SOP in review ]
** xref:new-hosts.adoc[new-hosts - SOP in review ]
** xref:nonhumanaccounts.adoc[nonhumanaccounts - SOP in review ]

View file

@ -28,23 +28,25 @@ nagios (noc01)::
The nagios configuration on noc01 should only monitor general host
statistics ansible status, uptime, apache status (up/down), SSH etc.
+
The configurations are found in nagios ansible module:
ansible/roles/nagios
The configurations are found in nagios ansible roles:
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client]
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server]
nagios-external (noc02)::
The nagios configuration on noc02 is located outside of our main
datacenter and should monitor our user websites/applications
(fedoraproject.org, FAS, PackageDB, Bodhi/Updates).
+
The configurations are found in nagios ansible role: roles/nagios
The configurations are found in nagios ansible roles:
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client]
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server]
[NOTE]
.Note
====
Production and staging instances through SSH: Please make sure you are
into 'sysadmin' and 'sysadmin-noc' FAS groups before trying to access
these hosts.
See SSH Access SOP
See xref:sshaccess.adoc[SSH Access SOP]
====
=== NRPE
@ -58,7 +60,7 @@ https://assets.nagios.com/downloads/nagioscore/docs/nrpe/NRPE.pdf
== Understanding the Messages
=== General:
=== General
Nagios notifications are generally easy to read, and follow this
consistent format:
@ -70,7 +72,7 @@ consistent format:
Reading the message will provide extra information on what is wrong.
=== Disk Space Warning/Critical:
=== Disk Space Warning/Critical
Disk space warnings normally include the following information:
@ -83,5 +85,5 @@ not the inode usage and is a sign that more diskspace is required.
=== Further Reading
* Ansible SOP
* Outages SOP
* xref:ansible.adoc[Ansible SOP]
* xref:outage.adoc[Outages SOP]