Review nagios SOP

Signed-off-by: Michal Konečný <mkonecny@redhat.com>
This commit is contained in:
Michal Konečný 2021-09-07 14:19:29 +02:00
parent b3bba37f97
commit 09ebb5d459
2 changed files with 12 additions and 10 deletions

View file

@ -74,7 +74,7 @@
** xref:mirrormanager.adoc[MirrorManager Infrastructure - SOP] ** xref:mirrormanager.adoc[MirrorManager Infrastructure - SOP]
** xref:mirrormanager-S3-EC2-netblocks.adoc[AWS Mirrors - SOP] ** xref:mirrormanager-S3-EC2-netblocks.adoc[AWS Mirrors - SOP]
** xref:mote.adoc[mote - SOP] ** xref:mote.adoc[mote - SOP]
** xref:nagios.adoc[nagios - SOP in review ] ** xref:nagios.adoc[Fedora Infrastructure Nagios - SOP]
** xref:netapp.adoc[netapp - SOP in review ] ** xref:netapp.adoc[netapp - SOP in review ]
** xref:new-hosts.adoc[new-hosts - SOP in review ] ** xref:new-hosts.adoc[new-hosts - SOP in review ]
** xref:nonhumanaccounts.adoc[nonhumanaccounts - SOP in review ] ** xref:nonhumanaccounts.adoc[nonhumanaccounts - SOP in review ]

View file

@ -28,23 +28,25 @@ nagios (noc01)::
The nagios configuration on noc01 should only monitor general host The nagios configuration on noc01 should only monitor general host
statistics ansible status, uptime, apache status (up/down), SSH etc. statistics ansible status, uptime, apache status (up/down), SSH etc.
+ +
The configurations are found in nagios ansible module: The configurations are found in nagios ansible roles:
ansible/roles/nagios * https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client]
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server]
nagios-external (noc02):: nagios-external (noc02)::
The nagios configuration on noc02 is located outside of our main The nagios configuration on noc02 is located outside of our main
datacenter and should monitor our user websites/applications datacenter and should monitor our user websites/applications
(fedoraproject.org, FAS, PackageDB, Bodhi/Updates). (fedoraproject.org, FAS, PackageDB, Bodhi/Updates).
+ +
The configurations are found in nagios ansible role: roles/nagios The configurations are found in nagios ansible roles:
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client]
* https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server]
[NOTE] [NOTE]
.Note
==== ====
Production and staging instances through SSH: Please make sure you are Production and staging instances through SSH: Please make sure you are
into 'sysadmin' and 'sysadmin-noc' FAS groups before trying to access into 'sysadmin' and 'sysadmin-noc' FAS groups before trying to access
these hosts. these hosts.
See SSH Access SOP See xref:sshaccess.adoc[SSH Access SOP]
==== ====
=== NRPE === NRPE
@ -58,7 +60,7 @@ https://assets.nagios.com/downloads/nagioscore/docs/nrpe/NRPE.pdf
== Understanding the Messages == Understanding the Messages
=== General: === General
Nagios notifications are generally easy to read, and follow this Nagios notifications are generally easy to read, and follow this
consistent format: consistent format:
@ -70,7 +72,7 @@ consistent format:
Reading the message will provide extra information on what is wrong. Reading the message will provide extra information on what is wrong.
=== Disk Space Warning/Critical: === Disk Space Warning/Critical
Disk space warnings normally include the following information: Disk space warnings normally include the following information:
@ -83,5 +85,5 @@ not the inode usage and is a sign that more diskspace is required.
=== Further Reading === Further Reading
* Ansible SOP * xref:ansible.adoc[Ansible SOP]
* Outages SOP * xref:outage.adoc[Outages SOP]