From 09ebb5d45963473f06ac337397592834449655f7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Kone=C4=8Dn=C3=BD?= Date: Tue, 7 Sep 2021 14:19:29 +0200 Subject: [PATCH] Review nagios SOP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Michal Konečný --- modules/sysadmin_guide/nav.adoc | 2 +- modules/sysadmin_guide/pages/nagios.adoc | 20 +++++++++++--------- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/modules/sysadmin_guide/nav.adoc b/modules/sysadmin_guide/nav.adoc index b60e44f..710eb13 100644 --- a/modules/sysadmin_guide/nav.adoc +++ b/modules/sysadmin_guide/nav.adoc @@ -74,7 +74,7 @@ ** xref:mirrormanager.adoc[MirrorManager Infrastructure - SOP] ** xref:mirrormanager-S3-EC2-netblocks.adoc[AWS Mirrors - SOP] ** xref:mote.adoc[mote - SOP] -** xref:nagios.adoc[nagios - SOP in review ] +** xref:nagios.adoc[Fedora Infrastructure Nagios - SOP] ** xref:netapp.adoc[netapp - SOP in review ] ** xref:new-hosts.adoc[new-hosts - SOP in review ] ** xref:nonhumanaccounts.adoc[nonhumanaccounts - SOP in review ] diff --git a/modules/sysadmin_guide/pages/nagios.adoc b/modules/sysadmin_guide/pages/nagios.adoc index a29c189..2853b4b 100644 --- a/modules/sysadmin_guide/pages/nagios.adoc +++ b/modules/sysadmin_guide/pages/nagios.adoc @@ -28,23 +28,25 @@ nagios (noc01):: The nagios configuration on noc01 should only monitor general host statistics ansible status, uptime, apache status (up/down), SSH etc. + - The configurations are found in nagios ansible module: - ansible/roles/nagios + The configurations are found in nagios ansible roles: + * https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client] + * https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server] nagios-external (noc02):: The nagios configuration on noc02 is located outside of our main datacenter and should monitor our user websites/applications (fedoraproject.org, FAS, PackageDB, Bodhi/Updates). + - The configurations are found in nagios ansible role: roles/nagios + The configurations are found in nagios ansible roles: + * https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_client[ansible/roles/nagios_client] + * https://pagure.io/fedora-infra/ansible/blob/main/f/roles/nagios_server[ansible/roles/nagios_server] [NOTE] -.Note ==== Production and staging instances through SSH: Please make sure you are into 'sysadmin' and 'sysadmin-noc' FAS groups before trying to access these hosts. -See SSH Access SOP +See xref:sshaccess.adoc[SSH Access SOP] ==== === NRPE @@ -58,7 +60,7 @@ https://assets.nagios.com/downloads/nagioscore/docs/nrpe/NRPE.pdf == Understanding the Messages -=== General: +=== General Nagios notifications are generally easy to read, and follow this consistent format: @@ -70,7 +72,7 @@ consistent format: Reading the message will provide extra information on what is wrong. -=== Disk Space Warning/Critical: +=== Disk Space Warning/Critical Disk space warnings normally include the following information: @@ -83,5 +85,5 @@ not the inode usage and is a sign that more diskspace is required. === Further Reading -* Ansible SOP -* Outages SOP +* xref:ansible.adoc[Ansible SOP] +* xref:outage.adoc[Outages SOP]