Commit graph

278 commits

Author SHA1 Message Date
Pavel Raiskup
d0f7c7ca30 copr: use again a deterministic nrpe UID
It was notoriously colliding with other system users like copr-signer
and others.

Revert "copr: test without nrpe_client_uid specified"

This reverts commit 435b71a695.
2022-11-22 10:54:00 +01:00
Pavel Raiskup
435b71a695 copr: test without nrpe_client_uid specified
Revert "copr: define nrpe_client_uid=500"

This reverts commit fa5cd7344c.
2022-11-22 10:41:26 +01:00
Pavel Raiskup
fa5cd7344c copr: define nrpe_client_uid=500 2022-11-22 10:37:15 +01:00
Pavel Raiskup
baa6a0dff0 nagios_client: typo s/null/omit/ 2022-11-22 10:25:54 +01:00
Pavel Raiskup
2627babd44 nagios_client: precreate nrpe client
With a specific UID if {{ nrpe_client_uid }} is defined.
2022-11-22 10:16:14 +01:00
Kevin Fenzi
18eecec303 nagios: adjust redhat.com email check
Right now there's often a backlog due to it not responding to the fedora
side. So, lets bump up the checks a bit so they do not alert all the
time.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-10-12 12:45:48 -07:00
Aurélien Bompard
8962731dbc
Don't use datetime.fromtimestamp yet
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2022-05-24 18:37:27 +02:00
Aurélien Bompard
e979a1955e Update the datanommer Nagios check to query datagrepper directly
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2022-05-24 16:17:14 +00:00
Pavel Raiskup
120acfb3e7 copr-be: really setup the copr-be storage warning to 12%
The templates got de-synced.
2022-04-23 23:54:23 +02:00
Pavel Raiskup
3186e413d6 nagios/copr: monitor inodes (and one additional volume) 2022-02-08 22:53:43 +01:00
Mikolaj Izdebski
26c38caafa nagios: Remove check for supybot fedmsg plugin
Zodbot no longer has fedmsg plugin installed - supybot-fedmsg package
is not installed on value02 (RHEL 8) and supybot-fedmsg upstream
project on GitHub has been archived.
2021-11-03 22:49:21 +00:00
Jakub Kadlcik
9a8acc79ae nagios: enable disk monitoring for copr instances
I think that / monitoring should work by default just by
setting `nrpe: true` because of

    define service {
      hostgroup_name	all, !mincheckgrp
      service_description   Disk_Space_/
      check_command		check_by_nrpe!check_disk_/
      use                   disktemplate
    }
2021-08-09 11:45:53 +00:00
Pavel Raiskup
29fb33bbb7 copr-be: test remaining results storage space 2021-07-28 13:51:16 +02:00
Pavel Raiskup
92ff0683f5 nrpe: check_disk order (almost) alphabetically
Without this, it was hard to tell if check_disk.cfg.j2 mirrors
nrpe.cfg.j2.
2021-07-28 13:41:26 +02:00
Michael Scherer
3b8504f293 Fix mention of Freenode 2021-07-02 11:17:20 +02:00
seddikalaouiismaili
ac9750d6a0 correct output message for nagios check 2021-06-07 23:48:23 +00:00
d9fc78b0e4 nagios: remove MBSProducer check from mbs-backend 2021-05-21 18:58:14 +00:00
9006cf784e nagios: remove unused check_datanommer_faf 2021-05-21 18:57:09 +00:00
Kevin Fenzi
d890a9fbf4 bugzilla2fedmsg: drop checks against vm as it has moved to openshift
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-05-19 12:00:49 -07:00
Kevin Fenzi
740109a295 nagios_client / check_systemd_units: remove old debugging output
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-25 14:25:17 -07:00
Kevin Fenzi
cebb78ed82 nagios_client: the check_systemd_units is in scripts, not script
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-25 13:58:20 -07:00
seddikalaouiismaili
eae91f0d2b install nrpe check for systemd units 2021-03-25 20:16:48 +00:00
Pierre-Yves Chibon
a32dabc92e nagios_client: install the pagure systemd checks on all pagure instances
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2021-02-12 12:37:26 +01:00
seddikalaouiismaili
890dd31cb0 script to monitor systemd units on pagure 2021-02-12 11:34:57 +00:00
Pierre-Yves Chibon
65c85dd5ec nagios: Fix the check_supybot_pugin
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-06 17:03:57 +02:00
Pierre-Yves Chibon
342c056ae4 nagios_client: Fix the check_ipa_replication plugin
It looks like the data it retrieves is in bytes and thus needs to be
decoded into a unicode string so we can use it as a regular string
in our code later.

Fixes https://pagure.io/fedora-infrastructure/issue/9372

Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-06 10:46:45 +02:00
Kevin Fenzi
0c85e1a2e1 nagios / client / check_ipa_replication: 2to3 on the script.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-05 16:59:22 -07:00
Pierre-Yves Chibon
f91a80046b Wipe everything that is to do with pdc-backend from our ansible repo
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-05 18:57:52 +00:00
Mark O'Brien
32330de141 update nagios client checks to py3 2020-10-05 15:17:25 +01:00
Kevin Fenzi
5d03d396e6 nagios / client: fix typo from rsyslog PR
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-02 12:00:28 -07:00
seddikalaouiismaili
e785293064 add check for rsyslogd 2020-10-02 18:50:29 +00:00
Kevin Fenzi
bfb55039d6 nagios / client: fix ip address for batcave01 vpn endpoint
We used the 192.168.20 network while migrating from phx2 to iad2, but we
no longer should be using it anywhere. Change it to 192.168.1.41, which
is the current correct ip

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-02 09:06:10 -07:00
Mark O'Brien
d62e9c6e05 nagios file exists in 2 places so updating second file 2020-09-25 16:24:49 +01:00
Stephen Smoogen
8d58708305 remove 10.5.126 ips from nrpe to try and figure out why host was not connecting 2020-09-23 17:08:17 -04:00
a54f002051 nagios: increase threshold for mailqueue check
Signed-off-by: Francois Andrieu <darknao@fedoraproject.org>
2020-08-20 21:01:45 +02:00
655d167750 nagios: fine tune mail queue check on bastion01
Signed-off-by: Francois Andrieu <darknao@fedoraproject.org>
2020-08-20 21:01:16 +02:00
Stephen Smoogen
340c388f99 alter check_postfix_queue to have higher allowances for red hat backlog 2020-08-20 14:57:13 -04:00
Rick Elrod
dcc53bd63b add crl check to nagios + nrpe + facl perms for nrpe
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-08-06 15:32:09 -05:00
Rick Elrod
e566487ec0 Add crl nextUpdate check script for nagios, but not used anywhere yet
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-08-06 09:29:53 -05:00
Francois Andrieu
e1b6248a4c nagios: add check_postfix_redhat to bastion01 2020-07-22 19:41:11 +00:00
Francois Andrieu
da5599c3a8 check_postfix_queue.py: fix args type 2020-07-22 19:41:11 +00:00
Francois Andrieu
bf533dbca7 nagios_client: check postfix queue length for all or specific domain (ie. redhat.com) 2020-07-22 19:41:11 +00:00
Francois Andrieu
30d704f486 nagios_client: make check_raid python 2/3 compatible
Signed-off-by: Francois Andrieu <naolwen@gmail.com>
2020-07-09 17:28:42 +02:00
Kevin Fenzi
5a7245bf26 iptables / nagios_client/server: clean up more phx2 ips for iad2
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 14:51:43 -07:00
Stephen Smoogen
7d5ab8fcfd remove rabbitmq from phx2 2020-06-08 15:37:18 -04:00
Kevin Fenzi
9464ab8090 nagios_client: have clients check batcave01.iad2 not batcave01.phx2 for comnnectivity.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-04 15:11:56 -07:00
Stephen Smoogen
89f91a9642 Clean up nagios to deal with dropped services and that servicegroups can NOT end with a , while every other nagios group can. 2020-05-21 13:22:26 -04:00
Rick Elrod
0135fc1102 nagios: Add script and check for checking that a timestamp within a file is within a delta of now, and then use this for alerting when websites stop building
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-04-24 21:34:26 +02:00
Rick Elrod
1a3e03e527 always set perms on nrpe files because sometimes they default badly
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-04-24 21:34:22 +02:00
Rick Elrod
4e71722d8a nagios/fedmsg: make changes yesterday remain py2 backwards compatible
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-04-24 21:34:22 +02:00