Pavel Raiskup
d0f7c7ca30
copr: use again a deterministic nrpe UID
...
It was notoriously colliding with other system users like copr-signer
and others.
Revert "copr: test without nrpe_client_uid specified"
This reverts commit 435b71a695
.
2022-11-22 10:54:00 +01:00
Pavel Raiskup
435b71a695
copr: test without nrpe_client_uid specified
...
Revert "copr: define nrpe_client_uid=500"
This reverts commit fa5cd7344c
.
2022-11-22 10:41:26 +01:00
Pavel Raiskup
fa5cd7344c
copr: define nrpe_client_uid=500
2022-11-22 10:37:15 +01:00
Pavel Raiskup
baa6a0dff0
nagios_client: typo s/null/omit/
2022-11-22 10:25:54 +01:00
Pavel Raiskup
2627babd44
nagios_client: precreate nrpe client
...
With a specific UID if {{ nrpe_client_uid }} is defined.
2022-11-22 10:16:14 +01:00
Kevin Fenzi
18eecec303
nagios: adjust redhat.com email check
...
Right now there's often a backlog due to it not responding to the fedora
side. So, lets bump up the checks a bit so they do not alert all the
time.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-10-12 12:45:48 -07:00
Aurélien Bompard
8962731dbc
Don't use datetime.fromtimestamp yet
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2022-05-24 18:37:27 +02:00
Aurélien Bompard
e979a1955e
Update the datanommer Nagios check to query datagrepper directly
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2022-05-24 16:17:14 +00:00
Pavel Raiskup
120acfb3e7
copr-be: really setup the copr-be storage warning to 12%
...
The templates got de-synced.
2022-04-23 23:54:23 +02:00
Pavel Raiskup
3186e413d6
nagios/copr: monitor inodes (and one additional volume)
2022-02-08 22:53:43 +01:00
Mikolaj Izdebski
26c38caafa
nagios: Remove check for supybot fedmsg plugin
...
Zodbot no longer has fedmsg plugin installed - supybot-fedmsg package
is not installed on value02 (RHEL 8) and supybot-fedmsg upstream
project on GitHub has been archived.
2021-11-03 22:49:21 +00:00
Jakub Kadlcik
9a8acc79ae
nagios: enable disk monitoring for copr instances
...
I think that / monitoring should work by default just by
setting `nrpe: true` because of
define service {
hostgroup_name all, !mincheckgrp
service_description Disk_Space_/
check_command check_by_nrpe!check_disk_/
use disktemplate
}
2021-08-09 11:45:53 +00:00
Pavel Raiskup
29fb33bbb7
copr-be: test remaining results storage space
2021-07-28 13:51:16 +02:00
Pavel Raiskup
92ff0683f5
nrpe: check_disk order (almost) alphabetically
...
Without this, it was hard to tell if check_disk.cfg.j2 mirrors
nrpe.cfg.j2.
2021-07-28 13:41:26 +02:00
Michael Scherer
3b8504f293
Fix mention of Freenode
2021-07-02 11:17:20 +02:00
seddikalaouiismaili
ac9750d6a0
correct output message for nagios check
2021-06-07 23:48:23 +00:00
d9fc78b0e4
nagios: remove MBSProducer check from mbs-backend
2021-05-21 18:58:14 +00:00
9006cf784e
nagios: remove unused check_datanommer_faf
2021-05-21 18:57:09 +00:00
Kevin Fenzi
d890a9fbf4
bugzilla2fedmsg: drop checks against vm as it has moved to openshift
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-05-19 12:00:49 -07:00
Kevin Fenzi
740109a295
nagios_client / check_systemd_units: remove old debugging output
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-25 14:25:17 -07:00
Kevin Fenzi
cebb78ed82
nagios_client: the check_systemd_units is in scripts, not script
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-25 13:58:20 -07:00
seddikalaouiismaili
eae91f0d2b
install nrpe check for systemd units
2021-03-25 20:16:48 +00:00
Pierre-Yves Chibon
a32dabc92e
nagios_client: install the pagure systemd checks on all pagure instances
...
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2021-02-12 12:37:26 +01:00
seddikalaouiismaili
890dd31cb0
script to monitor systemd units on pagure
2021-02-12 11:34:57 +00:00
Pierre-Yves Chibon
65c85dd5ec
nagios: Fix the check_supybot_pugin
...
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-06 17:03:57 +02:00
Pierre-Yves Chibon
342c056ae4
nagios_client: Fix the check_ipa_replication plugin
...
It looks like the data it retrieves is in bytes and thus needs to be
decoded into a unicode string so we can use it as a regular string
in our code later.
Fixes https://pagure.io/fedora-infrastructure/issue/9372
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-06 10:46:45 +02:00
Kevin Fenzi
0c85e1a2e1
nagios / client / check_ipa_replication: 2to3 on the script.
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-05 16:59:22 -07:00
Pierre-Yves Chibon
f91a80046b
Wipe everything that is to do with pdc-backend from our ansible repo
...
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-05 18:57:52 +00:00
Mark O'Brien
32330de141
update nagios client checks to py3
2020-10-05 15:17:25 +01:00
Kevin Fenzi
5d03d396e6
nagios / client: fix typo from rsyslog PR
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-02 12:00:28 -07:00
seddikalaouiismaili
e785293064
add check for rsyslogd
2020-10-02 18:50:29 +00:00
Kevin Fenzi
bfb55039d6
nagios / client: fix ip address for batcave01 vpn endpoint
...
We used the 192.168.20 network while migrating from phx2 to iad2, but we
no longer should be using it anywhere. Change it to 192.168.1.41, which
is the current correct ip
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-02 09:06:10 -07:00
Mark O'Brien
d62e9c6e05
nagios file exists in 2 places so updating second file
2020-09-25 16:24:49 +01:00
Stephen Smoogen
8d58708305
remove 10.5.126 ips from nrpe to try and figure out why host was not connecting
2020-09-23 17:08:17 -04:00
a54f002051
nagios: increase threshold for mailqueue check
...
Signed-off-by: Francois Andrieu <darknao@fedoraproject.org>
2020-08-20 21:01:45 +02:00
655d167750
nagios: fine tune mail queue check on bastion01
...
Signed-off-by: Francois Andrieu <darknao@fedoraproject.org>
2020-08-20 21:01:16 +02:00
Stephen Smoogen
340c388f99
alter check_postfix_queue to have higher allowances for red hat backlog
2020-08-20 14:57:13 -04:00
Rick Elrod
dcc53bd63b
add crl check to nagios + nrpe + facl perms for nrpe
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-08-06 15:32:09 -05:00
Rick Elrod
e566487ec0
Add crl nextUpdate check script for nagios, but not used anywhere yet
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-08-06 09:29:53 -05:00
Francois Andrieu
e1b6248a4c
nagios: add check_postfix_redhat to bastion01
2020-07-22 19:41:11 +00:00
Francois Andrieu
da5599c3a8
check_postfix_queue.py: fix args type
2020-07-22 19:41:11 +00:00
Francois Andrieu
bf533dbca7
nagios_client: check postfix queue length for all or specific domain (ie. redhat.com)
2020-07-22 19:41:11 +00:00
Francois Andrieu
30d704f486
nagios_client: make check_raid python 2/3 compatible
...
Signed-off-by: Francois Andrieu <naolwen@gmail.com>
2020-07-09 17:28:42 +02:00
Kevin Fenzi
5a7245bf26
iptables / nagios_client/server: clean up more phx2 ips for iad2
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 14:51:43 -07:00
Stephen Smoogen
7d5ab8fcfd
remove rabbitmq from phx2
2020-06-08 15:37:18 -04:00
Kevin Fenzi
9464ab8090
nagios_client: have clients check batcave01.iad2 not batcave01.phx2 for comnnectivity.
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-04 15:11:56 -07:00
Stephen Smoogen
89f91a9642
Clean up nagios to deal with dropped services and that servicegroups can NOT end with a , while every other nagios group can.
2020-05-21 13:22:26 -04:00
Rick Elrod
0135fc1102
nagios: Add script and check for checking that a timestamp within a file is within a delta of now, and then use this for alerting when websites stop building
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-04-24 21:34:26 +02:00
Rick Elrod
1a3e03e527
always set perms on nrpe files because sometimes they default badly
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-04-24 21:34:22 +02:00
Rick Elrod
4e71722d8a
nagios/fedmsg: make changes yesterday remain py2 backwards compatible
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-04-24 21:34:22 +02:00