Kevin Fenzi
c88e89d96b
retrace: fix ssl check
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-20 15:06:29 -08:00
Kevin Fenzi
b388a003b4
nagios: add checks for ssl certs on fcos and ocp4 endpoints, change to just checking proxy01
...
Add checks for ssl certs on fcos openshift endpoints.
Add checks for ocp4 wildcard certs.
Change check to only use proxy01/proxy01.stg instead of all proxies.
Ideally we really do want to check all proxies, but in practice this
results in like 70 alerts anytime the cert is going to expire.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-02 15:47:23 -08:00
Silvie Chlupova
5011e6a2dc
copr: remove -f follow from nagios check
2022-01-31 11:51:31 +01:00
Silvie Chlupova
db6dc98940
copr: fix nagios service for checking Copr CDN
...
Fixes: https://pagure.io/fedora-infrastructure/issue/10508
2022-01-31 10:34:43 +01:00
Silvie Chlupova
ba86e27e79
copr: add nagios checks for copr servers
2022-01-21 14:18:05 +01:00
Silvie Chlupova
cb2f805c26
copr: don't check copr servers using nagios for now
2022-01-20 16:35:33 +01:00
Silvie Chlupova
debd3c5b7e
copr: define new command for nagios
...
We need to use --ssl and also -f follow
2022-01-20 15:26:53 +01:00
Silvie Chlupova
6fa2999dbf
copr: use already existing copr.cfg
2022-01-20 13:23:31 +01:00
Silvie Chlupova
b9fa39f0c8
copr: nagios check for Copr's CDN
...
Relates: https://pagure.io/fedora-infrastructure/issue/10456
2022-01-04 15:28:24 +01:00
Mikolaj Izdebski
26c38caafa
nagios: Remove check for supybot fedmsg plugin
...
Zodbot no longer has fedmsg plugin installed - supybot-fedmsg package
is not installed on value02 (RHEL 8) and supybot-fedmsg upstream
project on GitHub has been archived.
2021-11-03 22:49:21 +00:00
Mikolaj Izdebski
a65fa4e1c0
nagios_server: Update hostname where zodbot is running
...
Zodbot is running on value02 now.
2021-11-03 16:38:34 +01:00
Kevin Fenzi
ec0d18a8b8
nagios: adjust where zodbot announces alerts, zodbot is on value02 now
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-22 10:10:10 -07:00
Pavel Raiskup
d2f9b772e9
nagios: move copr-ping to internal
2021-08-10 08:51:55 +02:00
Pavel Raiskup
ff215ea2b9
nagios: external: define copr_* hostgroups
2021-08-09 15:25:19 +02:00
Jakub Kadlcik
9a8acc79ae
nagios: enable disk monitoring for copr instances
...
I think that / monitoring should work by default just by
setting `nrpe: true` because of
define service {
hostgroup_name all, !mincheckgrp
service_description Disk_Space_/
check_command check_by_nrpe!check_disk_/
use disktemplate
}
2021-08-09 11:45:53 +00:00
Pavel Raiskup
73ba7d25b1
copr-be: fixup copr-ping nagios mapping
2021-08-09 13:34:25 +02:00
Pavel Raiskup
0771b0e4ad
copr-be: install ping nrpe task
2021-08-09 11:59:03 +02:00
Pavel Raiskup
44c172c56e
copr-be: copr-ping
2021-08-05 14:48:20 +02:00
Pavel Raiskup
97e5861ac0
nagios: sync copr-be and copr-be-dev
2021-07-28 23:06:26 +02:00
Pavel Raiskup
eb66378f24
nagios: typo in copr_back => copr_back_aws
2021-07-28 16:20:45 +02:00
Pavel Raiskup
e433a17ffe
nagios: add schlupov, and notify her in copr contactgroup
2021-07-28 14:49:50 +02:00
Pavel Raiskup
9eebd7387c
nagios: add contact for 'praiskup'
2021-07-28 14:14:18 +02:00
Pavel Raiskup
9dd486fac8
Revert "nagios: add me and schlupov to copr contact group"
...
We need to define the contacts first.
This reverts commit 00b5afa1a9
.
2021-07-28 14:08:45 +02:00
Pavel Raiskup
29fb33bbb7
copr-be: test remaining results storage space
2021-07-28 13:51:16 +02:00
Pavel Raiskup
00b5afa1a9
nagios: add me and schlupov to copr contact group
2021-07-28 13:41:30 +02:00
Michael Scherer
3b8504f293
Fix mention of Freenode
2021-07-02 11:17:20 +02:00
9006cf784e
nagios: remove unused check_datanommer_faf
2021-05-21 18:57:09 +00:00
Kevin Fenzi
d890a9fbf4
bugzilla2fedmsg: drop checks against vm as it has moved to openshift
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-05-19 12:00:49 -07:00
Stephen Smoogen
7b43b8049a
Update kevins config so nagios will load since 16x7 no longer exists
...
Signed-off-by: Stephen Smoogen <smooge@smoogespace.com>
2021-04-28 16:07:43 +00:00
Stephen Smoogen
e5a3fb3a43
Add in a 12x7 versus 16x7 and make some timeszones friendlier
...
Signed-off-by: Stephen Smoogen <smooge@smoogespace.com>
2021-04-28 16:07:43 +00:00
seddikalaouiismaili
890dd31cb0
script to monitor systemd units on pagure
2021-02-12 11:34:57 +00:00
Kevin Fenzi
25ace56df7
pagure.io / nagios: check only that cert is valid for 25 days
...
We renew letsencrypt certs at 30 days, so checking at 60 is pointless.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-02-02 14:24:07 -08:00
Kevin Fenzi
a74b4015e7
nagios: contacts
...
Clean up a bunch of old contacts that no longer are around
or care about getting alerts from our nagios.
Add readme file that notes that this information is public and
people should use a filtered email address for this purpose and avoid
adding sensitive information like phone numbers.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-28 11:52:24 -07:00
Kevin Fenzi
71c650baff
nagios / server: drop checking for fas fedmsgs, they likely wont be back
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-05 17:21:08 -07:00
Kevin Fenzi
f650eab7ee
nagios_server / fedmsg: pkgs01 does not run any fedmsg-hub anymore.
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-05 17:00:15 -07:00
Kevin Fenzi
bb61f0da99
nagios / server: don't try and check mincheck group rsyslog
...
We want to make sure rsyslog is running on hosts, but the mincheck
hostss are ones we don't do any nrpe checks on, so we should exclude
them from this. This is like builders or aws hosts.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-02 12:56:49 -07:00
seddikalaouiismaili
e785293064
add check for rsyslogd
2020-10-02 18:50:29 +00:00
Mark O'Brien
5fe015a90a
nagios server plugins: port to py3
2020-10-02 18:46:32 +00:00
Pierre-Yves Chibon
9506631012
pagure: replace pagure01 by pagure02
...
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-01 16:09:14 +02:00
Mark O'Brien
b2073703e5
[nagios] add back in strp accidentally removed
2020-09-25 14:11:10 +00:00
Mark O'Brien
95eb7c75d3
[nagios] port haproxy connections script to py3
2020-09-25 14:11:10 +00:00
Kevin Fenzi
eed8859c64
pdc-backend: clean up last bits of pdc-backend hosts.
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-08-24 09:32:04 -07:00
Rick Elrod
dcc53bd63b
add crl check to nagios + nrpe + facl perms for nrpe
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-08-06 15:32:09 -05:00
Francois Andrieu
e1b6248a4c
nagios: add check_postfix_redhat to bastion01
2020-07-22 19:41:11 +00:00
Kevin Fenzi
349dec197c
nagios_seever/ irc colorize: 2to3 run to move to python3
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-07-01 16:45:28 -07:00
Kevin Fenzi
9770bae604
nagios_server: use iad2-mgmt-http.cfg
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 16:13:26 -07:00
Kevin Fenzi
9d9d7f6c5c
nagios_server: more adjustments, drop fas for now, fix gateway hosts harder
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 15:59:32 -07:00
Kevin Fenzi
88ab378bba
nagios_server: drop phx2_internal stuff, fix mailman01 to use iad2
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 14:40:14 -07:00
Rick Elrod
d9a23d9930
nagios: nix basset checks for now
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-06-20 21:01:41 -05:00
Rick Elrod
f6c5bac836
nagios: comment out qahardware ref for now
...
Signed-off-by: Rick Elrod <relrod@redhat.com>
2020-06-20 21:01:41 -05:00