Kevin Fenzi
771d72e12d
resultsdb01: clean up last entries
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-06-27 15:14:12 -07:00
Mikolaj Izdebski
89f28097ce
nagios_server: Update koschei internal website check for ocp4
2022-06-24 17:55:10 +02:00
Kevin Fenzi
0757ae95df
greenwave: change nagios check for ocp4
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-06-15 16:01:01 -07:00
Kevin Fenzi
fcc9d984da
waiverdb / nagios: fix url to ocp4
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-06-15 15:37:38 -07:00
Mark O Brien
91f3d3b0bc
change nagios checks for http-bodhi to only run on ocp4 proxies
...
Signed-off-by: Mark O Brien <markobri@redhat.com>
2022-06-09 13:17:12 +01:00
Kevin Fenzi
d7a8c7aa57
nagios: only check mote on value01
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-05-25 13:25:00 -07:00
Mark O Brien
75aadffd63
rename proxies_ocp4 hostgroup
...
Signed-off-by: Mark O Brien <markobri@redhat.com>
2022-05-16 15:08:17 +01:00
Mark O Brien
28db0aa10f
update nagios checks for http-accounts for ocp4 proxies only
...
Signed-off-by: Mark O Brien <markobri@redhat.com>
2022-05-16 13:59:32 +01:00
Andrew Heath
81aad830e6
Fix typo
2022-04-29 18:58:50 +00:00
Andrew Heath
8795bffd2c
Adding Check for pagure.io per issue 10541
2022-04-29 18:58:50 +00:00
Kevin Fenzi
c88e89d96b
retrace: fix ssl check
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-20 15:06:29 -08:00
Kevin Fenzi
467498bb8b
retrace fixes: fix dns to work, add nagios check for ssl cert
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-20 13:52:35 -08:00
Kevin Fenzi
2e548a91e6
nagios_server: update what variable nagios templates use for ipv4
...
We changed eth0_ip and eth0_ipv4 to eth0_ipv4_ip. Update the host
templates.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-09 16:03:01 -08:00
Kevin Fenzi
6cd9a57b0b
nagios: adjust hostname for copr-be, it cannot use the alias
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-07 13:52:13 -08:00
Silvie Chlupova
dce5318cfc
copr: add nagios check for copr backend
2022-02-07 20:22:45 +00:00
Kevin Fenzi
b388a003b4
nagios: add checks for ssl certs on fcos and ocp4 endpoints, change to just checking proxy01
...
Add checks for ssl certs on fcos openshift endpoints.
Add checks for ocp4 wildcard certs.
Change check to only use proxy01/proxy01.stg instead of all proxies.
Ideally we really do want to check all proxies, but in practice this
results in like 70 alerts anytime the cert is going to expire.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-02 15:47:23 -08:00
Kevin Fenzi
4dda088136
nagios: remove duplicate variable check
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-01-31 10:29:21 -08:00
Silvie Chlupova
194a5503f3
copr: comment define service for copr backend, it doesn't work
2022-01-31 14:13:12 +01:00
Silvie Chlupova
5011e6a2dc
copr: remove -f follow from nagios check
2022-01-31 11:51:31 +01:00
Silvie Chlupova
db6dc98940
copr: fix nagios service for checking Copr CDN
...
Fixes: https://pagure.io/fedora-infrastructure/issue/10508
2022-01-31 10:34:43 +01:00
Stephen Smoogen
9845cd08be
fix nagios check on download.copr to use check_website_follow_ssl to remove alert
2022-01-21 11:16:55 -05:00
Pavel Raiskup
c9951efa8d
nagios: disable download.copr.fedoraproject.org chack again
...
We don't know what's wrong on that:
HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - string 'Index of /' not found on 'https://download.copr.fedorainfracloud.org:443/ ' - 3692 bytes in 0.631 second response time
2022-01-21 15:29:14 +01:00
Silvie Chlupova
ba86e27e79
copr: add nagios checks for copr servers
2022-01-21 14:18:05 +01:00
Silvie Chlupova
cb2f805c26
copr: don't check copr servers using nagios for now
2022-01-20 16:35:33 +01:00
Pavel Raiskup
f7edb31e43
noc: fixup noc.yaml playbook
...
Per report:
Error: Could not find any hostgroup matching 'datagrepper'
(config file '/etc/nagios/services/websites.cfg', starting on line 194)"
Folow up for: 726a788721
2022-01-20 15:34:41 +01:00
Silvie Chlupova
debd3c5b7e
copr: define new command for nagios
...
We need to use --ssl and also -f follow
2022-01-20 15:26:53 +01:00
Silvie Chlupova
6fa2999dbf
copr: use already existing copr.cfg
2022-01-20 13:23:31 +01:00
Silvie Chlupova
8c5dc50c7e
copr: move copr nagios services into separate file
2022-01-20 12:14:48 +01:00
Silvie Chlupova
87e510f378
copr: nagios check for copr frontend, backend and distgit
...
Fixes: https://pagure.io/copr/copr/issue/2002
2022-01-20 11:47:14 +01:00
Silvie Chlupova
8d9f6e0c4c
copr: nagios check for copr frontend, backend and distgit
...
Fixes: https://pagure.io/copr/copr/issue/2002
2022-01-20 08:33:23 +00:00
Kevin Fenzi
0f2ae88d63
nagios: add some copr team members
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-11-21 14:43:57 -08:00
49c1616ca7
Update nagios check for accounts.fedoraproject.org
2021-09-29 19:04:41 +00:00
Kevin Fenzi
844177a0ae
nagios: try and sepecify the additional groups another way
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-09-02 11:25:38 -07:00
Kevin Fenzi
d4ad74ae5e
nagios / vpnclients: fix typo in previous commit
...
group was used, but ansible needs groups here.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-09-02 10:28:20 -07:00
Stephen Smoogen
2272ab1f6f
Add in a test to make that the nagios templates try to add in groups
...
with no vpn.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2021-08-27 11:05:40 -04:00
Pavel Raiskup
ff215ea2b9
nagios: external: define copr_* hostgroups
2021-08-09 15:25:19 +02:00
Kevin Fenzi
5b1b2c403d
nagios: fix ipsilon check to look for something in the new theme
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-24 18:13:37 -07:00
Kevin Fenzi
3caa063699
nagios_server / services: registry is only on proxy01/10
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-12-02 09:57:53 -08:00
09d0a5bde5
Add monitoring to oci-registry api/webui ( Fixes #9231 )
2020-12-01 23:44:22 +01:00
Mark O'Brien
f5288ed317
[nagios] add mobrien as admin
2020-08-24 18:58:48 +01:00
Francois Andrieu
f8ad3f83de
nagios_server: monitoring docs.fedoraproject.org
...
https://pagure.io/fedora-infrastructure/issue/8956
Signed-off-by: Francois Andrieu <naolwen@gmail.com>
2020-07-30 16:21:17 +00:00
Ernestas Kulik
4473905341
nagios: Add httpd monitoring for resultsdb01
...
This should help prevent longer outages due to not being monitored.
https://pagure.io/fedora-infrastructure/issue/8494
Signed-off-by: Ernestas Kulik <ekulik@redhat.com>
Signed-off-by: Francois Andrieu <naolwen@gmail.com>
2020-07-10 13:40:18 -07:00
Stephen Smoogen
b0c7013f73
you need an empty line because jinja eats the carriage return and nagios configs cant handle } not being on a line by itself
2020-07-01 18:25:38 -04:00
Stephen Smoogen
6e218c7031
a box not on the vpn has a hard time testing for boxes on the vpn
2020-07-01 18:14:02 -04:00
Stephen Smoogen
28ba173acb
move the dns_external check to using a group variable in the nagios group. This takes it out of the main inventory where its names do not match and this other group was not used in any other playbook
2020-07-01 17:40:02 -04:00
Kevin Fenzi
90c28879f9
nagios_server: Adjust ns01/02 to try and work with nagios external
...
Also adjust gateway group as ibiblio-gw can't be it's own parent.
Also setup vpn hosts also on external as it needs the hostgroup.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 17:55:12 -07:00
Kevin Fenzi
93b8e0c893
nagios_server: actually define the address for iad2_gw
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 16:25:11 -07:00
Kevin Fenzi
e7edf9ef55
nagios_server: missed a phx2-gw in mgmt hosts.
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 16:09:39 -07:00
Kevin Fenzi
9d9d7f6c5c
nagios_server: more adjustments, drop fas for now, fix gateway hosts harder
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 15:59:32 -07:00
Kevin Fenzi
632d4a0273
nagios_server: adjust a bunch more things for iad2.
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 15:39:32 -07:00