Commit graph

1099 commits

Author SHA1 Message Date
Greg Sutcliffe
9f431805ec nagios: Update authorized user lists 2025-03-26 21:16:13 +00:00
Kevin Fenzi
0a986e4f7e nagios / registry: check registry via the actual registry instead of the web page
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-02-17 12:19:30 -08:00
Michal Konecny
f63e839698 [nagios-server] Move the datanommer checks to noc01
There were few fedora-messaging datanommer checks that were running on
busgateway01. As this machine is part of fedmsg it will be
decommissioned. Let's move the checks to noc01.

Signed-off-by: Michal Konecny <mkonecny@redhat.com>
2025-02-14 09:45:39 +00:00
Michal Konecny
6428f8f772 Sunset github2fedmsg and fedmsg
This commit is removing all the fedmsg related stuff from ansible
repository.

Signed-off-by: Michal Konecny <mkonecny@redhat.com>
2025-02-13 10:08:51 +00:00
Nick Bebout
cdb7471dfe Remove codeblock (relrod) from nagios 2025-02-11 18:39:05 -06:00
Michal Konecny
2ec055db6f Use first uppercase letter for all handlers
This will unify all the handlers to use first uppercase letter for
ansible-lint to stop complaining.

I went through all `notify:` occurrences and fixed them by running
```
set TEXT "text_to_replace"; set REPLACEMENT "replacement_text"; git grep
-rlz "$TEXT" . | xargs -0 sed -i "s/$TEXT/$REPLACEMENT/g"
```

Then I went through all the changes and removed the ones that wasn't
expected to be changed.

Fixes https://pagure.io/fedora-infrastructure/issue/12391

Signed-off-by: Michal Konecny <mkonecny@redhat.com>
2025-02-10 20:31:49 +00:00
Kevin Fenzi
22f3d8832f handlers: more renaming fixes
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-01-24 14:06:11 -08:00
47c68f478d ansiblelint fixes - fqcn[action-core] - template to ansible.builtin.template
Replaces references to template: with ansible.builtin.template

Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2025-01-15 11:30:29 +10:00
25391e95b7 ansiblelint fixes - fqcn[action-core] - package to ansible.builtin.package
Replaces many references to  package: with ansible.builtin.package

Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2025-01-15 11:28:00 +10:00
462176464b ansiblelint fixes-- fqcn[action-core] - command to ansible.builtin.command
Replaces many references to  command: with ansible.builtin.command

Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2025-01-15 11:26:47 +10:00
6a3816dfdc ansiblelint fixes-- fqcn[action-core] - copy to ansible.builtin.copy
Replaces many references to 'copy' with ansible.builtin.copy

Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2025-01-15 10:43:31 +10:00
62952df107 ansiblelint fixes-- fqcn[action-core] - file to ansible.builtin.file
Replaces many references to  file: with ansible.builtin.file

Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2025-01-15 10:41:52 +10:00
691adee6ee Fix name[casing] ansible-lint issues
fix 1900 failures of the following case issue:

`name[casing]: All names should start with an uppercase letter.`

Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2025-01-14 20:20:07 +10:00
89f6f1fc32 Fix majority of remaining yamllint warnings and errors
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2024-11-28 17:31:45 +10:00
Kevin Fenzi
ef8a734d69 nagios: also make sure the service is running and enabled
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-11-21 12:53:00 -08:00
Kevin Fenzi
160a909053 noc: install ipmitool as well
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-11-19 13:22:48 -08:00
Michal Konecny
cee2700942 [nagios_server] Add zlopez to list of users who can use commands
I was not able to acknowledge alerts on nagios and this looks like the correct
place to get them.

Signed-off-by: Michal Konecny <mkonecny@redhat.com>
2024-11-04 12:52:38 +01:00
Kevin Fenzi
e3e2cb1d93 odcs: retire service ( infra 12192 )
Time to retire ODCS. ELN is moved off and that was the last thing using
it. Thanks for all the service ODCS!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-09-24 18:21:51 +00:00
Jiri Podivin
f513e7cbcd Linting python scripts
Signed-off-by: Jiri Podivin <jpodivin@redhat.com>
2024-09-18 19:57:29 +00:00
James Antill
31de6ced58 nagios: change the monitoring of registry.fedoraproject.org to start at
fedora (skiping fNN/* etc), so we don't hit limits and not see
        fedora* images.

Signed-off-by: James Antill <james@and.org>
2024-09-12 19:24:03 +00:00
Kevin Fenzi
0dfa11a6eb fedimg: signing off...
Thanks for all the uploads fedimg.
You go to a far far better place I'm sure.

There's no point in keeping it around now, as it's actually not working
and the replacement ( cloud-image-uploader) should work soon.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-08-13 16:40:01 -07:00
Kevin Fenzi
d6ecf4c07d virthost-cc-rdu02/rhel7 becomes vmhost-x86-cc02/rhel9
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-08-02 11:53:18 -07:00
Stephen Smoogen
a0397d7abb Add blocks to nagios.conf httpd
I forgot I am the expert on nagios configs so added it to the template
file.

Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2024-07-09 09:18:56 +00:00
Kevin Fenzi
2397e3fbc4 mirrormanager: remove no longer needed nagios check for frontend
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-07-01 14:37:55 -07:00
Kevin Fenzi
4bcbc54efa people: retire people02
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-06-27 15:38:03 -07:00
James Antill
d7258e320e Add DNF countme nagios checks.
Signed-off-by: James Antill <james@and.org>
2024-06-27 17:35:23 +00:00
Kevin Fenzi
84a7a7afc8 nagios: adjust nrpe for badges vs old fedbadges
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-28 13:54:53 -07:00
Kevin Fenzi
71d5c496d4 nagios: fix badges monitoring check in nagios
This changed from 'fedbadges' to 'badges'.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-28 13:07:21 -07:00
Kevin Fenzi
d366194a22 module-build-service (mbs): retire service
With the EOL of Fedora 38 yesterday, we are no longer building any
modules and can retire our module build service.

Note that toddlers needs to be adjusted still, that will happen after
this.

Thanks for all the modules!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-22 13:38:53 -07:00
675f400fdf Add ryanlerch to nagios commands lists
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2024-05-16 10:59:50 +10:00
Kevin Fenzi
e472e0c1b6 noc / badges: remove another old vm monitoring remnant
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-06 13:40:44 -07:00
Kevin Fenzi
ce72533001 nagios / badges: remove old fedmsg checks
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-06 13:11:59 -07:00
Leo Puvilland
5e59e8c213 add current oncall and recent oncalls to nagios permissions CGI
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-04-25 00:17:29 +00:00
Kevin Fenzi
c84b99223c osbs: raise a glass for it's service
This removes osbs and allmost all it's associated playbooks and files.

It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.

This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).

Good bye osbs!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-03-28 12:52:07 -07:00
Leo Puvilland
daa5e252cc
nagios: fix stray ampersand that was breaking the curl command
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-02-23 18:51:32 -08:00
Leo Puvilland
fac5a39208 nagios: add parameter for which nagios host is sending the alert
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-02-22 00:44:19 +00:00
Kevin Fenzi
f95712d8a0 nagios / koji: drop ssl cert check
This check was from long ago when koji used a self signed cert/ca
It still amusingly has that configured, so this check is telling us that
that self signed cert that we dont use anymore is expiring. :)
So, just drop this, koji is being proxies now and uses our main wildcard
cert.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-02-13 10:13:48 -08:00
Leo Puvilland
172a57c0cf nagios: remove serviceackauthor from host notifications
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-01-24 03:34:52 +00:00
Leo Puvilland
c2b5cf45ac Switch to SERVICESTATE instead of HOSTSTATE in notify.cfg
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-01-08 21:59:13 +00:00
Leo Puvilland
18e4f51c61
Make only the nagios group able to execute the matrix-notify script
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-21 14:46:02 -08:00
Leo Puvilland
00d82f8610
Add matrix-bot to ircbot contactgroup
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-20 15:35:19 -08:00
Leo Puvilland
11b56e8551
Fix path to Matrix-Notify script
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-20 10:29:15 -08:00
Leo Puvilland
e04948b31a
Fix template file not being copied (matrix-notify script)
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-19 09:18:16 -08:00
Leo Puvilland
05bff0da9f nagios matrix notify: use full filename for script in role
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2023-12-19 10:13:23 +10:00
Leo Puvilland
48d7982ebf Correct syntax error in nagios role
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2023-12-19 10:08:07 +10:00
Leo Puvilland
5aafc6a1d2
Move nagios notifications to Matrix
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-18 15:55:30 -08:00
Kevin Fenzi
2524e7c258 nagios: stop trying to monitor start.fedoraproject.org, as its now under fedoraproject.org/start
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-30 14:52:03 -08:00
Kevin Fenzi
f42ce93d85 nagios: remove missed value01 reference
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-16 13:54:09 -08:00
Kevin Fenzi
d727ee47ea nagios: remove another old notifs remnant
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 13:12:36 -08:00
Kevin Fenzi
20dc948173 notifs (old fmn): retire
We are retiring this in favor of the new service.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 12:28:28 -08:00