Commit graph

35470 commits

Author SHA1 Message Date
Pavel Raiskup
d2f9b772e9 nagios: move copr-ping to internal 2021-08-10 08:51:55 +02:00
David Kirwan
bc6c6bfc32 metrics-for-apps: added two temporary worker vms 04,05
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-10 11:37:36 +09:00
Kevin Fenzi
ecbda7c851 haproxy: add staging ocp cert for api-int
haproxy needs to terminate ssl for the api part of the ocp cluster.
We can't do this in apache without listening for non standard ports and
that could be a mess, so terminate ssl here and talk into the cluster

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-09 10:51:13 -07:00
David Kirwan
d78d1070f8 metrics-for-apps: terminate tls for api/api-int in haproxy
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-09 17:48:38 +00:00
Kevin Fenzi
c3830a9698 db01: do not bother to alert on low swap
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-09 08:23:23 -07:00
Pavel Raiskup
ff215ea2b9 nagios: external: define copr_* hostgroups 2021-08-09 15:25:19 +02:00
Pavel Raiskup
8aa8247784 copr-dist-git: add nagios_client role 2021-08-09 13:59:55 +02:00
Pavel Raiskup
f76859775c nagios: pick up copr_external.cfg services 2021-08-09 13:50:30 +02:00
Jakub Kadlcik
9a8acc79ae nagios: enable disk monitoring for copr instances
I think that / monitoring should work by default just by
setting `nrpe: true` because of

    define service {
      hostgroup_name	all, !mincheckgrp
      service_description   Disk_Space_/
      check_command		check_by_nrpe!check_disk_/
      use                   disktemplate
    }
2021-08-09 11:45:53 +00:00
Pavel Raiskup
73ba7d25b1 copr-be: fixup copr-ping nagios mapping 2021-08-09 13:34:25 +02:00
Pavel Raiskup
54c5c85eaa copr-be: copr-ping: fix log destination 2021-08-09 12:21:05 +02:00
Pavel Raiskup
77e072a0b4 copr-be: copr-ping: missing touch 2021-08-09 12:12:39 +02:00
Pavel Raiskup
f4ab3f6999 copr-be: copr-ping: fix variable name clash 2021-08-09 12:09:31 +02:00
Pavel Raiskup
b16a5e3411 copr-be: add forgotten vars file 2021-08-09 12:01:02 +02:00
Pavel Raiskup
0771b0e4ad copr-be: install ping nrpe task 2021-08-09 11:59:03 +02:00
Pavel Raiskup
06f2e4b236 copr-be: copr-ping: use system paths
So NRPE can execute/read those files.  Also update the log file format a
bit.
2021-08-09 11:45:22 +02:00
Kevin Fenzi
a8ffcd6575 download / nagios: Also don't check swap on download servers.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-08 11:08:18 -07:00
David Kirwan
d3c44cfc66 metrics-for-apps: revert change gather facts
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-06 22:57:16 +09:00
David Kirwan
46d54fbb35 metrics-for-apps: Reverting recent changes 2021-08-06 22:45:22 +09:00
David Kirwan
a1a8c5405c metrics-for-apps: adding next-server config for ocp01-03 nodes
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-06 12:53:06 +09:00
David Kirwan
ae3e5fedc2 metrics-for-apps: dhcp conf ocp controlplane vms
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-06 12:05:48 +09:00
David Kirwan
3b0ab20ac3 metrics-for-apps: manually setting mac address for controlplane
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-06 12:02:04 +09:00
Nils Philippsen
f703e7a771 Add and use optimized http log syncing script
The previous one synced all hosts serially and ran rsync for each log
file. This reimplements the shell script in Python, with these changes:

- Run rsync on whole directories of log files, with much reduced
  overhead.
- Use a pool of five workers which process hosts in parallel.

Additionally, remove download-rdu01.vpn.fedoraproject.org from the list
of synced hosts.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-08-05 16:44:47 +00:00
Pavel Raiskup
ea1c19d522 copr-be: copr-ping: fix the checker 2021-08-05 16:36:31 +02:00
Pavel Raiskup
29d9c262e2 copr-be: copr-ping: ping checker 2021-08-05 16:33:21 +02:00
Pavel Raiskup
b91a40cf6e copr-be: copr-ping: ping the stating only daily
Not each half-an-our.
2021-08-05 15:35:21 +02:00
Pavel Raiskup
5e9ee1e814 copr-be: copr-ping: setup both prod/stg instances 2021-08-05 15:32:30 +02:00
Pavel Raiskup
6a582f9ef9 copr-be: copr-ping: always replace the ping script 2021-08-05 15:19:32 +02:00
Pavel Raiskup
cd64609f96 copr-be: copr-ping: I wanted to add a tag there 2021-08-05 15:18:21 +02:00
Pavel Raiskup
616364d1c6 copr-be: copr-ping: new script
Use the cli to submit the build, and dump the exit status to log that
will be later parsed.
2021-08-05 15:17:10 +02:00
Pavel Raiskup
c2c169bb33 copr-be: copr-ping: we don't need quotes for comments 2021-08-05 15:17:10 +02:00
Luca BRUNO
f4266a393e
Revert "coreos-cincinnati: build current git (604be79)"
This reverts commit c6dc9f45a3.

Ref: https://pagure.io/fedora-infrastructure/issue/10111#comment-746286
2021-08-05 13:14:00 +00:00
Pavel Raiskup
66f8ef055d copr-be: copr-ping: correct stat check 2021-08-05 14:51:55 +02:00
Pavel Raiskup
463d6f2b8b copr-be: typo in copr-ping trigger 2021-08-05 14:50:11 +02:00
Pavel Raiskup
44c172c56e copr-be: copr-ping 2021-08-05 14:48:20 +02:00
Stephen Smoogen
33df23d457 this will give copies of these emails to asaleh and nils so they can see how the cron jobs are working 2021-08-05 06:46:17 -04:00
Pavel Raiskup
3e0d1eb890 copr-fe: keytab host fix
fatal: [copr-fe-dev.aws.fedoraproject.org ->
ipa01.stg.iad2.fedoraproject.org]: FAILED! => {"changed": false, "msg":
"host_add: coprfe-.stg.fedoraproject.org: invalid 'hostname': invalid
domain-name: only letters, numbers, '-' are allow
2021-08-05 08:39:33 +02:00
Pavel Raiskup
f364113835 copr-fe: include role typo #3 2021-08-05 08:26:56 +02:00
Pavel Raiskup
2e1fec745c copr-fe: typo in include role #2 2021-08-05 08:25:57 +02:00
Pavel Raiskup
76dde7b708 copr-fe: typo in include_role statement 2021-08-05 08:24:32 +02:00
Pavel Raiskup
7849dda43f copr-fe: generate keytab for API 2021-08-05 08:11:13 +02:00
David Kirwan
aee759077e metrics-for-apps: template pxeboot worker nodes
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-05 12:38:52 +09:00
David Kirwan
37b2427919 metrics-for-apps: hotfix kernel, initrd config
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-05 12:32:59 +09:00
David Kirwan
90dfd799d1 metrics-for-apps: hotfix, live rootfs url
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-05 12:04:42 +09:00
Kevin Fenzi
ccbc500d4f fix tag
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-04 19:49:57 -07:00
Kevin Fenzi
06745a52a4 add cert to proxies-certificates
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-04 19:46:15 -07:00
Kevin Fenzi
b9f0e06735 perhaps a comma is needed here
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-04 19:42:57 -07:00
Kevin Fenzi
bd361bc5d3 fix spacing vs quotes issue
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-04 19:41:06 -07:00
Kevin Fenzi
1076e00aed add ocp stg wildcard cert and also point api to use it
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-08-04 19:39:28 -07:00
David Kirwan
bdc1da99d8 metrics-for-apps: hotfix installdevice for ocp4 workers
Signed-off-by: David Kirwan <dkirwan@redhat.com>
2021-08-05 11:23:20 +09:00