Commit graph

25 commits

Author SHA1 Message Date
Kevin Fenzi
e3e2cb1d93 odcs: retire service ( infra 12192 )
Time to retire ODCS. ELN is moved off and that was the last thing using
it. Thanks for all the service ODCS!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-09-24 18:21:51 +00:00
Adam Williamson
e87cfca492 openQA: whoops, need to change routing keys here too
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2024-02-09 12:33:54 -08:00
Adam Williamson
35260ce9d8 Revert "openQA: trim asset size limits a bit"
This reverts commit cef30714a2.
Turns out the problem was just that asset cleanup wasn't running
because the testresults/images minimization job was blocking it.
2023-07-25 14:20:03 -07:00
Adam Williamson
cef30714a2 openQA: trim asset size limits a bit
prod is running out of disk space. Not sure why this has only
just started to happen, but...the numbers do seem to more or
less add up...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-07-25 11:34:08 -07:00
Adam Williamson
a5c322b4ee More cleanup on the openQA AMQP stuff
nirik and I went around and around a bit today and ended up back
where we started, but with a clearer understanding of where that
this. This explains it a bit better, and makes what's actually
going on in various places clearer with the use of appropriate
shared variables. This should not actually *change* anything at
all when deployed.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-06-22 23:21:28 +02:00
Adam Williamson
be953e0be4 Dangit, went too far. Only the scheduler should be set that way
Sigh. Sorry, this stuff is hard to keep straight in my head.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-06-21 13:26:16 +02:00
Adam Williamson
b50fa6a477 openqa amqp: fix stg-on-prod queue names
so, this was working before somehow, but it was pretty clearly
wrong. We were using queues owned by openqa.stg on the prod
rabbitmq instance for the cases where the openQA "stg" consumers
need to listen to prod queues. This can only have been working
with an openqa.stg user on prod, which seems wrong. Instead,
these three consumers should do it the way the relval and
relvalami consumers do - use a queue owned by the "openqa" user,
but with a suffix so they have a different queue from the actual
prod queue. The upshot of this is that in the configs, we should
go from:

amqp_url = "amqps://openqa:@rabbitmq.fedoraproject.org/%2Fpubsub"
...
queue = "openqa.stg_scheduler"

- which is weird and I have no idea how it ever worked - to:

amqp_url = "amqps://openqa:@rabbitmq.fedoraproject.org/%2Fpubsub"
...
queue = "openqa_scheduler_stg"

- which seems much more sensible.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-06-21 13:15:38 +02:00
Adam Williamson
bedeaaa8f7 openqa messaging config - add back a missing leading slash
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-06-21 12:36:27 +02:00
Adam Williamson
9953afa06e openqa etc: fix up and improve AMQP messaging configuration
This is triggered by
https://pagure.io/fedora-infrastructure/issue/11375 , but the
changes are rather extensive. Unfortunately, some of the
relevant files got messed up by the alphabetical sort thing that
got run on several group variable files a while ago, so that
confuses the diff a bit - I had to unwind those changes to make
the files readable again in order to make these changes.

Ultimately the goal here is to make the config more consistent
and more functional - the variables used and their names should
be more consistently related to what they're actually *for*,
which I didn't entirely understand when setting this up. So
we have variables for the username being used in each case and
we use that variable where we're referring to the username, for
instance. This should also make the whole thing about the cases
where listeners on the openQA stg/lab instance need to listen
to prod messages clearer, too. It also makes the user creation
clearer by doing it explicitly, just once per user, instead of
haphazardly doing it implicitly through the queue definitions.

And finally it should also actually fix 11375, by giving the
appropriate write permissions to each user.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-06-21 12:04:55 +02:00
Adam Williamson
c40ecfef1d openqa: also listen for ODCS state change messages
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-06-19 18:12:56 +02:00
Adam Williamson
29395d0c96 openqa: report queued 'results' on prod too
It seems to work OK on staging, so let's try it out on prod and
see how it looks in Bodhi.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-03-29 08:47:14 -07:00
Adam Williamson
807c43163c openqa: bump asset size for aarch64
300G doesn't seem like enough when we have candidate composes
alongside Rawhide and Branched, we're getting incomplete tests
because assets are being removed.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-10-25 09:05:52 -07:00
Adam Williamson
06fc914348 openQA: same update asset size for prod and stg
We're turning on Rawhide update testing on prod now (whee) so we
need this.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-07-28 16:13:44 -07:00
Adam Williamson
224e28131d openQA: prepare for prod deployment of latest releases
This unifies prod and stg onto the ways of doing things for the
latest packages, and rejigs the swtpm stuff a bit to tear down
more (we shouldn't need the custom SELinux policy any more).

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-06 10:40:33 -08:00
Adam Williamson
fd0d2b5a7e Add openqa_amqp_scheduler_routing_keys back to group_vars
sigh, needs to be here too as it's used from outside of the role
where the default is set. Not sure if there's a better fix for
this.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-24 11:05:09 -08:00
Adam Williamson
3e4c3534e5 openqa: switch FCOS scheduling to messages, reduce duplication
This sets us up for scheduling FCOS tests from messages, not
using a cron job. Also reduces some duplication of variables
between openqa-servers-common and the dispatcher role defaults.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-24 10:59:01 -08:00
Adam Williamson
113789b93e Add job re-trigger request key to openQA scheduler AMQP keys
We just updated the scheduler code to handle job re-trigger
requests, we need to configure the listener to listen for the
appropriate messages.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-19 10:01:42 -08:00
Kevin Fenzi
580cd252c5 Inventory group/host variables: Sort yaml
This was done using yq (
https://mikefarah.gitbook.io/yq/operators/sort-keys )

Doing things this way makes it much easier to see if a variable is set
in a file or if two hosts differ in what variables they set. Hopefully
we can keep things sorted moving forward.

Basically this means just sort a-z anything you add to any host or group
vaiable and it will be in the right place.

Additionally, this enforces 'normal' intent rules for all the variable
files which we should also try and obey. 2 spaces for first level, 3 for
next, etc. When in doubt you can run yq on it.

This should cause NO actual vairable changes, it's all just readability
fixing for humans, ansible parses it exactly the same.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-11-16 13:27:57 -08:00
Adam Williamson
b2f88d916b openQA: trim Fedora group asset size to 500G
600G kinda pushes the limits on prod, and I think 500 should be
enough, let's see how it goes.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-08-12 09:22:19 -07:00
Adam Williamson
a889649c46 openqa: bump asset size allocations a bit
We have more space on the IAD servers, so let's use it.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-04-08 09:24:23 -07:00
Adam Williamson
cd09666b16 Try and fix cecert definitions for openQA lab/stg
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-04-01 17:18:57 -07:00
Nick Bebout
0eae657232 Fix sudo rules for sysadmin-noc and sysadmin-veteran 2021-03-28 20:46:01 -05:00
Nils Philippsen
6fcbc946ee ipa/client: enable for openqa in prod
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-03-24 13:44:33 +01:00
Nils Philippsen
dbbf94a411 ipa/client: configure global shell access and sudo
Almost global anyway, i.e. inside the VPN.

The ipa/client-based shell access and sudo rules are only effective for
staging right now, the respective playbook bits are masked out for prod.

- Assign Ansible host groups to IPA host groups, the latter don't care
  about 'stg' in the name and use dashes rather than underscores.
- Distill shell access groups from fas_client_groups in group and host
  vars.
- Let all `sysadmin-*` groups in the previous list run anything via sudo
  in the host group (except bastion & batcave).
- Remove `fas_client_groups` from staging host and group vars.
- Remove sudoers from staging host and group vars if only `sysadmin-*`
  groups have shell access.
- Set up `ipa_client_shell_groups` on bastion to be a super set of the
  same on batcave.

Newly created IPA host groups:
- autosign
- badges
- basset
- bastion
- batcave
- blockerbugs
- bodhi
- bugzilla2fedmsg
- busgateway
- datagrepper
- dbserver
- dns
- fedimg
- github2fedmsg
- ipa
- kernel-qa
- kerneltest
- kojibuilder
- kojihub
- kojipkgs
- logging
- mailman
- memcached
- mirrormanager
- nagios
- notifs
- oci-registry
- odcs
- openqa
- openqa-workers
- osbs
- packages
- pdc-web
- pkgs
- proxies
- rabbitmq
- releng-compose
- resultsdb
- secondary
- sign-bridge
- sundries
- value
- wiki

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-02-01 22:23:41 +00:00
Adam Williamson
95f062c07a openQA: allow all workers NFS write access, other tweaks
The main goal of these changes is to allow all workers in each
deployment NFS write access to the factory share. This is because
I want to try using os-autoinst's at-job-run-time decompression
of disk images instead of openQA's at-asset-download-time
decompression; it avoids some awkwardness with the asset file
name, and should also actually allow us to drop the decompression
code from openQA I think.

I also rejigged various other things at the same time as they
kinda logically go together. It's mostly cleanups and tweaks to
group variables. I tried to handle more things explicitly with
variables, as it's better for use of these plays outside of
Fedora infra.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 16:10:32 -08:00
Renamed from inventory/group_vars/openqa_common (Browse further)