Commit graph

334 commits

Author SHA1 Message Date
Adam Williamson
aa2a002a96 Change how we get the HTML file accessible in fedora_nightlies
Just can't get Apache config Alias to work for some reason, so
let's go with the flow and stick the file in openQA's public
directory. This works!

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-21 18:37:03 -08:00
Adam Williamson
efb353bc02 Let's make that IncludeOptional so lab doesn't die
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-21 17:47:23 -08:00
Adam Williamson
4851dc8d65 Try and do fedora_nightlies Apache config without breaking openQA
Er, oops. This involves a hack, but at least it doesn't take the
openQA web UI offline.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-21 17:43:55 -08:00
Adam Williamson
813bbc4d2a openqa/server: allow group to write to factory dirs
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 17:16:28 -08:00
Adam Williamson
61251d0b11 More syntax...sigh
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 16:24:27 -08:00
Adam Williamson
95f062c07a openQA: allow all workers NFS write access, other tweaks
The main goal of these changes is to allow all workers in each
deployment NFS write access to the factory share. This is because
I want to try using os-autoinst's at-job-run-time decompression
of disk images instead of openQA's at-asset-download-time
decompression; it avoids some awkwardness with the asset file
name, and should also actually allow us to drop the decompression
code from openQA I think.

I also rejigged various other things at the same time as they
kinda logically go together. It's mostly cleanups and tweaks to
group variables. I tried to handle more things explicitly with
variables, as it's better for use of these plays outside of
Fedora infra.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 16:10:32 -08:00
Adam Williamson
be8dc36f7f openqa/worker: sigh restarted not started
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 14:36:12 -07:00
Adam Williamson
c2023d5560 openQA: try to make NFS mount changes more robust
On client end, restart mount unit (with daemon-reload) if mount
file changes. On server end, run exportfs -r if export config
file changes.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 14:06:07 -07:00
Adam Williamson
2b7a62f232 openqa/dispatcher: use arch filtering instead of custom WANTED
I just enhanced the scheduler code so we can share the stock
WANTED definition (which now includes all arches) between prod
and lab, but filter the arches with a config file setting. This
means we don't have to carry and install a whole custom WANTED
file with the extra arches for lab any more, we just set the
appropriate value in the config file. Also drop some stuff from
the config file that's been useless since we switched to
fedora-messaging.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-29 16:55:11 -07:00
Adam Williamson
6937712e23 openqa/dispatcher: enable Server and Workstation aarch64 images
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-29 15:43:58 -07:00
Adam Williamson
8ba222327d openqa/dispatcher: Drop 32-bit ARM disk image testing again
It's not working (no display output). Not sure why not yet.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-29 15:33:33 -07:00
Adam Williamson
728ae59b6d openqa/dispatcher: try scheduling 32-bit ARM minimal image again
...this time it'll run on an aarch64 host. Let's see if that
works better.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-29 14:10:21 -07:00
Adam Williamson
ef5d044161 openqa/dispatcher: schedule minimal disk image on aarch64
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-29 13:49:36 -07:00
Adam Williamson
d4f34b56f9 openqa/dispatcher: write file after creating directory
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-05 12:40:45 -07:00
Adam Williamson
13f59ad0eb openqa/worker: have swtpm service restart on success
This is because swtpm is designed not to be persistent, it's
sort of tied to a single "system" (VM in this case). We can't
expect an instance will stick around after it's been "used", it
doesn't do that, it exits successfully. So we need to restart it
when that happens.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-08 10:56:12 -07:00
Adam Williamson
11177cf2dc Good grief once more for the cheap seats in back
Would someone please fire me

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-02 15:24:19 -07:00
Adam Williamson
1afdb241ed openqa/dispatcher: fix that cron script cos I'm a doofus
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-02 15:22:42 -07:00
Adam Williamson
f77767ce85 openqa/dispatcher: add cron job to schedule jobs for CoreOS
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-09-02 15:19:26 -07:00
Adam Williamson
a2bef634cf openqa/worker: use include_tasks not import_tasks
Using `when` with `import_tasks` doesn't actually skip the import
entirely, it just imports the tasks and skips them one by one.
Which reads oddly. `include_tasks` is properly dynamic so seems
better here.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-24 14:11:21 -07:00
Adam Williamson
d9f5530046 openqa/worker: configure to use 172. IP range not 10.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-07-23 17:27:19 -07:00
Adam Williamson
6b196e70ab openqa/worker: set up swtpm service on tap worker hosts
swtpm is a TPM emulator we want to use for testing Clevis on
IoT (and potentially other things in future). We're implementing
this by having os-autoinst just add the qemu args but expect
swtpm itself to be running already - that's counted as the
sysadmin's responsibility. My approach to this is to have openQA
tap worker hosts also be tpm worker hosts, meaning they run one
instance of swtpm per worker instance (as a systemd service) and
are added to a 'tpm' worker class which tests can use to ensure
they run on a suitably-equipped worker. This sets up all of that.
We need a custom SELinux policy module to allow systemd to run
swtpm - this is blocked by default.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-24 16:59:11 -07:00
Adam Williamson
be6a4937ea openqa/worker: revert br0 netmask
os-autoinst *really really* wants it to be this. The helper
service fails if it isn't.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-10 17:05:24 -07:00
Adam Williamson
e7e8fbd6d0 openqa/worker: ignore network.service start fail for now
It's failing on the new IAD worker and I can't figure out why.
Let's skip it for now just to get the plays run.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-10 12:56:09 -07:00
Adam Williamson
44434ee9fa openqa/worker: tighten netmask for br0 tap bridge
It shouldn't need anything but 10.0.2.*, and hopefully this will
stop it interfering with the rest of the infra network...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-09 16:31:33 -07:00
Adam Williamson
5510c165e8 openqa/dispatcher: drop bool usage
I think I was using it wrong.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-05 16:20:38 -07:00
Adam Williamson
c3b87d88d1 openqa/server: allow template dump to fail
It will on first deployment. That's fine.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-05 16:05:18 -07:00
Adam Williamson
137c8cc123 openqa: test IoT dvd-ostree on aarch64, reorder file a bit
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-05 12:38:35 -07:00
Adam Williamson
0dc0dd6659 openqa: slightly broaden "(N|n)ot a git repository" check
I wrote it as "Not" before, so I presumably saw the message that
way, but today it seems to be "not". Let's just skip the letter.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-01 17:09:38 -07:00
Adam Williamson
c77f42a409 openqa/dispatcher: tweak some conditionals
Use the |bool modifier, and add that check in some places where
we didn't currently have it.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-05-19 12:36:11 -07:00
Adam Williamson
2b6c8be5aa openqa etc: reinstall local Python libs when Python ver changes
In openqa/dispatcher, relvalconsumer and check-compose roles, we
install Python libraries from git checkouts (these are things we
don't really want to package as they change too much). This
enhances those roles so that we check whether pip considers the
libraries to be installed, and install them if it doesn't. The
purpose is to catch when the Python version rolls over on system
upgrade, and reinstall the libraries in that case - I got bitten
by this when upgrading to F32, I forgot to reinstall these libs
for Python 3.8, and it broke things for a couple of days before
I noticed and fixed it manually...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-05-13 09:58:50 -07:00
Adam Williamson
a720ccac18 openqa/worker: correct scratchrepo cleanup filename
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-05-06 15:08:54 -07:00
Adam Williamson
1b87504450 openqa/worker: make createhdds git branch to use configurable
So we can test non-master branches on stg easier. May extend this
design to other repos (like the tests...) later.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-05-06 14:43:33 -07:00
Adam Williamson
32f9933aad openqa/server: drop createhdds stuff
This was disabled due to a bug for some time now. Originally I
meant to turn it back on, but now I don't think I do: it makes
more sense to just keep letting the worker hosts handle disk
image building, it doesn't make any sense to have the server do
it for x86_64 but worker hosts do it for other arches. If the
server can't do it *all*, we may as well be consistent across
arches and always have the worker hosts do it.

This does mean that on initial deployment using these plays there
is a time where the server is up and running but any jobs run
that need the base disk images will fail because the worker play
won't have built them yet. But I think that's not a big problem,
and it was already the case for non-x86_64 arches anyhow.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-05-06 14:27:37 -07:00
Adam Williamson
c1adee3cb7 openqa: drop scratch builds, drop hack
Scratch builds are installed now and seem to be working, so on
their way to updates-testing, so we don't need to specify them
here any more. Also drop the hack I put in to get the service
restart handler run.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 17:03:32 -07:00
Adam Williamson
e9c96f5b4d openqa: really fix the worker service loop this time (I hope)
Thanks mackerman on Freenode...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 15:43:26 -07:00
Adam Williamson
6566f6ba3f openqa/worker: try the |int fix for the loop here too
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 15:33:00 -07:00
Adam Williamson
d9d0048729 openqa/worker: abuse the scratch build stuff to trigger handler
I want this handler to run so I need to abuse something that's
gonna come up 'changed'.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 15:00:48 -07:00
Adam Williamson
c1b38b5ced openqa/worker: try and fix the service restart handler
It's failing and I don't see why, since I based this right on the
ansible docs. Maybe a |int will help?

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 15:00:48 -07:00
Adam Williamson
26005bf805 openqa: correct scratch repo config filename
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 12:38:48 -07:00
Adam Williamson
ba8c7b49ff openqa: create repodata for scratchrepo
Whoops.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 12:35:00 -07:00
Adam Williamson
255ce6ebad openqa/server: use jq for JSON comparison as json_diff died
Swiped from https://stackoverflow.com/questions/31930041/

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 12:33:50 -07:00
Adam Williamson
bb1525bdef openqa/{server,worker}: enhance package handling
This provides a mechanism for deploying scratch builds, and also
for controlling whether or not to install openQA and os-autoinst
from updates-testing.

I have been doing the scratch build thing for years already, just
manually by ssh'ing into the boxes. This is getting tiring now
we have like 15 worker hosts.

The scratch build mechanism isn't properly idempotent, but fixing
that would be hard and I really only intend to use it transiently
when I'm updating the packages, so I don't think it's worth the
effort.

This also adds a notification for restarting openQA worker
services when the packages or config are updated, and fixes the
worker playbook to enable the last worker service.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-30 12:23:57 -07:00
Adam Williamson
22f9b422f6 openqa/worker: fix issues flagged by ansible-lint
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-29 17:46:29 -07:00
Adam Williamson
d86a76b4d1 openqa/server: fix issues reported by ansible-lint
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-29 17:32:16 -07:00
Adam Williamson
7278d3f451 openqa/dispatcher: fix ansible-lint detected errors
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-29 16:49:32 -07:00
Adam Williamson
fbecb70bc1 openqa/dispatcher: Drop 32-bit ARM-on-x86_64 from stg scheduling
As in fedora_openqa, drop this as it's been broken forever and
we are clearly not going to fix it. At some point I'll set up
32-bit-ARM-on-aarch64 testing. Hopefully.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-24 21:34:29 +02:00
Adam Williamson
a6b9c5392d openqa/worker: disable aarch64-02 with a special worker class
openqa-aarch64-02.qa is broken in some very mysterious way:
https://pagure.io/fedora-infrastructure/issue/8750
until we can figure that out, this should prevent it picking up
normal jobs, but let us manually target a job at it whenever we
need to for debugging.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-24 21:34:26 +02:00
Adam Williamson
793b69e105 openqa/dispatcher: test everything and workstation on ppc64le
As requested by Michel in
https://pagure.io/fedora-qa/fedora_openqa/pull-request/77 (we
shadow changes to that here for $REASONS)

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-24 21:34:26 +02:00
Adam Williamson
063548a931 openqa/dispatcher: drop --nodeps from setup.py install
This was an old fedmsg Python 2 vs. Python 3 thing that's no
longer needed.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-24 21:34:26 +02:00
Adam Williamson
52d7450a9c openqa, check-compose, relvalconsumer: drop remaining fedmsg bits
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-24 21:34:26 +02:00