121 lines
3.8 KiB
Text
121 lines
3.8 KiB
Text
= Koji Infrastructure SOP
|
|
|
|
== Contents
|
|
|
|
* <<_contact_information>>
|
|
* <<_description>>
|
|
* <<_add_packages_into_buildroot>>
|
|
* <<_troubleshooting_and_resolution>>
|
|
** <<_kojid_wont_start_or_some_builders_wont_connect>>
|
|
** <<_disk_space_issues>>
|
|
** <<_checking_builders_are_all_checking_in_correctly>>
|
|
** <<_mntkoji_is_not_accessible_on_s390x_builder>>
|
|
|
|
== Contact Information
|
|
|
|
Owner::
|
|
Fedora Infrastructure Team
|
|
Contact::
|
|
#fedora-admin, sysadmin-build group
|
|
Persons::
|
|
mbonnet, dgilmore, f13, notting, mmcgrath, SmootherFrOgZ
|
|
Servers::
|
|
* koji.fedoraproject.org
|
|
* buildsys.fedoraproject.org
|
|
* xenbuilder[1-4]
|
|
* hammer1, ppc[1-4]
|
|
Purpose::
|
|
Build packages for Fedora.
|
|
|
|
== Description
|
|
|
|
Users submit builds to _koji.fedoraproject.org_ or
|
|
_buildsys.fedoraproject.org_. From there it gets passed on to the
|
|
builders.
|
|
|
|
== Add packages into Buildroot
|
|
|
|
Some contributors may have the need to build packages against fresh
|
|
built packages which are not into buildroot yet. Koji has override tags
|
|
as a Inheritance to the build tag in order to include them into
|
|
buildroot which can be set by:
|
|
|
|
....
|
|
koji tag-pkg dist-$release-override <package_nvr>
|
|
....
|
|
|
|
== Troubleshooting and Resolution
|
|
|
|
=== kojid won't start or some builders won't connect
|
|
|
|
In the event that some items are able to connect to koji while some are
|
|
not, please make sure that the database is not filled up on connections.
|
|
This is common if koji crashes and the db connections aren't properly
|
|
cleared. Upon restart many of the connections are full so koji cannot
|
|
reconnect. Clearing old connections is easy, guess about how long it the
|
|
new koji has been up and pick a number of minutes larger then that and
|
|
kill those queries. From _db-koji01_ as _postgres_ run:
|
|
|
|
....
|
|
echo "select procpid from pg_stat_activity where usename='koji' and now() - query_start \
|
|
>= '00:40:00' order by query_start;" | psql koji | grep "^ " | xargs kill
|
|
....
|
|
|
|
=== Disk Space Issues
|
|
|
|
The builders use a lot of temporary storage. Failed builds and old mock buildroots should be automatic cleaned, but in case it doesn't, remove /var/lib/mock/* and restart kojid on the affected builder:
|
|
|
|
....
|
|
systemctl restart kojid
|
|
....
|
|
|
|
[IMPORTANT]
|
|
====
|
|
aarch64 buildhw's have a lot of space taken up in /var/log/libvirt/qemu/nvram/ which can all be deleted to free up space.
|
|
====
|
|
|
|
=== Checking builders are all checking in correctly
|
|
|
|
To check builders, list all builders and grep by the time of last update. If the builders are checking in correctly the time of last update should be close to your current date/time, so use a command like the following example:
|
|
|
|
....
|
|
koji list-hosts --enabled | grep -v '04 Dec 2022 12:1'
|
|
....
|
|
|
|
[IMPORTANT]
|
|
====
|
|
Kojira process should only run on koji02. Never on koji01.
|
|
====
|
|
|
|
=== /mnt/koji is not accessible on s390x builder
|
|
|
|
After restarting any `s390x` machine in `inventory/builders` `[runroot]` group sshfs mounts are not mounted automatically. Those needs to be mounted manually.
|
|
|
|
. `mount /mnt/koji`
|
|
. `mount /srv/odcs`
|
|
|
|
[NOTE]
|
|
====
|
|
You need to have access to Bitwarden Vault for the password prompt.
|
|
====
|
|
|
|
=== OSError: [Errno 30] Read-only file system: "/var/tmp/koji/tasks/xxx"
|
|
|
|
For more information about this issue see link:<https://bugzilla.redhat.com/show_bug.cgi?id=2312886>[relevant bugzilla ticket].
|
|
The issue is also reported on parent task, make sure you found the exact task that failed to find the failing host.
|
|
Only way to deal with it currently is to disable the builder with issue in koji and reinstall it again.
|
|
|
|
==== Disabling builder in koji
|
|
|
|
. Generate a kerberos ticket
|
|
. `koji disable <builder_with_issue>`
|
|
|
|
[NOTE]
|
|
====
|
|
If you don't have permissions to disable builder using `koji disable` command. Ssh to the machine and stop
|
|
kojid service by running `systemctl stop kojid`. This will prevent the machine to serve any more builds.
|
|
====
|
|
|
|
==== Reinstalling the builder
|
|
|
|
. Run `groups/buildvm.yml` ansible playbook with `--limit <builder_hostname>`
|