224 lines
5.7 KiB
Text
224 lines
5.7 KiB
Text
= Koji Infrastructure SOP
|
|
|
|
[NOTE]
|
|
.Note
|
|
====
|
|
We are transitioning from two buildsystems, koji for Fedora and plague
|
|
for EPEL, to just using koji. This page documents both.
|
|
====
|
|
|
|
Koji and plague are our buildsystems. They share some of the same
|
|
machines to do their work.
|
|
|
|
== Contents
|
|
|
|
[arabic]
|
|
. Contact Information
|
|
. Description
|
|
. Add packages into Buildroot
|
|
. Troubleshooting and Resolution
|
|
|
|
[arabic]
|
|
. Restarting Koji
|
|
. kojid won't start or some builders won't connect
|
|
. OOM (Out of Memory) Issues
|
|
|
|
[arabic]
|
|
. Increase Memory
|
|
. Decrease weight
|
|
|
|
[arabic, start=4]
|
|
. Disk Space Issues
|
|
|
|
|
|
{empty}5. Should there be mention of being sure filesystems in chroots
|
|
are unmounted before you delete the chroots?
|
|
|
|
== Contact Information
|
|
|
|
Owner::
|
|
Fedora Infrastructure Team
|
|
Contact::
|
|
#fedora-admin, sysadmin-build group
|
|
Persons::
|
|
mbonnet, dgilmore, f13, notting, mmcgrath, SmootherFrOgZ
|
|
Location::
|
|
Phoenix
|
|
Servers::
|
|
* koji.fedoraproject.org
|
|
* buildsys.fedoraproject.org
|
|
* xenbuilder[1-4]
|
|
* hammer1, ppc[1-4]
|
|
Purpose::
|
|
Build packages for Fedora.
|
|
|
|
== Description
|
|
|
|
Users submit builds to koji.fedoraproject.org or
|
|
buildsys.fedoraproject.org. From there it gets passed on to the
|
|
builders.
|
|
|
|
[IMPORTANT]
|
|
.Important
|
|
====
|
|
At present plague and koji are unaware of each other. A result of this
|
|
may be an overloaded builder. A easy fix for this is not clear at this
|
|
time
|
|
====
|
|
|
|
== Add packages into Buildroot
|
|
|
|
Some contributors may have the need to build packages against fresh
|
|
built packages which are not into buildroot yet. Koji has override tags
|
|
as a Inheritance to the build tag in order to include them into
|
|
buildroot which can be set by:
|
|
|
|
....
|
|
koji tag-pkg dist-$release-override <package_nvr>
|
|
....
|
|
|
|
== Troubleshooting and Resolution
|
|
|
|
=== Restarting Koji
|
|
|
|
If for some reason koji needs to be restarted, make sure to restart the
|
|
koji master first, then the builders. If the koji master has been down
|
|
for a short enough time the builders do not need to be restarted.:
|
|
|
|
....
|
|
service httpd restart
|
|
service kojira restart
|
|
service kojid restart
|
|
....
|
|
|
|
[IMPORTANT]
|
|
.Important
|
|
====
|
|
If postgres becomes interrupted in some way, koji will need to be
|
|
restarted. As long as the koji master daemon gets restarted the builders
|
|
should reconnect automatically. If the db server has been restarted and
|
|
the builders don't seem to be building, restart their daemons as well.
|
|
====
|
|
|
|
=== kojid won't start or some builders won't connect
|
|
|
|
In the event that some items are able to connect to koji while some are
|
|
not, please make sure that the database is not filled up on connections.
|
|
This is common if koji crashes and the db connections aren't properly
|
|
cleared. Upon restart many of the connections are full so koji cannot
|
|
reconnect. Clearing old connections is easy, guess about how long it the
|
|
new koji has been up and pick a number of minutes larger then that and
|
|
kill those queries. From db3 as postgres run:
|
|
|
|
....
|
|
echo "select procpid from pg_stat_activity where usename='koji' and now() - query_start \
|
|
>= '00:40:00' order by query_start;" | psql koji | grep "^ " | xargs kill
|
|
....
|
|
|
|
=== OOM (Out of Memory) Issues
|
|
|
|
Out of memory issues occur from time to time on the build machines.
|
|
There are a couple of options for correction. The first fix is to just
|
|
restart the machine and hope it was a one time thing. If the problem
|
|
continues please choose from one of the following options.
|
|
|
|
==== Increase Memory
|
|
|
|
The xen machines can have memory increased on their corresponding xen
|
|
hosts. At present this is the table:
|
|
|
|
[width="34%",cols="44%,56%",]
|
|
|===
|
|
|xen3 |xenbuilder1
|
|
|xen4 |xenbuilder2
|
|
|disabled |xenbuilder3
|
|
|xen8 |xenbuilder4
|
|
|===
|
|
|
|
Edit `/etc/xen/xenbuilder[1-4]` and add more memory.
|
|
|
|
==== Decrease weight
|
|
|
|
Each builder has a weight as to how much work can be given to it.
|
|
Presently the only way to alter weight is actually changing the database
|
|
on db3:
|
|
|
|
....
|
|
$ sudo su - postgres
|
|
-bash-2.05b$ psql koji
|
|
koji=# select * from host limit 1;
|
|
id | user_id | name | arches | task_load | capacity | ready | enabled
|
|
---+---------+------------------------+-----------+-----------+----------+-------+---------
|
|
6 | 130 | ppc3.fedora.redhat.com | ppc ppc64 | 1.5 | 4 | t | t
|
|
(1 row)
|
|
koji=# update host set capacity=2 where name='ppc3.fedora.redhat.com';
|
|
....
|
|
|
|
Simply update capacity to a lower number.
|
|
|
|
=== Disk Space Issues
|
|
|
|
The builders use a lot of temporary storage. Failed builds also get left
|
|
on the builders, most should get cleaned but plague does not. The
|
|
easiest thing to do is remove some older cache dirs.
|
|
|
|
Step one is to turn off both koji and plague:
|
|
|
|
....
|
|
/etc/init.d/plague-builder stop
|
|
/etc/init.d/kojid stop
|
|
....
|
|
|
|
Next check to see what file system is full:
|
|
|
|
....
|
|
df -h
|
|
....
|
|
|
|
[IMPORTANT]
|
|
.Important
|
|
====
|
|
If any one of the following directories is full, send an outage
|
|
notification as outlined in: [62]Infrastructure/OutageTemplate to the
|
|
fedora-infrastructure-list and fedora-devel-list, then contact Mike
|
|
McGrath
|
|
|
|
* /mnt/koji
|
|
* /mnt/ntap-fedora1/scratch
|
|
* /pub/epel
|
|
* /pub/fedora
|
|
====
|
|
|
|
Typically just / will be full. The next thing to do is determine if
|
|
we have any extremely large builds left on the builder. Typical
|
|
locations include /var/lib/mock and /mnt/build (/mnt/build actually is
|
|
on the local filesystem):
|
|
|
|
....
|
|
du -sh /var/lib/mock/* /mnt/build/*
|
|
....
|
|
|
|
`/var/lib/mock/dist-f8-build-10443-1503`::
|
|
classic koji build
|
|
`/var/lib/mock/fedora-6-ppc-core-57cd31505683ef1afa533197e91608c5a2c52864`::
|
|
classic plague build
|
|
|
|
If nothing jumps out immediately, just start deleting files older than
|
|
one week. Once enough space has been freed start koji and plague back
|
|
up:
|
|
|
|
....
|
|
/etc/init.d/plague-builder start
|
|
/etc/init.d/kojid start
|
|
....
|
|
|
|
=== Unmounting
|
|
|
|
[WARNING]
|
|
.Warning
|
|
====
|
|
Should there be mention of being sure filesystems in chroots are
|
|
unmounted before you delete the chroots?
|
|
|
|
Res ipsa loquitur.
|
|
====
|