2021-07-26 10:39:47 +02:00
|
|
|
= Koschei SOP
|
|
|
|
|
|
|
|
Koschei is a continuous integration system for RPM packages. Koschei
|
|
|
|
runs package scratch builds after dependency change or after time elapse
|
|
|
|
and reports package buildability status to interested parties.
|
|
|
|
|
2021-09-03 11:41:27 +02:00
|
|
|
Production instance::
|
2024-06-06 07:16:52 +02:00
|
|
|
https://koschei.fedoraproject.org/
|
|
|
|
Staging instance::
|
|
|
|
https://koschei.stg.fedoraproject.org/
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
== Contact Information
|
|
|
|
|
|
|
|
Owner::
|
2024-06-06 07:16:52 +02:00
|
|
|
mizdebsk
|
2021-07-26 10:39:47 +02:00
|
|
|
Contact::
|
|
|
|
#fedora-admin
|
|
|
|
Location::
|
2024-06-06 07:16:52 +02:00
|
|
|
Fedora infrastructure OpenShift
|
2021-07-26 10:39:47 +02:00
|
|
|
Purpose::
|
|
|
|
continuous integration system
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
== Description
|
|
|
|
|
|
|
|
Koschei consists of frontend and backend.
|
|
|
|
|
|
|
|
Frontend is a web application written in Python using Flask framework.
|
|
|
|
It is ran under Apache httpd with mod_wsgi as a WSGi application.
|
|
|
|
Frontend displays information to users and allows editing package
|
|
|
|
groups and changing priorities.
|
|
|
|
|
|
|
|
Backend consists of a couple of loosely-coupled microservices,
|
|
|
|
including:
|
|
|
|
|
|
|
|
* `watcher` - listens to events on Fedora messaging bus for complete
|
|
|
|
builds and changes build states in the database.
|
|
|
|
* `repo-resolver` - resolves package dependencies in given repo using
|
|
|
|
hawkey and compares them with previous iteration to get a dependency
|
|
|
|
diff. It resolves all packages in the newest repo available in
|
|
|
|
Koji. The output is a base for scheduling new builds.
|
|
|
|
* `build-resolver` - resolves complete builds in the repo in which
|
|
|
|
they were done in Koji. Produces the dependency differences visible in
|
|
|
|
the frontend.
|
|
|
|
* `scheduler` - schedules new builds based on multiple criteria:
|
|
|
|
** dependency priority - dependency changes since last build valued by
|
|
|
|
their distance in the dependency graph
|
|
|
|
** manual and static priorities - set manually in the frontend. Manual
|
|
|
|
priority is reset after each build, static priority persists
|
|
|
|
** time priority - time elapsed since the last build.
|
|
|
|
* `polling` - polls the same types of events as `watcher` without
|
|
|
|
reliance on the messaging bus. Additionally takes care of package list
|
|
|
|
synchronization and other regularly executed tasks.
|
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
== Deployment
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Koschei deployment is managed by an Ansible playbook:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
2024-06-06 07:16:52 +02:00
|
|
|
sudo rbac-playbook openshift-apps/koschei.yml
|
2021-07-26 10:39:47 +02:00
|
|
|
....
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
The above playbook is idempotent, which means that running it has no
|
|
|
|
effect when everything is already configured as expected.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Koschei is fully-containerized. It is deployed on OpenShift.
|
|
|
|
|
|
|
|
Koschei is stateless. It doesn't use any persistent storage. All
|
|
|
|
non-volatile information is stored in PostgreSQL database, which is
|
|
|
|
not part of Koschei, but an external service that Koschei depends on.
|
|
|
|
|
|
|
|
There is one common container image for different Koschei workloads --
|
|
|
|
frontend and backend containers are all ran from the same image.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Koschei images are built by upstream on Quay.io. Upstream implements
|
|
|
|
continuous delivery of container images to Quay.io registry. Code
|
|
|
|
pushed to fedora-prod or fedora-stage git branches in upstream GitHub
|
|
|
|
repository are automatically built as container images and pushed to
|
|
|
|
Quay.io registry with appropriate tags.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Pristine upstream Koschei images are then imported into internal
|
|
|
|
OpenShift registry -- Fedora OpenShift does not build any Koschei
|
|
|
|
container images by itself. Image import into OpenShift is always
|
|
|
|
done manually by a Koschei sysadmin, usually by running a manual
|
|
|
|
Ansible playbook. This way we ensure that developers who can push
|
|
|
|
code to GitHub repository don't have any control over Fedora
|
|
|
|
infrastructure deployment process.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Upstream images don't contain any Fedora-specific configuration. Such
|
|
|
|
configuration is mounted into containers as read-only volumes backed
|
|
|
|
by Kubernetes Secrets.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Frontend is ran as Kubernetes Deployment with multiple replicas for
|
|
|
|
high availability. Frontend supports rolling update, which allows it
|
|
|
|
to be updated with no user-visible downtime.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Each of backend services has its own Kubernetes Deployment with a
|
|
|
|
single replica. Because backend downtime is not user-visible, rolling
|
|
|
|
updates are not used by backend.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
In addition to frontend and backend, there is also `admin` Deployment,
|
|
|
|
which runs a container that does nothing but waits for sysadmin to
|
|
|
|
`rsh` into it for running manual admin commands.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Besides the forementioned Kubernetes Deployments, some ad-hoc tasks
|
|
|
|
are ran as Kubernetes Jobs, either created on a time schedule from
|
|
|
|
CronJobs or created by running manual Ansible playbooks by Koschei
|
|
|
|
sysadmins.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
== Upgrade
|
|
|
|
|
|
|
|
Upgrading Koschei to a new upstream version is done by running one of
|
|
|
|
manual Ansible playbooks:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
2024-06-06 07:16:52 +02:00
|
|
|
sudo rbac-playbook manual/upgrade/koschei-rolling.yml
|
|
|
|
sudo rbac-playbook manual/upgrade/koschei-full.yml
|
2021-07-26 10:39:47 +02:00
|
|
|
....
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
The first rolling update playbook should be used when given update is
|
|
|
|
known not to change database schema. In this case new upstream image
|
|
|
|
is simply imported into internal OpenShift registry and all
|
|
|
|
Deployments are restarted. OpenShift takes care of doing rolling
|
|
|
|
update of frontend, so that no downtime is experienced by
|
|
|
|
users. Backend Pods are also recreated with the new image.
|
|
|
|
|
|
|
|
The second full update playbook is used when given update changes
|
|
|
|
database schema. This playbook pauses all Deployments and terminates
|
|
|
|
all Pods. Users experience frontend downtime. When everything is
|
|
|
|
stopped, the playbook creates Kubernetes Jobs to run database
|
|
|
|
migrations and perform other maintenance tasks. Once the Jobs are
|
|
|
|
done, new Deployments are rolled.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
== Admin shell
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
Certain Koschei operation tasks are done with the `koschei-admin` CLI
|
|
|
|
tool. The container where the tool is available can be accessed with:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
...
|
|
|
|
oc project koschei
|
|
|
|
oc rsh deploy/admin
|
|
|
|
...
|
2021-07-26 10:39:47 +02:00
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
== Suspending Koschei operation
|
|
|
|
|
|
|
|
For stopping builds from being scheduled, scaling down the `scheduler`
|
|
|
|
Deployment to zero replicas is enough. For planned Koji outages, it's
|
|
|
|
recommended to stop the scheduler service. It is not necessary, as
|
|
|
|
Koschei can recover from Koji errors and network errors automatically,
|
|
|
|
but when Koji builders are stopped, it may cause unexpected build
|
|
|
|
failures that would be reported to users. Other backend services can
|
|
|
|
be left running as they automatically restart themselves on Koji and
|
|
|
|
network errors.
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
== Limiting Koji usage
|
|
|
|
|
|
|
|
Koschei is by default limited to 30 concurrently running builds. This
|
|
|
|
limit can be changed in the configuration under `koji_config.max_builds`
|
|
|
|
key. There's also Koji load monitoring, that prevents builds from being
|
|
|
|
scheduled when Koji load is higher that certain threshold. That should
|
|
|
|
prevent scheduling builds during mass rebuilds, so it's not necessary to
|
|
|
|
stop scheduling during those.
|
|
|
|
|
|
|
|
== Setting admin announcement
|
|
|
|
|
|
|
|
Koschei can display announcement in web UI. This is mostly useful to
|
|
|
|
inform users about outages or other problems.
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
To set announcement, run:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
|
|
|
koschei-admin set-notice "Koschei operation is currently suspended due to scheduled Koji outage"
|
|
|
|
....
|
|
|
|
|
|
|
|
or:
|
|
|
|
|
|
|
|
....
|
2024-06-06 07:16:52 +02:00
|
|
|
koschei-admin set-notice "Submitting scratch builds by Koschei is currently disabled due to Fedora 23 mass rebuild"
|
2021-07-26 10:39:47 +02:00
|
|
|
....
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
To clear announcement, run:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
|
|
|
koschei-admin clear-notice
|
|
|
|
....
|
|
|
|
|
|
|
|
== Adding package groups
|
|
|
|
|
|
|
|
Packages can be added to one or more group.
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
To add new group named `mynewgroup`, run:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
|
|
|
koschei-admin add-group mynewgroup
|
|
|
|
....
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
To add new group named `mynewgroup` and populate it with some
|
|
|
|
packages, run:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
|
|
|
koschei-admin add-group mynewgroup pkg1 pkg2 pkg3
|
|
|
|
....
|
|
|
|
|
|
|
|
== Set package static priority
|
|
|
|
|
|
|
|
Some packages are more or less important and can have higher or lower
|
|
|
|
priority. Any user can change manual priority, which is reset after
|
|
|
|
package is rebuilt. Admins can additionally set static priority, which
|
|
|
|
is not affected by package rebuilds.
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
To set static priority of package `foo` to value `100`, run:
|
2021-07-26 10:39:47 +02:00
|
|
|
|
|
|
|
....
|
|
|
|
koschei-admin --collection f27 set-priority --static foo 100
|
|
|
|
....
|
|
|
|
|
|
|
|
== Branching a new Fedora release
|
|
|
|
|
|
|
|
After branching occurs and Koji build targets have been created, Koschei
|
|
|
|
should be updated to reflect the new state. There is a special admin
|
|
|
|
command for this purpose, which takes care of copying the configuration
|
|
|
|
and also last builds from the history.
|
|
|
|
|
|
|
|
To branch the collection from Fedora 27 to Fedora 28, use the following:
|
|
|
|
|
|
|
|
....
|
|
|
|
koschei-admin branch-collection f27 f28 -d 'Fedora 27' -t f28 --bugzilla-version 27
|
|
|
|
....
|
|
|
|
|
|
|
|
Then you can optionally verify that the collection configuration is
|
2024-06-06 07:16:52 +02:00
|
|
|
correct by visiting https://koschei.fedoraproject.org/collections
|
2021-07-26 10:39:47 +02:00
|
|
|
and examining the configuration of the newly branched collection.
|
2022-11-07 22:02:00 -03:00
|
|
|
|
|
|
|
== Edit Koschei group to make it global
|
|
|
|
|
2024-06-06 07:16:52 +02:00
|
|
|
To turn `mygroup` group created by user `someuser` into a global group
|
|
|
|
`thegroup`, run:
|
2022-11-07 22:02:00 -03:00
|
|
|
|
|
|
|
....
|
2024-06-06 07:16:52 +02:00
|
|
|
koschei-admin edit-group someuser/mygroup --make-global --new-name thegroup
|
|
|
|
....
|