Review copr SOP

Signed-off-by: Michal Konečný <mkonecny@redhat.com>
This commit is contained in:
Michal Konečný 2021-08-18 12:49:35 +02:00
parent bfa0488890
commit 8881c92d7d
2 changed files with 30 additions and 20 deletions

View file

@ -67,7 +67,7 @@ $ systemctl start copr-backend
....
Sometimes OpenStack can not handle spawning too much VMs at the same
time. So it is safer to edit on copr-be.cloud.fedoraproject.org:
time. So it is safer to edit on _copr-be.cloud.fedoraproject.org_:
....
vi /etc/copr/copr-be.conf
@ -165,7 +165,7 @@ $ rm -rf ./appdata
=== Backend action queue issues
First check the link:[number of not-yet-processed actions]. If that
First check the _number of not-yet-processed actions_. If that
number isn't equal to zero, and is not decrementing relatively fast (say
single action takes longer than 30s) -- there might be some problem.
Logs for the action dispatcher can be found in:
@ -188,12 +188,13 @@ $ sudo rbac-playbook groups/copr-keygen.yml
$ sudo rbac-playbook groups/copr-dist-git.yml
....
https://pagure.io/copr/copr/blob/master/f/copr-setup.txt The
[.title-ref]#copr-setup.txt# manual is severely outdated, but there is
The
https://pagure.io/copr/copr/blob/main/f/copr-setup.txt[copr-setup.txt]
manual is severely outdated, but there is
no up-to-date alternative. We should extract useful information from it
and put it here in the SOP or into
https://docs.pagure.org/copr.copr/maintenance_documentation.html and
then throw the [.title-ref]#copr-setup.txt# away.
then throw the _copr-setup.txt_ away.
On backend should run copr-backend service (which spawns several
processes). Backend spawns VM from Fedora Cloud. You could not login to
@ -265,9 +266,9 @@ shouldn't be worried with.
* redis
* lighttpd
All the [.title-ref]#copr-backend-*.service# are configured to be a part
of the [.title-ref]#copr-backend.service# so e.g. in case of restarting
all of them, just restart the [.title-ref]#copr-backend.service#.
All the _copr-backend-*.service_ are configured to be a part
of the _copr-backend.service_ so e.g. in case of restarting
all of them, just restart the _copr-backend.service_.
=== Frontend
@ -289,18 +290,27 @@ Builders for PPC64 are located at rh-power2.fit.vutbr.cz and anyone with
access to buildsys ssh key can get there using keys as::
msuchy@rh-power2.fit.vutbr.cz
There are commands: $ ls bin/ destroy-all.sh reinit-vm26.sh
There are commands:
....
$ ls bin/
destroy-all.sh reinit-vm26.sh
reinit-vm28.sh virsh-destroy-vm26.sh virsh-destroy-vm28.sh
virsh-start-vm26.sh virsh-start-vm28.sh get-one-vm.sh reinit-vm27.sh
reinit-vm29.sh virsh-destroy-vm27.sh virsh-destroy-vm29.sh
virsh-start-vm27.sh virsh-start-vm29.sh
....
bin/destroy-all.sh destroy all VM and reinit them reinit-vmXX.sh copy VM
image from template virsh-destroy-vmXX.sh destroys VM
virsh-start-vmXX.sh starts VM get-one-vm.sh start one VM and return its
IP - this is used in Copr playbooks.
`destroy-all.sh` destroy all VM and reinit them
In case of big queue of PPC64 tasks simply call bin/destroy-all.sh and
`reinit-vmXX.sh` copy VM image from template
`virsh-destroy-vmXX.sh` destroys VM
`virsh-start-vmXX.sh` starts VM
`get-one-vm.sh` start one VM and return its IP - this is used in Copr playbooks.
In case of big queue of PPC64 tasks simply call `bin/destroy-all.sh` and
it will destroy stuck VM and copr backend will spawn new VM.
== Ports opened for public
@ -360,7 +370,7 @@ I don't think we can settle down with any instance that provides less
than (2G RAM, obviously), but ideally, we need 3G+. 2-core CPU is good
enough.
* Disk space: 17G for system and 8G for [.title-ref]#pgsqldb# directory
* Disk space: 17G for system and 8G for _pgsqldb_ directory
If needed, we are able to clean-up the database directory of old dumps
and backups and get down to around 4G disk space.
@ -371,7 +381,7 @@ and backups and get down to around 4G disk space.
* CPU: 8 cores (3400MHz) with load 4.09, 4.55, 4.24
Backend takes care of spinning-up builders and running ansible playbooks
on them, running [.title-ref]#createrepo_c# (on big repositories) and so
on them, running _createrepo_c_ (on big repositories) and so
on. Copr utilizes two queues, one for builds, which are delegated to
OpenStack builders, and action queue. Actions, however, are processed
directly by the backend, so it can spike our load up. We would ideally
@ -406,9 +416,9 @@ distgit data, so we can't go any lower than what we have.
* RAM: ~150M (out of 2G)
* CPU: 1 core (3400MHz) with load 0.10, 0.31, 0.25
We are basically running just [.title-ref]#signd# and
[.title-ref]#httpd# here, both with minimal resource requirements. The
memory usage is topped by [.title-ref]#systemd-journald#.
We are basically running just _signd_ and
_httpd_ here, both with minimal resource requirements. The
memory usage is topped by _systemd-journald_.
* Disk space: 7G for system and ~500M (out of ~700M) for GPG keys