Review copr SOP

Signed-off-by: Michal Konečný <mkonecny@redhat.com>
This commit is contained in:
Michal Konečný 2021-08-18 12:49:35 +02:00
parent bfa0488890
commit 8881c92d7d
2 changed files with 30 additions and 20 deletions

View file

@ -16,7 +16,7 @@
** xref:collectd.adoc[Collectd - SOP] ** xref:collectd.adoc[Collectd - SOP]
** xref:compose-tracker.adoc[Compose Tracker - SOP] ** xref:compose-tracker.adoc[Compose Tracker - SOP]
** xref:contenthosting.adoc[Content Hosting Infrastructure - SOP] ** xref:contenthosting.adoc[Content Hosting Infrastructure - SOP]
** xref:copr.adoc[copr - SOP in review ] ** xref:copr.adoc[Copr - SOP]
** xref:database.adoc[database - SOP in review ] ** xref:database.adoc[database - SOP in review ]
** xref:datanommer.adoc[datanommer - SOP in review ] ** xref:datanommer.adoc[datanommer - SOP in review ]
** xref:debuginfod.adoc[debuginfod - SOP in review ] ** xref:debuginfod.adoc[debuginfod - SOP in review ]

View file

@ -67,7 +67,7 @@ $ systemctl start copr-backend
.... ....
Sometimes OpenStack can not handle spawning too much VMs at the same Sometimes OpenStack can not handle spawning too much VMs at the same
time. So it is safer to edit on copr-be.cloud.fedoraproject.org: time. So it is safer to edit on _copr-be.cloud.fedoraproject.org_:
.... ....
vi /etc/copr/copr-be.conf vi /etc/copr/copr-be.conf
@ -165,7 +165,7 @@ $ rm -rf ./appdata
=== Backend action queue issues === Backend action queue issues
First check the link:[number of not-yet-processed actions]. If that First check the _number of not-yet-processed actions_. If that
number isn't equal to zero, and is not decrementing relatively fast (say number isn't equal to zero, and is not decrementing relatively fast (say
single action takes longer than 30s) -- there might be some problem. single action takes longer than 30s) -- there might be some problem.
Logs for the action dispatcher can be found in: Logs for the action dispatcher can be found in:
@ -188,12 +188,13 @@ $ sudo rbac-playbook groups/copr-keygen.yml
$ sudo rbac-playbook groups/copr-dist-git.yml $ sudo rbac-playbook groups/copr-dist-git.yml
.... ....
https://pagure.io/copr/copr/blob/master/f/copr-setup.txt The The
[.title-ref]#copr-setup.txt# manual is severely outdated, but there is https://pagure.io/copr/copr/blob/main/f/copr-setup.txt[copr-setup.txt]
manual is severely outdated, but there is
no up-to-date alternative. We should extract useful information from it no up-to-date alternative. We should extract useful information from it
and put it here in the SOP or into and put it here in the SOP or into
https://docs.pagure.org/copr.copr/maintenance_documentation.html and https://docs.pagure.org/copr.copr/maintenance_documentation.html and
then throw the [.title-ref]#copr-setup.txt# away. then throw the _copr-setup.txt_ away.
On backend should run copr-backend service (which spawns several On backend should run copr-backend service (which spawns several
processes). Backend spawns VM from Fedora Cloud. You could not login to processes). Backend spawns VM from Fedora Cloud. You could not login to
@ -265,9 +266,9 @@ shouldn't be worried with.
* redis * redis
* lighttpd * lighttpd
All the [.title-ref]#copr-backend-*.service# are configured to be a part All the _copr-backend-*.service_ are configured to be a part
of the [.title-ref]#copr-backend.service# so e.g. in case of restarting of the _copr-backend.service_ so e.g. in case of restarting
all of them, just restart the [.title-ref]#copr-backend.service#. all of them, just restart the _copr-backend.service_.
=== Frontend === Frontend
@ -289,18 +290,27 @@ Builders for PPC64 are located at rh-power2.fit.vutbr.cz and anyone with
access to buildsys ssh key can get there using keys as:: access to buildsys ssh key can get there using keys as::
msuchy@rh-power2.fit.vutbr.cz msuchy@rh-power2.fit.vutbr.cz
There are commands: $ ls bin/ destroy-all.sh reinit-vm26.sh There are commands:
....
$ ls bin/
destroy-all.sh reinit-vm26.sh
reinit-vm28.sh virsh-destroy-vm26.sh virsh-destroy-vm28.sh reinit-vm28.sh virsh-destroy-vm26.sh virsh-destroy-vm28.sh
virsh-start-vm26.sh virsh-start-vm28.sh get-one-vm.sh reinit-vm27.sh virsh-start-vm26.sh virsh-start-vm28.sh get-one-vm.sh reinit-vm27.sh
reinit-vm29.sh virsh-destroy-vm27.sh virsh-destroy-vm29.sh reinit-vm29.sh virsh-destroy-vm27.sh virsh-destroy-vm29.sh
virsh-start-vm27.sh virsh-start-vm29.sh virsh-start-vm27.sh virsh-start-vm29.sh
....
bin/destroy-all.sh destroy all VM and reinit them reinit-vmXX.sh copy VM `destroy-all.sh` destroy all VM and reinit them
image from template virsh-destroy-vmXX.sh destroys VM
virsh-start-vmXX.sh starts VM get-one-vm.sh start one VM and return its
IP - this is used in Copr playbooks.
In case of big queue of PPC64 tasks simply call bin/destroy-all.sh and `reinit-vmXX.sh` copy VM image from template
`virsh-destroy-vmXX.sh` destroys VM
`virsh-start-vmXX.sh` starts VM
`get-one-vm.sh` start one VM and return its IP - this is used in Copr playbooks.
In case of big queue of PPC64 tasks simply call `bin/destroy-all.sh` and
it will destroy stuck VM and copr backend will spawn new VM. it will destroy stuck VM and copr backend will spawn new VM.
== Ports opened for public == Ports opened for public
@ -360,7 +370,7 @@ I don't think we can settle down with any instance that provides less
than (2G RAM, obviously), but ideally, we need 3G+. 2-core CPU is good than (2G RAM, obviously), but ideally, we need 3G+. 2-core CPU is good
enough. enough.
* Disk space: 17G for system and 8G for [.title-ref]#pgsqldb# directory * Disk space: 17G for system and 8G for _pgsqldb_ directory
If needed, we are able to clean-up the database directory of old dumps If needed, we are able to clean-up the database directory of old dumps
and backups and get down to around 4G disk space. and backups and get down to around 4G disk space.
@ -371,7 +381,7 @@ and backups and get down to around 4G disk space.
* CPU: 8 cores (3400MHz) with load 4.09, 4.55, 4.24 * CPU: 8 cores (3400MHz) with load 4.09, 4.55, 4.24
Backend takes care of spinning-up builders and running ansible playbooks Backend takes care of spinning-up builders and running ansible playbooks
on them, running [.title-ref]#createrepo_c# (on big repositories) and so on them, running _createrepo_c_ (on big repositories) and so
on. Copr utilizes two queues, one for builds, which are delegated to on. Copr utilizes two queues, one for builds, which are delegated to
OpenStack builders, and action queue. Actions, however, are processed OpenStack builders, and action queue. Actions, however, are processed
directly by the backend, so it can spike our load up. We would ideally directly by the backend, so it can spike our load up. We would ideally
@ -406,9 +416,9 @@ distgit data, so we can't go any lower than what we have.
* RAM: ~150M (out of 2G) * RAM: ~150M (out of 2G)
* CPU: 1 core (3400MHz) with load 0.10, 0.31, 0.25 * CPU: 1 core (3400MHz) with load 0.10, 0.31, 0.25
We are basically running just [.title-ref]#signd# and We are basically running just _signd_ and
[.title-ref]#httpd# here, both with minimal resource requirements. The _httpd_ here, both with minimal resource requirements. The
memory usage is topped by [.title-ref]#systemd-journald#. memory usage is topped by _systemd-journald_.
* Disk space: 7G for system and ~500M (out of ~700M) for GPG keys * Disk space: 7G for system and ~500M (out of ~700M) for GPG keys