infra-docs-fpo/modules/sysadmin_guide/pages/mirrormanager.adoc
Aurélien Bompard e89773e48b MirrorManager: update the rest of the docs
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2024-08-06 15:43:43 +00:00

165 lines
6.1 KiB
Text

= MirrorManager Infrastructure SOP
MirrorManager manages mirrors for fedora distribution.
== Contact Information
Owner::
Fedora Infrastructure Team
Contact::
#fedora-admin, sysadmin-main, sysadmin-web
Servers::
Hosted in OpenShift
Mirrorlist Servers::
Docker container on the proxy servers
Purpose::
Manage mirrors for Fedora distribution
== Description
MirrorManager handles our mirroring system. It keeps track of lists of
valid mirrors and handles handing out metalink URLs to end users to
download packages from.
Everything runs in OpenShift. There is a cron job to scan the master mirror
(NFS mounted at `/srv`) using the _mm2_update-master-directory-list_ script
(_umdl_) for changes. Changed directories are detected by comparing the ctime
to the value in the database.
There are also jobs to compare the content on the mirrors with the results
from _umdl_ using RSYNC, HTTP, HTTPS.
The crawler schedule can be viewed in the `vars/apps/mirrormanager.yml` file
in Ansible.
If the content on the mirrors is the same as on the master those mirrors
are included in the dynamic metalink/mirrorlist.
A hourly job generates a binary file which contains the information about the
state of each mirror. This file is used by the mirrorlist containers on the
proxy servers to dynamically generate the metalink/mirrorlist for each client
individually.
The `frontend` deployment runs the web interface to manipulate the mirrors.
Each mirror-admin can only change the details of the associated mirror.
Members of the FAS group _sysadmin-web_ can seen and change all existing
mirrors.
The mirrorlist provided by the frontend has no actively consumed
content and is therefore heavily cached (12h). It is only used to give
an overview of existing mirrors.
Additionally the frontend provides:::
* an overview of the mirror list usage
https://mirrormanager.fedoraproject.org/statistics
* a propagation overview
https://mirrormanager.fedoraproject.org/propgation
* a mirror map https://mirrormanager.fedoraproject.org/maps
The frontend is also used for _report_mirror_ check-ins. This is used by
mirrors to report their status independent of the crawlers.
== Release Preparation
MirrorManager should automatically detect the new release version, and
will create a new `Version()` object in the database. This is visible on
the Version page in the web UI, and on
https://mirrormanager.fedoraproject.org/.
If the versioning scheme changes, it's possible this will fail. If so,
contact the Mirror Wrangler.
== Move to Archive
Once the files of an EOL release have been copied to the archive
directory tree and enough mirrors have picked the files up at the
archive location there is also a playbook to adapt those paths in
MirrorManager's database:
....
$ rbac-playbook -v /srv/web/infra/ansible/playbooks/manual/mirrormanager/move-to-archive.yml --extra-vars="product='EPEL' version='7'"
....
== mirrorlist containers and mirrorlist servers
Every hour at :55 after the hour, a job generates a binary file with
all the current mirrormanager information in it and syncs it to proxies
and mirrorlist-servers. Each proxy accepts requests to
_mirrors.fedoraproject.org_ on apache, then uses haproxy to determine what
backend will reply. There are 2 containers defined on each proxy:
mirrorlist1 and mirrorlist2. haproxy will look for those first, then
fall back to any of the mirrorlist servers defined over the vpn.
At :15 after the hour, a script runs on all proxies:
`/usr/local/bin/restart-mirrorlist-containers` This script starts up
mirrorlist2 container, makes sure it can process requests and then if
so, restarts mirrorlist1 container with the new pkl data. If not,
mirrorlist1 keeps running with the old data. During this process at
least one (with mirrorlists servers as backup) server is processing
requests so users see no issues.
`mirrorlist-containers` log to `/var/log/mirrormanager/mirrorlist\{1|2}/` on
the host proxy server.
== Troubleshooting and Resolution
=== Regenerating the Publiclist
On _os-control01_:
....
oc -n mirrormanager create job --from=cj/update-mirrorlist-cache update-mirrorlist-cache-manual
....
This command generates a new mirrorlist file and transfers it to
the proxies. The mirrorlist containers on the proxies are restarted 15
minutes after each full hour.
The mirrorlist generation can take up to 20 minutes. The logs can be viewed
with:
....
oc -n mirrormanager logs -f job/update-mirrorlist-cache-manual
....
Once done, the job should be deleted from openshift with:
....
oc -n mirrormanager delete job/update-mirrorlist-cache-manual
....
If a faster solution is required the mirrorlist file from the previous run is
available at:
....
/var/lib/mirrormanager/old/mirrorlist_cache.pkl
....
=== Updating the mirrorlist containers
The container used for mirrorlists is the mirrormanager2-mirrorlist
container in Fedora dist git:
https://src.fedoraproject.org/cgit/container/mirrormanager2-mirrorlist.git/
The one being used is defined in a ansible variable in:
roles/mirrormanager/mirrorlist_proxy/defaults/main.yml
(TODO: This file no longer exists, find the new place where this is defined)
and in turn used in systemd unit files for mirrorlist1 and mirrorlist2. To update the
container used, update this variable, run the playbook and then restart
the mirrorlist1 and mirrorlist2 containers on each proxy. Note that this
may take a while the first time as the image has to be downloaded from
our registry.
=== Debugging problems with mirrorlist container startup
Sometimes on boot some hosts won't be properly serving mirrorlists. This
is due to a container startup issue. run: `docker ps -a` as root to see
the active containers. It will usually say something like 'exited(1)' or
the like. Record the container id and then run: `docker rm --force
<containerid>` then run `docker ps -a` and confirm nothing shows. Then
run `systemctl start mirrorlist1` and it should correctly start
mirrorlist1.
=== General debugging for mirrorlist containers
Docker commands like `docker ps -a` show a fair bit of information.
Also, `systemctl status mirrorlist1/2` or the journal should have
information when a container is failing.