= oraculum Infrastructure SOP https://pagure.io/fedora-qa/oraculum[oraculum] is an app developed by Fedora QA to aid packagers with maintenance and quality in Fedora and EPEL releases. As such, it serves as backend for Packager Dashboard, testcloud, Fedora Easy Karma, and Pagure dist-git (versions table). == Contents * <<_contact_information>> * <<_file_locations>> * <<_building_for_infra>> * <<_upgrading>> * <<_watchdog>> * <<_components>> == Contact Information Owner:: Fedora QA Devel Contact:: #fedora-qa Persons:: jskladan, lbrabec Servers:: * In OpenShift. Purpose:: Hosting the https://pagure.io/fedora-qa/oraculum[oraculum] for packagers == File Locations `oraculum/cli.py - cli for the app `oraculum/cli.py debug - interactive debug interface for the app == Configuration Configuration is loaded from the environment in the pod. The default configuration is set in the playbook: `roles/openshift-apps/oraculum/templates/deploymentconfig.yml`. Remember that the configuration needs to be changed for each of the various pods (described later). The possible values to set up can be found in `oraculum/config.py` inside the `openshift_config` function. Apart from that, secrets, tokens, and api keys are set in the secrets Ansible repository. == Building for Infra The application levarages s2i containers. Both the production and staging instances are tracking `master` branch from the oraculum repository. The build don't happen automatically, but need to be triggered manually from the OpenShift web console. == Upgrading Oraculum is currently configured through ansible and all configuration changes need to be done through ansible. The pod initialization is set in the way that all database upgrades happen automatically on startup. That means the extra care is needed, and all deployments that do database changes need to happen on stg first. == Deployment WatchDog The deployment is configured to perform automatic liveness testing. The first phase is running `cli.py upgrade_db`, and the second phase consists of the cluster trying to get HTTP return from container on port `8080` on the `oraculum-api-endpoint` pod. If any of these fail, the cluster automatically reverts to the previous build, and such failure can be seen on `Events` tab in the DeploymentConfig details. Apart from that, the cluster regularly polls the `oraculum-api-endpoint` for liveness testing. If that fails or times out, a pod restart occurs. Such event can be seen in `Events` tab of the DeploymentConfig. == Cache clearing oraculum doesn't handle any garbage collection in the cache. In some situations like having stale data in the cache (for example in situations where bugzilla data wouldn't refresh due to bugs or optimization choices), or too large db cache, it can be beneficial or even necessary to clear its cache completely. That can be done by clearing all rows in `db_cache` table: `DELETE * FROM cached_data;` After that, to minimize downtime, its recommended to manually re-sync generic providers via `CACHE._refresh`, in the following order: (in the pod terminal via debug) [source,python] ---- python oraculum/cli.py debug CACHE._refresh("fedora_releases") CACHE._refresh("bodhi_updates") CACHE._refresh("bodhi_overrides") CACHE._refresh("package_versions_generic") CACHE._refresh("pagure_groups") CACHE._refresh("koschei_data") CACHE._refresh("packages_owners_json") ---- and finally building up the static cache block manually via: `oraculum.utils.celery_utils.celery_sync_static_package_caches()` To do a more lightweight cleanup, removing just PRs, bugs, and abrt cache can do the trick: `DELETE FROM cached_data WHERE provider LIKE 'packager-dashboard__all_package_bugs%';` `DELETE FROM cached_data WHERE provider LIKE 'packager_dashboard_package_prs%';` `DELETE FROM cached_data WHERE provider LIKE 'packager-dashboard_abrt_issues%';` == Components of Deployment Oraculum deployment consists of various pods that run together. === oraculum-api-endpoint Provides api responses rendering endpoint. Runs via gunicorn in multiple threads. === oraculum-worker Managed via celery, periodic and ad-hoc sync requests are processed by these. Pods are replicated, and each pods spawns 4 workers. === oraculum-beat Sends periodic sync requests to the workers. === oraculum-flower Provides an overview of the celery/worker queues via http. Current state of the workers load can be seen in https://packager-dashboard.fedoraproject.org/_flower/[Flower]. === oraculum-redis Provides a deployment-local redis instance.