[releng-misc-guide] Initialise Releng Miscellaneous guide

Signed-off-by: Samyak Jain <samyak.jn11@gmail.com>
This commit is contained in:
Samyak Jain 2023-04-04 13:01:23 +05:30
parent d5a4761208
commit 96d9c091f1
23 changed files with 318 additions and 2 deletions

View file

@ -0,0 +1,8 @@
:year: 2023
:rawhide_next: 40
:rawhide: 39
:rawhide_name: Thirty Nine
:branched: 38
:branched_name: Thirty Eight
:current: 37
:old_release: 36

View file

@ -0,0 +1,150 @@
== Fedora Release Engineering
Contents:
overview philosophy contributing troubleshooting architecture sop
This page contains information about the Fedora Release Engineering
team.
[[releng-contact-info]]
=== Contact Information
* IRC: `#fedora-releng` on irc.libera.chat
* Mailing List: https://admin.fedoraproject.org/mailman/listinfo/rel-eng[rel-eng@lists.fedoraproject.org]
* Issue tracker: https://pagure.io/releng/new_issue[Fedora Releng Pagure Tickets]
If you want the to get something done (e.g. moving packages to
buildroots or into frozen compositions) by the ReleaseEngineering Team,
please create a ticket in the issue tracker mentioned above. Please
enter your FAS-username or e-mail address in the respective textbox, to
make sure the team can contact you.
[[index-team-composition]]
=== Team Composition
* https://fedoraproject.org/wiki/User:kevin[Kevin Fenzi (nirik)]
* https://fedoraproject.org/wiki/User:sharkcz[Dan Horák (sharkcz)](secondary arches)
* https://fedoraproject.org/wiki/User:pbrobinson[Peter Robinson(pbrobinson)]
* https://fedoraproject.org/wiki/User:maxamillion[Adam Miller(maxamillion)]
* https://fedoraproject.org/wiki/User:humaton[Tomas Hrcka(humaton)]
Release Team members are approved by FESCo. However, FESCo has delegated
this power to the Release Team itself. If you want to join the team,
please read `join-releng`.
=== What is Fedora Release Engineering?
For a Broad Overview, see `overview <overview>`.
=== Why we do things the way we do them
For information on the Fedora Release Engineering Philosophy, see
`philosophy <philosophy>`.
=== Fedora Release Engineering Leadership
Mohan Boddu (mboddu on IRC, FAS username mohanboddu)
Leadership is currently appointed by FESCo with input from the current
release team.
=== Things we Do
* {blank}
+
Create official Fedora releases.::
** {blank}
+
Fedora Products;;
*** Cloud
*** Server
*** Workstation
** Fedora Spins
* Report progress towards release from
https://fedoraproject.org/wiki/Releases/Branched[branched] creation on.
* Give reports to FESCo on changes to processes.
* If something is known to be controversial, we let FESCo know before
implementing otherwise implementation generally happens concurrently to
reporting.
* Set policy on freeze management
* Administrate the build system(s)
* Remove unmaintained packages from Fedora
* Push updated packages
* write and maintain tools to compose and push Fedora
[[join-releng]]
=== Joining Release Engineering
Much of rel-eng's communication is via IRC. One of the best ways to
initially get involved is to attend one of the meetings and say that
you're interested in doing some work during the open floor at the end of
the meeting. If you can't make the meeting times, you can also ping one
of us on IRC or sign up for the
https://admin.fedoraproject.org/mailman/listinfo/rel-eng[mailing list].
Since release engineering needs special access to systems essential to
Fedora people new to rel-eng will usually get access a little bit at a
time. Typically people won't immediately be granted the ability to sign
packages and push updates for example. A couple of tasks you could start
out with are troubleshooting why builds are failing (and if rel-eng
could take actions to fix it) as the requests are submitted to pagure or
help with scripts for various rel-eng tasks.
There are also a number of tools that Fedora Release Engineering uses
and relies upon, working on improving these upstream to fascilitate with
new things that the Fedora Project is aiming to deliver is also a great
way to get involved with Fedora Rel-Eng.
=== How we do it
See our `Standard Operating Procedures <sop>` for details on how we do
the things we do.
Most discussions regarding release engineering will happen either in
[.title-ref]##fedora-releng# or on the releng mailing list. For
requests, please consult the `releng-contact-info`
=== Meetings
rel-eng holds regular meetings every Monday at 14:30 UTC in
[.title-ref]##fedora-meeting-3# on the Libera IRC network.
* https://pagure.io/releng/issues?status=Open&tags=meeting[Meeting
agendas] are created from open tickets in pagure that contain the
meeting keyword.
==== Meeting Minutes
Minutes are posted to the rel-eng mailing list. They are also available
at the
https://meetbot.fedoraproject.org/sresults/?group_id=releng&type=team[Meetbot
team page for releng]
There are also
https://fedoraproject.org/wiki/ReleaseEngineering/Meetings[historical
Meeting Minutes for 2007-04-16 to 2009-05-04].
=== Current activities
See our https://pagure.io/releng/issues[ticket queue] for the things we
are currently working.
See https://fedoraproject.org/wiki/Releases[Releases] for information
about Fedora releases, including schedules.
=== Freeze Policies
* https://fedoraproject.org/wiki/Milestone_freezes[Milestone (Alpha,
Beta, Final) freezes]
* https://fedoraproject.org/wiki/Software_String_Freeze_Policy[String
Freeze Policy] (Same time as Alpha Freeze)
* https://fedoraproject.org/wiki/Changes/Policy[Change freeze policy]
(that's 'Change' as in 'feature')
* https://fedoraproject.org/wiki/Updates_Policy[Updates Policy] (not
technically a freeze, but of interest)
=== Indices and tables
* `genindex`
* `modindex`
* `search`

View file

@ -0,0 +1,141 @@
[[overview]]
== Fedora Release Engineering Overview
[[overview-intro]]
=== Introduction
The development of Fedora is a very open process, involving over a
thousand package maintainers (along with testers, translators,
documentation writers and so forth). These maintainers are responsible
for the bulk of Fedora distribution development. An elected
https://fedoraproject.org/wiki/Fedora_Engineering_Steering_Committee[committee
of people] provides some level of direction over the engineering aspects
of the project.
The rapid pace of Fedora development leaves little time for polishing
the development system into a quality release. To solve this dilemma,
the Fedora project makes use of regular freezes and milestone (Alpha,
Beta, Final) releases of the distribution, as well as "branching" of our
trees to maintain different strains of development.
Stable branches of the Fedora tree and associated
https://fedoraproject.org/wiki/Repositories[Repositories] are maintained
for each Fedora release. The
https://fedoraproject.org/wiki/Releases/Rawhide[Rawhide] rolling
development tree is the initial entry point for all Fedora development,
and the trunk from which all branches diverge. Releases are
https://fedoraproject.org/wiki/Releases/Branched[Branched] from Rawhide
some time before they are sent out as stable releases, and the milestone
releases (Alpha, Beta and Final) are all built from this Branched tree.
Nightly snapshot images of various kinds are built from Rawhide and
Branched (when it exists) and made available for download from within
the trees on the https://mirrors.fedoraproject.org/[mirrors] or from the
https://fedoraproject.org/wiki/Koji[Koji] build system.
The https://fedoraproject.org/wiki/Fedora_Release_Life_Cycle[Fedora
Release Life Cycle] page is a good entry point for more details on these
processes. Some other useful references regarding the Fedora release
process include:
* The https://fedoraproject.org/wiki/Changes/Policy[Release planning
process]
* The
https://fedoraproject.org/wiki/QA:Release_validation_test_plan[release
validation test plan]
* The https://fedoraproject.org/wiki/QA:Updates_Testing[updates-testing
process], via https://fedoraproject.org/wiki/Bodhi[Bodhi] and the
https://fedoraproject.org/wiki/Updates_Policy[Updates Policy]
* The https://fedoraproject.org/wiki/QA:SOP_compose_request[test compose
and release candidate system]
* The https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process[blocker
bug process] and
https://fedoraproject.org/wiki/QA:SOP_freeze_exception_bug_process[freeze
exception bug process]
* The [.title-ref]#Repositories#
* The https://fedoraproject.org/wiki/Bugs_and_feature_requests[Bugzilla
system]
=== Final Release Checklist
Various tasks need to be accomplished prior to a final Fedora release.
Release Engineering is responsible for many of them, as outlined here.
==== Release Announcement
The https://fedoraproject.org/wiki/Docs_Project[Fedora Documentation
Project] prepares release announcements for the final releases. A
https://bugzilla.redhat.com/bugzilla/enter_bug.cgi?product=Fedora%20Documentation&op_sys=Linux&target_milestone=---&bug_status=NEW&version=devel&component=release-notes&rep_platform=All&priority=normal&bug_severity=normal&assigned_to=relnotes%40fedoraproject.org&cc=&estimated_time_presets=0.0&estimated_time=0.0&bug_file_loc=http%3A%2F%2F&short_desc=RELNOTES%20-%20Summarize%20the%20release%20note%20suggestion%2Fcontent&comment=Provide%20details%20here.%20%20Do%20not%20change%20the%20blocking%20bug.&status_whiteboard=&keywords=&issuetrackers=&dependson=&blocked=151189&ext_bz_id=0&ext_bz_bug_id=&data=&description=&contenttypemethod=list&contenttypeselection=text%2Fplain&contenttypeentry=&maketemplate=Remember%20values%20as%20bookmarkable%20template&form_name=enter_bug[bug
needs to be filed] for this two weeks before the final release date.
==== Mirror List Files
A new set of mirror list files need to be created for the new release.
Email mailto:mirror-admin@fedoraproject.org[Fedora Mirror Admins] to
have these files created. These should be created at the Final Freeze
point but may redirect to Rawhide until final bits have been staged.
=== Release Composing
Fedora “releases” can be built by anyone with a fast machine of proper
arch and access to a package repository. All of the tools necessary to
build a release are available from the package repository. These tools
aim to provide a consistent way to build Fedora releases. A complete
release can actually be built with only a couple commands, including the
creation of network install images, offline install images ('DVDs'),
live images, disk images, install repositories, [[FedUp]] upgrade
images, and other bits. These commands are named pungi and
livecd-creator.
[NOTE]
.Note
====
There is work currently being done to replace livecd-creator with
https://github.com/rhinstaller/lorax/blob/master/src/sbin/livemedia-creator[livemedia-creator].
====
==== Pungi
https://pagure.io/pungi[Pungi] is the tool used to compose Fedora
releases. It requires being ran in a chroot of the package set that it
is composing. This ensures that the correct userland tools are used to
match up with the kernel that will be used to perform the installation.
It uses a comps file + yum repos to gather packages for the compose. The
https://fedoraproject.org/wiki/Koji[Koji] build system provides a way to
run tasks within chroots on the various arches, as well as the ability
to produce yum repos of packages from specific collections.
==== Livecd-creator
Livecd-creator is part of the
https://fedoraproject.org/wiki/FedoraLiveCD[livecd-tools] package in
Fedora and it is used to compose the live images as part of the Fedora
release. It is also used to compose many of the custom
https://spins.fedoraproject.org[Spins] or variants of Fedora.
=== Distribution
Once a compose has been completed, the composed tree of release media,
installation trees, and frozen
https://fedoraproject.org/wiki/Repositories[Repositories] needs to be
synchronized with the Fedora mirror system. [[MirrorManager]] has some
more details on the mirror system. Many of the images are also offered
via BitTorrent as an alternative method of downloading.
==== Download Mirrors
Depends on the Fedora Mirror System and infrastructure to populate them
privately.
==== BitTorrent
BitTorrent is currently served by http://torrent.fedoraproject.org.
Images are added to the system via this
https://infrastructure.fedoraproject.org/infra/docs/docs/sysadmin-guide/sops/torrentrelease.rst[Standard
Operating Procedure].
=== Acknowledgements
This document was influenced by
http://www.freebsd.org/doc/en_US.ISO8859-1/articles/releng/article.html[release
engineering documents] from http://freebsd.org[FreeBSD].

View file

@ -0,0 +1,59 @@
== Clean AMIs Process
=== Description
The Fedora AMIs are uploaded on a daily basis to Amazon Web Services.
Over time the number of AMIs piles up and have to be removed manually.
Manual removal comes with it's own set of issues where missing to delete
the AMIs is a viable issue.
The goal of the script is to automate the process and continue regular
removal of the AMIs. The report of the script is pushed to a
https://pagure.io/ami-purge-report[Pagure repo]
=== Action
There is a script in the https://pagure.io/releng[Fedora RelEng repo]
named `clean-amis.py` under the `scripts` directory.
The script runs as a cron job within the Fedora Infrastructure to delete
the old AMIs. The permission of the selected AMIs are changed to
private. This is to make sure that if someone from the community raises
an issue we have the option to get the AMI back to public. After 10
days, if no complaints are raised the AMIs are deleted permanently.
The complete process can be divided in couple of parts:
* Fetching the data from datagrepper. Based on the [.title-ref]#--days#
param, the script starts fetching the fedmsg messages from datagrepper
for the specified timeframe i.e. for lasts [.title-ref]#n# days, where
[.title-ref]#n# is the value of [.title-ref]#--days# param. The queried
fedmsg topic [.title-ref]#fedimg.image.upload#.
* Selection of the AMIs: After the AMIs are parsed from datagrepper. The
AMIs are filtered to remove Beta, Two-week Atomic Host and GA released
AMIs. Composes with [.title-ref]#compose_type# set to
[.title-ref]#nightly# are picked up for deletion. Composes which contain
date in the [.title-ref]#compose label# are also picked up for deletion.
GA composes also have the compose_type set to production. So to
distinguish then we filter them if the compose_label have date in them.
The GA composes dont have date whereas they have the version in format
of X.Y
* Updated permissions of AMIs The permissions of the selected AMIs are
changed to private.
* Deletion of AMIs After 10 days, the private AMIs are deleted.
In order to change the permissions of the AMIs use the command given
below, add [.title-ref]#--dry-run# argument test the command works.
Adding [.title-ref]#--dry-run# argument will print the AMIs to console.
....
AWS_ACCESS_KEY={{ ec2_image_delete_access_key_id }} AWS_SECRET_ACCESS_KEY={{ ec2_image_delete_access_key }} PAGURE_ACCESS_TOKEN={{ ami_purge_report_api_key }} ./clean-amis.py --change-perms --days 7 --permswaitperiod 5
....
In order to delete the AMIs whose launch permissions have been removed,
add [.title-ref]#--dry-run# argument test the command works. Adding
[.title-ref]#--dry-run# argument will print the AMIs to console.
....
AWS_ACCESS_KEY={{ ec2_image_delete_access_key_id }} AWS_SECRET_ACCESS_KEY={{ ec2_image_delete_access_key }} PAGURE_ACCESS_TOKEN={{ ami_purge_report_api_key }} ./clean-amis.py --delete --days 17 --deletewaitperiod 10
....

View file

@ -0,0 +1,28 @@
== Finding Module Information
=== Description
When users submit builds to the Module Build Service (MBS), it in turn
submits builds to Koji. Sometimes, you are looking at a koji build, and
you want to know what module-build it is a part of.
=== Caveat
It requires that the build has been completed and has been tagged, until
https://pagure.io/fm-orchestrator/issue/375 is complete.
=== Setup
Run the following:
....
$ sudo dnf install python-arrow python-requests koji
....
=== Action
Run the following:
....
$ scripts/mbs/koji-module-info.py $BUILD_ID
....

View file

@ -0,0 +1,185 @@
== Generating Openh264 Composes
=== Description
Openh264 repos are a special case and we need to generate the composes
for it in a different way. We use ODCS to generate the private compose
and send the rpms to Cisco to publish them on their CDN. We publish the
repodata on our side.
[WARNING]
.Warning
====
We do not have all the appropriate legal rights to distribute these
packages, so we need to be extra carefull to make sure they are never
distributed via our build system or websites
====
=== Action
==== Permissions needed
You will need some ODCS permissions in order to request private composes
and composes from tags. You can set this in infra/ansible in
inventory/group_vars/odcs in the odcs_allowed_clients_users variable.
See other releng users entries for format.
==== Get the odcs token
In order to generate an odcs compose, you need a openidc token.
Run the odcs-token.py under `scripts/odcs/` from pagure releng
repository to generate the token.
....
$ ./odcs-token.py
....
==== Make sure rpms are written out with the right signature
....
$ koji write-signed-rpm eb10b464 openh264-2.2.0-1.fc38
....
Where the key for that branch is listed, then the open264 package and
version.
==== Generate a private odcs compose
With the token generated above, generate the odcs private compose
....
$ python odcs-private-compose.py <token> <koji_tag> <signingkeyid>
....
`koji_tag`: fxx-openh264 (Openh264 builds are tagged to fxx-openh264
tags where [.title-ref]#xx# represents the fedora release)
`signingkeyid`: The short hash of the key for this Fedora branch.
The composes are stored under `/srv/odcs/private/` dir on
`odcs-backend-releng01.iad2.fedoraproject.org`
==== Pull the compose to your local machine
We need to extract the rpms and tar them to send them to Cisco. In order
to that, first of all we need to pull the compose to our local machine.
===== Move the compose to your home dir on odcs-backend-releng01.iad2.fedoraproject.org
Since the compose is owned by [.title-ref]#odcs-server# pull it into
your home dir
....
$ mkdir ~/32-openh264
$ sudo rsync -avhHP /srv/odcs/private/odcs-3835/ ~/32-openh264/
$ sudo chown -R mohanboddu:mohanboddu ~/32-openh264/
....
===== Sync the compose to your local machine
Pull in the compose from your home dir on odcs releng backend to your
local machine into a temp working dir
....
$ mkdir openh264-20200813
$ scp -rv odcs-backend-releng01.iad2.fedoraproject.org:/home/fedora/mohanboddu/32-openh264/ openh264-20200813/
....
===== Make the changes needed
Please follow the following commands to make the necessary tar files to
send to Cisco
....
$ cd openh264-20200813
$ mkdir 32-rpms
# Copy rpms including devel rpms
$ cp -rv 32-openh264/compose/Temporary/*/*/*/*/*rpm 32-rpms/
# Copy debuginfo rpms
$ cp -rv 32-openh264/compose/Temporary/*/*/*/*/*/*rpm 32-rpms/
# copy the src.rpm
$ cp -rv 32-openh264/compose/Temporary/*/*/*/*/*src.rpm 32-rpms/
$ cd 32-rpms
# Create the tar file with the rpms
$ tar -cJvf ../fedora-32-openh264-rpms.tar.xz *rpm
....
We need to send this tar file to Cisco along with the list of rpms in
each tarball.
===== Syncing the compose to sundries01
Once we get a confirmation from Cisco that the rpms are updated on their
CDN, verify them by using curl. For example:
....
$ curl -I http://ciscobinary.openh264.org/openh264-2.1.1-1.fc32.x86_64.rpm
....
Now push these composes to *sundries01.iad2.fedoraproject.org* and
*mm-backend01.iad2.fedoraproject.org*
On sundries01 we need to sync to a directory that is owned by _apache_,
so first we sync to the home directory on sundries01. Same with
mm-backend01 as the directory is owned by _root_.
Create a temp working directory on sundries01
....
$ ssh sundries01.iad2.fedoraproject.org
$ mkdir openh264-20200825
....
Create a temp working directory on mm-backend01
....
$ ssh mm-backend01.iad2.fedoraproject.org
$ mkdir openh264-20200825
....
Then from your local machine, sync the compose
....
$ cd openh264-20200825
$ rsync -avhHP 32-openh264 sundries01.iad2.fedoraproject.org:/home/fedora/mohanboddu/openh264-20200825
$ rsync -avhHP 32-openh264 mm-backend01.iad2.fedoraproject.org:/home/fedora/mohanboddu/openh264-20200825
....
On sundries01
....
$ cd openh264-20200825
$ sudo rsync -avhHP 32-openh264/compose/Temporary/ /srv/web/codecs.fedoraproject.org/openh264/32/
....
On mm-backend01
....
$ cd openh264-20200825
$ sudo rsync -avhHP 32-openh264/compose/Temporary/ /srv/codecs.fedoraproject.org/openh264/32/
....
===== Extra info
Normally that should be it, but in some cases you may want to push
things out faster than normal, and here's a few things you can do to do
that:
On mm-backend01.iad2.fedoraproject.org you can run:
....
# sudo -u mirrormanager /usr/local/bin/umdl-required codecs /var/log/mirrormanager/umdl-required.log
....
This will have mirrormanager scan the codecs dir and update it if it's
changed.
On batcave01.iad2.fedoraproject.org you can use ansible to force all the
proxies to sync the codec content from sundries01:
....
# ansible -a '/usr/bin/rsync --delete -a --no-owner --no-group sundries01::codecs.fedoraproject.org/ /srv/web/codecs.fedoraproject.org/' proxies
....
Mirrorlist servers should update every 15min.

View file

@ -0,0 +1,17 @@
== Package Blocking
=== Description
If a
https://fedoraproject.org/wiki/How_to_remove_a_package_at_end_of_life[package
is removed (retired) from Fedora], for example because it was renamed,
it needs to be blocked in Koji. This prevents creating new package
builds and distribution of built RPMs. Packages are blocked in the
listing of `tags`, due to inheritance it is enough to block packages at
the oldest tag will make it unavailable also in upstream tags.
=== Action
The blocking of retired packages is done by the
https://pagure.io/releng/blob/master/f/scripts/block_retired.py[block_retired.py]
script as part of the daily Rawhide and Branched composes.

View file

@ -0,0 +1,121 @@
== Package Unblocking
=== Description
Packages are sometimes unblocked from Fedora, usually when a package had
been orphaned and now has a new owner. When this happens, release
engineering needs to "unblock" the package from koji tags.
=== Action
==== Find Unblock requests
Unblock requests are usually reported in the
https://pagure.io/releng/issues[rel-eng issue tracker].
==== Perform the unblocking
First assign the ticket to yourself to show, that you are handling the
request.
===== Discover proper place to unblock
The ticket should tell you which Fedora releases to unblock the package
in. Typically it'll say "Fedora 13" or "F14". This means we need to
unblock it at that Fedora level and all future tags. However depending
on where the package was blocked we may have to do our unblock action at
a different Fedora level.
To discover where a package is blocked, use the `list-pkgs` method of
koji.
....
$ koji list-pkgs --help
Usage: koji list-pkgs [options]
(Specify the --help global option for a list of other help options)
Options:
-h, --help show this help message and exit
--owner=OWNER Specify owner
--tag=TAG Specify tag
--package=PACKAGE Specify package
--quiet Do not print header information
--noinherit Don't follow inheritance
--show-blocked Show blocked packages
--show-dups Show superseded owners
--event=EVENT# query at event
--ts=TIMESTAMP query at timestamp
--repo=REPO# query at event for a repo
....
For example if we wanted to see where python-psco was blocked we would
do:
....
$ koji list-pkgs --package python-psyco --show-blocked
Package Tag Extra Arches Owner
----------------------- ----------------------- ---------------- ---------------
python-psyco dist-f14 konradm [BLOCKED]
python-psyco olpc2-ship2 shahms
python-psyco olpc2-trial3 shahms
...
....
Here we can see that it was blocked at dist-f14. If we got a request
that was to unblock it before f14, we can simply use the dist-f14 target
to unblock. However if they want it unblocked after f14, we would use
the earliest dist-f?? tag the user wants, such as dist-f15 if the user
asked for it to be unblocked in Fedora 15+
===== Performing the unblock
To unblock a package for a tag, use the `unblock-pkg` method of Koji.
....
$ koji unblock-pkg --help
Usage: koji unblock-pkg [options] tag package [package2 ...]
(Specify the --help global option for a list of other help options)
Options:
-h, --help show this help message and exit
....
For example, if we were asked to unblock python-psyco in F14 we would
issue:
....
$ koji unblock-pkg dist-f14 python-psyco
....
Now the ticket can be closed.
=== Verification
To verify that the package was successfully unblocked use the
`list-pkgs` koji command:
....
$ koji list-pkgs --package python-psyco --show-blocked
....
We should see the package listed as not blocked at dist-f14 or above:
....
Package Tag Extra Arches Owner
----------------------- ----------------------- ---------------- ---------------
python-psyco olpc2-trial3 jkeating
python-psyco olpc2-ship2 jkeating
python-psyco olpc2-update1 jkeating
python-psyco trashcan jkeating
python-psyco f8-final jkeating
...
....
We should not see it listed as blocked in dist-f14 or any later Fedora
tags.
=== Consider Before Running
* Watch the next day's rawhide/branched/whatever report for a slew of
broken deps related to the package. We may have to re-block the package
in order to fix the deps.

View file

@ -0,0 +1,33 @@
== Process fedora-scm-requests tickets
=== Description
When a packager wants a new package added to Fedora or a new dist-git
branch blessed, they need to go through the new package process and,
once their package review is approved, they use the
[.title-ref]#fedrepo-req# cli tool to file a ticket in the
https://pagure.io/releng/fedora-scm-requests[fedora-scm-requests queue].
Periodically, (daily?) release engineering will need to review and
process this queue using the [.title-ref]#fedrepo-req-admin# tool.
=== Setup
A release engineering will need to have several values set locally as
well as sufficient permissions in a number of server-side systems.
. A pagure.io token. See the fedrepo-req README for instructions on
where to get this.
. src.fedoraproject.org token generated by [.title-ref]#pagure-admin#.
Ask @pingou how to get one. If doing this yourself, go to pkgs01 and run
[.title-ref]#PAGURE_CONFIG=/etc/pagure/pagure.cfg pagure-admin
admin-token create -h# for more info.
. pdc token. See the PDC SOP for getting one of these.
=== Action
. Run [.title-ref]#fedrepo-req-admin list# to list all open requests.
. Run [.title-ref]#fedrepo-req-admin process N# to process a particular
ticket.
. Run [.title-ref]#fedrepo-req-admin processall# to iterate over all the
tickets.

View file

@ -0,0 +1,225 @@
== Pushing Updates
=== Description
Fedora updates are typically pushed once a day. This SOP covers the
steps involved.
==== Coordinate
Releng has a rotation of who pushes updates when. Please coordinate and
only push updates when you are expected to or have notified other releng
folks you are doing so. See:
https://apps.fedoraproject.org/calendar/release-engineering/ for the
list or on irc you can run `.pushduty` in any channel with zodbot to see
who is on duty this week.
==== Login to machine to sign updates
Login to a machine that is configured for sigul client support and has
the bodhi client installed. Currently, this machine is:
`bodhi-backend01.phx2.fedoraproject.org`
==== Decide what releases you're going to push.
* If there is a Freeze ongoing, you SHOULD NOT push all stable requests
for a branched release, only specific blocker or freeze exception
requests that QA will request in a releng ticket.
* If there is no Freeze ongoing you can push all Fedora and EPEL
releases at the same time if you wish.
* From time to time there may be urgent requests in some branches, you
can only push those if requested. Note however that bodhi2 will
automatically push branches with security updates before others.
==== Get a list of packages to push
....
$ sudo -u apache bodhi-push --releases 'f26,f25,f24,epel-7,EL-6' --username <yourusername>
<enter your password+2factorauth, then your fas password>
....
Sometimes you see a warning "Warning: foobar-1.fcxx has unsigned builds
and has been skipped" which means those updates are currently getting
signed and it can verified by listing the tagged builds in
fxx-signing-pending tag.
::::
$ koji list-tagged fxx-signing-pending
You can say 'n' to the push at this point if you wish to sign packages
(see below). Or you can keep this request open in a window while you
sign the packages, then come back and say y.
List the releases above you wish to push from: 25 24 5 6 7, etc
You can also specify `--request=testing` to limit pushes. Valid types
are `testing` or `stable`.
The list of updates should be in the cache directory named
`Stable-$Branch` or `Testing-$Branch` for each of the Branches you
wished to push.
During freezes you will need to do two steps: (If say, fedora 26
branched was frozen):
....
$ sudo -u apache bodhi-push --releases f26 --request=testing \
--username <username>
....
Then
....
$ sudo -u apache bodhi-push --releases 'f25,f24,epel-7,EL-6' --username <username>
....
During the Release Candidate compose phase we tag builds to be included
into a -compose tag (e.g. f26-compose). When we have a candidate that
has been signed off as gold we need to ensure that all builds tagged
into the -compose tag have been pushed stable. Once we have pushed all
-compose builds stable we then have to clone the base tag (e.g. f26) to
a tag for the milestone for Alpha and Beta (e.g. f26-Alpha). After final
release we need to lock the base tag and adjust the release status in
bodhi so that updates now hit the -updates tag (e.g. f26-updates). Once
we have cloned the tag or locked the tag and adjusted bodhi we are free
to push stable updates again.
==== Pushing Stable updates during freeze
During feezes we need to push to stable builds included in the compose.
QA will file a ticket with the nvrs to push.
[NOTE]
.Note
====
If you are pushing a bodhi update that contains multiple builds, you
need only pass bodhi-push a single build nvr and all the others in that
update will be detected and pushed along with it. However, if you are
pushing multiple disjoint bodhi updates then each build will need to be
listed individually.
====
....
$ sudo -u apache bodhi-push --builds '<nvr1>,<nvr2>,...' --username <username>
....
===== There are no updates to push.
If you are getting the message `There are no updates to push.` or the
list of packages you are seeing to push out for the Stable updates
request is not correct compared to what you specified in the `--builds`
section of the command above then one of two things likely happened.
. The update hasn't yet reached appropriate karma
+
This should be handled case-by-case, if the QA Team has requested this
be pushed as stable to fix a blocker but there's not yet enough karma
for an autostable prompt to occur then you should verify with QA that
these are ready to go out even without karma. If they are, then log into
the Bodhi WebUI and modify the karma threshold of the update to 1 and
add karma (if necessary). This is not something we should do as normal
practice and is considered an edge case. When update requests come to
RelEng, it should have appropriate karma. Sometimes it doesn't and as
long as QA approves, we need not block on it.
. The update was never requested for stable
+
It's possible the update wasn't requested for stable, you can resolve
this by running the following on one of the bodhi-backend systems:
+
....
bodhi updates request <BODHI-REQUEST-ID> stable
....
==== Perform the bodhi push
Say 'y' to push for the above command.
=== Verification
. Monitor Bodhi's composes with `bodhi-monitor-composes`
+
....
$ sudo -u apache watch -d -n 60 bodhi-monitor-composes
....
. Monitor the systemd journal
+
....
$ sudo journalctl -o short -u fedmsg-hub -l -f
....
. Check the processes
+
....
$ ps axf|grep bodhi
....
. Watch for fedmsgs through the process. It will indicate what releases
it's working on, etc. You may want to watch in `#fedora-fedmsg`.
+
....
bodhi.masher.start -- kevin requested a mash of 48 updates
bodhi.mashtask.start -- bodhi masher started a push
bodhi.mashtask.mashing -- bodhi masher started mashing f23-updates
bodhi.mashtask.mashing -- bodhi masher started mashing f22-updates-testing
...
bodhi.update.complete.stable -- moceap's wondershaper-1.2.1-5.fc23 bodhi update completed push to stable https://admin.fedoraproject.org/updates/FEDORA-2015-13052
...
bodhi.errata.publish -- Fedora 23 Update: wondershaper-1.2.1-5.fc23 https://admin.fedoraproject.org/updates/FEDORA-2015-13052
bodhi.mashtask.complete -- bodhi masher successfully mashed f23-updates
bodhi.mashtask.sync.wait -- bodhi masher is waiting for f22-updates-testing to hit the master mirror
....
. Seach for problems with a particular push:
+
....
sudo journalctl --since=yesterday -o short -u fedmsg-hub | grep dist-6E-epel (or f22-updates, etc)
....
. Note: Bodhi will look at the things you have told it to push and see
if any have security updates, those branches will be started first. It
will then fire off threads (up to 3 at a time) and do the rest.
=== Consider Before Running
Pushes often fall over due to tagging issues or unsigned packages. Be
prepared to work through the failures and restart pushes from time to
time
....
$ sudo -u apache bodhi-push --resume
....
Bodhi will ask you which push(es) you want to resume.
=== Common issues / problems with pushes
* When the push fails due to new unsigned packages that were added after
you started the process. re-run step 4a or 4b with just the package
names that need to be signed, then resume.
* When the push fails due to an old package that has no signature, run:
`koji write-signed-rpm <gpgkeyid> <n-v-r>` and resume.
* When the push fails due to a package not being tagged with
updates-testing when being moved stable:
`koji tag-pkg dist-<tag>-updates-testing <n-v-r>`
* When signing fails, you may need to ask that the sigul bridge or
server be restarted.
* If the updates push fails with a:
`OSError: [Errno 16] Device or resource busy: '/var/lib/mock/*-x86_64/root/var/tmp/rpm-ostree.*'`
You need to umount any tmpfs mounts still open on the backend and resume
the push.
* If the updates push fails with:
`"OSError: [Errno 39] Directory not empty: '/mnt/koji/mash/updates/*/../*.repocache/repodata/'`
you need to restart fedmsg-hub on the backend and resume.
* If the updates push fails with:
`IOError: Cannot open /mnt/koji/mash/updates/epel7-160228.1356/../epel7.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/epel7-160228.1356/../epel7.repocache/repodata/repomd.xml doesn't exists or not a regular file`
This issue will be resolved with NFSv4, but in the mean time it can be
worked around by removing the [.title-ref]#.repocache# directory and
resuming the push.
`$ sudo rm -fr /mnt/koji/mash/updates/epel7.repocache`
* If the Atomic OSTree compose fails with some sort of
[.title-ref]#Device or Resource busy# error, then run
[.title-ref]#mount# to see if there are any stray [.title-ref]#tmpfs#
mounts still active:
`tmpfs on /var/lib/mock/fedora-22-updates-testing-x86_64/root/var/tmp/rpm-ostree.bylgUq type tmpfs (rw,relatime,seclabel,mode=755)`
You can then
`$ sudo umount /var/lib/mock/fedora-22-updates-testing-x86_64/root/var/tmp/rpm-ostree.bylgUq`
and resume the push.
Other issues should be addressed by releng or bodhi developers in
`#fedora-releng`.

View file

@ -0,0 +1,43 @@
== Remove dist-git branches
=== Description
Release Engineering is often asked by maintainers to remove branches in
dist-git by maintainers.
=== Action
. Log into batcave01
+
....
ssh <fas-username>@batcave01.iad2.fedoraproject.org
....
. Get root shell
. Log into pkgs01.iad2.fedoraproject.org :
+
....
ssh pkgs01.iad2.fedoraproject.org
....
. Change to the package's directory
+
....
cd /srv/git/rpms/<package>.git/
....
. Remove the branch
+
....
git branch -D <branchname> </pre>
....
=== Verification
To verify just list the branches.
....
git branch
....
=== Consider Before Running
Make sure that the branch in question isn't one of our pre-created
branches `f??/rawhide`, `olpc?/rawhide`, `el?/rawhide`

View file

@ -0,0 +1,27 @@
== Requesting Automation Users
[[sop_requesting_task_automation_user]]
=== Description
When performing automated Release Engineering tasks using `RelEng
Automation <_releng-automation>` you will sometimes find that you need
to perform an action in the Infrastructure with `sudo` that does not yet
have an automation user associated with it.
=== Actions
==== Requesting a new loopabull user
File a ticket with https://pagure.io/fedora-infrastructure/[Fedora
Infrastructure] making sure to satisfy the following requirements:
* Provide a justification for these permissions being needed (What are
you trying to do and why?)
* Commands needing to be run with sudo
* Destination server on which the commands need to be run
* The `loopabull_` username requested to be created for this OR which
`loopabull_` username needs it's pre-existing permissions enhanced
For reference:
https://pagure.io/fedora-infrastructure/issue/5943[Example
Infrastructure Ticket]

View file

@ -0,0 +1,86 @@
include::_partials/attributes.adoc[]
== Retire Orphaned Packages
=== Description
Every release prior to the
https://fedoraproject.org/wiki/Schedule[Feature Freeze/Branching]
Release Engineering retires
https://fedoraproject.org/wiki/Orphaned_package_that_need_new_maintainers[orphaned
packages]. This keeps out unowned software and prevents future problems
down the road.
=== Action
The orphan process takes place in stages:
. Detecting a list of orphans and the dependencies that will be broken
if the orphans are removed.
. Sending the list of potential orphans to devel@lists.fedoraproject.org
for community review and removal from the orphan list.
. Retriring packages nobody wants to adopt.
==== Detecting Orphans
A script called `find_unblocked_orphans.py` assists in the detection
process. It should be run on a machine that has `koji` and
`python-fedora` installed. It runs without options and takes a while to
complete.
`find_unblocked_orphans.py` is available in the
https://pagure.io/releng[Release Engineering git repository]
==== Announcing Packages to be retired
`find_unblocked_orphans.py` outputs text to stdout on the command line
in a form suitable for the body of an email message.
....
$ ./find-unblocked-orphans.py > email-message
....
Email the output to the development list
(`devel@lists.fedodraproject.org`) at least a month before the feature
freeze, send mails with updated lists as necessary. This gives
maintainers an opportunity to pick up orphans that are important to them
or are required by other packages.
==== Retiring Orphans
Once maintainers have been given an opportunity to pick up orphaned
packages, the remaining
https://fedoraproject.org/wiki/How_to_remove_a_package_at_end_of_life[packages
are retired]
===== Bugs
This procedure probably leaves open bugs for the retired packages
behind. It is not within the scope of release engineering to take care
of these. If bugs are closed, only bugs targeted at Rawhide should be
affected, since other branches might still be maintained.
=== Verification
To verify that the packages were blocked correctly we can use the
`latest-pkg` `koji` action.
[source,subs="attributes+"]
....
$ koji latest-pkg dist-f{branched} wdm
....
This should return nothing, as the `wdm` package is blocked.
=== Consider Before Running
Generally we retire anything that doesn't leave broken dependencies. If
there are orphans whose removal would result in broken dependencies a
second warning should be sent to `devel@lists.fedoraproject.org` and to
`<package>-owner@fedoraproject.org` for each dependent package.
Allow another couple of days for maintainers to take notice and fix
these package so the package repository can be maintained without broken
dependencies or needing to the package. It is not good to have broken
package dependencies in our package repositories so every effort should
be made to find owners or to fix the broken dependencies.

View file

@ -0,0 +1,32 @@
include::_partials/attributes.adoc[]
== Sign the packages
* This doc explains how to sign builds in the release(s).
* Manual signing should rarely ever be needed anymore. Just make sure
that robosignatory is setup for all tags that are created.
* If a build seems to be stuck in the autosigning queue (one of the
-pending or -signing-pending tags), just koji untag and koji tag the
package. This will retrigger autosigning.
* If bodhi is reporting a build as unsigned but the build is not in the
-signing-pending tag, that means bodhi missed the tagging. Just run the
following command to make the build get retagged again, giving Bodhi
another change at seeing the signing
+
....
$ koji move $dist-updates-testing-pending $dist-signing-pending $build
....
* If need be, sign builds using scripts/sigulsign_unsigned.py from
releng git repo
+
_NOTE! This will NOT help if Bodhi marks a build as unsigned!_
+
[source,subs="attributes+"]
....
$ ./sigulsign_unsigned.py -vv --write-all \
--sigul-batch-size=25 fedora-{branched} \
$(cat /var/cache/sigul/Stable-F{branched} /var/cache/sigul/Testing-F{branched})
....
(Make sure you sign each release with the right key... ie, 'fedora-19'
key with F19 packages, or 'epel-6' with EL-6 packages)

View file

@ -0,0 +1,57 @@
== Sigul Client Setup
This document describes how to configure a sigul client. For more
information on sigul, please see link:User-Mitr[User:Mitr]
=== Prerequisites
. Install `sigul` and its dependencies. It is available in both Fedora
and EPEL:
+
On Fedora:
+
....
dnf install sigul
....
+
On RHEL/CentOS (Using EPEL):
+
....
yum install sigul
....
. Ensure that your koji certificate and the link:Fedora-Cert[Fedora CA
certificates] are present on the system you're running the sigul client
from at the following locations:
* `~/.fedora.cert`
* `~/.fedora-server-ca.cert`
* `~/.fedora-upload-ca.cert`
. Admin privileges on koji are required to write signatures.
=== Configuration
. Run `sigul_setup_client`
. Choose a password for your NSS database. By default this will be
stored on-disk in `~/.sigul/client.conf`.
. Choose an export password. You will only need to remember it until
finishing `sigul_setup_client`.
. Enter the DB password you chose earlier, then the export password. You
should see the message `pk12util: PKCS12 IMPORT SUCCESSFUL`
. Enter the DB password again. You should see the message `Done`.
. Assuming that you are running the sigul client within phx2, edit
`~/.sigul/client.conf` to include the following lines:
....
[client]
bridge-hostname: sign-bridge.phx2.fedoraproject.org
server-hostname: sign-vault.phx2.fedoraproject.org
....
=== Updating your Fedora certificate
When your Fedora certificate expires, after updating it run the
following commands:
....
$ certutil -d ~/.sigul -D -n sigul-client-cert
$ sigul_setup_client
....

View file

@ -0,0 +1,22 @@
== Standard Operating Procedure Template
[NOTE]
.Note
====
This is a template file, this can be copied to a new filename in order
to start a new document.
New SOP Documents should be added to the `.. toctree::` list in the
`sop.rst` file.
These documents are formatted using sphinx-doc reStructure Text, for
markup reference please see: http://sphinx-doc.org/rest.html
====
=== Description
=== Action
=== Verification
=== Consider Before Running

View file

@ -0,0 +1,165 @@
== Unretiring a package branch
=== Description
Sometimes, packagers request that we _unretire_ a package branch that
has previously been retired.
This typically happens on the [.title-ref]#rawhide# branch, but could
conceivably happen on any stable or arbitrary branch.
=== Action
==== Validate Package Ready for Unretirement
. Verify the package was not retired for any reason, such as legal or
license issues, that would prevent it from being re-instated.
. Ensure a bugzilla was filed to review the package for unretirement.
. Verify with the the requestor exactly which tags they would like
unblocked as part of the unretirement request.
==== Revert the Retirement Commit
. Connect to one of the compose systems.
+
____
....
$ ssh compose-x86-02.phx2.fedoraproject.org
....
____
. Clone the git-dist package using the the proper release engineering
credentials.
+
____
....
$ GIT_SSH=/usr/local/bin/relengpush fedpkg --user releng clone PACKAGENAME
....
____
. Enter the directory of the cloned package and configure the git user
information.
+
____
....
$ cd PACKAGENAME
$ git config --local user.name "Fedora Release Engineering"
$ git config --local user.email "releng@fedoraproject.org"
....
____
. Git revert the [.title-ref]#dead.package# file commit in dist-git on
the particular branch using its commit hash_id. Ensure the commit
message contains a URL to the request in pagure.
+
____
....
$ git revert -s COMMIT_HASH_ID
$ GIT_SSH=/usr/loca/bin/relengpush fedpkg --user releng push
....
____
==== Unblock the Package in Koji
. Check the current state of the branches in koji for the package.
+
____
....
$ koji list-pkgs --show-blocked --package=PACKAGENAME
....
____
. Unblock each requested tag using koji.
+
....
$ koji unblock-pkg TAGNAME PACKAGENAME
....
==== Verify Package is Not Orphaned
. Check package ownership
+
Navigate to [.title-ref]#https://src.fedoraproject.org/# and check
package owner.
. If the package is orphaned, then give the package to the requestor
using the [.title-ref]#give-package# script from the
https://pagure.io/releng[Release Engineering Repo].
+
....
$ ./scripts/distgit/give-package --package=PACKAGENAME --custodian=REQUESTOR
....
+
[NOTE]
.Note
====
This script requires the user to be a member of the group
[.title-ref]#cvsadmin# in FAS.
====
==== Update Product Definition Center (PDC)
[NOTE]
.Note
====
If there are more than one tag to be unblocked then the PDC update step
should be completed for each tag and package.
====
. Log into the https://pdc.fedoraproject.org/[Fedora PDC instance] using
a FAS account.
. Check PDC for the entry for the [.title-ref]#PACKAGENAME# in each
[.title-ref]#TAG# that was unblocked in a previous step.
+
____
....
https://pdc.fedoraproject.org/rest_api/v1/component-branch-slas/?branch=TAG&global_component=PACKAGENAME
....
[NOTE]
.Note
====
If no information is returned by this query then it is not in PDC and is
likely not yet a branch. The requestor should use the
[.title-ref]#fedpkg request-branch# utility to ask for a branch.
====
____
. If the package existed within PDC then obtain a token from the PDC
site while logged in by navigating to the
[.title-ref]#https://pdc.fedoraproject.org/rest_api/v1/auth/token/obtain/#
URL with the Firefox web browser.
. Press F12 once the page has loaded and select the tab labeled
[.title-ref]#Network#. Refresh the web page and find the line whose
string in the file column matches
[.title-ref]#/rest_api/v1/auth/token/obtain/#.
. Right click on specified line and select Copy>Copy as cURL. Paste this
into a terminal session and add [.title-ref]#-H "Accept:
application/json"#. It should look something similar to the below:
+
____
....
$ curl 'https://pdc.fedoraproject.org/rest_api/v1/auth/token/obtain/' \
-H 'Host: pdc.fedoraproject.org' \
-H .0) Gecko/20100101 Firefox/57.0' \
-H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' \
-H 'Accept-Language: en-US,en;q=0.5' \
--compressed \
-H 'Cookie: csrftoken=CSRF_TOKEN_HASH; SERVERID=pdc-web01; mellon-saml-sesion-cookie=SAML_SESSION_HASH; sessionid=SESSION_ID_HASH' \
-H 'Connection: keep-alive' \
-H 'Upgrade-Insecure-Requests: 1' \
-H 'Cache-Control: max-age=0' \
-H "Accept: application/json"
....
____
. Using the token obtained from the previous step run the
[.title-ref]#adjust-eol.py# script from the
https://pagure.io/releng[Release Engineering Repo].
+
____
....
$ PYTHONPATH=scripts/pdc/ scripts/pdc/adjust-eol.py fedora MYTOKEN PACKAGENAME rpm TAG default -y
....
[NOTE]
.Note
====
The local machine will have configuration information in the
[.title-ref]#/etc/pdc.d/# directory. This is why _fedora_ can be passed
as an argument instead of the full API endpoint URL.
====
____

View file

@ -0,0 +1,55 @@
== Update Critpath
[NOTE]
.Note
====
Critpath = "Critical Path"
This is a collection of packages deemed "critical" to Fedora
====
=== Description
PDC has information about which packages are critpath and which are not.
A script that reads the yum repodata (critpath group in comps, and the
package dependencies) is used to generate this. Since package
dependencies change, this list should be updated periodically.
=== Action
. Release engineering scripts for updating critpath live in the
https://pagure.io/releng[releng git repository].
. Check the critpath.py script to see if the list of releases needs to
be updated:
+
....
for r in ['12', '13', '14', '15', '16', '17']: # 13, 14, ...
releasepath[r] = 'releases/%s/Everything/$basearch/os/' % r
updatepath[r] = 'updates/%s/$basearch/' % r
# Branched Fedora goes here
branched = '18'
....
+
The for loop has the version numbers for releases that have gone past
final. branched has the release that's been branched from rawhide but
not yet hit final. (These have different paths in the repository and may
not have an updates directory, thus they're in separate sections).
. Run the script with the release to generate info for (for a release
that's hit final, this is the release number example: "17". For
branched, it's "branched").
+
....
./critpath.py --srpm -o critpath.txt branched
....
. Run the update script to add that to PDC:
+
....
./update-critpath --user toshio f18 critpath.txt
....
+
The username is your fas username. You must be in cvsadmin to be able to
change this. The branch is the name of the dist-git branch. critpath.txt
is the file that the output of critpath.py went into. The script needs a
PDC token to talk to the server, configured in /etc/pdc.d/. See the PDC
SOP for more info.

View file

@ -0,0 +1,54 @@
== Update RelEng Rendered Docs
=== Description
When an improvement happens to the Release Engineering documentation
following the `contributing <contributing>` for the
http://sphinx-doc.org/[Sphinx]
https://en.wikipedia.org/wiki/ReStructuredText[reStructured Text] source
found in `docs/source` within the https://pagure.io/releng[RelEng git
repository] someone has to manually perform a process in order to update
the documentation that is hosted in the https://pagure.io/pagure[pagure]
documentation space for https://docs.pagure.org/releng/[Fedora RelEng
docs].
=== Action
In order to render the documentation using
http://sphinx-doc.org/[Sphinx], you need to first be sure to have the
package installed:
....
$ dnf install python-sphinx
....
Then we'll need to clone the RelEng repository and the RelEng docs
repository (the docs git repository is provided by pagure
automatically). There is a script in the [.title-ref]#releng# repository
that takes care of cleanly updating the documentation site for us.
....
$ ./scripts/update-docs.sh
....
The documentation is now live.
[NOTE]
.Note
====
This will require someone with permissions to push to the rawhide branch
for the releng repository. If you are curious whom all has this ability,
please refer to the `Main Page <index>` and contact someone from the
"Team Composition"
====
=== Verification
Visit the https://docs.pagure.org/releng/[Fedora RelEng docs] website
and verify that the changes are reflected live on the docs site.
=== Consider Before Running
No considerations at this time. The docs git repository is simply a
static html hosting space and we can just re-render the docs and push to
it again if necessary.

View file

@ -0,0 +1,260 @@
== Fedora Release Engineering Troubleshooting Guide
Fedora Release Engineering consists of many different systems, many
different code bases and multiple tools. Needless to say, things can get
pretty complex in a hurry. This aspect of Fedora Release Engineering is
not very welcoming to newcomers who would like to get involved. This
guide stands as a place to educate those new to the processes, systems,
code bases, and tools. It also is to serve as a reference to those who
aren't new but maybe are fortunate enough to not have needed to diagnose
things in recent memory.
We certainly won't be able to document every single possible compontent
in the systems that could go wrong but hopefully over time this document
will stand as a proper knowledge base for reference and educational
purposes on the topics listed below.
=== Compose
If something with a compose has gone wrong, there's a number of places
to find information. Each of these are discussed below.
==== releng-cron list
The compose output logs are emailed to the releng-cron mailing list. It
is good practice to check the
https://lists.fedoraproject.org/archives/list/releng-cron@lists.fedoraproject.org/[releng-cron
mailing list archives] and find the latest output and give it a look.
==== compose machines
If the
https://lists.fedoraproject.org/archives/list/releng-cron@lists.fedoraproject.org/[releng-cron
mailing list archives] doesn't prove to be useful, you can move on to
checking the contents of the composes themselves on the primary compose
machines in the Fedora Infrastructure. At the time of this writing,
there are multiple machines based on the specific compose you are
looking for:
* Two-Week Atomic Nightly Compose
** `compose-x86-01.phx2.fedoraproject.org`
* Branched Compose
** `branched-composer.phx2.fedoraproject.org`
* Rawhide Compose
** `rawhide-composer.phx2.fedoraproject.org`
Depending on which specific compose you are in search of will depend on
what full path you will end up inspecting:
* For Two Week Atomic you will find the compose output in
`/mnt/fedora_koji/compose/`
* For Release Candidate / Test Candidate composes you will find compose
output in `/mnt/koji/compose/`
[NOTE]
.Note
====
It's possible the mock logs are no longer available. The mock chroots
are rewritten on subsequent compose runs.
====
You can also check for mock logs if they are still persisting from the
compose you are interested in. Find the specific mock chroot directory
name (that will reside in `/var/lib/mock/`) you are looking for by
checking the appropriate compose mock configuration (the following is
only a sample provided at the time of this writing):
....
$ ls /etc/mock/*compose*
/etc/mock/fedora-22-compose-aarch64.cfg /etc/mock/fedora-branched-compose-aarch64.cfg
/etc/mock/fedora-22-compose-armhfp.cfg /etc/mock/fedora-branched-compose-armhfp.cfg
/etc/mock/fedora-22-compose-i386.cfg /etc/mock/fedora-branched-compose-i386.cfg
/etc/mock/fedora-22-compose-x86_64.cfg /etc/mock/fedora-branched-compose-x86_64.cfg
/etc/mock/fedora-23-compose-aarch64.cfg /etc/mock/fedora-rawhide-compose-aarch64.cfg
/etc/mock/fedora-23-compose-armhfp.cfg /etc/mock/fedora-rawhide-compose-armhfp.cfg
/etc/mock/fedora-23-compose-i386.cfg /etc/mock/fedora-rawhide-compose-i386.cfg
/etc/mock/fedora-23-compose-x86_64.cfg /etc/mock/fedora-rawhide-compose-x86_64.cfg
....
==== running the compose yourself
If you happen to strike out there and are still in need of debugging, it
might be time to just go ahead and run the compose yourself. The exact
command needed can be found in the cron jobs located on their respective
compose machines, this information can be found in the
`compose-machines` section. Also note that each respective compose
command should be ran from it's respective compose machine as outlined
in the `compose-machines` section.
An example is below, setting the compose directory as your
`username-debug.1`, the `.1` at the end is common notation for an
incremental run of a compose. If you need to do another, increment it to
`.2` and continue. This is helpful to be able to compare composes.
[NOTE]
.Note
====
It is recommended that the following command be run in
https://www.gnu.org/software/screen/[screen] or
https://tmux.github.io/[tmux]
====
....
$ TMPDIR=`mktemp -d /tmp/twoweek.XXXXXX` && cd $TMPDIR \
&& git clone -n https://pagure.io/releng.git && cd releng && \
git checkout -b stable twoweek-stable && \
LANG=en_US.UTF-8 ./scripts/make-updates 23 updates $USER-debug.1
....
The above command was pulled from the `twoweek-atomic` cron job with
only the final parameter being altered. This is used as the name of the
output directory.
The compose can take some time to run, so don't be alarmed if you don't
see output in a while. This should provide you all the infromation
needed to debug and/or diagnose further. When in doubt, as in
`#fedora-releng` on `irc.libera.chat`.
=== Docker Layered Image Build Service
The
https://fedoraproject.org/wiki/Changes/Layered_Docker_Image_Build_Service[Docker
Layered Image Build Service] is built using a combination of
technologies such as https://www.openshift.org/[OpenShift],
https://github.com/projectatomic/osbs-client[osbs-client], and the
https://github.com/release-engineering/koji-containerbuild[koji-containerbuild]
plugin that when combined are often refered to as an OpenShift Build
Service instance (OSBS). Something to note is that
https://www.openshift.org/[OpenShift] is a
http://kubernetes.io/[kubernetes] distribution with many features built
on top of http://kubernetes.io/[kubernetes] such as the
https://docs.openshift.org/latest/dev_guide/builds.html[build primitive]
that is used as the basis for the build service. This information will
hopefully shed light on some of the terminology and commands used below.
There are a few "common" scenarios in which build may fail or hang that
will need some sort of inspection of the build system.
==== Build Appears to stall after being scheduled
In the event that a build scheduled through koji appears to be stalled
and is not in a `free` state (i.e. - has been scheduled). An
administrator will need to ssh into `osbs-master01` or
`osbs-master01.stg` (depending stage vs production) and inspect the
build itself.
....
$ oc status
In project default on server https://10.5.126.216:8443
svc/kubernetes - 172.30.0.1 ports 443, 53, 53
bc/cockpit-f24 custom build of git://....
build #8 succeeded 7 weeks ago
build #9 failed 33 minutes ago
$ oc describe build/cockpit-f24-9
# lots of output about status of the specific build
$ oc logs build/cockpit-f24-9
# extremely verbose logs, these should in normal circumstances be found in
# the koji build log post build
....
The information found in the commands above will generally identify the
issue.
==== Build fails but there's no log output in the Koji Task
Sometimes there is a communications issue between Koji and OSBS which
cause for a failure to be listed in Koji but without all the logs. These
can be diagnosed by checking the `kojid` logs on the Koji builder listed
in the task output.
For example:
....
$ fedpkg container-build
Created task: 90123598
Task info: http://koji.stg.fedoraproject.org/koji/taskinfo?taskID=90123598
Watching tasks (this may be safely interrupted)...
90123598 buildContainer (noarch): free
90123598 buildContainer (noarch): free -> open (buildvm-04.stg.phx2.fedoraproject.org)
90123599 createContainer (x86_64): free
90123599 createContainer (x86_64): free -> open (buildvm-02.stg.phx2.fedoraproject.org)
90123599 createContainer (x86_64): open (buildvm-02.stg.phx2.fedoraproject.org) -> closed
0 free 1 open 1 done 0 failed
90123598 buildContainer (noarch): open (buildvm-04.stg.phx2.fedoraproject.org) -> FAILED: Fault: <Fault 2001: 'Image build failed. OSBS build id: cockpit-f24-9'>
0 free 0 open 1 done 1 failed
90123598 buildContainer (noarch) failed
....
In this example the buildContiner task was scheduled and ran on
`buildvm-04.stg` with the actual createContainer task being on
`buildvm-02.stg`, and `buildvm-02.stg` is where we're going to want to
begin looking for failures to communicate with OSBS as this is the point
of contact with the external system.
Logs can be found in `/var/log/kojid.log` or if necessary, check the
koji hub in question. Generally, you will want to start with the first
point of contact with OSBS and "work your way back" so in the above
example you would first check `buildvm-02.stg`, then move on to
`buildvm-04.stg` if nothing useful was found in the logs of the previous
machine, and again move on to the koji hub if neither of the builder
machines involved provided useful log information.
==== Build fails because it can't get to a network resource
Sometimes there is a situation where the firewall rules get messed up on
one of the OpenShift Nodes in the environment. This can cause output
similar to the following:
....
$ fedpkg container-build --scratch
Created task: 90066343
Task info: http://koji.stg.fedoraproject.org/koji/taskinfo?taskID=90066343
Watching tasks (this may be safely interrupted)...
90066343 buildContainer (noarch): free
90066343 buildContainer (noarch): free -> open (buildvm-03.stg.phx2.fedoraproject.org)
90066344 createContainer (x86_64): open (buildvm-04.stg.phx2.fedoraproject.org)
90066344 createContainer (x86_64): open (buildvm-04.stg.phx2.fedoraproject.org) -> FAILED: Fault: <Fault 2001: "Image build failed. Error in plugin distgit_fetch_artefacts: OSError(2, 'No such file or directory'). OSBS build id: scratch-20161102132628">
0 free 1 open 0 done 1 failed
90066343 buildContainer (noarch): open (buildvm-03.stg.phx2.fedoraproject.org) -> closed
0 free 0 open 1 done 1 failed
....
If we go to the OSBS Master and run the following commands, we will see
the root symptom:
....
# oc logs build/scratch-20161102132628
Error from server: Get https://osbs-node02.stg.phx2.fedoraproject.org:10250/containerLogs/default/scratch-20161102132628-build/custom-build: dial tcp 10.5.126.213:10250: getsockopt: no route to host
# ping 10.5.126.213
PING 10.5.126.213 (10.5.126.213) 56(84) bytes of data.
64 bytes from 10.5.126.213: icmp_seq=1 ttl=64 time=0.299 ms
64 bytes from 10.5.126.213: icmp_seq=2 ttl=64 time=0.299 ms
64 bytes from 10.5.126.213: icmp_seq=3 ttl=64 time=0.253 ms
64 bytes from 10.5.126.213: icmp_seq=4 ttl=64 time=0.233 ms
^C
--- 10.5.126.213 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3073ms
rtt min/avg/max/mdev = 0.233/0.271/0.299/0.028 ms
# http get 10.5.126.213:10250
http: error: ConnectionError: HTTPConnectionPool(host='10.5.126.213', port=10250): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fdab064b320>: Failed to establish a new connection: [Errno 113] No route to host',)) while doing GET request to URL: http://10.5.126.213:10250/
....
In the above output, we can see that we do actually have network
connectivity to the Node but we can not connect to the OpenShift service
that should be listening on port `10250`.
To fix this, you need to ssh into the OpenShift Node that you can't
connect to via port `10250` and run the following commands. This should
resolve the issue.
....
iptables -F && iptables -t nat -F && systemctl restart docker && systemctl restart origin-node
....