Added the infra SOPs ported to asciidoc.

2021-07-26 10:39:47 +02:00 · 2021-07-26 10:39:47 +02:00 · a0301e30f1
commit a0301e30f1
parent 8a7f111a12
148 changed files with 18575 additions and 17 deletions
--- a/modules/ROOT/pages/services.adoc
+++ b/modules/ROOT/pages/services.adoc
@ -7,11 +7,9 @@ they may be maintained by other people or team).
 Services handling identity and providing personal space to our contributors.
-FAS https://fas.fedoraproject.org[fas.fp.o]::
+Accounts https://accounts.fedoraproject.org/[accounts.fp.o]::
-The __F__edora __A__ccount __S__ystem, our directory and identity management
+Our directory and identity management tool provides community members with a single account to login on Fedora
-tool, provides community members with a single account to login on Fedora
+services. Registering an account there is one of the first things to do if you plan to work on Fedora.
 services. https://admin.fedoraproject.org/accounts/user/new[Creating an
 account] is one of the first things to do if you plan to work on Fedora.
 Fedora People https://fedorapeople.org/[fedorapeople.org]::
 Personnal web space provided to community members to share files, git
--- a/modules/communishift/assets/images/.keep
+++ b/modules/communishift/assets/images/.keep
--- a/modules/communishift/nav.adoc
+++ b/modules/communishift/nav.adoc
@ -1 +0,0 @@
 * xref:index.adoc[Communishift documentation]
--- a/modules/communishift/pages/index.adoc
+++ b/modules/communishift/pages/index.adoc
@ -1,10 +0,0 @@
 :experimental:
 = Communishift documentation
 link:https://console-openshift-console.apps.os.fedorainfracloud.org/[Communishift] is the name for the OpenShift community cluster run by the Fedora project.
 It's intended to be a place where community members can test/deploy/run things that are of benefit to the community at a lower SLE (Service Level Expectation) than services directly run and supported by infrastructure, additionally doing so in a self service manner.
 It's also an incubator for applications that may someday be more fully supported once they prove their worth.
 Finally, it's a place for Infrastructure folks to learn and test and discover OpenShift in a less constrained setting than our production clusters.
 This documentation focuses on implementation details of Fedora's OpenShift instance, not on OpenShift usage in general.
 These instructions are already covered by link:https://docs.openshift.com/container-platform/4.1/welcome/index.html[upstream documentation].
--- a/modules/sysadmin_guide/nav.adoc
+++ b/modules/sysadmin_guide/nav.adoc
@ -1 +1,144 @@
-* link:https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/index.html[System Administrator Guide]
+* xref:index.adoc[Sysadmin Guide]
 ** xref:2-factor.adoc[Two factor auth]
 ** xref:accountdeletion.adoc[Account Deletion SOP]
 ** xref:anitya.adoc[Anitya Infrastructure SOP]
 ** xref:ansible.adoc[ansible - SOP in review ]
 ** xref:apps-fp-o.adoc[apps-fp-o - SOP in review ]
 ** xref:archive-old-fedora.adoc[archive-old-fedora - SOP in review ]
 ** xref:arm.adoc[arm - SOP in review ]
 ** xref:askbot.adoc[askbot - SOP in review ]
 ** xref:aws-access.adoc[aws-access - SOP in review ]
 ** xref:basset.adoc[basset - SOP in review ]
 ** xref:bastion-hosts-info.adoc[bastion-hosts-info - SOP in review ]
 ** xref:bladecenter.adoc[bladecenter - SOP in review ]
 ** xref:blockerbugs.adoc[blockerbugs - SOP in review ]
 ** xref:bodhi.adoc[bodhi - SOP in review ]
 ** xref:bugzilla2fedmsg.adoc[bugzilla2fedmsg - SOP in review ]
 ** xref:bugzilla.adoc[bugzilla - SOP in review ]
 ** xref:cloud.adoc[cloud - SOP in review ]
 ** xref:collectd.adoc[collectd - SOP in review ]
 ** xref:communishift.adoc[communishift - SOP in review ]
 ** xref:compose-tracker.adoc[compose-tracker - SOP in review ]
 ** xref:contenthosting.adoc[contenthosting - SOP in review ]
 ** xref:copr.adoc[copr - SOP in review ]
 ** xref:cyclades.adoc[cyclades - SOP in review ]
 ** xref:darkserver.adoc[darkserver - SOP in review ]
 ** xref:database.adoc[database - SOP in review ]
 ** xref:datanommer.adoc[datanommer - SOP in review ]
 ** xref:debuginfod.adoc[debuginfod - SOP in review ]
 ** xref:denyhosts.adoc[denyhosts - SOP in review ]
 ** xref:departing-admin.adoc[departing-admin - SOP in review ]
 ** xref:dns.adoc[dns - SOP in review ]
 ** xref:docs.fedoraproject.org.adoc[docs.fedoraproject.org - SOP in review ]
 ** xref:fas-notes.adoc[fas-notes - SOP in review ]
 ** xref:fas-openid.adoc[fas-openid - SOP in review ]
 ** xref:fedmsg-certs.adoc[fedmsg-certs - SOP in review ]
 ** xref:fedmsg-gateway.adoc[fedmsg-gateway - SOP in review ]
 ** xref:fedmsg-introduction.adoc[fedmsg-introduction - SOP in review ]
 ** xref:fedmsg-irc.adoc[fedmsg-irc - SOP in review ]
 ** xref:fedmsg-new-message-type.adoc[fedmsg-new-message-type - SOP in review ]
 ** xref:fedmsg-relay.adoc[fedmsg-relay - SOP in review ]
 ** xref:fedmsg-websocket.adoc[fedmsg-websocket - SOP in review ]
 ** xref:fedocal.adoc[fedocal - SOP in review ]
 ** xref:fedorapackages.adoc[fedorapackages - SOP in review ]
 ** xref:fedorapastebin.adoc[fedorapastebin - SOP in review ]
 ** xref:fedora-releases.adoc[fedora-releases - SOP in review ]
 ** xref:fedorawebsites.adoc[fedorawebsites - SOP in review ]
 ** xref:fmn.adoc[fmn - SOP in review ]
 ** xref:fpdc.adoc[fpdc - SOP in review ]
 ** xref:freemedia.adoc[freemedia - SOP in review ]
 ** xref:freenode-irc-channel.adoc[freenode-irc-channel - SOP in review ]
 ** xref:freshmaker.adoc[freshmaker - SOP in review ]
 ** xref:gather-easyfix.adoc[gather-easyfix - SOP in review ]
 ** xref:gdpr_delete.adoc[gdpr_delete - SOP in review ]
 ** xref:gdpr_sar.adoc[gdpr_sar - SOP in review ]
 ** xref:geoip-city-wsgi.adoc[geoip-city-wsgi - SOP in review ]
 ** xref:github2fedmsg.adoc[github2fedmsg - SOP in review ]
 ** xref:github.adoc[github - SOP in review ]
 ** xref:gitweb.adoc[gitweb - SOP in review ]
 ** xref:greenwave.adoc[greenwave - SOP in review ]
 ** xref:guestdisk.adoc[guestdisk - SOP in review ]
 ** xref:guestedit.adoc[guestedit - SOP in review ]
 ** xref:haproxy.adoc[haproxy - SOP in review ]
 ** xref:hosted_git_to_svn.adoc[hosted_git_to_svn - SOP in review ]
 ** xref:hotfix.adoc[hotfix - SOP in review ]
 ** xref:hotness.adoc[hotness - SOP in review ]
 ** xref:hubs.adoc[hubs - SOP in review ]
 ** xref:ibm_rsa_ii.adoc[ibm_rsa_ii - SOP in review ]
 ** xref:index.adoc[index - SOP in review ]
 ** xref:infra-git-repo.adoc[infra-git-repo - SOP in review ]
 ** xref:infra-hostrename.adoc[infra-hostrename - SOP in review ]
 ** xref:infra-raidmismatch.adoc[infra-raidmismatch - SOP in review ]
 ** xref:infra-repo.adoc[infra-repo - SOP in review ]
 ** xref:infra-retiremachine.adoc[infra-retiremachine - SOP in review ]
 ** xref:infra-yubikey.adoc[infra-yubikey - SOP in review ]
 ** xref:ipsilon.adoc[ipsilon - SOP in review ]
 ** xref:iscsi.adoc[iscsi - SOP in review ]
 ** xref:jenkins-fedmsg.adoc[jenkins-fedmsg - SOP in review ]
 ** xref:kerneltest-harness.adoc[kerneltest-harness - SOP in review ]
 ** xref:kickstarts.adoc[kickstarts - SOP in review ]
 ** xref:koji.adoc[koji - SOP in review ]
 ** xref:koji-archive.adoc[koji-archive - SOP in review ]
 ** xref:koji-builder-setup.adoc[koji-builder-setup - SOP in review ]
 ** xref:koschei.adoc[koschei - SOP in review ]
 ** xref:layered-image-buildsys.adoc[layered-image-buildsys - SOP in review ]
 ** xref:librariesio2fedmsg.adoc[librariesio2fedmsg - SOP in review ]
 ** xref:linktracking.adoc[linktracking - SOP in review ]
 ** xref:loopabull.adoc[loopabull - SOP in review ]
 ** xref:mailman.adoc[mailman - SOP in review ]
 ** xref:making-ssl-certificates.adoc[making-ssl-certificates - SOP in review ]
 ** xref:massupgrade.adoc[massupgrade - SOP in review ]
 ** xref:mastermirror.adoc[mastermirror - SOP in review ]
 ** xref:mbs.adoc[mbs - SOP in review ]
 ** xref:memcached.adoc[memcached - SOP in review ]
 ** xref:message-tagging-service.adoc[message-tagging-service - SOP in review ]
 ** xref:mirrorhiding.adoc[mirrorhiding - SOP in review ]
 ** xref:mirrormanager.adoc[mirrormanager - SOP in review ]
 ** xref:mirrormanager-S3-EC2-netblocks.adoc[mirrormanager-S3-EC2-netblocks - SOP in review ]
 ** xref:mote.adoc[mote - SOP in review ]
 ** xref:nagios.adoc[nagios - SOP in review ]
 ** xref:netapp.adoc[netapp - SOP in review ]
 ** xref:new-hosts.adoc[new-hosts - SOP in review ]
 ** xref:nonhumanaccounts.adoc[nonhumanaccounts - SOP in review ]
 ** xref:nuancier.adoc[nuancier - SOP in review ]
 ** xref:odcs.adoc[odcs - SOP in review ]
 ** xref:openqa.adoc[openqa - SOP in review ]
 ** xref:openshift.adoc[openshift - SOP in review ]
 ** xref:openvpn.adoc[openvpn - SOP in review ]
 ** xref:orientation.adoc[orientation - SOP in review ]
 ** xref:outage.adoc[outage - SOP in review ]
 ** xref:packagedatabase.adoc[packagedatabase - SOP in review ]
 ** xref:packagereview.adoc[packagereview - SOP in review ]
 ** xref:pagure.adoc[pagure - SOP in review ]
 ** xref:pdc.adoc[pdc - SOP in review ]
 ** xref:pesign-upgrade.adoc[pesign-upgrade - SOP in review ]
 ** xref:planetsubgroup.adoc[planetsubgroup - SOP in review ]
 ** xref:privatefedorahosted.adoc[privatefedorahosted - SOP in review ]
 ** xref:publictest-dev-stg-production.adoc[publictest-dev-stg-production - SOP in review ]
 ** xref:rabbitmq.adoc[rabbitmq - SOP in review ]
 ** xref:rdiff-backup.adoc[rdiff-backup - SOP in review ]
 ** xref:registry.adoc[registry - SOP in review ]
 ** xref:requestforresources.adoc[requestforresources - SOP in review ]
 ** xref:resultsdb.adoc[resultsdb - SOP in review ]
 ** xref:retrace.adoc[retrace - SOP in review ]
 ** xref:reviewboard.adoc[reviewboard - SOP in review ]
 ** xref:scmadmin.adoc[scmadmin - SOP in review ]
 ** xref:selinux.adoc[selinux - SOP in review ]
 ** xref:sigul-upgrade.adoc[sigul-upgrade - SOP in review ]
 ** xref:simple_koji_ci.adoc[simple_koji_ci - SOP in review ]
 ** xref:sshaccess.adoc[sshaccess - SOP in review ]
 ** xref:sshknownhosts.adoc[sshknownhosts - SOP in review ]
 ** xref:staging.adoc[staging - SOP in review ]
 ** xref:status-fedora.adoc[status-fedora - SOP in review ]
 ** xref:syslog.adoc[syslog - SOP in review ]
 ** xref:tag2distrepo.adoc[tag2distrepo - SOP in review ]
 ** xref:torrentrelease.adoc[torrentrelease - SOP in review ]
 ** xref:unbound.adoc[unbound - SOP in review ]
 ** xref:virt-image.adoc[virt-image - SOP in review ]
 ** xref:virtio.adoc[virtio - SOP in review ]
 ** xref:virt-notes.adoc[virt-notes - SOP in review ]
 ** xref:voting.adoc[voting - SOP in review ]
 ** xref:waiverdb.adoc[waiverdb - SOP in review ]
 ** xref:wcidff.adoc[wcidff - SOP in review ]
 ** xref:wiki.adoc[wiki - SOP in review ]
 ** xref:zodbot.adoc[zodbot - SOP in review ]
--- a/modules/sysadmin_guide/pages/2-factor.adoc
+++ b/modules/sysadmin_guide/pages/2-factor.adoc
@ -0,0 +1,98 @@
 = Two factor auth
 Fedora Infrastructure has implemented a form of two factor auth for
 people who have sudo access on Fedora machines. In the future we may
 expand this to include more than sudo but this was deemed to be a high
 value, low hanging fruit.
 == Using two factor
 http://fedoraproject.org/wiki/Infrastructure_Two_Factor_Auth
 To enroll a Yubikey, use the fedora-burn-yubikey script like normal. To
 enroll using FreeOTP or Google Authenticator, go to
 https://admin.fedoraproject.org/totpcgiprovision/
 === What's enough authentication?
 FAS Password+FreeOTP or FAS Password+Yubikey Note: don't actually enter
 a +, simple enter your FAS Password and press your yubikey or enter your
 FreeOTP code.
 == Administrating and troubleshooting two factor
 Two factor auth is implemented by a modified copy of the
 https://github.com/mricon/totp-cgi project doing the authentication and
 pam_url submitting the authentication tokens.
 totp-cgi runs on the fas servers (currently fas01.stg and
 fas01/fas02/fas03 in production), listening on port 8443 for pam_url
 requests.
 FreeOTP, Google authenticator and yubikeys are supported as tokens to
 use with your password.
 === FreeOTP, Google authenticator:
 FreeOTP application is preferred, however Google authenticator works as
 well. (Note that Google authenticator is not open source)
 This is handled via totpcgi. There's a command line tool to manage
 users, totpprov. See 'man totpprov' for more info. Admins can use this
 tool to revoke lost tokens (google authenticator only) with 'totpprov
 delete-user username'
 To enroll using FreeOTP or Google Authenticator for production machines,
 go to https://admin.fedoraproject.org/totpcgiprovision/
 To enroll using FreeOTP or Google Authenticator for staging machines, go
 to https://admin.stg.fedoraproject.org/totpcgiprovision/
 You'll be prompted to login with your fas username and password.
 Note that staging and production differ.
 === YubiKeys:
 Yubikeys are enrolled and managed in FAS. Users can self-enroll using
 the fedora-burn-yubikey utility included in the fedora-packager package.
 === What do I do if I lose my token?
 Send an email to admin@fedoraproject.org that is encrypted/signed with
 your gpg key from FAS, or otherwise identifies you are you.
 === How to remove a token (so the user can re-enroll)?
 First we MUST verify that the user is who they say they are, using any
 of the following:
 * Personal contact where the person can be verified by member of
 sysadmin-main.
 * Correct answers to security questions.
 * Email request to admin@fedoraproject.org that is gpg encrypted by the
 key listed for the user in fas.
 Then:
 . For google authenticator,
 +
 ____
 .. ssh into batcave01 as root
 .. ssh into os-master01.iad2.fedoraproject.org
 .. $ oc project fas
 .. $ oc get pods
 .. $ oc rsh <pod> (Pick one of totpcgi pods from the above list)
 .. $ totpprov delete-user <username>
 ____
 . For yubikey: login to one of the fas machines and run:
 /usr/local/bin/yubikey-remove.py username
 The user can then go to
 https://admin.fedoraproject.org/totpcgiprovision/ and reprovision a new
 device.
 If the user emails admin@fedoraproject.org with the signed request, make
 sure to reply to all indicating that a reset was performed. This is so
 that other admins don't step in and reset it again after its been reset
 once.
--- a/modules/sysadmin_guide/pages/accountdeletion.adoc
+++ b/modules/sysadmin_guide/pages/accountdeletion.adoc
@ -0,0 +1,294 @@
 = Account Deletion SOP
 For the most part we do not delete accounts. In the case that a deletion
 is paramount, it will need to be coordinated with appropriate entities.
 Disabling accounts is another story but is limited to those with the
 appropriate privileges. Reasons for accounts to be disabled can be one
 of the following:
 ____
 * Person has placed SPAM on the wiki or other sites.
 * It is seen that the account has been compromised by a third party.
 * A person wishes to leave the Fedora Project and wants the account
 disabled.
 ____
 == Contents
 * Disabling
 ** Disable Accounts
 ** Disable Groups
 * User Requested disables
 * Renames
 ** Rename Accounts
 ** Rename Groups
 * Deletion
 ** Delete Accounts
 ** Delete Groups
 === Disable
 Disabling accounts is the easiest to accomplish as it just blocks people
 from using their account. It does not remove the account name and
 associated UID so we don't have to worry about future, unintentional
 collisions.
 == Disable Accounts
 To begin with, accounts should not be disabled until there is a ticket
 in the Infrastructure ticketing system. After that the contents inside
 the ticket need to be verified (to make sure people aren't playing
 pranks or someone is in a crappy mood). This needs to be logged in the
 ticket (who looked, what they saw, etc). Then the account can be
 disabled.:
 ....
 ssh db02
 sudo -u postgres pqsql fas2
 fas2=# begin;
 fas2=# select * from people where username = 'FOOO';
 ....
 Here you need to verify that the account looks right, that there is only
 one match, or other issues. If there are multiple matches you need to
 contact one of the main sysadmin-db's on how to proceed.:
 ....
 fas2=# update people set status = 'admin_disabled' where username = 'FOOO';
 fas2=# commit;
 fas2=# /q
 ....
 == Disable Groups
 There is no explicit way to disable groups in FAS2. Instead, we close
 the group for adding new members and optionally remove existing members
 from it. This can be done from the web UI if you are an administrator of
 the group or you are in the accounts group. First, go to the group info
 page. Then click the (edit) link next to Group Details. Make sure that
 the Invite Only box is checked. This will prevent other users from
 requesting the group on their own.
 If you want to remove the existing users, View the Group info, then
 click on the View Member List link. Click on All under the Results
 heading. Then go through and click on Remove for each member.
 Doing this in the database instead can be quicker if you have a lot of
 people to remove. Once again, this requires someone in sysadmin-db to do
 the work:
 ....
 ssh db02
 sudo -u postgres pqsql fas2
 fas2=# begin;
 fas2=# update group, set invite_only = true where name = 'FOOO';
 fas2=# commit;
 fas2=# begin;
 fas2=# select p.name, g.name, r.role_status from people as p, person_roles as r, groups as g
 where p.id = r.person_id and g.id = r.group_id
 and g.name = 'FOOO';
 fas2=# -- Make sure that the list of users in the groups looks correct
 fas2=# delete from person_roles where person_roles.group_id = (select id from groups where g.name = 'FOOO');
 fas2=# -- number of rows in both of the above should match
 fas2=# commit;
 fas2=# /q
 ....
 === User Requested Disables
 According to our Privacy Policy, a user may request that their personal
 information from FAS if they want to disable their account. We can do
 this but need to do some extra work over simply setting the account
 status to disabled.
 == Record User's CLA information
 If the user has signed the CLA/FPCA, then they may have contributed
 something to Fedora that we'll need to contact them about at a later
 date. For that, we need to keep at least the following information:
 * Fedora username
 * human name
 * email address
 All of this information should be on the CLA email that is sent out when
 a user signs up. We need to verify with spot (Tom Callaway) that he has
 that record. If not, we need to get it to him. Something like:
 ....
 select id, username, human_name, email, telephone, facsimile, postal_address from people where username = 'USERNAME';
 ....
 and send it to spot to keep.
 == Remove the personal information
 The following sequence of db commands should do it:
 ....
 fas2=# begin;
 fas2=# select * from people where username = 'USERNAME';
 ....
 Here you need to verify that the account looks right, that there is only
 one match, or other issues. If there are multiple matches you need to
 contact one of the main sysadmin-db's on how to proceed.:
 ....
 fas2=# update people set human_name = '', gpg_keyid = null, ssh_key = null, unverified_email = null, comments = null, postal_address = null, telephone = null, facsimile = null, affiliation = null, ircnick = null, status = 'inactive', locale = 'C', timezone = null, latitude = null, longitude = null, country_code = null, email = 'disabled1@fedoraproject.org'  where username = 'USERNAME';
 ....
 Make sure only one record was updated:
 ....
 fas2=# select * from people where username = 'USERNAME';
 ....
 Make sure the correct record was updated:
 ....
 fas2=# commit;
 ....
 [NOTE]
 .Note
 ====
 The email address is both not null and unique in the database. Due to
 this, you need to set it to a new string for every user who requests
 deletion like this.
 ====
 === Renames
 In general, renames do not require as much work as deletions but they
 still require coordination. This is because renames do not change the
 UID/GID but some of our applications save information based on
 username/groupname rather than UID/GID.
 == Rename Accounts
 [WARNING]
 .Warning
 ====
 Needs more eyes This list may not be complete.
 ====
 * Check the databases for koji, pkgdb, and bodhi for occurrences of
 the old username and update them to the new username.
 * Check fedorapeople.org for home directories and yum repositories under
 the old username that would need to be renamed
 * Check (or ask the user to check and update) mailing list subscriptions
 on fedorahosted.org and lists.fedoraproject.org under the old
 username@fedoraproject.org email alias
 * Check whether the user has a username@fedoraproject.org bugzilla
 account in python-fedora and update that. Also ask the user to update
 that in bugzilla.
 * If the user is in a sysadmin-* group, check for home directories on
 bastion and other infrastructure boxes that are owned by them and need
 to be renamed (Could also just tell the user to backup any files there
 themselves b/c they're getting a new home directory).
 * grep through ansible for occurrences of the username
 * Check for entries in trac on fedorahosted.org for the username as an
 "Assigned to" or "CC" entry.
 * Add other places to check here
 == Rename Groups
 [WARNING]
 .Warning
 ====
 Needs more eyes This list may not be complete.
 ====
 * grep through ansible for occurrences of the group name.
 * Check for group-members,group-admins,group-sponsors@fedoraproject.org
 email alias presence in any fedorahosted.org or lists.fedoraproject.org
 mailing list
 * Check for entries in trac on fedorahosted.org for the username as an
 "Assigned to" or "CC" entry.
 * Add other places to check here
 === Deletion
 Deletion is the toughest one to audit because it requires that we look
 through our systems looking for the UID and GID in addition to looking
 for the username and password. The UID and GID are used on things like
 filesystem permissions so we have to look there as well. Not catching
 these places may lead to security issus should the UID/GID ever be
 reused.
 [NOTE]
 .Note
 ====
 Recommended to rename instead When not strictly necessary to purge all
 traces of an account, it's highlyrecommended to rename the user or group
 to something like DELETED_oldusername instead of deleting. This avoids
 the problems and additional checking that we have to do below.
 ====
 == Delete Accounts
 [WARNING]
 .Warning
 ====
 Needs more eyes This list may be incomplete. Needs more people to look
 at this and find places that may need to be updated
 ====
 * Check everything for the #Rename Accounts case.
 * Figure out what boxes a user may have had access to in the past. This
 means you need to look at all the groups a user may ever have been
 approved for (even if they are not approved for those groups now). For
 instance, any git*, svn*, bzr*, hg* groups would have granted access to
 hosted03 and hosted04. packager would have granted access to
 pkgs.fedoraproject.org. Pretty much any group grants access to
 fedorapeople.org.
 * For those boxes, run a find over the files there to see if the UID
 owns any files on the system:
 +
 ....
 # find / -uid 100068 -print
 ....
 +
 Any files owned by that uid must be reassigned to another user or::
  removed.
 [WARNING]
 .Warning
 ====
 What to do about backups? Backups pose a special problem as they may
 contain the uid that's being removed. Need to decide how to handle this
 ====
 * Add other places to check here
 == Delete Groups
 [WARNING]
 .Warning
 ====
 Needs more eyes This list may be incomplete. Needs more people to look
 at this and find places that may need to be updated
 ====
 * Check everything for the #Rename Groups case.
 * Figure out what boxes may have had files owned by that group. This
 means that you'd need to look at the users in that group, what boxes
 they have shell accounts on, and then look at those boxes. groups used
 for hosted would also need to add hosted03 and hosted04 to that list and
 the box that serves the hosted mailing lists.
 * For those boxes, run a find over the files there to see if the GID
 owns any files on the system:
 +
 ....
 # find / -gid 100068 -print
 ....
 +
 Any files owned by that GID must be reassigned to another group or
 removed.
 [WARNING]
 .Warning
 ====
 What to do about backups? Backups pose a special problem as they may
 contain the gid that's being removed. Need to decide how to handle this
 ====
 * Add other places to check here
--- a/modules/sysadmin_guide/pages/anitya.adoc
+++ b/modules/sysadmin_guide/pages/anitya.adoc
@ -0,0 +1,210 @@
 = Anitya Infrastructure SOP
 Anitya is used by Fedora to track upstream project releases and maps
 them to downstream distribution packages, including (but not limited to)
 Fedora.
 Anitya staging instance: https://stg.release-monitoring.org
 Anitya production instance: https://release-monitoring.org
 Anitya project page: https://github.com/fedora-infra/anitya
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, #fedora-apps
 Persons::
  zlopez
 Location::
  iad2.fedoraproject.org
 Servers::
  Production
  +
  * os-master01.iad2.fedoraproject.org
  +
  Staging
  +
  * os-master01.stg.iad2.fedoraproject.org
 Purpose::
  Map upstream releases to Fedora packages.
 == Hosts
 The current deployment is made up of release-monitoring OpenShift
 namespace.
 === release-monitoring
 This OpenShift namespace runs following pods:
 * The apache/mod_wsgi application for release-monitoring.org
 * A libraries.io SSE client
 * A service checking for new releases
 This OpenShift project relies on:
 * A postgres db server running in OpenShift
 * Lots of external third-party services. The anitya webapp can scrape
 pypi, rubygems.org, sourceforge and many others on command.
 * Lots of external third-party services. The check service makes all
 kinds of requests out to the Internet that can fail in various ways.
 * Fedora messaging RabbitMQ hub for publishing messages
 Things that rely on this host:
 * `hotness-sop` is a fedora messaging consumer running in Fedora Infra
 in OpenShift. It listens for Anitya messages from here and performs
 actions on koji and bugzilla.
 == Releasing
 The release process is described in
 https://anitya.readthedocs.io/en/latest/contributing.html#release-guide[Anitya
 documentation].
 === Deploying
 Staging deployment of Anitya is deployed in OpenShift on
 os-master01.stg.iad2.fedoraproject.org.
 To deploy staging instance of Anitya you need to push changes to staging
 branch on https://github.com/fedora-infra/anitya[Anitya GitHub]. GitHub
 webhook will then automatically deploy a new version of Anitya on
 staging.
 Production deployment of Anitya is deployed in OpenShift on
 os-master01.iad2.fedoraproject.org.
 To deploy production instance of Anitya you need to push changes to
 production branch on https://github.com/fedora-infra/anitya[Anitya
 GitHub]. GitHub webhook will then automatically deploy a new version of
 Anitya on production.
 ==== Configuration
 To deploy the new configuration, you need
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/sshaccess.html[ssh
 access] to batcave01.iad2.fedoraproject.org and
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ansible.html[permissions
 to run the Ansible playbook].
 All the following commands should be run from batcave01.
 First, ensure there are no configuration changes required for the new
 update. If there are, update the Ansible anitya role(s) and optionally
 run the playbook:
 ....
 $ sudo rbac-playbook openshift-apps/release-monitoring.yml
 ....
 The configuration changes could be limited to staging only using:
 ....
 $ sudo rbac-playbook openshift-apps/release-monitoring.yml -l staging
 ....
 This is recommended for testing new configuration changes.
 ==== Upgrading
 ===== Staging
 To deploy new version of Anitya you need to push changes to staging
 branch on https://github.com/fedora-infra/anitya[Anitya GitHub]. GitHub
 webhook will then automatically deploy a new version of Anitya on
 staging.
 ===== Production
 To deploy new version of Anitya you need to push changes to production
 branch on https://github.com/fedora-infra/anitya[Anitya GitHub]. GitHub
 webhook will then automatically deploy a new version of Anitya on
 production.
 Congratulations! The new version should now be deployed.
 == Administrating release-monitoring.org
 Anitya web application offers some functionality to administer itself.
 User admin status is tracked in Anitya database. Admin users can grant
 or revoke admin priviledges to users in the
 https://release-monitoring.org/users[users tab].
 Admin users have additional functionality available in web interface. In
 particular, admins can view flagged projects, remove projects and remove
 package mappings etc.
 For more information see
 https://anitya.readthedocs.io/en/stable/admin-user-guide.html[Admin user
 guide] in Anitya documentation.
 === Flags
 Anitya lets users flag projects for administrator attention. This is
 accessible to administrators in the
 https://release-monitoring.org/flags[flags tab].
 == Monitoring
 To monitor the activity of Anitya you can connect to Fedora infra
 OpenShift and look at the state of pods.
 For staging look at the [.title-ref]#release-monitoring# namespace in
 https://os.stg.fedoraproject.org/console/project/release-monitoring/overview[staging
 OpenShift instance].
 For production look at the [.title-ref]#release-monitoring# namespace in
 https://os.fedoraproject.org/console/project/release-monitoring/overview[production
 OpenShift instance].
 == Troubleshooting
 This section contains various issues encountered during deployment or
 configuration changes and possible solutions.
 === Fedmsg messages aren't sent
 *Issue:* Fedmsg messages aren't sent.
 *Solution:* Set USER environment variable in pod.
 *Explanation:* Fedmsg is using USER env variable as a username inside
 messages. Without USER env set it just crashes and didn't send anything.
 === Cronjob is crashing
 *Issue:* Cronjob pod is crashing on start, even after configuration
 change that should fix the behavior.
 *Solution:* Restart the cronjob. This could be done by OPS.
 *Explanation:* Every time the cronjob is executed after crash it is
 trying to actually reuse the pod with bad configuration instead of
 creating a new one with new configuration.
 === Database migration is taking too long
 *Issue:* Database migration is taking few hours to complete.
 *Solution:* Stop every pod and cronjob before migration.
 *Explanation:* When creating new index or doing some other complex
 operation on database, the migration script needs exclusive access to
 the database.
 === Old version is deployed instead the new one
 *Issue:* The pod is deployed with old version of Anitya, but it says
 that it was triggered by correct commit.
 *Solution:* Set [.title-ref]#dockerStrategy# in buildconfig.yml to
 noCache.
 *Explanation:* The OpenShift is by default caching the layers of docker
 containers, so if there is no change in Dockerfile it will just use the
 cached version and don't run the commands again.
--- a/modules/sysadmin_guide/pages/ansible.adoc
+++ b/modules/sysadmin_guide/pages/ansible.adoc
@ -0,0 +1,252 @@
 = Ansible infrastructure SOP/Information.
 == Background
 Fedora infrastructure used to use func and puppet for system change
 management. We are now using ansible for all system change mangement and
 ad-hoc tasks.
 == Overview
 Ansible runs from batcave01 or backup01. These hosts run a ssh-agent
 that has unlocked the ansible root ssh private key. (This is unlocked
 manually by a human with the passphrase each reboot, the passphrase
 itself is not stored anywhere on the machines). Using 'sudo -i',
 sysadmin-main members can use this agent to access any machines with the
 ansible root ssh public key setup, either with 'ansible' for one-off
 commands or 'ansible-playbook' to run playbooks.
 Playbooks are idempotent (or should be). Meaning you should be able to
 re-run the same playbook over and over and it should get to a state
 where 0 items are changing.
 Additionally (see below) there is a rbac wrapper that allows members of
 some other groups to run playbooks against specific hosts.
 === GIT repositories
 There are 2 git repositories associated with Ansible:
 * The Fedora Infrastructure Ansible repository and replicas.
 +
 [CAUTION]
 .Caution
 ====
 This is a public repository. Never commit private data to this repo.
 ====
 +
 image:ansible-repositories.png[image]
 +
 This repository exists as several copies or replicas:
 ** The "upstream" repository on Pagure.
 +
 https://pagure.io/fedora-infra/ansible
 +
 This repository is the public facing place where people can contribute
 (e.g. pull requests) as well as the authoritative source. Members of the
 `sysadmin` FAS group or the `fedora-infra` Pagure group have commit
 access to this repository.
 +
 To contribute changes, fork the repository on Pagure and submit a Pull
 Request. Someone from the aforementioned groups can then review and
 merge them.
 +
 It is recommended that you configure git to use `pull --rebase` by
 default by running `git config --bool pull.rebase true` in your ansible
 clone directory. This configuration prevents unneeded merges which can
 occur if someone else pushes changes to the remote repository while you
 are working on your own local changes.
 ** Two bare mirrors on [.title-ref]#batcave01#, `/srv/git/ansible.git`
 and `/srv/git/mirrors/ansible.git`
 +
 [CAUTION]
 .Caution
 ====
 These are public repositories. Never commit private data to these
 repositories. Don't commit or push to these repos directly, unless
 Pagure is unavailable.
 ====
 +
 The `mirror_pagure_ansible` service on [.title-ref]#batcave01# receives
 bus messages about changes in the repository on Pagure, fetches these
 into `/srv/git/mirrors/ansible.git` and pushes from there to
 `/srv/git/ansible.git`. When this happens, various actions are triggered
 via git hooks:
 *** The working copy at `/srv/web/infra/ansible` is updated.
 *** A mail about the changes is sent to [.title-ref]#sysadmin-members#.
 *** The changes are announced on the message bus, which in turn triggers
 announcements on IRC.
 +
 You can check out the repo locally on [.title-ref]#batcave01# with:
 +
 ....
 git clone /srv/git/ansible.git
 ....
 +
 If the Ansible repository on Pagure is unavailable, members of the
 [.title-ref]#sysadmin# group may commit directly, provided this
 procedure is followed:
 [arabic]
 . The synchronization service is stopped and disabled:
 +
 ....
 sudo systemctl disable --now mirror_pagure_ansible.service
 ....
 . Changes are applied to the repository on [.title-ref]#batcave01#.
 . After Pagure is available again, the changes are pushed to the
 repository there.
 . The synchronization service is enabled and started:
 +
 ....
 sudo systemctl enable --now mirror_pagure_ansible.service
 ....
 ** `/srv/web/infra/ansible` on [.title-ref]#batcave01#, the working copy
 from which playbooks are run.
 +
 [CAUTION]
 .Caution
 ====
 This is a public repository. Never commit private data to this repo.
 Don't commit or push to this repo directly, unless Pagure is
 unavailable.
 ====
 +
 You can access it also via a cgit web interface at:
 https://pagure.io/fedora-infra/ansible/
 +
 [verse]
 --
 --
 * `/srv/git/ansible-private` on [.title-ref]#batcave01#.
 +
 [CAUTION]
 .Caution
 ====
 This is a private repository for passwords and other sensitive data. It
 is not available in cgit, nor should it be cloned or copied remotely.
 ====
 +
 This repository is only accessible to members of 'sysadmin-main'.
 === Cron job/scheduled runs
 With use of run_ansible-playbook_cron.py that is run daily via cron we
 walk through playbooks and run them with [.title-ref]#--check --diff#
 params to perform a dry-run.
 This way we make sure all the playbooks are idempotent and there is no
 unexpected changes on servers (or playbooks).
 === Logging
 We have in place a callback plugin that stores history for any
 ansible-playbook runs and then sends a report each day to
 sysadmin-logs-members with any CHANGED or FAILED actions. Additionally,
 there's a fedmsg plugin that reports start and end of ansible playbook
 runs to the fedmsg bus. Ansible also logs to syslog verbose reporting of
 when and what commands and playbooks were run.
 === role based access control for playbooks
 There's a wrapper script on batcave01 called 'rbac-playbook' that allows
 non sysadmin-main members to run specific playbooks against specific
 groups of hosts. This is part of the ansible_utils package. The upstream
 for ansible_utils is: https://bitbucket.org/tflink/ansible_utils
 To add a new group:
 [arabic]
 . add the playbook name and sysadmin group to the rbac-playbook
 (ansible-private repo)
 . add that sysadmin group to sudoers on batcave01 (also in
 ansible-private repo)
 To use the wrapper:
 ....
 sudo rbac-playbook playbook.yml
 ....
 == Directory setup
 === Inventory
 The inventory directory tells ansible all the hosts that are managed by
 it and the groups they are in. All files in this dir are concatenated
 together, so you can split out groups/hosts into separate files for
 readability. They are in ini file format.
 Additionally under the inventory directory are host_vars and group_vars
 subdirectories. These are files named for the host or group and
 containing variables to set for that host or group. You should strive to
 set variables in the highest level possible, and precedence is in:
 global, group, host order.
 === Vars
 This directory contains global variables as well as OS specific
 variables. Note that in order to use the OS specific ones you must have
 'gather_facts' as 'True' or ansible will not have the facts it needs to
 determine the OS.
 === Roles
 Roles are a collection of tasks/files/templates that can be used on any
 host or group of hosts that all share that role. In other words, roles
 should be used except in cases where configuration only applies to a
 single host. Roles can be reused between hosts and groups and are more
 portable/flexable than tasks or specific plays.
 === Scripts
 In the ansible git repo under scripts are a number of utilty scripts for
 sysadmins.
 === Playbooks
 In the ansible git repo there's a directory for playbooks. The top level
 contains utility playbooks for sysadmins. These playbooks perform
 one-off functions or gather information. Under this directory are hosts
 and groups playbooks. These playbooks are for specific hosts and groups
 of hosts, from provision to fully configured. You should only use a host
 playbook in cases where there will never be more than one of that thing.
 === Tasks
 This directory contains one-off tasks that are used in playbooks. Some
 of these should be migrated to roles (we had this setup before roles
 existed in ansible). Those that are truely only used on one host/group
 could stay as isolated tasks.
 === Syntax
 Ansible now warns about depreciated syntax. Please fix any cases you see
 related to depreciation warnings.
 Templates use the jinja2 syntax.
 == Libvirt virtuals
 * TODO: add steps to make new libvirt virtuals in staging and production
 * TODO: merge in new-hosts.txt
 == Cloud Instances
 * TODO: add how to make new cloud instances
 * TODO: merge in from ansible README file.
 == rdiff-backups
 see:
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/rdiff-backup.html
 == Additional Reading/Resources
 Upstream docs:::
  https://docs.ansible.com/
 Example repo with all kinds of examples:::
  * https://github.com/ansible/ansible-examples
  * https://gist.github.com/marktheunissen/2979474
 Jinja2 docs:::
  http://jinja.pocoo.org/docs/
--- a/modules/sysadmin_guide/pages/apps-fp-o.adoc
+++ b/modules/sysadmin_guide/pages/apps-fp-o.adoc
@ -0,0 +1,31 @@
 = apps-fp-o SOP
 Updating and maintaining the landing page at
 https://apps.fedoraproject.org/
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-apps, #fedora-admin
 Servers:::
  proxy0*
 Purpose:::
  Have a nice landing page for all our webapps.
 == Description
 We have a number of webapps, many of which our users don't know about.
 This page was created so there was a central place where users could
 stumble through them and learn.
 The page is generated by a ansible role in ansible/roles/apps-fp-o/ It
 makes use of an RPM package, the source code for which is at
 https://github.com/fedora-infra/apps.fp.o
 You can update the page by updating the apps.yaml file in that ansible
 module.
 When ansible is run next, the two ansible handlers should see your
 changes and regenerate the static html and json data for the page.
--- a/modules/sysadmin_guide/pages/archive-old-fedora.adoc
+++ b/modules/sysadmin_guide/pages/archive-old-fedora.adoc
@ -0,0 +1,60 @@
 = How to Archive Old Fedora Releases
 The Fedora download servers contain terabytes of data, and to allow for
 mirrors to not have to take all of that data, infrastructure regularly
 moves data of end of lifed releases (from /pub/fedora/linux) to the
 archives section (/pub/archive/fedora/linux)
 == Steps Involved
 [arabic]
 . log into batcave01.phx2.fedoraproject.org and ssh to bodhi-backend01
 +
 $ sudo -i ssh root@bodhi-backend01.iad2.fedoraproject.org # su - ftpsync
 $
 . Then change into the releases directory.
 +
 $ cd /pub/fedora/linux/releases
 . Check to see that the target directory doesn't already exist.
 +
 $ ls /pub/archive/fedora/linux/releases/
 . If the target directory does not already exist, do a recursive link
 copy of the tree you want to the target
 +
 $ cp -lvpnr 21 /pub/archive/fedora/linux/releases/21
 . If the target directory already exists, then we need to do a recursive
 rsync to update any changes in the trees since the previous copy.
 +
 $ rsync -avAXSHP --delete ./21/ /pub/archive/fedora/linux/releases/21/
 . We now do the updates and updates/testing in similar ways.
 +
 $ cd ../updates/ $ cp -lpnr 21 /pub/archive/fedora/linux/updates/21 $ cd
 testing $ cp -lpnr 21 /pub/archive/fedora/linux/updates/testing/21
 Alternative if this is a later refresh of an older copy.
 ____
 $ cd ../updates/ $ rsync -avAXSHP 21/
 /pub/archive/fedora/linux/updates/21/ $ cd testing $ rsync -avAXSHP 21/
 /pub/archive/fedora/linux/updates/testing/21/
 ____
 [arabic, start=7]
 . Do the same with fedora-secondary.
 . Announce to the mirror list this has been done and that in 2 weeks you
 will move the old trees to archives.
 . In two weeks, log into mm-backend01 and run the archive script
 +
 sudo -u mirrormanager mm2_move-to-archive --originalCategory="Fedora
 Linux" --archiveCategory="Fedora Archive" --directoryRe='/21/Everything'
 . If there are problems, the postgres DB may have issues and so you need
 to get a DBA to update the backend to fix items.
 . Wait an hour or so then you can remove the files from the main tree.
 +
 ssh bodhi-backend01 cd /pub/fedora/linux cd releases/21 ls # make sure
 you have stuff here rm -rf * ln ../20/README . cd ../../updates/21 ls #
 make sure you have stuff here rm -rf * ln ../20/README . cd
 ../testing/21 ls # make sure you have stuff here rm -rf * ln
 ../20/README .
 This should complete the archiving.
--- a/modules/sysadmin_guide/pages/arm.adoc
+++ b/modules/sysadmin_guide/pages/arm.adoc
@ -0,0 +1,205 @@
 = Fedora ARM Infrastructure
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, sysadmin-releng
 Location::
  Phoenix
 Servers::
  arm01, arm02, arm03, arm04
 Purpose::
  Information on working with the arm SOCs
 == Description
 We have 4 arm chassis in phx2, each containing 24 SOCs (System On Chip).
 Each chassis has 2 physical network connections going out from it. The
 first one is used for the management interface on each SOC. The second
 one is used for eth0 for each SOC.
 Current allocations (2016-03-11):
 arm01::
  primary builders attached to koji.fedoraproject.org
 arm02::
  primary arch builders attached to koji.fedoraproject.org
 arm03::
  In cloud network, public qa/packager and copr instances
 arm04::
  primary arch builders attached to koji.fedoraproject.org
 == Hardware Configuration
 Each SOC has:
 * eth0 and eth1 (unused) and a management interface.
 * 4 cores
 * 4GB ram
 * a 300GB disk
 SOCs are addressed by:
 ....
 arm{chassisnumber}-builder{number}.arm.fedoraproject.org
 ....
 Where chassisnumber is 01 to 04 and number is 00-23
 == PXE installs
 Kickstarts for the machines are in the kickstarts repo.
 PXE config is on noc01. (or cloud-noc01.cloud.fedoraproject.org for
 arm03)
 The kickstart installs the latests Fedora and sets them up with a base
 package set.
 == IPMI tool Management
 The SOCs are managed via their mgmt interfaces using a custom ipmitool
 as well as a custom python script called 'cxmanage'. The ipmitool
 changes have been submitted upstream and cxmanage is under review in
 Fedora.
 The ipmitool is currently installed on noc01 and it has ability to talk
 to them on their management interface. noc01 also serves dhcp and is a
 pxeboot server for the SOCs.
 However you will need to add it to your path:
 ....
 export PATH=$PATH:/opt/calxeda/bin/
 ....
 Some common commands:
 To set the SOC to boot the next time only with pxe:
 ....
 ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org chassis bootdev pxe
 ....
 To set the SOC power off:
 ....
 ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org power off
 ....
 To set the SOC power on:
 ....
 ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org power on
 ....
 To get a serial over lan console from the SOC:
 ....
 ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org -I lanplus sol activate
 ....
 == DISK mapping
 Each SOC has a disk. They are however mapped to the internal 00-23 in a
 non direct manner:
 ....
 HDD Bay   EnergyCard  SOC (Port 1) SOC Num
 0     0       3    03
 1     0       0    00
 2     0       1    01
 3     0       2    02
 4     1       3    07
 5     1       0    04
 6     1       1    05
 7     1       2    06
 8     2       3    11
 9     2       0    08
 10        2       1    09
 11        2       2    10
 12        3       3    15
 13        3       0    12
 14        3       1    13
 15        3       2    14
 16        4       3    19
 17        4       0    16
 18        4       1    17
 19        4       2    18
 20        5       3    23
 21        5       0    20
 22        5       1    21
 23        5       2    22
 ....
 Looking at the system from the front, the bay numbering starts from left
 to right.
 == cxmanage
 The cxmanage tool can be used to update firmware or gather diag info.
 Until cxmanage is packaged, you can use it from a python virtualenv:
 ....
 virtualenv --system-site-packages cxmanage
 cd cxmanage
 source bin/activate
 pip install --extra-index-url=http://sources.calxeda.com/python/packages/ cxmanage
 <use cxmanage>
 deactivate
 ....
 Some cxmanage commands
 ....
 cxmanage sensor arm03-builder00-mgmt.arm.fedoraproject.org 
 Getting sensor readings...
 1 successes  |  0 errors  |  0 nodes left  |  .  
 MP Temp 0
 arm03-builder00-mgmt.arm.fedoraproject.org: 34.00 degrees C
 Minimum         : 34.00 degrees C
 Maximum         : 34.00 degrees C
 Average         : 34.00 degrees C
 ... (and about 20 more sensors)...
 ....
 ....
 cxmanage info arm03-builder00-mgmt.arm.fedoraproject.org 
 Getting info...
 1 successes  |  0 errors  |  0 nodes left  |  .  
 [ Info from arm03-builder00-mgmt.arm.fedoraproject.org ]
 Hardware version   : EnergyCard X04
 Firmware version   : ECX-1000-v2.1.5
 ECME version       : v0.10.2
 CDB version        : v0.10.2
 Stage2boot version : v1.1.3
 Bootlog version    : v0.10.2
 A9boot version     : v2012.10.16-3-g66a3bf3
 Uboot version      : v2013.01-rc1_cx_2013.01.17
 Ubootenv version   : v2013.01-rc1_cx_2013.01.17
 DTB version        : v3.7-4114-g34da2e2
 ....
 firmware update:
 ....
 cxmanage --internal-tftp 10.5.126.41:6969 --all-nodes fwupdate package ECX-1000_update-v2.1.5.tar.gz arm03-builder00-mgmt.arm.fedoraproject.org
 ....
 (note that this runs against the 00 management interface for the chassis
 and updates all the nodes), and that we must run a tftpserver on port
 6969 for firewall handling.
 == Links
 http://sources.calxeda.com/python/packages/cxmanage/
 == Contacts
 help.desk@boston.co.uk is the contact to send repair requests to.
--- a/modules/sysadmin_guide/pages/askbot.adoc
+++ b/modules/sysadmin_guide/pages/askbot.adoc
@ -0,0 +1,359 @@
 = Ask Fedora SOP
 To set up https://ask.fedoraproject.org based on Askbot as a question
 and answer support forum for the Fedora community. A production instance
 could be seen at https://ask.fedoraproject.org and the staging instance
 is at http://ask.stg.fedoraproject.org/
 This page describes how to set up and customize it from scratch.
 == Contents
 [arabic]
 . Contact Information
 . Creating database
 . Setting up the forum
 . Adding administrators
 . Change settings within the forum
 . Database tweaks
 . Debugging
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  anyone from the sysadmin team
 Sponsor::
  nirik
 Location::
  phx2
 Servers::
  ask01 , ask01.stg
 Purpose::
  To host Ask Fedora
 == Creating database
 We use the postgresql database backend. To add the database to a
 postgresql server:
 ....
 # psql -U postgres
 postgres# create user askfedora with password 'xxx';
 postgres# create database askfedora;
 postgres# ALTER DATABASE askfedora owner to askfedora;
 postgres# \q;
 ....
 Now setup the db tables if this is a new install:
 ....
 python manage.py syncdb
 python manage.py migrate askbot
 python manage.py migrate django_authopenid #embedded login application
 ....
 == Setting up the forum
 Askbot is packaged and available in Rawhide, Fedora 16 and EPEL 6. On a
 RHEL 6 system, you need to install EPEL 6 repo first.:
 ....
 # yum install askbot
 ....
 The /etc/askbot/sites/ask/conf/settings.py file should look something
 like:
 ....
 DATABASE_ENGINE = 'postgresql_psycopg2'
 DATABASE_NAME = 'testaskbot'
 DATABASE_USER = 'askbot'
 DATABASE_PASSWORD = 'xxxxx'
 DATABASE_HOST = '127.0.0.1'
 DATABASE_PORT = '5432'
 # Outgoing mail server settings
 #
 DEFAULT_FROM_EMAIL = 'askfedora@fedoraproject.org'
 EMAIL_SUBJECT_PREFIX = '[Askfedora]'
 EMAIL_HOST='127.0.0.1'
 EMAIL_PORT='25'
 # This variable points to the Askbot plugin which will be used for user
 # authentication.  Not enabled yet because we don't need FAS auth but use 
 # Fedora id as a openid provider.
 #
 # ASKBOT_CUSTOM_AUTH_MODULE = 'authfas'
 Now Ask Fedora website should be accessible from the browser.
 ....
 == Adding administrators
 As of Askbot version 0.7.21, the first user who logs in automatically
 becomes the administrator. In previous versions, you have to do the
 following.:
 ....
 # cd /etc/askbot/sites/ask/conf/
 # python manage.py add_admin 1
   Do you really wish to make user (id=1, name=pjp) a site administrator?
   yes/no: yes
 ....
 Once a user is marked as a administrator, he or she can go into anyone's
 profile, go the "moderation" tab in the end and mark them as
 administrator or moderator as well as block or suspend a user.
 == Change settings within the forum
 * {blank}
 +
 Data entry and display:::
  ** Disable "Allow asking questions anonymously"
  ** Enable "Force lowercase the tags"
  ** Change "Format of tag list" to "cloud"
  ** Change "Minimum length of search term for Ajax search" to "3"
  ** Change "Number of questions to list by default" to "50"
  ** Change "What should "unanswered question" mean?" to "Question has
  no
  ** answers"
 * {blank}
 +
 Email and email alert settings::
  ** Change "Default news notification frequency" to "Instantly"
 * {blank}
 +
 Flatpages - about, privacy policy, etc.::
  Change "Text of the Q&A forum About page (html format)" to the
  following:
  +
 ....
 Ask Fedora provides a community edited knowledge base and support forum
 for the Fedora community. Make sure you read the FAQ and search for
 existing questions before asking yours. If you want to provide feedback,
 just a question in this site! Tag your questions "meta" to highlight your
 questions to the administrators of Ask Fedora.
 ....
 * {blank}
 +
 Login provider settings::
  ** Disable "Activate local login"
 * {blank}
 +
 Q&A forum website parameters and urls::
  ** {blank}
  +
  Change "Site title for the Q&A forum" to "Ask Fedora: Community
  Knowledge;;
    Base and Support Forum"
  ** {blank}
  +
  Change "Comma separated list of Q&A site keywords" to "Ask Fedora,
  forum,;;
    community, support, help"
  ** {blank}
  +
  Change "Copyright message to show in the footer" to "All content is
  under;;
    Creative Commons Attribution Share Alike License. Ask Fedora is
    community maintained and Red Hat or Fedora Project is not
    responsible for content"
  ** {blank}
  +
  Change "Site description for the search engines" to "Ask Fedora:
  Community;;
    Knowledge Base and Support Forum"
  ** Change "Short name for your Q&A forum" to "Ask Fedora"
  ** {blank}
  +
  Change "Base URL for your Q&A forum, must start with http or https"
  to;;
    "http://ask.fedoraproject.org"
 * {blank}
 +
 Sidebar widget settings - main page::
  ** Disable "Show avatar block in sidebar"
  ** Disable "Show tag selector in sidebar"
 * Skin and User Interface settings
 ** Upload "Q&A site logo"
 ** Upload "Site favicon". Must be a ICO format file because that is the
 only one IE supports as a fav icon.
 ** Enable "Apply custom style sheet (CSS)"
 ** Upload the following custom CSS:
 +
 ....
 #ab-main-nav a {
 color: #333333;
 background-color: #d8dfeb;
 border: 1px solid #888888;
 border-bottom: none;
 padding: 0px 12px 3px 12px;
 height: 25px;
 line-height: 30px;
 margin-right: 10px;
 font-size: 18px;
 font-weight: 100;
 text-decoration: none;
 display: block;
 float: left;
 }
 #ab-main-nav a.on {
 height: 24px;
 line-height: 28px;
 border-bottom: 1px solid #0a57a4;
 border-right: 1px solid #0a57a4;
 border-top: 1px solid #0a57a4;
 border-left: 1px solid #0a57a4; /*background:#A31E39; */
 background: #0a57a4;
 color: #FFF;
 font-weight: 800;
 text-decoration: none
 }
 #ab-main-nav a.special {
 font-size: 18px;
 color: #072b61;
 font-weight: bold;
 text-decoration: none;
 }
 /* tabs stuff */
 .tabsA { float: right; }
 .tabsC { float: left; }
 .tabsA a.on, .tabsC a.on, .tabsA a:hover, .tabsC a:hover {
 background: #fff;
 color: #072b61;
 border-top: 1px solid #babdb6;
 border-left: 1px solid #babdb6;
 border-right: 1px solid #888a85;
 border-bottom: 1px solid #888a85;
 height: 24px;
 line-height: 26px;
 margin-top: 3px;
 }
 .tabsA a.rev.on, tabsA a.rev.on:hover {
 padding: 0px 2px 0px 7px;
 }
 .tabsA a, .tabsC a{
 background: #f9f7eb;
 border-top: 1px solid #eeeeec;
 border-left: 1px solid #eeeeec;
 border-right: 1px solid #a9aca5;
 border-bottom: 1px solid #888a85;
 color: #888a85;
 display: block;
 float: left;
 height: 20px;
 line-height: 22px;
 margin: 5px 0 0 4px;
 padding: 0 7px;
 text-decoration: none;
 }
 .tabsA .label, .tabsC .label {
 float: left;
 font-weight: bold;
 color: #777;
 margin: 8px 0 0 0px;
 }
 .tabsB a {
 background: #eee;
 border: 1px solid #eee;
 color: #777;
 display: block;
 float: left;
 height: 22px;
 line-height: 28px;
 margin: 5px 0px 0 4px;
 padding: 0 11px 0 11px;
 text-decoration: none;
 }
 a {
 color: #072b61;
 text-decoration: none;
 cursor: pointer;
 }
 div.side-box
 {
 width:200px;
 padding:10px;
 border:3px solid #CCCCCC;
 margin:0px;
 background: -moz-linear-gradient(top, #DDDDDD, #FFFFFF);
 }
 ....
 == Database tweaks
 To automatically delete expired sessions, we run a trigger that makes
 PostgreSQL delete them upon inserting a new one.
 The code used to create this trigger was:
 ....
 askfedora=# CREATE FUNCTION delete_old_sessions() RETURNS trigger
 askfedora-# LANGUAGE plpgsql
 askfedora-# AS $$
 askfedora$# BEGIN
 askfedora$# DELETE FROM django_session WHERE expire_date<current_timestamp;
 askfedora$# RETURN NEW;
 askfedora$# END
 askfedora$# $$;
 CREATE FUNCTION
 askfedora=# CREATE TRIGGER old_sessions_gc
 askfedora-# AFTER INSERT ON django_session
 askfedora-# EXECUTE PROCEDURE delete_old_sessions();
 ....
 In case this trigger causes any problems, please remove it by running:
 `DROP TRIGGER old_sessions_gc;`
 To make this perform, we have a custom index that's not in upstream
 askbot, please remember to add that when recreating the trigger:
 ....
 CREATE INDEX CONCURRENTLY django_session_expire_date ON django_session (expire_date);
 ....
 If you deleted the trigger, or reinstalled without trigger, please make
 sure to run `manage.py clean_sessions` regularly, so you don't end up
 with a database that's too massive in size.
 == Debugging
 Set DEBUG to True in settings.py file and restart Apache.
 == Auth issues
 Users can login to ask with a variety of social media accounts. Once
 they login with one they can attach other ones as well.
 If a user forgets what social media they used, you can look in the
 database:
 Login to database host (db01.phx2.fedoraproject.org) # sudo -u postgres
 psql askfedora psql> select * from django_authopenid_userassociation
 where user_id like '%username%';
 If they can login again with the same auth, ask them to do so. If not,
 you can add the fedora account system openid auth to allow them to login
 with that:
 psql> insert into django_authopenid_userassociation (user_id,
 openid_url,provider_name) VALUES (2595,
 'http://name.id.fedoraproject.org', 'fedoraproject');
 Use the ID from the previous query and replace name with the users fas
 name.
--- a/modules/sysadmin_guide/pages/aws-access.adoc
+++ b/modules/sysadmin_guide/pages/aws-access.adoc
@ -0,0 +1,152 @@
 = Amazon Web Services Access
 AWS includes a highly granular set of access policies, which can be
 combined into roles and groups. Ipsilon is used to translate between IAM
 policy groupings and groups in the Fedora Account System (FAS). Tags and
 namespaces are used to keep roles resources seperate.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  nirik, pfrields
 Location::
  ?
 Servers::
  N/A
 Purpose::
  Provide AWS resource access to contributors via FAS group membership.
 == Accessing the AWS Console
 To access the AWS Console via Ipsilon authentication, use
 https://id.fedoraproject.org/saml2/SSO/Redirect?SPIdentifier=urn:amazon:webservices&RelayState=https://console.aws.amazon.com[this
 SAML link].
 You must be in the
 https://admin.fedoraproject.org/accounts/group/view/aws-iam[aws-iam FAS
 group] (or another group with access) to perform this action.
 === Adding a role to AWS IAM
 Sign into AWS via the URL above, and visit
 https://console.aws.amazon.com/iam/home[Identity and Access Management
 (IAM)] in the Security, Identity and Compliance tools.
 Choose Roles to view current roles. Confirm there is not already a role
 matching the one you need. If not, create a new role as follows:
 [arabic]
 . Select _Create role_.
 . Select _SAML 2.0 federation_.
 . Choose the SAML provider _id.fedoraproject.org_, which should already
 be populated as a choice from previous use.
 . Select the attribute _SAML:aud_. For value, enter
 _https://signin.aws.amazon.com/saml_. Do not add a condition. Proceed to
 the next step.
 . Assign the appropriate policies from the pre-existing IAM policies.
 It's unlikely you'll have to create your own, which is outside the scope
 of this SOP. Then proceed to the next step.
 . Set the role name and description. It is recommended you use the
 _same_ role name as the FAS group for clarity. Fill in a longer
 description to clarify the purpose of the role. Then choose _Create
 role_.
 Note or copy the Role ARN (Amazon Resource Name) for the new role.
 You'll need this in the mapping below.
 === Adding a group to FAS
 When finished, login to FAS and create a group to correspond to the new
 role. Use the prefix _aws-_ to denote new AWS roles in FAS. This makes
 them easier to locate in a search.
 It may be appropriate to set group ownership for _aws-_ groups to an
 Infrastructure team principal, and then add others as users or sponsors.
 This is especially worth considering for groups that have modify (full)
 access to an AWS resource.
 === Adding an IAM role mapping in Ipsilon
 Add the new role mapping for FAS group to Role ARN in the ansible git
 repo, under _roles/ipsilon/files/infofas.py_. Current mappings look like
 this:
 ....
 aws_groups = {
    'aws-master': 'arn:aws:iam::125523088429:role/aws-master',
    'aws-iam': 'arn:aws:iam::125523088429:role/aws-iam',
    'aws-billing': 'arn:aws:iam::125523088429:role/aws-billing',
    'aws-atomic': 'arn:aws:iam::125523088429:role/aws-atomic',
    'aws-s3-readonly': 'arn:aws:iam::125523088429:role/aws-s3-readonly'
 }
 ....
 Add your mapping to the dictionary as shown. Start a new build/rollout
 of the ipsilon project in openshift to make the changes live.
 === User accounts
 If you only need to use the web interface to aws, a role (and associated
 policy) should be all you need, however, if you need cli access, you
 will need a user and a token. Users should be named the same as the role
 they are associated with.
 === Role and User policies
 Each Role (and user if there is a user needed for the role) should have
 the same policy attached to it. Policies are named
 'fedora-$rolename-$service' ie, 'fedora-infra-ec2'. A copy of polices is
 available in the ansible repo under files/aws/iam/policies. These are in
 json form.
 Policies are setup such that roles/users can do most things with a
 resource if it's untagged. If it's tagged it MUST be tagged with their
 group: FedoraGroup / $groupname. If it's tagged with another group name,
 they cannot do anything with or to that resource. (Aside from seeing it
 exists).
 If there's a permssion you need, please file a ticket and it will be
 evaluated.
 Users MUST keep tokens private and secure. YOU are responsible for all
 use of tokens issued to you from Fedora Infrastructure. Report any
 compromised or possibly public tokens as soon as you are aware.
 Users MUST tag resources with their FedoraGroup tag within one day, or
 the resource may be removed.
 === ec2
 users/roles with ec2 permissions should always tag their instances with
 their FedoraGroup as soon as possible. Untagged resources can be
 terminated at any time.
 === s3
 users/roles with s3 permissions will be given specific bucket(s) that
 they can manage/use. Care should be taken to make sure nothing in them
 is public that should not be.
 === cloudfront
 Please file a ticket if you need cloudfront and infrastructure will do
 any needed setup if approved.
 == Regions
 Users/groups are encouraged to use regions 'near' them or wherever makes
 the most sense. If you are trying to create ec2 instances you will need
 infrastructure to create a vpc in the region with network, etc. File a
 ticket for such requests.
 == Other Notes
 AWS resource access that is not read-only should be treated with care.
 In some cases, Amazon or other entities may absorb AWS costs, so changes
 in usage can cause issues if not controlled or monitored. If you have
 doubts about access, consult the Fedora Project Leader or Fedora
 Engineering Manager.
--- a/modules/sysadmin_guide/pages/basset.adoc
+++ b/modules/sysadmin_guide/pages/basset.adoc
@ -0,0 +1,118 @@
 = Basset anti-spam service
 Since the Fedora Project has come under targeted spam attacks, we have
 decided to create a service that all our applications can hook into to
 have a central repository for anti-spam procedures. Basset is this
 service, and it's hosted on https://pagure.io/basset.
 == Contents
 [arabic]
 . Contact Information
 . Overview
 . FAS
 . Trac
 . Wiki
 . Setup
 . Outage
 == Contact Information
 Owner::
  Patrick Uiterwijk (puiterwijk)
 Contact::
  #fedora-admin, #fedora-apps, #fedora-noc, sysadmin-main
 Location::
  basset01
 Purpose::
  Centralized anti-spam
 == Overview
 Basset is a central anti-spam service: it received messages from
 services that certain actions happened, and will then decide to accept
 or deny the request, or pass it on to an administrator.
 At the moment, we have the following modules live: FAS, trac, wiki.
 == FAS
 This module receives notifications from FAS about new users
 registrations and new users signing the FPCA. With Basset enabled, FAS
 will not automatically accept a new user registration or a FPCA signing,
 but instead let Basset know a user tried to perform these actions and
 then depend on Basset to enact this.
 In the case of registration this is done by setting the user to a
 spamcheck_awaiting status. As soon as Basset made a decision, it will
 set the user to spamcheck_manual, spamcheck_denied or active. If it sets
 the user to active, it will also send the welcome email to the user. If
 it made a wrong decision, and the user is set as spamcheck_manual or
 spamcheck_denied, a member of the accounts team can go to that users'
 page and click the "Enable" button to override the decision. If this
 needed to be done, please notify puiterwijk so that the rules Basset
 uses can be updated.
 For the case of the FPCA, FAS will request the cla_fpca group
 membership, but not sponsor the user. At the moment that Basset decides
 it accepts the request, it will sponsor the user into the group. If it
 declined the FPCA request, it will remove the user from the group. To
 override this decision, a member of the accounts group can go to FAS and
 manually add the user to the cla_fpca group and sponsor them into it.
 == Trac
 For Trac, if a post gets denied, the content item gets deleted, the Trac
 account gets blocked cross-instance and the FAS account gets blocked.
 To unblock the user, log in to hosted03, and remove
 /srv/web/trac/blocks/$username. For info on how to unblock the FAS user,
 see the notes under FAS.
 == Wiki
 For Wiki, if an edit gets denied, the page gets deleted, the wiki
 account blocked and the FAS account gets blocked.
 For the wiki parts of undoing this, follow the regular mediawiki unblock
 procedures using:::
  * https://fedoraproject.org/wiki/Special:BlockList to check if an user
  is blocked or not
  * https://fedoraproject.org/wiki/Special:Unblock to unblock that user
 Don't forget to unblock the account as in FAS.
 == Setup
 At this moment, Basset runs on a single server (basset01(.stg)), and
 runs the frontend, message broker and worker all on a single server. For
 all of it to work, the following services are used: - httpd (frontend) -
 rabbitmq-server (broker) - mongod (mongo database server for storage of
 internal info) - basset-worker (worker)
 == Outage
 The consequences of certain services not being up results in various
 conditions:
 If the httpd or frontend aren't up, no new messages will come in. FAS
 will set the user to spamcheck_awaiting, but not submit it to Basset.
 Work is in progress on a script to submit such entries to the queue
 after Basset frontend is back. However, since this part of the code is
 so small, this is not likely to be the part that's down. (You can know
 that it is because the FAS logs will log an error instead of "result:
 checking".)
 If the worker or the mongo server are down, no messages will be
 processed, but all messages queued up will be processed the moment both
 of the services start again: as long as a message makes it into the
 queue, it will be processed until completion.
 If the worker encounters an error during processing of a message, it
 will dump a tracedump into the journal log file, and stop processing any
 messages. Resolve the condition reported in the error and restart the
 basset-worker service, and all work will be continued, starting with the
 message it was processing when it errored out.
 This means that as long as the message is queued, the worker will pick
 it up and handle it.
--- a/modules/sysadmin_guide/pages/bastion-hosts-info.adoc
+++ b/modules/sysadmin_guide/pages/bastion-hosts-info.adoc
@ -0,0 +1,31 @@
 = Fedora Bastion Hosts
 == Description
 There are 2 primary bastion hosts in the phx2 datacenter. One will be
 active at any given time and the second will be a hot spare, ready to
 take over. Switching between bastion hosts is currently a manual process
 that requires changes in ansible.
 There is also a bastion-comm01 bastion host for the qa.fedoraproject.org
 network. This is used in cases where users only need to access resources
 in that qa.fedoraproject.org.
 All of the bastion hosts have an external IP that is mapped into them.
 The reverse dns for these IPs is controlled by RHIT, so any changes must
 be carefully coordinated.
 The active bastion host performs the following functions:
 * Outgoing smtp from fedora servers. This includes email aliases,
 mailing list posts, build and commit notices, mailing list posts, etc.
 * Incoming smtp from servers in phx2 or on the fedora vpn. Incoming mail
 directly from the outside is NOT accepted or forwarded.
 * ssh access to all phx2/vpn connected servers.
 * openvpn hub. This is the hub that all vpn clients connect to and talk
 to each other via. Taking down or stopping this service will be a major
 outage of services as all proxy and app servers use the vpn to talk to
 each other.
 When rebuilding these machines, care must be taken to match up the dns
 names externally, and to preserve the ssh host keys.
--- a/modules/sysadmin_guide/pages/bladecenter.adoc
+++ b/modules/sysadmin_guide/pages/bladecenter.adoc
@ -0,0 +1,52 @@
 = BladeCenter Access Infrastructure SOP
 Many of the builders in PHX are blades in a blade center. A few other
 machines are also on blades.
 == Contents
 [arabic]
 . Contact Information
 . Common Tasks
 ____
 [arabic]
 . Logging into the web interface
 . Using the Serial Console of Blades
 ____
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  PHX
 Purpose::
  Contains blades used for buildsystems, etc
 == Common Tasks
 === Logging into the web interface
 The web interface to the bladecenters let you reset power, etc. They are
 bc01-mgmt and bc02-mgmt.
 === Using the Serial Console of Blades
 All of the blades are set up with a serial console over lan (SOL). To
 use this, ssh into the bladecenter. You can then pick your system and
 bring up a console with:
 ....
 env -T system:blade[x]
 console -o
 ....
 where x is the blade number (can be determined from web interface, etc)
 To leave the console session, press Esc (
 For more details on BladeCenter SOL, see
 http://www-304.ibm.com/systems/support/supportsite.wss/docdisplay?brandind=5000008&lndocid=MIGR-54666
--- a/modules/sysadmin_guide/pages/blockerbugs.adoc
+++ b/modules/sysadmin_guide/pages/blockerbugs.adoc
@ -0,0 +1,157 @@
 = Blockerbugs Infrastructure SOP
 https://pagure.io/fedora-qa/blockerbugs[Blockerbugs] is an app developed
 by Fedora QA to aid in tracking items related to release blocking and
 freeze exception bugs in branched Fedora releases.
 == Contents
 [arabic]
 . Contact Information
 . File Locations
 . Upgrade Process
 * Upgrade Preparation (for all upgrades)
 * Minor Upgrade (no db change)
 * Major Upgrade (with db changes)
 == Contact Information
 Owner::
  Fedora QA Devel
 Contact::
  #fedora-qa
 Location::
  Phoenix
 Servers::
  blockerbugs01.phx2, blockerbugs02.phx2, blockerbugs01.stg.phx2
 Purpose::
  Hosting the https://pagure.io/fedora-qa/blockerbugs[blocker bug
  tracking application] for QA
 == File Locations
 `/etc/blockerbugs/settings.py` - configuration for the app
 === Node Roles
 blockerbugs01.stg.phx2::
  the staging instance, it is not load balanced
 blockerbugs01.phx2::
  one of the load balanced production nodes, it is responsible for
  running bugzilla/bodhi/koji sync
 blockerbugs02.phx2::
  the other load balanced production node. It does not do any sync
  operations
 == Building for Infra
 === Do not use mock
 For whatever reason, the `epel7-infra` koji tag rejects SRPMs with the
 `el7.centos` dist tag. Make sure that you build SRPMs with:
 ....
 rpmbuild -bs --define='dist .el7' blockerbugs.spec
 ....
 Also note that this expects the release tarball to be in
 `~/rpmbuild/SOURCES/`.
 === Building with Koji
 You'll need to ask someone who has rights to build into `epel7-infra`
 tag to make the build for you:
 ....
 koji build epel7-infra blockerbugs-0.4.4.11-1.el7.src.rpm
 ....
 [NOTE]
 .Note
 ====
 The fun bit of this is that `python-flask` is only available on `x86_64`
 builders. If your build is routed to one of the non-x86_64, it will
 fail. The only solution available to us is to keep submitting the build
 until it's routed to one of the x86_64 builders and doesn't fail.
 ====
 Once the build is complete, it should be automatically tagged into
 `epel7-infra-stg` (after a ~15 min delay), so that you can test it on
 blockerbugs staging instance. Once you've verified it's working well,
 ask someone with infra rights to move it to `epel7-infra` tag so that
 you can update it in production.
 == Upgrading
 Blockerbugs is currently configured through ansible and all
 configuration changes need to be done through ansible.
 === Upgrade Preparation (all upgrades)
 Blockerbugs is not packaged in epel, so the new build needs to exist in
 the infrastructure stg repo for deployment to stg or the infrastructure
 repo for deployments to production.
 See the blockerbugs documentation for instructions on building a
 blockerbugs RPM.
 === Minor Upgrades (no database changes)
 Run the following on *both* `blockerbugs01.phx2` and
 `blockerbugs02.phx2` if updating in production.
 [arabic]
 . Update ansible with config changes, push changes to the ansible repo:
 +
 ....
 roles/blockerbugs/templates/blockerbugs-settings.py.j2
 ....
 . Clear yum cache and update the blockerbugs RPM:
 +
 ....
 yum clean expire-cache && yum update blockerbugs
 ....
 . Restart httpd to reload the application:
 +
 ....
 service httpd restart
 ....
 === Major Upgrades (with database changes)
 Run the following on *both* `blockerbugs01.phx2` and
 `blockerbugs02.phx2` if updating in production.
 [arabic]
 . Update ansible with config changes, push changes to the ansible repo:
 +
 ....
 roles/blockerbugs/templates/blockerbugs-settings.py.j2
 ....
 . Stop httpd on *all* relevant instances (if load balanced):
 +
 ....
 service httpd stop
 ....
 . Clear yum cache and update the blockerbugs RPM on all relevant
 instances:
 +
 ....
 yum clean expire-cache && yum update blockerbugs
 ....
 . Upgrade the database schema:
 +
 ....
 blockerbugs upgrade_db
 ....
 . Check the upgrade by running a manual sync to make sure that nothing
 unexpected went wrong:
 +
 ....
 blockerbugs sync
 ....
 . Start httpd back up:
 +
 ....
 service httpd start
 ....
--- a/modules/sysadmin_guide/pages/bodhi.adoc
+++ b/modules/sysadmin_guide/pages/bodhi.adoc
@ -0,0 +1,429 @@
 = Bodhi Infrastructure SOP
 Bodhi is used by Fedora developers to submit potential package updates
 for releases and to manage buildroot overrides. From here, bodhi handles
 all of the dirty work, from sending around emails, dealing with Koji, to
 composing the repositories.
 Bodhi production instance: https://bodhi.fedoraproject.org Bodhi project
 page: https://github.com/fedora-infra/bodhi
 == Contents
 [arabic]
 . Contact Information
 . Adding a new pending release
 . 0-day Release Actions
 . Configuring all bodhi nodes
 . Pushing updates
 . Monitoring the bodhi composer output
 . Resuming a failed push
 . Performing a production bodhi upgrade
 . Syncing the production database to staging
 . Release EOL
 . Adding notices to the front page or new update form
 . Using the Bodhi Shell to modify updates by hand
 . Using the Bodhi shell to fix uniqueness problems with e-mail addresses
 . Troubleshooting and Resolution
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  bowlofeggs
 Location::
  Phoenix
 Servers::
  * bodhi-backend01.phx2.fedoraproject.org (composer)
  * os.fedoraproject.org (web front end and backend task workers for
  non-compose tasks)
  * bodhi-backend01.stg.phx2.fedoraproject.org (staging composer)
  * os.stg.fedoraproject.org (staging web front end and backend task
  workers for non-compose tasks)
 Purpose::
  Push package updates, and handle new submissions.
 == Adding a new pending release
 Adding and modifying releases is done using the
 [.title-ref]#bodhi-manage-releases# tool.
 You can add a new pending release by running this command:
 ....
 bodhi-manage-releases create --name F23 --long-name "Fedora 23" --id-prefix FEDORA --version 23 --branch f23 --dist-tag f23 --stable-tag f23-updates --testing-tag f23-updates-testing --candidate-tag f23-updates-candidate --pending-stable-tag f23-updates-pending --pending-testing-tag f23-updates-testing-pending --override-tag f23-override --state pending
 ....
 == Pre-Beta Bodhi config
 Enable pre_beta policy in bodhi config in ansible.::::
  ansible/roles/bodhi2/base/templates/production.ini.j2
 Uncomment or add the following lines:
 ....
 #f29.status = pre_beta
 #f29.pre_beta.mandatory_days_in_testing = 3
 #f29.pre_beta.critpath.min_karma = 1
 #f29.pre_beta.critpath.stable_after_days_without_negative_karma = 14
 ....
 == Post-Beta Bodhi config
 Enable post_beta policy in bodhi config in ansible.::::
  ansible/roles/bodhi2/base/templates/production.ini.j2
 Comment or remove the following lines corresponding to pre_beta policy:
 ....
 #f29.status = pre_beta
 #f29.pre_beta.mandatory_days_in_testing = 3
 #f29.pre_beta.critpath.min_karma = 1
 #f29.pre_beta.critpath.stable_after_days_without_negative_karma = 14
 ....
 Uncomment or add the following lines for post_beta policy
 ....
 #f29.status = post_beta
 #f29.post_beta.mandatory_days_in_testing = 7
 #f29.post_beta.critpath.min_karma = 2
 #f29.post_beta.critpath.stable_after_days_without_negative_karma = 14
 ....
 == 0-day Release Actions
 * update atomic config
 * run the ansible playbook
 Going from pending to a proper release in bodhi requires a few steps:
 Change state from pending to current:
 ....
 bodhi-manage-releases edit --name F23 --state current
 ....
 You may also need to disable any pre-beta or post-beta policy defined in
 the bodhi config in ansible.:
 ....
 ansible/roles/bodhi2/base/templates/production.ini.j2
 ....
 Uncomment or remove the lines related to pre and post beta polcy
 ....
 #f29.status = post_beta
 #f29.post_beta.mandatory_days_in_testing = 7
 #f29.post_beta.critpath.min_karma = 2
 #f29.post_beta.critpath.stable_after_days_without_negative_karma = 14
 #f29.status = pre_beta
 #f29.pre_beta.mandatory_days_in_testing = 3
 #f29.pre_beta.critpath.min_karma = 1
 #f29.pre_beta.critpath.stable_after_days_without_negative_karma = 14
 ....
 == Configuring all bodhi nodes
 Run this command from the [.title-ref]#ansible# checkout to configure
 all of bodhi in production:
 ....
 # This will configure the backends
 $ sudo rbac-playbook playbooks/groups/bodhi2.yml
 # This will configure the frontend
 $ sudo rbac-playbook openshift-apps/bodhi.yml
 ....
 == Pushing updates
 SSH into the [.title-ref]#bodhi-backend01# machine and run:
 ....
 $ sudo -u apache bodhi-push
 ....
 You can restrict the updates by release and/or request:
 ....
 $ sudo -u apache bodhi-push --releases f23,f22 --request stable
 ....
 You can also push specific builds:
 ....
 $ sudo -u apache bodhi-push --builds openssl-1.0.1k-14.fc22,openssl-1.0.1k-14.fc23
 ....
 This will display a list of updates that are ready to be pushed.
 == Monitoring the bodhi composer output
 You can monitor the bodhi composer via the `bodhi` CLI tool, or via the
 systemd journal on `bodhi-backend01`:
 ....
 # From the comfort of your own laptop.
 $ bodhi composes list
 # From bodhi-backend01
 $ journalctl -f -u fedmsg-hub
 ....
 == Resuming a failed push
 If a push fails for some reason, you can easily resume it on
 `bodhi-backend01` by running:
 ....
 $ sudo -u apache bodhi-push --resume
 ....
 == Performing a bodhi upgrade
 === Build Bodhi
 Bodhi is deployed from the infrastructure Koji repositories. At the time
 of this writing, it is deployed from the `f29-infra` and `f29-infra-stg`
 (for staging) repositories. Bodhi is built for these repositories from
 the `master` branch of the
 https://src.fedoraproject.org/rpms/bodhi[bodhi dist-git repository].
 As an example, to build a Bodhi beta for the `f29-infra-stg` repository,
 you can use this command:
 ....
 $ rpmbuild --define "dist .fc29.infra" -bs bodhi.spec
 Wrote: /home/bowlofeggs/rpmbuild/SRPMS/bodhi-3.13.0-0.0.beta.e0ca5bc.fc29.infra.src.rpm
 $ koji build f29-infra /home/bowlofeggs/rpmbuild/SRPMS/bodhi-3.13.0-0.0.beta.e0ca5bc.fc29.infra.src.rpm
 ....
 When building a Bodhi release that is intended for production, we should
 build from the production dist-git repo instead of uploading an SRPM:
 ....
 $ koji build f29-infra git+https://src.fedoraproject.org/rpms/bodhi.git#d64f40408876ec85663ec52888c4e44d92614b37
 ....
 All builds against the `f29-infra` build target will go into the
 `f29-infra-stg` repository. If you wish to promote a build from staging
 to production, you can do something like this command:
 ....
 $ koji move-build f29-infra-stg f29-infra bodhi-3.13.0-1.fc29.infra
 ....
 === Staging
 The upgrade playbook will apply configuration changes after running the
 alembic upgrade. Sometimes you may need changes applied to the Bodhi
 systems in order to get the upgrade playbook to succeed. If you are in
 this situation, you can apply those changes by running the bodhi-backend
 playbook:
 ....
 sudo rbac-playbook -l staging groups/bodhi-backend.yml
 ....
 In the [.title-ref]#os_masters inventory#
 <https://pagure.io/fedora-infra/ansible/blob/main/f/inventory/group_vars/os_masters_stg>_,
 edit the `bodhi_version` setting it to the version you wish to deploy to
 staging. For example, to deploy `bodhi-3.13.0-1.fc29.infra` to staging,
 I would set that varible like this:
 ....
 bodhi_version: "bodhi-3.13.0-1.fc29.infra"
 ....
 Run these commands:
 ....
 # Synchronize the database from production to staging
 $ sudo rbac-playbook manual/staging-sync/bodhi.yml -l staging
 # Upgrade the Bodhi backend on staging
 $ sudo rbac-playbook manual/upgrade/bodhi.yml -l staging
 # Upgrade the Bodhi frontend on staging
 $ sudo rbac-playbook openshift-apps/bodhi.yml -l staging
 ....
 === Production
 The upgrade playbook will apply configuration changes after running the
 alembic upgrade. Sometimes you may need changes applied to the Bodhi
 systems in order to get the upgrade playbook to succeed. If you are in
 this situation, you can apply those changes by running the bodhi-backend
 playbook:
 ....
 sudo rbac-playbook groups/bodhi-backend.yml -l bodhi-backend
 ....
 In the [.title-ref]#os_masters inventory#
 <https://pagure.io/fedora-infra/ansible/blob/main/f/inventory/group_vars/os_masters>_,
 edit the `bodhi_version` setting it to the version you wish to deploy to
 production. For example, to deploy `bodhi-3.13.0-1.fc29.infra` to
 production, I would set that varible like this:
 ....
 bodhi_version: "bodhi-3.13.0-1.fc29.infra"
 ....
 To update the bodhi RPMs in production:
 ....
 # Update the backend VMs (this will also run the migrations, if any)
 $ sudo rbac-playbook manual/upgrade/bodhi.yml -l bodhi-backend
 # Update the frontend
 $ sudo rbac-playbook openshift-apps/bodhi.yml
 ....
 == Syncing the production database to staging
 This can be useful for testing issues with production data in staging:
 ....
 $ sudo rbac-playbook manual/staging-sync/bodhi.yml -l staging
 ....
 == Release EOL
 ....
 bodhi-manage-releases edit --name F21 --state archived
 ....
 == Adding notices to the front page or new update form
 You can easily add notification messages to the front page of bodhi
 using the [.title-ref]#frontpage_notice# option in
 [.title-ref]#ansible/roles/bodhi2/base/templates/production.ini.j2#. If
 you want to flash a message on the New Update Form, you can use the
 [.title-ref]#newupdate_notice# variable instead. This can be useful for
 announcing things like service outages, etc.
 == Using the Bodhi Shell to modify updates by hand
 The "bodhi shell" is a Python shell with the SQLAlchemy session and
 transaction manager initialized. It can be run from any
 production/staging backend instance and allows you to modify any models
 by hand.
 ....
 sudo pshell /etc/bodhi/production.ini
 # Execute a script that sets up the `db` and provides a `delete_update` function.
 # This will eventually be shipped in the bodhi package, but can also be found here.
 # https://raw.githubusercontent.com/fedora-infra/bodhi/develop/tools/shelldb.py
 >>> execfile('shelldb.py')
 ....
 At this point you have access to a [.title-ref]#db# SQLAlchemy Session
 instance, a [.title-ref]#t# [.title-ref]#transaction# module, and
 [.title-ref]#m# for the [.title-ref]#bodhi.models#.
 ....
 # Fetch an update, and tweak it as necessary.
 >>> up = m.Update.get(u'u'FEDORA-2016-4d226a5f7e', db)
 # Commit the transaction
 >>> t.commit()
 ....
 Here is an example of merging two updates together and deleting the
 original.
 ....
 >>> up = m.Update.get(u'FEDORA-2016-4d226a5f7e', db)
 >>> up.builds
 [<Build {'epoch': 0, 'nvr': u'resteasy-3.0.17-2.fc24'}>, <Build {'epoch': 0, 'nvr': u'pki-core-10.3.5-1.fc24'}>]
 >>> b = up.builds[0]
 >>> up2 = m.Update.get(u'FEDORA-2016-5f63a874ca', db)
 >>> up2.builds
 [<Build {'epoch': 0, 'nvr': u'resteasy-3.0.17-3.fc24'}>]
 >>> up.builds.remove(b)
 >>> up.builds.append(up2.builds[0])
 >>> delete_update(up2)
 >>> t.commit()
 ....
 == Using the Bodhi shell to fix uniqueness problems with e-mail addresses
 Bodhi currently enforces uniqueness on user e-mail addresses. There is
 https://github.com/fedora-infra/bodhi/issues/2387[an issue] filed to
 drop this upstream, but for the time being the constraint is enforced.
 This can be a problem for users who have more than one FAS account if
 they make one account use an e-mail address that was previously used by
 another account, if that other account has not logged into Bodhi since
 it was changed to use a different address. One way the user can fix this
 themselves is to log in to Bodhi with the old account so that Bodhi
 learns about its new address. However, an admin can also fix this by
 hand by using the Bodhi shell.
 For example, suppose a user has created `user_1` and `user_2`. Suppose
 that `user_1` used to use `email_a@example.com` but has been changed to
 use `email_b@example.com` in FAS, and `user_2` is now configured to use
 `email_a@example.com` in FAS. If `user_2` attempts to log in to Bodhi,
 it will cause a uniqueness violation since Bodhi does not know that
 `user_1` has changed to `email_b@example.com`. The user can simply log
 in as `user_1` to fix this, which will cause Bodhi to update its e-mail
 address to `email_b@example.com`. Or an admin can fix it with a shell on
 one of the Bodhi backend servers like this:
 ....
 [bowlofeggs@bodhi-backend02 ~][PROD]$ sudo -u apache pshell /etc/bodhi/production.ini
 2018-05-29 20:21:36,366 INFO  [bodhi][MainThread] Using python-bugzilla
 2018-05-29 20:21:36,367 DEBUG [bodhi][MainThread] Using Koji Buildsystem
 2018-05-29 20:21:42,559 INFO  [bodhi.server][MainThread] Bodhi ready and at your service!
 Python 2.7.14 (default, Mar 14 2018, 13:36:31) 
 [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux2
 Type "help" for more information.
 Environment:
  app          The WSGI application.
  registry     Active Pyramid registry.
  request      Active request object.
  root         Root of the default resource tree.
  root_factory Default root factory used to create `root`.
 Custom Variables:
  m            bodhi.server.models
 >>> u = m.User.query.filter_by(name=u'user_1').one()
 >>> u.email = u'email_b@example.com'
 >>> m.Session().commit()
 ....
 == Troubleshooting and Resolution
 === Atomic OSTree compose failure
 If the Atomic OSTree compose fails with some sort of [.title-ref]#Device
 or Resource busy# error, then run [.title-ref]#mount# to see if there
 are any stray [.title-ref]#tmpfs# mounts still active:
 ....
 tmpfs on /var/lib/mock/fedora-22-updates-testing-x86_64/root/var/tmp/rpm-ostree.bylgUq type tmpfs (rw,relatime,seclabel,mode=755)
 ....
 You can then [.title-ref]#umount
 /var/lib/mock/fedora-22-updates-testing-x86_64/root/var/tmp/rpm-ostree.bylgUq#
 and resume the push again.
 === nfs repodata cache IOError
 Sometimes you may hit an IOError during the updateinfo.xml generation
 process from createrepo_c:
 ....
 IOError: Cannot open /mnt/koji/mash/updates/epel7-160228.1356/../epel7.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/epel7-160228.1356/../epel7.repocache/repodata/repomd.xml doesn't exists or not a regular file
 ....
 This issue will be resolved with NFSv4, but in the mean time it can be
 worked around by removing the [.title-ref]#.repocache# directory and
 resuming the push:
 ....
 rm -fr /mnt/koji/mash/updates/epel7.repocache
 ....
--- a/modules/sysadmin_guide/pages/bugzilla.adoc
+++ b/modules/sysadmin_guide/pages/bugzilla.adoc
@ -0,0 +1,122 @@
 = Bugzilla Sync Infrastructure SOP
 We do not run bugzilla.redhat.com. If bugzilla itself is down we need to
 get in touch with Red Hat IT or one of the bugzilla hackers (for
 instance, Dave Lawrence (dkl)) in order to fix it.
 Infrastructure has some scripts that perform administrative functions on
 bugzilla.redhat.com. These scripts sync information from FAS and the
 Package Database into bugzilla.
 == Contents
 [arabic]
 . Contact Information
 . Description
 . Troubleshooting and Resolution
 ____
 [arabic]
 . Errors while syncing bugzilla with the PackageDB
 ____
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  abadger1999
 Location::
  Phoenix, Denver (Tummy), Red Hat Infrastructure
 Servers::
  (fas1, app5) => Need to migrate these to bapp1, bugzilla.redhat.com
 Purpose::
  Sync Fedora information to bugzilla.redhat.com
 == Description
 At present there are two scripts that sync information from Fedora into
 bugzilla.
 === export-bugzilla.py
 `export-bugzilla.py` is the first script. It is responsible for syncing
 Fedora Accounts into bugzilla. It adds Fedora packages and bug triagers
 into a bugzilla group that gives the users extra permissions within
 bugzilla. This script is run off of a cron job on FAS1. The source code
 resides in the FAS git repo in `fas/scripts/export-bugzilla.*` however
 the code we run on the servers presently lives in ansible:
 ....
 roles/fas_server/files/export-bugzilla
 ....
 === pkgdb-sync-bugzilla
 The other script is pkgdb-sync-bugzilla. It is responsible for syncing
 the package owners and cclists to bugzilla from the pkgdb. The script
 runs off a cron job on app5. The source code is in the packagedb bzr
 repo is
 `packagedb/fedora-packagedb-stable/server-scripts/pkgdb-sync-bugzilla.*`.
 Just like FAS, a separate copy is presently installed from ansbile to
 `/usr/local/bin/pkgdb-sync-bugzilla` but that should change ASAP as the
 present fedora-packagedb package installs
 `/usr/bin/pkgdb-sync-bugzilla`.
 == Troubleshooting and Resolution
 === Errors while syncing bugzilla with the PackageDB
 One frequent problem is that people will sign up to watch a package in
 the packagedb but their email address in FAS isn't a bugzilla email
 address. When this happens the scripts that try to sync the packagedb
 information to bugzilla encounter an error and send an email like this:
 ....
 Subject: Errors while syncing bugzilla with the PackageDB
 The following errors were encountered while updating bugzilla with information
 from the Package Database.  Please have the problems taken care of:
 ({'product': u'Fedora', 'component': u'aircrack-ng', 'initialowner': u'baz@zardoz.org',
 'initialcclist': [u'foo@bar.org', u'baz@zardoz.org']}, 504, 'The name foo@bar.org is not a
 valid username.  \n    Either you misspelled it, or the person has not\n    registered for a
 Red Hat Bugzilla account.')
 ....
 When this happens we attempt to contact the person with the problematic
 mail address and get them to change it. Here's a boilerplate message:
 ....
 To: foo@bar.org
 Subject: Fedora Account System Email vs Bugzilla Email
 Hello,
 You are signed up to receive bug reports against the aircrack-ng package
 in Fedora.  Unfortunately, the email address we have for you in the
 Fedora Account System is not a valid bugzilla email address.  That means
 that bugzilla won't send you mail and we're getting errors in the script
 that syncs the cclist into bugzilla.
 There's a few ways to resolve this:
 1) Create a new bugzilla account with the email foo@bar.org as
 an account at https://bugzilla.redhat.com.
 2) Change an existing account on https://bugzilla.redhat.com to use the
 foo@bar.org email address.
 3) Change your email address in https://admin.fedoraproject.org/accounts
 to use an email address that matches with an existing bugzilla email
 address.
 Please let me know what you want to do!
 Thank you,
 ....
 If the user does not reply someone in the cvsadmin group needs to go
 into the pkgdb and remove the user from the cclist for the package.
--- a/modules/sysadmin_guide/pages/bugzilla2fedmsg.adoc
+++ b/modules/sysadmin_guide/pages/bugzilla2fedmsg.adoc
@ -0,0 +1,71 @@
 = bugzilla2fedmsg SOP
 Receive events from bugzilla over the RH "unified messagebus" and
 rebroadcast them over our own fedmsg bus.
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-fedmsg, #fedora-admin, #fedora-noc
 Servers::
  bugzilla2fedmsg01
 Purpose::
  Rebroadcast bugzilla events on our bus.
 == Description
 bugzilla2fedmsg is a small service running as the 'moksha-hub' process
 which receives events from bugzilla via the RH "unified messagebus" and
 rebroadcasts them to our fedmsg bus.
 [NOTE]
 .Note
 ====
 Unlike _all_ of our other fedmsg services, this one runs as the
 'moksha-hub' process and not as the 'fedmsg-hub'.
 ====
 The bugzilla2fedmsg package provides a plugin to the moksha-hub that
 connects out over the STOMP protocol to a 'fabric' of JBOSS activemq
 FUSE brokers living in the Red Hat DMZ. We authenticate with a cert/key
 pair that is kept in /etc/pki/fedmsg/. Those brokers should push
 bugzilla events over STOMP to our moksha-hub daemon. When a message
 arrives, we query bugzilla about the change to get some 'more
 interesting' data to stuff in our payload, then we sign the message
 using a fedmsg cert and fire it off to the rest of our bus.
 This service has no database, no memcached usage. It depends on those
 STOMP brokers and being able to query bugzilla.rh.com.
 == Relevant Files
 All managed by ansible, of course:
 ____
 STOMP config: /etc/moksha/production.ini fedmsg config: /etc/fedmsg.d/
 certs: /etc/pki/fedmsg code:
 /usr/lib/python2.7/site-packages/bugzilla2fedmsg.py
 ____
 == Useful Commands
 To look at logs, run:
 ....
 $ journalctl -u moksha-hub -f
 ....
 To restart the service, run:
 ....
 $ systemctl restart moksha-hub
 ....
 == Internal Contacts
 If we need to contact someone from the RH internal "unified messagebus"
 team, search for "unified messagebus" in mojo. It is operated as a joint
 project between RHIT and PnT Devops. See also the `#devops-message` IRC
 channel, internally.
--- a/modules/sysadmin_guide/pages/cloud.adoc
+++ b/modules/sysadmin_guide/pages/cloud.adoc
@ -0,0 +1,169 @@
 = Fedora OpenStack
 == Quick Start
 Controller:
 ....
 sudo  rbac-playbook hosts/fed-cloud09.cloud.fedoraproject.org.yml
 ....
 Compute nodes:
 ....
 sudo  rbac-playbook groups/openstack-compute-nodes.yml
 ....
 == Description
 If you need to install OpenStack install, either make sure the machine
 is clean. Or use `ansible.git/files/fedora-cloud/uninstall.sh` script to
 brute force wipe off.
 [NOTE]
 .Note
 ====
 by default, the script does not wipe LVM group with VM, you have to
 clean them manually. There is commented line in that script.
 ====
 On fed-cloud09, remove the file
 `/etc/packstack_sucessfully_finished` to enforce run of packstack and
 few other commands.
 After that wipe, you have to:
 ....
 ifdown eth1
 configure eth1 to become normal Ethernet with ip
 yum install openstack-neutron-openvswitch
 /usr/bin/systemctl restart neutron-ovs-cleanup
 ifup eth1
 ....
 Additionally when reprovision OpenStack, all volumes on DellEqualogic
 are preserved and you have to manually remove them (or remove them from
 OS before it is reprovision). SSH to DellEqualogic (credentials are at
 the bottom of `/etc/cinder/cinder.conf`) and run:
 ....
 show  (to get list of volumes)
 volume select <volume_name> offline
 volume delete <volume_name>
 ....
 Before installing make sure:
 ____
 * make sure rdo repo is enabled
 * `yum install openstack-packstack openstack-packstack-puppet openstack-puppet-modules`
 * {blank}
 +
 `vim /usr/lib/python2.7/site-packages/packstack/plugins/dashboard_500.py`::
  and missing parentheses:
  +
 ....
 ``host_resources.append((ssl_key, 'ssl_ps_server.key'))``
 ....
 ____
 Now you can run playbook:
 ....
 sudo rbac-playbook hosts/fed-cloud09.cloud.fedoraproject.org.yml
 ....
 If you run it after wipe (i.e. db has been reset), you have to:
 ____
 * import ssh keys of users (only possible via webUI - RHBZ 1128233
 * reset user passwords
 ____
 == Compute nodes
 Compute node is much easier and is written as role. Use:
 ....
 vars_files:
 - ... SNIP
 - /srv/web/infra/ansible/vars/fedora-cloud.yml
 - "{{ private }}/files/openstack/passwords.yml"
 roles:
 ... SNIP 
 - cloud_compute
 ....
 Define a host variable in `inventory/host_vars/FQDN.yml`:
 ....
 compute_private_ip: 172.23.0.10
 ....
 You should also add IP to `vars/fedora-cloud.yml`
 And when adding new compute node, please update
 `files/fedora-cloud/hosts`
 [IMPORTANT]
 .Important
 ====
 When reinstalling make sure you removed all members on Dell Equalogic
 (credentials are in /etc/cinder/cinder.conf on compute node) otherwise
 the space will be blocked!!!
 ====
 == Updates
 Our openstack cloud should have updates applied and reboots when the
 rest of our servers are updated and rebooted. This will cause an outage,
 please make sure to schedule it.
 [arabic]
 . Stop copr-backend process on copr-be.cloud.fedoraproject.org
 . Kill all copr-builder instances.
 . Kill all transient/scratch instances.
 . Update all instances we control. copr, persistent, infrastructure, qa
 etc.
 . Shutdown all instances
 . Update and reboot fed-cloud09
 . Update and reboot all compute nodes
 . Start up all instances that are shutdown in step 5.
 TODO: add commands for above as we know them.
 == Troubleshooting
 * {blank}
 +
 could not connect to VM? - check your security group, default SG does
 not::
  allow any connection.
 * packstack end up with error, it is likely race condition in puppet -
 BZ 1135529. Just run it again.
 * {blank}
 +
 ERROR : append() takes exactly one argument (2 given::
  `vi /usr/lib/python2.7/site-packages/packstack/plugins/dashboard_500.py`
  and add one more surrounding ()
 * {blank}
 +
 Local ip for ovs agent must be set when tunneling is enabled::
  restart fed-cloud09 or: ssh to fed-cloud09; ifdown eth1; ifup eth1;
  ifup br-ex
 * {blank}
 +
 mongodb problem? follow::
  https://ask.openstack.org/en/question/54015/mongodbpp-error-when-installing-rdo-on-centos-7/?answer=54076#post-id-54076
 * `WARNING:keystoneclient.httpclient:Failed to retrieve management_url from token`:
 +
 ....
 keystone --os-token $ADMIN_TOKEN --os-endpoint \
 https://fedorainfracloud.org:35357/v2.0/ endpoint-create --region 'RegionOne' \
 --service 91358b81b1aa40d998b3a28d0cfc86e7 --region 'RegionOne'  --publicurl \
 'https://fedorainfracloud.org:5000/v2.0'  --adminurl 'http://172.24.0.9:35357/v2.0' \
 --internalurl 'http://172.24.0.9:5000/v2.0' 
 ....
 == Fedora Classroom about our instance
 http://meetbot.fedoraproject.org/fedora-classroom/2015-05-11/fedora-classroom.2015-05-11-15.02.log.html
--- a/modules/sysadmin_guide/pages/collectd.adoc
+++ b/modules/sysadmin_guide/pages/collectd.adoc
@ -0,0 +1,62 @@
 = Collectd SOP
 Collectd ( https://collectd.org/ ) is a client/server setup that gathers
 system information from clients and allows the server to display that
 information over various time periods.
 Our server instance runs on log01.phx2.fedoraproject.org and most other
 servers run clients that connect to the server and provide it with data.
 '''''
 [arabic]
 . Contact Information
 . Collectd info
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Location::
  https://admin.fedoraproject.org/collectd/
 Servers::
  log01 and all/most other servers as clients
 Purpose::
  provide load and system information on servers.
 == Configuration
 The collectd roles configure collectd on the various machines:
 collectd/base - This is the base client role for most servers.
 collectd/server - This is the server for use on log01. collectd/other -
 There's various other subroles for different types of clients.
 == Web interface
 The server web interface is available at:
 https://admin.fedoraproject.org/collectd/
 == Restarting
 collectd runs as a normal systemd or sysvinit service, so you can:
 systemctl restart collectd or service collectd restart to restart it.
 == Removing old hosts
 Collectd keeps information around until it's deleted, so you may need to
 sometime go remove data from a host or hosts thats no longer used. To do
 this:
 [arabic]
 . Login to log01
 . cd /var/lib/collectd/rrd
 . sudo rm -rf oldhostname
 == Bug reporting
 Collectd is in Fedora/EPEL and we use their packages, so report bugs to
 bugzilla.redhat.com.
--- a/modules/sysadmin_guide/pages/communishift.adoc
+++ b/modules/sysadmin_guide/pages/communishift.adoc
@ -0,0 +1,76 @@
 = Communishift SOP
 Communishift is an OpenShift deployment hosted and maintained by Fedora
 Infrastructure that is available to the community to host applications.
 Fedora Infrastructure does not maintain the applications in Communishift
 and is only responsible for the OpenShift deployment itself.
 Production instance:
 https://console-openshift-console.apps.os.fedorainfracloud.org/
 Contents
 == Contact information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  nirik
 Location::
  Phoenix
 Servers::
  * os-node01.fedorainfracloud.org
  * os-node02.fedorainfracloud.org
  * os-node03.fedorainfracloud.org
  * os-node04.fedorainfracloud.org
  * os-node05.fedorainfracloud.org
  * os-node06.fedorainfracloud.org
  * os-node07.fedorainfracloud.org
  * os-node08.fedorainfracloud.org
  * os-node09.fedorainfracloud.org
  * os-node10.fedorainfracloud.org
  * os-node11.fedorainfracloud.org
  * virthost-os01.fedorainfracloud.org
  * virthost-os02.fedorainfracloud.org
  * virthost-os03.fedorainfracloud.org
  * virthost-aarch64-os01.fedorainfracloud.org
  * virthost-aarch64-os02.fedorainfracloud.org
 Purpose::
  Allow community members to host services for the Fedora Project.
 == Onboarding new users
 To allow new users to create projects in Communishift, begin by adding
 them to the `communishift` FAS group.
 At the time of this writing, there is no automation to sync users from
 the `communishift` FAS group to OpenShift, so you will need to log in to
 the Communishift instance and grant that user permissions to create
 projects. For example, to grant `bowlofeggs` permissions, you would do
 this:
 ....
 $ oc adm policy add-cluster-role-to-user self-provisioner bowlofeggs
 $ oc create clusterquota for-bowlofeggs --project-annotation-selector openshift.io/requester=bowlofeggs --hard pods=10 --hard persistentvolumeclaims=5
 ....
 This will grant bowlofeggs the ability to provision up to 10 pods and 5
 volumes.
 == KVM access
 We allow applications access to the kvm device so they can run emulation
 faster. Anytime the cluster is re-installed, run:
 !/bin/bash set -eux if ! oc get --namespace=default ds/device-plugin-kvm
 &>/dev/null; then oc create --namespace=default -f
 https://raw.githubusercontent.com/kubevirt/kubernetes-device-plugins/master/manifests/kvm-ds.yml
 fi
 See the
 https://github.com/kubevirt/kubernetes-device-plugins/blob/master/docs/README.kvm.md[upstream
 docs] as well as the
 https://pagure.io/fedora-infrastructure/issue/8208[original request] for
 this.
--- a/modules/sysadmin_guide/pages/compose-tracker.adoc
+++ b/modules/sysadmin_guide/pages/compose-tracker.adoc
@ -0,0 +1,30 @@
 = Compose Tracker SOP
 Compose Tracker tracks the pungi composes and creates a ticket in a
 pagure repo for the composes are not FINISHED with a tail of the debug
 and the koji tasks associated to it.
 Compose Tracker: https://pagure.io/releng/compose-tracker Failed
 Composes Repo: https://pagure.io/releng/failed-composes
 == Contents
 [arabic]
 . Contact Information
 == Contact Information
 Owner::
  Fedora Release Engineering Team
 Contact::
  #fedora-releng
 Persons::
  dustymabe mohanboddu
 Purpose::
  Track failed composes
 == More Information
 For information about the tool and deployment on Fedora Infra Openshift
 please look at the documetation in
 https://pagure.io/releng/compose-tracker/blob/master/f/README.md
--- a/modules/sysadmin_guide/pages/contenthosting.adoc
+++ b/modules/sysadmin_guide/pages/contenthosting.adoc
@ -0,0 +1,127 @@
 = Content Hosting Infrastructure SOP
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, fedora-infrastructure-list
 Location::
  Phoenix
 Servers::
  secondary1, netapp[1-3], torrent1
 Purpose::
  Policy regarding hosting, removal and pruning of content.
 Scope::
  download.fedora.redhat.com, alt.fedoraproject.org,
  archives.fedoraproject.org, secondary.fedoraproject.org,
  torrent.fedoraproject.org
 == Description
 Fedora hosts both Fedora content and some non-Fedora content. Our
 resources are finite and as such we have to have some policy around when
 to remove old content. This SOP describes the test to remove content.
 The spirit of this SOP is to allow more people to host content and give
 it a try, prove that it's useful. If it's not popular or useful, it will
 get removed. Also out of date or expired content will be removed.
 === What hosting options are available
 Aside from the hosting at https://pagure.io/ we have a series of mirrors
 we're allowing people to use. They are located at:
 * http://archive.fedoraproject.org/pub/archive/ - For archives of
 historical Fedora releases
 * http://secondary.fedoraproject.org/pub/fedora-secondary/ - For
 secondary architectures
 * http://alt.fedoraproject.org/pub/alt/ - For misc content / catchall
 * http://torrent.fedoraproject.org/ - For torrent hosting
 * http://spins.fedoraproject.org/ - For official Fedora Spins hosting,
 mirrored somewhat
 * http://download.fedoraproject.com/pub/ - For official Fedora Releases,
 mirrored widely
 === Who can host? What can be hosted?
 Any official Fedora content can hosted and made available for mirroring.
 Official content is determined by the Council by virtue of allowing
 people to use the Fedora trademark. People representing these teams will
 be allowed to host.
 === Non Official Hosting
 People wanting to host unofficial bits may request approval for hosting.
 Create a ticket at https://pagure.io/fedora-infrastructure/ explaining
 what and why Fedora should host it. Such will be reviewed by the Fedora
 Infrastructure team.
 Requests for non-official hosting that may conflict with existing Fedora
 policies will be escalated to the Council for approval.
 === Licensing
 Anything hosted with Fedora must come with a Free software license that
 is approved by Fedora. See http://fedoraproject.org/wiki/Licensing for
 more.
 == Requesting Space
 * Make sure you have a Fedora account
 -https://admin.fedoraproject.org/accounts/
 * Ensure you have signed the Fedora Project Contributor Agreement (FPCA)
 * Submit a hosting request -https://pagure.io/fedora-infrastructure/
 ** Include who you are, and any group you are working with (e.g. a SIG)
 ** Include Space requirements
 ** Include an estimate of the number of downloads expected (if you can).
 ** Include the nature of the bits you want to host.
 * Apply for group hosted-content
 -https://admin.fedoraproject.org/accounts/group/view/hosted-content
 == Using Space
 A dedicated namespace in the mirror will be assigned to you. It will be
 your responsibility to upload content, remove old content, stay within
 your quota, etc. If you have any questions or concerns about this please
 let us know. Generally you will use rsync. For example:
 ....
 rsync -av --progress ./my.iso secondary01.fedoraproject.org:/srv/pub/alt/mySpace/
 ....
 [IMPORTANT]
 .Important
 ====
 None of our mirrored content is backed up. Ensure that you keep backups
 of your content.
 ====
 == Content Pruning / Purging / Removal
 The following guidelines / tests will be used to determine whether or
 not to remove content from the mirror.
 === Expired / Old Content
 If content meets any of the following criteria it may be removed:
 * Content that has reached the end of life (is no longer receiving
 updates).
 * Pre-release content that has been superceded.
 * EOL releases that have been moved to archives.
 * N-2 or greater releases. If more than 3 versions of a piece of content
 are on the mirror, the oldest may be removed.
 === Limited Use Content
 If content meets any of the following criteria it may be removed:
 * Content with exceedingly limited seeders or downloaders, with little
 prospect of increasing those numbers and which is older then 1 year.
 * Content such as videos or audio which are several years old.
 === Catch All Removal
 Fedora reserves the right to remove any content for any reason at any
 time. We'll do our best to host things but sometimes we'll need space or
 just need to remove stuff for legal or policy reasons.
--- a/modules/sysadmin_guide/pages/copr.adoc
+++ b/modules/sysadmin_guide/pages/copr.adoc
@ -0,0 +1,417 @@
 = Copr
 Copr is build system for 3rd party packages.
 Frontend:::
  * http://copr.fedorainfracloud.org/
 Backend:::
  * http://copr-be.cloud.fedoraproject.org/
 Package signer:::
  * copr-keygen.cloud.fedoraproject.org
 Dist-git::
  * copr-dist-git.fedorainfracloud.org
 Devel instances (NO NEED TO CARE ABOUT THEM, JUST THOSE ABOVE):::
  * http://copr-fe-dev.cloud.fedoraproject.org/
  * http://copr-be-dev.cloud.fedoraproject.org/
  * copr-keygen-dev.cloud.fedoraproject.org
  * copr-dist-git-dev.fedorainfracloud.org
 == Contact Information
 Owner::
  msuchy (mirek)
 Contact::
  #fedora-admin, #fedora-buildsys
 Location::
  Fedora Cloud
 Purpose::
  Build system
 == This document
 This document provides a condensed information allowing you to keep Copr
 alive and working. For more sofisticated business processes, please see
 https://docs.pagure.org/copr.copr/maintenance_documentation.html
 == TROUBLESHOOTING
 Almost every problem with Copr is due problem with spawning builder VMs,
 or with processing action queue on backend.
 === VM spawning/termination problems
 Try to restart copr-backend service:
 ....
 $ ssh root@copr-be.cloud.fedoraproject.org
 $ systemctl restart copr-backend
 ....
 If this doesn't solve the problem, try to follow logs for some clues:
 ....
 $ tail -f /var/log/copr-backend/{vmm,spawner,terminator}.log
 ....
 As the last resort option, you can terminate all builders and let
 copr-backend to throw all information about them. This action will
 obviously interrupt all running builds and reschedule them:
 ....
 $ ssh root@copr-be.cloud.fedoraproject.org
 $ systemctl stop copr-backend
 $ cleanup_vm_nova.py
 $ redis-cli
 > FLUSHALL
 $ systemctl start copr-backend
 ....
 Sometimes OpenStack can not handle spawning too much VMs at the same
 time. So it is safer to edit on copr-be.cloud.fedoraproject.org:
 ....
 vi /etc/copr/copr-be.conf
 ....
 and change:
 ....
 group0_max_workers=12
 ....
 to "6". Start copr-backend service and some time later increase it to
 original value. Copr automaticaly detect change in script and increase
 number of workers.
 The set of aarch64 VMs isn't maintained by OpenStack, but by Copr's
 backend itself. Steps to diagnose:
 ....
 $ ssh root@copr-be.cloud.fedoraproject.org
 [root@copr-be ~][PROD]# systemctl status resalloc
 ● resalloc.service - Resource allocator server
 ...
 [root@copr-be ~][PROD]# less /var/log/resallocserver/main.log
 [root@copr-be ~][PROD]# su - resalloc
 [resalloc@copr-be ~][PROD]$ resalloc-maint resource-list
 13569 - aarch64_01_prod_00013569_20190613_151319 pool=aarch64_01_prod tags=aarch64 status=UP
 13597 - aarch64_01_prod_00013597_20190614_083418 pool=aarch64_01_prod tags=aarch64 status=UP
 13594 - aarch64_02_prod_00013594_20190614_082303 pool=aarch64_02_prod tags=aarch64 status=STARTING
 ...
 [resalloc@copr-be ~][PROD]$ resalloc-maint ticket-list
 879 - state=OPEN tags=aarch64 resource=aarch64_01_prod_00013569_20190613_151319
 918 - state=OPEN tags=aarch64 resource=aarch64_01_prod_00013608_20190614_135536
 904 - state=OPEN tags=aarch64 resource=aarch64_02_prod_00013594_20190614_082303
 919 - state=OPEN tags=aarch64
 ...
 ....
 Be careful when there's some resource in `STARTING` state. If that's so,
 check
 `/usr/bin/tail -F -n +0 /var/log/resallocserver/hooks/013594_alloc`.
 Copr takes tickets from resalloc server; and if the resources fail to
 spawn, the ticket numbers are not assigned with appropriately tagged
 resource for a long time.
 If that happens (it shouldn't) and there's some inconsistency between
 resalloc's database and the actual status on aarch64 hypervisors
 (`ssh copr@virthost-aarch64-os0{1,2}.fedorainfracloud.org`) - use
 `virsh` there to introspect theirs statuses - use
 `resalloc-maint resource-delete`, `resalloc ticket-close` or `psql`
 commands to fix-up the resalloc's DB.
 === Backend Troubleshoting
 Information about status of Copr backend services:
 ....
 systemctl status copr-backend*.service
 ....
 Utilization of workers:
 ....
 ps axf
 ....
 Worker process change $0 to list which task they are working on and on
 which builder.
 To list which VM builders are tracked by copr-vmm service:
 ....
 /usr/bin/copr_get_vm_info.py
 ....
 === Appstream builder troubleshoting
 Appstream builder is painfully slow when running on a repository with a
 huge amount of packages. See
 https://github.com/hughsie/appstream-glib/issues/301 . You might need to
 disable it for some projects:
 ....
 $ ssh root@copr-be.cloud.fedoraproject.org
 $ cd /var/lib/copr/public_html/results/<owner>/<project>/
 $ touch .disable-appstream
 # You should probably also delete existing appstream data because
 # they might be obsolete
 $ rm -rf ./appdata
 ....
 === Backend action queue issues
 First check the link:[number of not-yet-processed actions]. If that
 number isn't equal to zero, and is not decrementing relatively fast (say
 single action takes longer than 30s) -- there might be some problem.
 Logs for the action dispatcher can be found in:
 ....
 /var/log/copr-backend/action_dispatcher.log
 ....
 Check if there's no stucked process under `Action dispatch` parent
 process in `pstree -a copr` output.
 == Deploy information
 Using playbooks and rbac:
 ....
 $ sudo rbac-playbook groups/copr-backend.yml
 $ sudo rbac-playbook groups/copr-frontend-cloud.yml
 $ sudo rbac-playbook groups/copr-keygen.yml
 $ sudo rbac-playbook groups/copr-dist-git.yml
 ....
 https://pagure.io/copr/copr/blob/master/f/copr-setup.txt The
 [.title-ref]#copr-setup.txt# manual is severely outdated, but there is
 no up-to-date alternative. We should extract useful information from it
 and put it here in the SOP or into
 https://docs.pagure.org/copr.copr/maintenance_documentation.html and
 then throw the [.title-ref]#copr-setup.txt# away.
 On backend should run copr-backend service (which spawns several
 processes). Backend spawns VM from Fedora Cloud. You could not login to
 those machines directly. You have to:
 ....
 $ ssh root@copr-be.cloud.fedoraproject.org
 $ su - copr
 $ copr_get_vm_info.py
 # find IP address of the VM that you want
 $ ssh root@172.16.3.3
 ....
 Instances can be easily terminated in
 https://fedorainfracloud.org/dashboard
 === Order of start up
 When reprovision you should start first: copr-keygen and copr-dist-git
 machines (in any order). Then you can start copr-be. Well you can start
 it sooner, but make sure that copr-* services are stopped.
 Copr-fe machine is completly independent and can be start any time. If
 backend is stopped it will just queue jobs.
 == Logs
 === Backend
 * /var/log/copr-backend/action_dispatcher.log
 * /var/log/copr-backend/actions.log
 * /var/log/copr-backend/backend.log
 * /var/log/copr-backend/build_dispatcher.log
 * /var/log/copr-backend/logger.log
 * /var/log/copr-backend/spawner.log
 * /var/log/copr-backend/terminator.log
 * /var/log/copr-backend/vmm.log
 * /var/log/copr-backend/worker.log
 And several logs for non-essential features such as
 copr_prune_results.log, hitcounter.log, cleanup_vms.log, that you
 shouldn't be worried with.
 === Frontend
 * /var/log/copr-frontend/frontend.log
 * /var/log/httpd/access_log
 * /var/log/httpd/error_log
 === Keygen
 * /var/log/copr-keygen/main.log
 === Dist-git
 * /var/log/copr-dist-git/main.log
 * /var/log/httpd/access_log
 * /var/log/httpd/error_log
 == Services
 === Backend
 * copr-backend
 ** copr-backend-action
 ** copr-backend-build
 ** copr-backend-log
 ** copr-backend-vmm
 * redis
 * lighttpd
 All the [.title-ref]#copr-backend-*.service# are configured to be a part
 of the [.title-ref]#copr-backend.service# so e.g. in case of restarting
 all of them, just restart the [.title-ref]#copr-backend.service#.
 === Frontend
 * httpd
 * postgresql
 === Keygen
 * signd
 === Dist-git
 * httpd
 * copr-dist-git
 == PPC64LE Builders
 Builders for PPC64 are located at rh-power2.fit.vutbr.cz and anyone with
 access to buildsys ssh key can get there using keys as::
  msuchy@rh-power2.fit.vutbr.cz
 There are commands: $ ls bin/ destroy-all.sh reinit-vm26.sh
 reinit-vm28.sh virsh-destroy-vm26.sh virsh-destroy-vm28.sh
 virsh-start-vm26.sh virsh-start-vm28.sh get-one-vm.sh reinit-vm27.sh
 reinit-vm29.sh virsh-destroy-vm27.sh virsh-destroy-vm29.sh
 virsh-start-vm27.sh virsh-start-vm29.sh
 bin/destroy-all.sh destroy all VM and reinit them reinit-vmXX.sh copy VM
 image from template virsh-destroy-vmXX.sh destroys VM
 virsh-start-vmXX.sh starts VM get-one-vm.sh start one VM and return its
 IP - this is used in Copr playbooks.
 In case of big queue of PPC64 tasks simply call bin/destroy-all.sh and
 it will destroy stuck VM and copr backend will spawn new VM.
 == Ports opened for public
 Frontend:
 [width="86%",cols="13%,17%,16%,54%",options="header",]
 |===
 |Port |Protocol |Service |Reason
 |22 |TCP |ssh |Remote control
 |80 |TCP |http |Serving Copr frontend website
 |443 |TCP |https |^^
 |===
 Backend:
 [width="86%",cols="13%,17%,16%,54%",options="header",]
 |===
 |Port |Protocol |Service |Reason
 |22 |TCP |ssh |Remote control
 |80 |TCP |http |Serving build results and repos
 |443 |TCP |https |^^
 |===
 Distgit:
 [width="86%",cols="13%,17%,16%,54%",options="header",]
 |===
 |Port |Protocol |Service |Reason
 |22 |TCP |ssh |Remote control
 |80 |TCP |http |Serving cgit interface
 |443 |TCP |https |^^
 |===
 Keygen:
 [width="86%",cols="13%,17%,16%,54%",options="header",]
 |===
 |Port |Protocol |Service |Reason
 |22 |TCP |ssh |Remote control
 |===
 == Resources justification
 Copr currently uses the following resources.
 === Frontend
 * RAM: 2G (out of 4G) and some swap
 * CPU: 2 cores (3400mhz) with load 0.92, 0.68, 0.65
 Most of the memory is eaten by PostgreSQL, followed by Apache. The CPU
 usage is also mainly used for those two services but in the reversed
 order.
 I don't think we can settle down with any instance that provides less
 than (2G RAM, obviously), but ideally, we need 3G+. 2-core CPU is good
 enough.
 * Disk space: 17G for system and 8G for [.title-ref]#pgsqldb# directory
 If needed, we are able to clean-up the database directory of old dumps
 and backups and get down to around 4G disk space.
 === Backend
 * RAM: 5G (out of 16G)
 * CPU: 8 cores (3400MHz) with load 4.09, 4.55, 4.24
 Backend takes care of spinning-up builders and running ansible playbooks
 on them, running [.title-ref]#createrepo_c# (on big repositories) and so
 on. Copr utilizes two queues, one for builds, which are delegated to
 OpenStack builders, and action queue. Actions, however, are processed
 directly by the backend, so it can spike our load up. We would ideally
 like to have the same computing power that we have now. Maybe we can go
 lower than 16G RAM, possibly down to 12G RAM.
 * Disk space: 30G for the system, 5.6T (out of 6.8T) for build results
 Currently, we have 1.3T of backup data, that is going to be deleted
 soon, but nevertheless, we cannot go any lower on storage. Disk space is
 a long-term issue for us and we need to do a lot of compromises and
 settling down just to survive our daily increase (which is around 10G of
 new data). Many features are blocked by not having enough storage. We
 cannot go any lower and also we cannot go much longer with the current
 storage.
 === Distgit
 * RAM: ~270M (out of 4G), but climbs to ~1G when busy
 * CPU: 2 cores (3400MHz) with load 1.35, 1.00, 0.53
 Personally, I wouldn't downgrade the machine too much. Possibly we can
 live with 3G ram, but I wouldn't go any lower.
 * Disk space: 7G for system, 1.3T dist-git data
 We currently employ a lot of aggressive cleaning strategies on our
 distgit data, so we can't go any lower than what we have.
 === Keygen
 * RAM: ~150M (out of 2G)
 * CPU: 1 core (3400MHz) with load 0.10, 0.31, 0.25
 We are basically running just [.title-ref]#signd# and
 [.title-ref]#httpd# here, both with minimal resource requirements. The
 memory usage is topped by [.title-ref]#systemd-journald#.
 * Disk space: 7G for system and ~500M (out of ~700M) for GPG keys
 We are slowly pushing the GPG keys storage to its limit, so in the case
 of migrating copr-keygen somewhere, we would like to scale-up it to at
 least 1G.
--- a/modules/sysadmin_guide/pages/cyclades.adoc
+++ b/modules/sysadmin_guide/pages/cyclades.adoc
@ -0,0 +1,31 @@
 = Cyclades
 cyclades notes
 [arabic]
 . login as root - default password is tslinux
 . {blank}
 +
 change password for root and admin to our password from the::
  phx2-access.txt file in the private repo
 . {blank}
 +
 port forward to the web browser for the cyclades::
  `ssh -L 8080:rack47-serial.phx2.fedoraproject.org:80`
 . connect to localhost:8080 in your web browser
 . login with root and the password you set above
 . click on 'security'
 . click on 'moderate'
 . {blank}
 +
 logout, port forward port 443 as above:::
  `ssh -L 8080:rack47-serial.phx2.fedoraproject.org:443`
 . click on the 'wizard' button at lower left
 . proceed through the wizard Info needed:
 * serial ports are set to 115200 8N1 by default
 * do not setup buffering
 * give it the ip of our syslog server
 . click 'apply changes'
 . hope
 . log back in
 . name/setup the port aliases
--- a/modules/sysadmin_guide/pages/darkserver.adoc
+++ b/modules/sysadmin_guide/pages/darkserver.adoc
@ -0,0 +1,108 @@
 = Darkserver SOP
 To setup a http://darkserver.fedoraproject.org based on Darkserver
 project to provide GNU_BUILD_ID information for packages. A devel
 instance can be seen at http://darkserver01.dev.fedoraproject.org and
 staging instance is http://darkserver01.stg.phx2.fedoraproject.org/.
 This page describes how to set up the server.
 == Contents
 [arabic]
 . Contact Information
 . Installing the server
 . Setting up the database
 . SELinux Configuration
 . Koji plugin setup
 . Debugging
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin
 Persons:::
  kushal mether
 Sponsor:::
  nirik
 Location:::
  phx2
 Servers:::
  darkserver01 , darkserver01.stg, darkserver01.dev
 Purpose:::
  To host Darkserver
 == Installing the Server
 ....
 root@localhost# yum install darkserver
 ....
 == Setting up the database
 We are using MySQL as database. We will need two users, one for
 koji-plugin and one for darkserver.:
 ....
 root@localhost# mysql -u root
 mysql> CREATE DATABASE darkserver;
 mysql> GRANT INSERT ON darkserver.* TO kojiplugin@'koji-hub-ip' IDENTIFIED BY 'XXX';
 mysql> GRANT SELECT ON darkserver.* TO dark@'darkserver-ip' IDENTIFIED BY 'XXX';  
 ....
 Setup this db configuration in the conf file under
 `/etc/darkserver/darkserverweb.conf`:
 ....
 [darkserverweb]
 host=db host name
 user=dark
 password=XXX
 database=darkserver
 ....
 Now setup the db tables if it is a new install.
 (For this you may need to `'GRANT * ON darkserver.*'` to the web user,
 and then `'REVOKE * ON darkserver.*'` after running.)
 ....
 root@localhost# python /usr/lib/python2.6/site-packages/darkserverweb/manage.py syncdb
 ....
 == SELinux Configuration
 Do the follow to allow the webserver to connect to the database.:
 ....
 root@localhost# setsebool -P httpd_can_network_connect_db 1
 ....
 == Setting up the Koji plugin
 Install the package.:
 ....
 root@localhost# yum install darkserver-kojiplugin
 ....
 Then fill up the configuration file under
 `/etc/koji-hub/plugins/darkserver.conf`:
 ....
 [darkserver]
 host=db host name
 user=kojiplugin
 password=XXX
 database=darkserver
 port=3306
 ....
 Then enable the plugin in the koji hub configuration.
 == Debugging
 Set DEBUG to True in `/etc/darkserver/settings.py` file and restart
 Apache.
--- a/modules/sysadmin_guide/pages/database.adoc
+++ b/modules/sysadmin_guide/pages/database.adoc
@ -0,0 +1,235 @@
 = Database Infrastructure SOP
 Our database servers provide database storage for many of our apps.
 Contents
 [arabic]
 . Contact Information
 . Description
 . Creating a New Postgresql Database
 . Troubleshooting and Resolution
 +
 ____
 [arabic]
 .. Connection issues
 .. Some useful queries
 +
 ____
 [arabic]
 ... What queries are running
 ... Seeing how "dirty" a table is
 ... XID Wraparound
 ____
 .. Restart Procedure
 +
 ____
 [arabic]
 ... Koji
 ... Bodhi
 ____
 ____
 . Note about TurboGears and MySQL
 . Restoring from backups or specific dbs
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, sysadmin-dba group
 Location::
  Phoenix
 Servers::
  sb01, db03, db-fas01, db-datanommer02, db-koji01, db-s390-koji01,
  db-arm-koji01, db-ppc-koji01, db-qa01, dbqastg01
 Purpose::
  Provides database connection to many of our apps.
 == Description
 db01, db03 and db-fas01 are our primmary servers. db01 and db-fas01 run
 PostgreSQL. db03 contain mariadb. db-koji01, db-s390-koji01,
 db-arm-koji01, db-ppc-koji01 contain secondary kojis. db-qa01 and
 db-qastg01 contain resultsdb. db-datanommer02 contains all storage
 messages from postgresql database.
 == Creating a New Postgresql Database
 Creating a new database on our postgresql server isn't hard but there's
 several steps that should be taken to make the database server as secure
 as possible.
 We want to separate the database permissions so that we don't have the
 user/password combination that can do anything it likes to the database
 on every host (the webapp user can usually do a lot of things even
 without those extra permissions but every little bit helps).
 Say we have an app called "raffle". We'd have three users:
 * raffleadmin: able to make any changes they want to this particular
 database. It should not be used in day to day but only for things like
 updating the database schema when an update occurs. We could very likely
 disable this account in the db whenever we are not using it.
 * raffleapp: the database user that the web application uses. This will
 likely need to be able to insert and select from all tables. It will
 probably need to update most tables as well. There may be some tables
 that it does _not_ need delete on. It should almost certainly not need
 schema modifying permissions. (With postgres, it likely also needs
 permission to insert/select on sequences as well).
 * rafflereadonly: Only able to read data from tables, not able to modify
 anything. Sadly, we aren't using this often but it can be useful for
 scripts that need to talk directly to the database without modifying it.
 ....
 db2 $ sudo -u postgres createuser -P -E NEWDBadmin
 Password: <randomly generated password>
 db2 $ sudo -u postgres createuser -P -E NEWDBapp
 Password: <randomly generated password>
 db2 $ sudo -u postgres createuser -P -E NEWDBreadonly
 Password: <randomly generated password>
 db2 $ sudo -u postgres createdb -E utf8 NEWDB -O NEWDBadmin
 db2 $ sudo -u postgres psql NEWDB
 NEWDB=# revoke all on database NEWDB from public;
 NEWDB=# revoke all on schema public from public;
 NEWDB=# grant all on schema public to NEWDBadmin;
 NEWDB=# [grant permissions to NEWDBapp as appropriate for your app]
 NEWDB=# [grant permissions to NEWDBreadonly as appropriate for a user that
       is only trusted enough to read information]
 NEWDB=# grant connect on database NEWDB to nagiosuser;
 ....
 If your application needs to have the NEWDBapp and password to connect
 to the database, you probably want to add these to ansible as well. Put
 the password in the private repo in batcave01. Then use a templatefile
 to incorporate it into the config file. See fas.pp for an example.
 == Troubleshooting and Resolution
 === Connection issues
 There are no known outstanding issues with the database itself. Remember
 that every time either database is restarted, services will have to be
 restarted (see below).
 === Some useful queries
 ==== What queries are running
 This can help you find out what queries are cuurently running on the
 server:
 ....
 select datname, pid, query_start, backend_start, query from
 pg_stat_activity where state<>'idle' order by query_start;
 ....
 This can help you find how many connections to the db server are for
 each individual database:
 ....
 select datname, count(datname) from pg_stat_activity group by datname
 order by count desc;
 ....
 ==== Seeing how "dirty" a table is
 We've added a function from postgres's contrib directory to tell how
 dirty a table is. By dirty we mean, how many tuples are active, how many
 have been marked as having old data (and therefore "dead") and how much
 free space is allocated to the table but not used.:
 ....
 \c fas2
 \x
 select * from pgstattuple('visit_identity');
 table_len          | 425984
 tuple_count        | 580
 tuple_len          | 46977
 tuple_percent      | 11.03
 dead_tuple_count   | 68
 dead_tuple_len     | 5508
 dead_tuple_percent | 1.29
 free_space         | 352420
 free_percent       | 82.73
 \x
 ....
 Vacuum should clear out dead_tuples. Only a vacuum full, which will lock
 the table and therefore should be avoided, will clear out free space.
 ==== XID Wraparound
 Find out how close we are to having to perform a vacuum of a database
 (as opposed to individual tables of the db). We should schedule a vacuum
 when about 50% of the transaction ids have been used (approximately
 530,000,000 xids):
 ....
 select datname, age(datfrozenxid), pow(2, 31) - age(datfrozenxid) as xids_remaining
 from pg_database order by xids_remaining;
 ....
 Information on [61]wraparound
 == Restart Procedure
 If the database server needs to be restarted it should come back on it's
 own. Otherwise each service on it can be restarted:
 ....
 service mysqld restart
 service postgresql restart
 ....
 === Koji
 Any time postgreql is restarted, koji needs to be restarted. Please also
 see [62]Restarting Koji
 === Bodhi
 Anytime postgresql is restarted Bodhi will need to be restarted no sop
 currently exists for this.
 == TurboGears and MySQL
 [NOTE]
 .Note
 ====
 about TurboGears and MySQL
 There's a known bug in TurboGears that causes MySQL clients not to
 automatically reconnect when lost. Typically a restart of the TurboGears
 application will correct this issue.
 ====
 == Restoring from backups or specific dbs.
 Our backups store the latest copy in /backups/ on each db server. These
 backups are created automatically by the db-backup script run fron cron.
 Look in /usr/local/bin for the backup script.
 To restore partially or completely you need to:
 [arabic]
 . setup postgres on a system
 . {blank}
 +
 start postgres/run initdb::
  * {blank}
  +
  if this new system running postgres has already run ansible then it
  will;;
    have wrong config files in /var/lib/pgsql/data - clear them out
    before you start postgres so initdb can work.
 . {blank}
 +
 grab the backups you need from /backups - also grab global.sql::
  edit up global.sql to only create/alter the dbs you care about
 . as postgres run: `psql -U postgres -f global.sql`
 . {blank}
 +
 when this completes you can restore each db with (as postgres user)::::
  createdb $dbname pg_restore -d dbname dbname_backup_file.db
 . restart postgres and check your data.
--- a/modules/sysadmin_guide/pages/datanommer.adoc
+++ b/modules/sysadmin_guide/pages/datanommer.adoc
@ -0,0 +1,120 @@
 = datanommer SOP
 Consume fedmsg bus activity and stuff it in a postgresql db.
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-fedmsg, #fedora-admin, #fedora-noc
 Servers::
  busgateway01
 Purpose::
  Save fedmsg bus activity
 == Description
 datanommer is a set of three modules:
 python-datanommer-models::
  Schema definition and API for storing new items and querying existing
  items
 python-datanommer-consumer::
  A plugin for the fedmsg-hub that actively listens to the bus and
  stores events.
 datanommer-commands::
  A set of CLI tools for querying the DB.
 datanommer will one day serve as a backend for future web services like
 datagrepper and dataviewer.
 Source: https://github.com/fedora-infra/datanommer/ Plan:
 https://fedoraproject.org/wiki/User:Ianweller/statistics_plus_plus
 == CLI tools
 Dump the db into a file as json:
 ....
 $ datanommer-dump > datanommer-dump.json
 ....
 When was the last bodhi message?:
 ....
 $ # It was 678 seconds ago
 $ datanommer-latest --category bodhi --timesince
 [678]
 ....
 When was the last bodhi message in more readable terms?:
 ....
 $ # It was 12 minutes and 43 seconds ago
 $ datanommer-latest --category bodhi --timesince --human
 [0:12:43.087949]
 ....
 What was that last bodhi message?:
 ....
 $ datanommer-latest --category bodhi
 [{"bodhi": {
  "topic": "org.fedoraproject.stg.bodhi.update.comment", 
  "msg": {
    "comment": {
      "group": null, 
      "author": "ralph", 
      "text": "Testing for latest datanommer.", 
      "karma": 0, 
      "anonymous": false, 
      "timestamp": 1360349639.0, 
      "update_title": "xmonad-0.10-10.fc17"
    }, 
    "agent": "ralph"
  }, 
 }}]
 ....
 Show me stats on datanommer messages by topic:
 ....
 $ datanommer-stats --topic
 org.fedoraproject.stg.fas.group.member.remove has 10 entries
 org.fedoraproject.stg.logger.log has 76 entries
 org.fedoraproject.stg.bodhi.update.comment has 5 entries
 org.fedoraproject.stg.busmon.colorized-messages has 10 entries
 org.fedoraproject.stg.fas.user.update has 10 entries
 org.fedoraproject.stg.wiki.article.edit has 106 entries
 org.fedoraproject.stg.fas.user.create has 3 entries
 org.fedoraproject.stg.bodhitest.testing has 4 entries
 org.fedoraproject.stg.fedoratagger.tag.create has 9 entries
 org.fedoraproject.stg.fedoratagger.user.rank.update has 5 entries
 org.fedoraproject.stg.wiki.upload.complete has 1 entries
 org.fedoraproject.stg.fas.group.member.sponsor has 6 entries
 org.fedoraproject.stg.fedoratagger.tag.update has 1 entries
 org.fedoraproject.stg.fas.group.member.apply has 17 entries
 org.fedoraproject.stg.__main__.testing has 1 entries
 ....
 == Upgrading the DB Schema
 datanommer uses "python-alembic" to manage its schema. When developers
 want to add new columns or features, these should/must be tracked in
 alembic and shipped with the RPM.
 In order to run upgrades on our stg/prod dbs:
 [arabic]
 . ssh to busgateway01\{.stg}
 . `cd /usr/share/datanommer.models/`
 . Run:
 +
 ....
 $ alembic upgrade +1
 ....
 ____
 Over and over again until the db is fully upgraded.
 ____
--- a/modules/sysadmin_guide/pages/debuginfod.adoc
+++ b/modules/sysadmin_guide/pages/debuginfod.adoc
@ -0,0 +1,133 @@
 = Fedora Debuginfod Service - SOP
 Debuginfod is the software that lies behind the service at
 https://debuginfod.fedoraproject.org/ and
 https://debuginfod.stg.fedoraproject.org/ . These services run on 1 VM
 each in the stg and prod infrastructure at IAD2.
 == Contact Information
 Owner:::
  RH perftools team + Fedora Infrastructure Team
 Contact:::
  @fche in #fedora-noc
 Servers:::
  VMs
 Purpose:::
  Serve elf/dwarf/source-code debuginfo for supported releases to
  debugger-like tools in Fedora.
 Repository:::
  https://sourceware.org/elfutils/Debuginfod.html
  https://fedoraproject.org/wiki/Debuginfod
 == How it works
 One virtual machine in prod NFS-mount the koji build system's RPM
 repository, read-only. The production VM has a virtual twin in the
 staging environment. They each run elfutils debuginfod to index
 designated RPMs into a large local sqlite database. They answers HTTP
 queries received from users on the Internet via reverse-proxies at the
 https://debuginfod.fedoraproject.org/ URL. The reverse proxies apply
 gzip compression on the data and provide redirection of the root `/`
 location only into the fedora wiki.
 Normally, it is autonomous and needs no maintenance. It should come back
 nicely after many kinds of outage. The software is based on elfutils in
 Fedora, but may occasionally track a custom COPR build with backported
 patches from future elfutils versions.
 == Configuration
 The daemon uses systemd and `/etc/sysconfig/debuginfod` to set basic
 parameters. These have been tuned from the distro defaults via
 experimental hand-editing or ansible. Key parameters are:
 [arabic]
 . The -I/-X include/exclude regexes. These tell debuginfod what fedora
 versions to include RPMs for. If index disk space starts to run low, one
 can eliminate some older fedoras from the index to free up space (after
 the next groom cycle).
 . The --fdcache related parameters. These tell debuginfod how much data
 to cache from RPMs. (Some debuginfo files - kernel, llvm, gtkweb, ...)
 are huge and worth retaining instead of repeated extracting.) This is
 straight disk space vs. time tradeoff.
 . The -t (scan interval) parameter. Scanning lets an index get bigger,
 as new RPMs in koji are examined and their contents indexed. Each pass
 takes a bunch of hours to traverse the entire koji NFS directory
 structure to fstat() everything for newness or change. A smaller scan
 interval lets debuginfod react quicker to koji builds coming into
 existence, but increases load on the NFS server. More -n (scan threads)
 may help the indexing process go faster, if the networking fabric & NFS
 server are underloaded.
 . The -g (groom interval) parameter. Grooming lets an index get smaller,
 as files removed from koji will be forgotten about. It can be run very
 intermittently - weekly or less - since it takes many hours and cannot
 run concurrently with scanning.
 A quick:
 ....
 systemd restart debuginfod
 ....
 activates the new settings.
 In case of some drastic failure like database corruption or signs of
 penetration/abuse, one can shut down the server with systemd, and/or
 stop traffic at the incoming proxy configuration level. The index sqlite
 database under `/var/cache/debuginfod` may be deleted, if necessary, but
 keep in mind that it takes days to reindex the relevant parts of koji.
 Alternately, with the services stopped, the 150GB+ sqlite database files
 may be freely copied between the staging and production servers, if that
 helps during disaster recovery.
 == Monitoring
 === Prometheus
 The debuginfod daemons answer the standard /metrics URL endpoint to
 serve a variety of operational metrics in prometheus. Important metrics
 include:
 [arabic]
 . filesys_free_ratio - free space on the filesystems. (These are also
 monitored via fedora-infra nagios.) If the free space on the database or
 tmp partition falls low, further indexing or even service may be
 impacted. Add more disk space if possible, or start eliding older fedora
 versions from the database via the -I/-X daemon options.
 . thread_busy - number of busy threads. During indexing, 1-6 threads may
 be busy for minutes or even days, intermittently. User requests show up
 as "buildid" (real request) or "buildid-after-you" (deferred duplicate
 request) labels. If there are more than a handful of "buildid" ones,
 there may be an overload/abuse underway, in which case it's time to
 identify the excessive traffic via the logs and get a temporary iptables
 block going. Or perhaps there is an outage or slowdown of the koji NFS
 storage system, in which case there's not much to do.
 . error_count. These should be zero or near zero all the time.
 === Logs
 The debuginfod daemons produce voluminous logs into the local systemd
 journal, whence the traffic moves to the usual fedora-infra log01
 server, `/var/log/hosts/debuginfod*/YYYY/MM/DD/messages.log`. The lines
 related to HTTP GET identify the main webapi traffic, with originating
 IP addresses in the XFF: field, and response size and elapsed service
 time in the last columns. These can be useful in tracking down possible
 abuse. :
 ....
 Jun 28 22:36:43 debuginfod01 debuginfod[381551]: [Mon 28 Jun 2021 10:36:43 PM GMT] (381551/2413727): 10.3.163.75:43776 UA:elfutils/0.185,Linux/x86_64,fedora/35 XFF:*elided* GET /buildid/90910c1963bbcf700c0c0c06ee3bf4c5cc831d3a/debuginfo 200 335440 0+0ms
 ....
 The lines related to prometheus /metrics are usually no big deal.
 The log also includes info about errors and indexing progress.
 Interesting may be the lines like:
 ....
 Jun 28 22:36:43 debuginfod01 debuginfod[381551]: [Mon 28 Jun 2021 10:36:43 PM GMT] (381551/2413727): serving fdcache archive /mnt/fedora_koji_prod/koji/packages/valgrind/3.17.0/3.fc35/x86_64/valgrind-3.17.0-3.fc35.x86_64.rpm file /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so
 ....
 which identify the file names derived from requests (which RPMs the
 buildids to). These can provide some indirect distro telemetry: what
 packages and binaries are being debugged and for which architectures?
--- a/modules/sysadmin_guide/pages/denyhosts.adoc
+++ b/modules/sysadmin_guide/pages/denyhosts.adoc
@ -0,0 +1,52 @@
 = Denyhosts Infrastructure SOP
 Denyhosts provides a protection against brute force attacks.
 == Contents
 [arabic]
 . Contact Information
 . Description
 . Troubleshooting and Resolution
 ____
 [arabic]
 . Connection issues
 ____
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main group
 Location::
  Anywhere
 Servers::
  All
 Purpose::
  Denyhosts provides a protection against brute force attacks.
 == Description
 All of our servers now implement denyhosts to protect against brute
 force attacks. Very few boxes should be in the 'allowed' list.
 Especially internally.
 == Troubleshooting and Resolution
 === Connection issues
 The most common issue will be legitimate logins failing. First, try to
 figure out why a host ended up on the deny list (tcptraceroute, failed
 login attempts, etc are all good candidates). Next do the following
 directions. The below example is for a host (10.0.0.1) being banned.
 Login to the box from a different host and as root do the following.:
 ....
 cd /var/lib/denyhosts
 sed -si '/10.0.0.1/d' * /etc/hosts.deny
 /etc/init.d/denyhosts restart
 ....
 That should correct the problem.
--- a/modules/sysadmin_guide/pages/departing-admin.adoc
+++ b/modules/sysadmin_guide/pages/departing-admin.adoc
@ -0,0 +1,63 @@
 = Departing admin SOP
 From time to time admins depart the project, this SOP checks any access
 they may no longer need.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  Everywhere
 Servers::
  all
 == Description
 From time to time people with admin access to various parts of the
 project may leave the project or no longer wish to contribute. This SOP
 attempts to list the process for removing access they no longer need.
 [arabic, start=0]
 . First, make sure that this SOP is needed. Verify the person has left
 the project and what areas they might wish to still contibute to.
 . Gather info: fas username, email address, knowledge of passwords.
 . Check the following areas with the following commands:
 +
 ____
 email address in ansible::
  * Check: `git grep email@address`
  * Remove: `git commit`
 koji admin::
  * Check: `koji list-permissions --user=username`
  * Remove: `koji revoke-permission permissionname username`
 wiki pages::
  * Check: look for https://fedoraproject.org/wiki/User:Username
  * Remove: delete page, or modify with info they are no longer
  contributing.
 packages::
  * Check: Download
  https://admin.fedoraproject.org/pkgdb/lists/bugzilla?tg_format=plain
  and grep
  * Remove: remove from cc, orphan packages or reassign.
 fas account::
  * Check: check username in fas
  * Remove: set user inactive
  +
  [NOTE]
  .Note
  ====
  If there are scripts or files needed, save homedir of user.
  ====
 passwords::
  * Check: if departing admin knew sensitive passwords.
  * Remove: Change passwords.
  +
  [NOTE]
  .Note
  ====
  root pw, management interfaces, etc
  ====
 ____
--- a/modules/sysadmin_guide/pages/dns.adoc
+++ b/modules/sysadmin_guide/pages/dns.adoc
@ -0,0 +1,358 @@
 = DNS repository for fedoraproject
 We've set this up so we can easily (and quickly) edit and deploy dns
 changes with a record of who changed what and why. This system also lets
 us edit out proxies from rotation for our many and varied websites
 quickly and with a minimum of opportunity for error. Finally, it checks
 to make sure that all of the zone changes will actually work before they
 are allowed.
 == DNS Infrastructure SOP
 We have 5 DNS servers:
 ns02.fedoraproject.org::
  hosted at ibiblio (ipv6 enabled)
 ns05.fedoraproject.org::
  hosted at internetx (ipv6 enabled)
 ns13.rdu2.fedoraproject.org::
  in rdu2, internal to rdu2.
 ns01.iad2.fedoraproject.org::
  in iad2, internal to iad2.
 ns02.iad2.fedoraproject.org::
  in iad2, internal to iad2.
 == Contents
 [arabic]
 . Contact Information
 . Troubleshooting, Resolution and Maintenance
 ____
 [arabic]
 . DNS update
 . Adding a new zone
 ____
 [arabic, start=3]
 . GeoDNS
 ____
 [arabic]
 . Non geodns fedoraproject.org IPs
 . Adding and removing countries
 . IP Country Mapping
 ____
 [arabic, start=4]
 . resolv.conf
 ____
 [arabic]
 . Phoenix
 . Non-Phoenix
 ____
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin, sysadmin-main, sysadmin-dns
 Location:::
  ServerBeach and ibiblio and internetx and phx2.
 Servers:::
  ns02, ns05, ns13.rdu2, ns01.iad2, ns02.iad2
 Purpose:::
  Provides DNS to our users
 Troubleshooting, Resolution and Maintenance
 == Check out the DNS repository
 You can get the dns repository from `/srv/git/dns` on `batcave01`:
 ....
 $ git clone /srv/git/dns
 ....
 == Adding a new Host
 Adding a new host requires to add it to DNS and to ansible, see
 new-hosts.rst for the details.
 == Editing the domain(s)
 We have three domains which needs to be able to change on demand for
 proxy rotation/removal:
 * fedoraproject.org.
 * getfedora.org.
 * cloud.fedoraproject.org.
 The other domains are edited only when we add/subtract a host or move it
 to a new ip. Not much else.
 If you need to edit a domain that is NOT In the above list:
 * change to the 'master' subdir, edit the domain as usual (remember to
 update the serial), save it.
 If you need to edit one of the domains in the above list: (replace
 fedoraproject.org with the domain from above)
 * if you need to add/change a host in fedoraproject.org that is not '@'
 or 'wildcard' then:
 ** edit fedoraproject.org.template
 ** make your changes
 ** {blank}
 +
 do not edit the serial or anything surrounded by \{\{ }} unless you::
  REALLY know what you are doing.
 * {blank}
 +
 if you need to only add/remove a proxy during an outage or due to::
  networking issue then run:
 - `./zone-template fedoraproject.org.cfg disable ip [ip] [ip]`::
  to disable the ip of the proxy you want removed.
 - `./zone-template fedoraproject.org.cfg enable ip [ip] [ip]`::
  reverses the disable
 - `./zone-template fedoraproject.org.cfg reset`::
  will reset to all ips enabled.
 * if you want to add an all new proxy as '@' or 'wildcard' for
 fedoraproject.org:
 ** edit fedoraproject.org.cfg
 ** add the ip to the correct section of the ipv4 or ipv6 in the config.
 ** save the file
 ** check the file for validity by running:
 `python fedoraproject.org.cfg` looking for errors or tracebacks.
 When complete run:
 ____
 git add . git commit -a -m 'description of your change here'
 ____
 It is important to commit this before running the do-domains script as
 it makes it easier to track the changes.
 In all cases then run:
 * `./do-domains`
 * if that completes successfully then run:
 +
 ....
 git add .
 git commit -a -m 'description of your change here'
 git push
 ....
 * nameservers update from dns via cron every 10minutes.
 The above git process can be achieved with the below bash function where
 the commit message is passed as an arg when running.:
 ....
 dnscommit()
 {
  local args=$1
  cd ~/dns;
  git commit -a -m "${args}"
  git pull --rebase && ./do-domains && git add built && git commit -a -m "Signed DNS" && git push
 }
 ....
 If you need an update to be live more quickly:
 and then run this on all of the nameservers (as root):
 ....
 /usr/local/bin/update-dns
 ....
 To run this via ansible from batcave do:
 ....
 $ sudo rbac-playbook update_dns.yml
 ....
 this will pull from the git tree, update all of the zones and reload the
 name server.
 == DNS update
 DNS config files are ansible managed on batcave01.
 From your local machine run:
 ....
 git clone ssh://git@pagure.io/fedora-infra/ansible.git
 cd ansible/roles/dns/files/
 ...make changes needed...
 git commit -m "What you did"
 git push
 ....
 It should update within a half hour. You can test the new configs with
 dig:
 ....
 dig @ns01.fedoraproject.org fedoraproject.org
 ....
 == Adding a new zone
 First name the zone and generate new set of keys for it. Run this on
 ns01. Note it could take SEVERAL minutes to run:
 ....
 /usr/sbin/dnssec-keygen -a RSASHA1 -b 1024 -n ZONE c.fedoraproject.org
 /usr/sbin/dnssec-keygen -a RSASHA1 -b 2048 -n ZONE -f KSK c.fedoraproject.org
 ....
 Then copy the created .key and .private files to the private git repo
 (You need to be sysadmin-main to do this). The directory is
 `private/private/dnssec`.
 * add the zone in zones.conf in `ansible/roles/dns/files/zones.conf`
 * save and commit - but do not push
 * Add zone file to the master subdir in this repo
 * git add and commit the file
 * check the zone by running check-domains
 * if you intend to have this be a dnssec signed zone then you must
 ** create a new key:
 +
 ....
 /usr/sbin/dnssec-keygen -a RSASHA1 -b 1024 -n ZONE $domain.org
 /usr/sbin/dnssec-keygen -a RSASHA1 -b 2048 -n ZONE -f KSK $domain.org
 ....
 *** {blank}
 +
 put the files this generates into /srv/privatekeys/dnssec on batcave01::
  **** edit the do-domains file in this dir and your domain to the
  signed_domains entry at the top
  **** edit the zone you just created and add the contents of the .key
  files to the bottom of the zone
 If this is a subdomain of fedoraproject.org:
 * run dnssec-dsfromkey on each of the .key files generated
 * paste that output into the bottom of fedoraproject.org.template
 * commit everything to the dns tree
 * push your changes
 * push your changes to the ansible repo
 * test
 If you add a new child zone, such as c.fedoraproject.org or
 vpn.fedoraproject.org you will also need to add the contents of
 dsset-childzone.fedoraproject.org (for example), to the main
 fedoraproject.org zonefile, so that DNSSEC has a valid trust path to
 that zone.
 You also must set the NS delegation entries near the top of
 fedoraproject.org zone file these are necessary to keep dnssec-signzone
 from whining with this error msg:
 ....
 dnssec-signzone: fatal: 'xxxxx.example.com': found DS RRset without NS RRset
 ....
 Look for the: "vpn IN NS" records at the top of fedoraproject.org and
 copy them for the new child zone.
 == GeoDNS
 As part of our Content Distribution Network we use geodns for certain
 zones. At the moment just `fedoraproject.org` and `*.fedoraproject.org`
 zones. We've got proxy servers all over the US and in Europe. We are now
 sending users to proxy servers that are near them. The current list of
 available 'zone areas' are:
 * DEFAULT
 * EU
 * NA
 DEFAULT contains all the zones. So someone who does not seem to be in or
 near the EU, or NA would get directed to any random set. (South Africa
 for example doesn't get directed to any particular server).
 [IMPORTANT]
 .Important
 ====
 Don't forget to increase the serial number in the fedoraproject.org zone
 file. Even if you're making a change to one of the geodns IPs. There is
 only one serial number for all setups and that serial number is in the
 fedoraproject.org zone.
 ====
 [NOTE]
 .Note
 ====
 Non geodns fedoraproject.org IPs If you're adding as server that is just
 in one location, and isn't going to get geodns balanced. Just add that
 host to the fedoraproject.org zone.
 ====
 === Adding and removing countries
 Our setup actually requires us to specify which countries go to which
 servers. To do this, simply edit the named.conf file in ansible. Below
 is an example of what counts as "NA" (North America).:
 ....
 view "NA" {
       match-clients { US; CA; MX; };
       recursion no;
       zone "fedoraproject.org" {
               type master;
               file "master/NA/fedoraproject.org.signed";
       };
       include "etc/zones.conf";
 };
 ....
 === IP Country Mapping
 The IP -> Location mapping is done via a config file that exists on the
 dns servers themselves (it's not ansible controlled). The file, located
 at `/var/named/chroot/etc/GeoIP.acl` is generated by the `GeoIP.sh`
 script (that script is in ansible).
 [WARNING]
 .Warning
 ====
 This is known to be a less efficient means of doing geodns than the
 patched version from kernel.org. We're using this version at the moment
 because it's in Fedora and works. The level of DNS traffic we see is
 generally low enough that the inefficiencies aren't that noticed. For
 example, average load on the servers before this geodns was .2, now it's
 around .4
 ====
 == resolv.conf
 In order to make the network more transparent to the admins, we do a lot
 of search based relative names. Below is a list of what a resolv.conf
 should look like.
 [IMPORTANT]
 .Important
 ====
 Any machine that is not on our vpn or has not yet joined the vpn should
 _link:[NOT] have the vpn.fedoraproject.org search until after it has
 been added to the vpn (if it ever does)
 ====
 Phoenix::
 ....
 search phx2.fedoraproject.org vpn.fedoraproject.org fedoraproject.org
 ....
 Phoenix in the QA network:::
 ....
 search qa.fedoraproject.org vpn.fedoraproject.org phx2.fedoraproject.org fedoraproject.org
 ....
 Non-Phoenix::
 ....
 search vpn.fedoraproject.org fedoraproject.org
 ....
 The idea here is that we can, when need be, setup local domains to
 contact instead of having to go over the VPN directly but still have
 sane configs. For example if we tell the proxy server to hit "app1" and
 that box is in PHX, it will go directly to app1, if its not, it will go
 over the vpn to app1.
--- a/modules/sysadmin_guide/pages/docs.fedoraproject.org.adoc
+++ b/modules/sysadmin_guide/pages/docs.fedoraproject.org.adoc
@ -0,0 +1,62 @@
 = docs SOP
 ____
 Fedora Documentation - Documentation for installing and using Fedora
 ____
 == Contact Information
 Owner:::
  docs, Fedora Infrastrcture Team
 Contact:::
  #fedora-docs
 Servers:::
  proxy*
 Purpose:::
  Provide documentation for users and contributors.
 == Description:
 The Fedora Documentation Project was created to provide documentation
 for fedora users and contributors. It's like "The Bible" for using
 Fedora and other software used by the Fedora Project. It uses Publican,
 a free and open-source publishing tool. Publican generates html pages
 from content in DocBook XML format. The source files are in a git repo
 and publican builds html files from these source files whenever changes
 are made. As these are static pages these are available on all the proxy
 servers which serve our requests for docs.fedoraproject.org.
 == Updates process:
 The fedora docs writers update and build their docs and then push the
 completed output into a git repo. This git repo is then pulled by each
 of the Fedora proxies and served as static content.
 Note that docs is talking about setting up a new process, this SOP needs
 updating when that happens.
 == Reporting bugs:
 Bugs can be reported at the Fedora Documentation's Bugzilla. Here's the
 link:
 https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora%20Documentation
 Errors or problems in the wiki can be modified by anyone with a FAS
 account.
 == Contributing to the Fedora Documentation Project:
 If you find the existing documentation insufficient or outdated or any
 particular page is not available in your language feel free to improve
 the documentation by contributing to Fedora Documentation Project. You
 can find more details here:
 https://fedoraproject.org/wiki/Join_the_Docs_Project
 Translation of documentation is taken care by the Fedora Localization
 Project aka L10N. More details can be found at:
 https://fedoraproject.org/wiki/L10N
 == Publican wiki:
 More details about Publican can be found at the publican wiki here:
 https://sourceware.org/publican/en-US/index.html
--- a/modules/sysadmin_guide/pages/fas-notes.adoc
+++ b/modules/sysadmin_guide/pages/fas-notes.adoc
@ -0,0 +1,157 @@
 = Fedora Account System
 Notes about FAS and how to do things in it:
 * where are certs for fas accounts for koji, etc? on fas01
 /var/lib/fedora-ca - makefile targets allow you to do things with them.
 look in index.txt for certs. One's marked with an 'R' in the left-most
 column are 'REVOKED'
 to revoke a cert:
 ....
 cd /var/lib/fedora-ca
 ....
 find the cert number in index.txt - the number is the 3rd column in the
 file - you can match it to the user by searching for their username. You
 want the highest number cert for their account.
 once you have the number you would run (as root or fas):
 ....
 make revoke cert=newcerts/$that_number.pem
 ....
 == How to gather information about a user
 You'll want to have direct access to query the database for this. The
 common way is to have someone in sysadmin-db ssh to the postgres db
 hosting FAS (currently db01). Then access it via ident auth on the box:
 ....
 sudo -u postgres psql fas2
 ....
 There are several tables that will have information about a user. Some
 of it is redundant but it's good to check all the sources there
 shouldn't be inconsistencies:
 ....
 select * from people where username = 'USERNAME';
 ....
 Of interest here are:
 id::
  for later queries
 password_changed::
  tells when the password was last changed
 last_seen::
  last login to fas (including through jsonfas from other TG1/2 apps.
  Maybe wiki and insight as well. Not fedorahosted trac, shell login,
  etc)
 status_change::
  last time that the user's status was updated via the website. Usually
  triggered when the user was marked inactive for a mass password change
  and then they reset their password.
 Next table is the log table:
 ....
 select * from log where author_id = ID_FROM_PREV_QUERY or description ~ '.*USERNAME.*';
 ....
 The FAS writes certain events to the log table. This will get those
 events. We use both the author_id field (who made the change) and the
 username in a description regex search because a few changes are made to
 users by admins. Fields of interest are pretty self explanatory here:
 changetime::
  when the log was made
 description::
  description of the event that's being logged
 [NOTE]
 .Note
 ====
 FAS does not log every event that happens to a user. Only "important"
 ones. FAS also cannot record direct changes to the database here (for
 instance, when we mark accounts inactive administratively via the db).
 ====
 Lastly, there's the groups and person_roles table. When a user joins
 a group, the person_roles table is updated to reflect the user's status
 in the group, when they applied, and when they were approved:
 ....
 select groups.name, person_roles.* from person_roles, groups where person_id = ID_FROM_INITIAL_QUERY and groups.id = person_roles.group_id;
 ....
 This will give you the following fields to pay attention to:
 name::
  Name of the group
 role_status::
  If this is unapproved, it just means the user applied for it. If it is
  approved, it means they are actually in the group.
 creation::
  When the user applied to the group
 approval::
  When the user was approved to be in the group
 role_type::
  What role the person has or wants to have in the group
 sponsor_id::
  If you suspect something is suspicious with one of the roles, you may
  want to ask the sponsor if they remember sponsoring this person
 == Account Deletion and renaming
 [NOTE]
 .Note
 ====
 see also accountdeletion.rst For information on how to disable, rename,
 and remove accounts.
 ====
 == Pseudo Users
 [NOTE]
 .Note
 ====
 see also nonhumanaccounts.rst For information on creating pseudo user
 accounts for use in pkgdb/bugzilla
 ====
 == fas staging
 we have a staging fas db setup on db-fas01.stg.phx2.fedoraproject.org -
 it accessed by fas01.stg.phx2.fedoraproject.org
 This system is not autopopulated by production fas - it must be done
 manually. To do this you must:
 * dump the fas2 db on db-fas01.phx2.fedoraproject.org:
 +
 ....
 sudo -u postgres pg_dump -C fas2 > fas2.dump
 scp fas2.dump db-fas01.stg.phx2.fedoraproject.org:/tmp
 ....
 * then on fas01.stg.phx2.fedoraproject.org:
 +
 ....
 /etc/init.d/httpd stop
 ....
 * then on db02.stg.phx2.fedoraproject.org:
 +
 ....
 echo "drop database fas2\;" | sudo -u postgres psql ; cat fas2.dump | sudo -u postgres psql
 ....
 * then on fas01.stg.phx2.fedoraproject.org:
 +
 ....
 /etc/init.d/httpd start
 ....
 that should do it.
--- a/modules/sysadmin_guide/pages/fas-openid.adoc
+++ b/modules/sysadmin_guide/pages/fas-openid.adoc
@ -0,0 +1,42 @@
 = FAS-OpenID
 FAS-OpenID is the OpenID server of Fedora infrastructure.
 Live instance is at https://id.fedoraproject.org/ Staging instance is at
 https://id.dev.fedoraproject.org/
 == Contact Information
 Owner::
  Patrick Uiterwijk (puiterwijk)
 Contact::
  #fedora-admin, #fedora-apps, #fedora-noc
 Location::
  openid0\{1,2}.phx2.fedoraproject.org openid01.stg.fedoraproject.org
 Purpose::
  Authentication & Authorization
 == Trusted roots
 FAS-OpenID has a set of "trusted roots", which contains websites which
 are always trusted, and thus FAS-OpenID will not show the Approve/Reject
 form to the user when they login to any such site.
 As a policy, we will only add websites to this list which Fedora
 Infrastructure controls. If anyone ever ask to add a website to this
 list, just answer with this default message:
 ....
 We only add websites we (Fedora Infrastructure) maintain to this list.
 This feature was put in because it wouldn't make sense to ask for permission
 to send data to the same set of servers that it already came from.
 Also, if we were to add external websites, we would need to judge their
 privacy policy etc.
 Also, people might start complaining that we added site X but not their site,
 maybe causing us "political" issues later down the road.
 As a result, we do NOT add external websites.
 ....
--- a/modules/sysadmin_guide/pages/fedmsg-certs.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-certs.adoc
@ -0,0 +1,181 @@
 = fedmsg (Fedora Messaging) Certs, Keys, and CA - SOP
 X509 certs, private RSA keys, Certificate Authority, and Certificate
 Revocation List.
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-admin, #fedora-apps, #fedora-noc
 Servers::
  * app0[1-7]
  * packages0[1-2]
  * fas0[1-3]
  * pkgs01
  * busgateway01,
  * value0\{1,3}
  * releng0\{1,4}
  * relepel03
 Purpose::
  Certify fedmsg messages come from authentic sources.
 == Description
 fedmsg sends JSON-encoded messages from many services to a zeromq
 messaging bus. We're not concerned with encrypting the messages, only
 with signing them so an attacker cannot spoof.
 Every instance of each service on each host has its own cert and private
 key, signed by the CA. By convention, we name the certs
 <service>-<fqdn>.\{crt,key} For instance, bodhi has the following certs:
 * bodhi-app01.phx2.fedoraproject.org
 * bodhi-app02.phx2.fedoraproject.org
 * bodhi-app03.phx2.fedoraproject.org
 * bodhi-app01.stg.phx2.fedoraproject.org
 * bodhi-app02.stg.phx2.fedoraproject.org
 * more
 Scripts to generate new keys, sign them, and revoke them live in the
 ansible repo in `ansible/roles/fedmsg/files/cert-tools/`. The keys and
 certs themselves (including ca.crt and the CRL) live in the private repo
 in `private/fedmsg-certs/keys/`
 fedmsg is locally configured to find the key it needs by looking in
 `/etc/fedmsg.d/ssl.py` which is kept in ansible in
 `ansible/roles/fedmsg/templates/fedmsg.d/ssl.py.erb`.
 Each service-host has its own key. This means:
 * A key is not shared across multiple instances of a service on
 different machines. i.e., bodhi on app01 and bodhi on app02 should have
 different key/cert pairs.
 * A key is not shared across multiple services on a host. i.e.,
 mediawiki on app01 and bodhi on app01 should have different key/cert
 pairs.
 The attempt here is to minimize the number of potential attack vectors.
 Each private key should be readable only by the service that needs it.
 bodhi runs under mod_wsgi in apache and should run as its own unique
 bodhi user (not as apache). The permissions for
 its.phx2.fedoraproject.org private_key, when deployed by ansible, should
 be read-only for that local bodhi user.
 For more information on how fedmsg uses these certs see
 http://fedmsg.readthedocs.org/en/latest/crypto.html
 == Configuring the Scripts
 Usage of the main scripts is described in more detail below. They are
 located in `ansible/rolesfedmsg/files/cert-tools`.
 Before you use them, you'll need to point them at the right directory to
 modify. By default, this is `~/private/fedmsg-certs/keys/`. You can
 change that by editing `ansible/roles/fedmsg/files/cert-tools/vars` in
 the event that you have the private repo checked out to an alternate
 location.
 There are other configuration values defined in that script. Most will
 not need to be changed.
 == Wiping and Rebuilding Everything
 There is a script in `ansible/roles/fedmsg/files/cert-tools/` named
 `rebuild-all-fedmsg-certs`. You can run it with no arguments to wipe out
 the old and generate a new CA root certificate, a signing cert and key,
 and all key/cert pairs for all service-hosts.
 [NOTE]
 .Note
 ====
 Warning -- Obviously, this will wipe everything. Do you want that?
 ====
 == Adding a new key for a new service-host
 First, checkout the ansible private repo as that's where the keys are
 going to be stored. The scripts will assume this is checked out to
 ~/private.
 In `ansible/roles/fedmsg/files/cert-tools` run:
 ....
 $ source ./vars
 $ ./build-and-sign-key <service>-<fqdn>
 ....
 For instance, if we bring up a new app host,
 app10.phx2.fedoraproject.org, we'll need to generate a new cert/key pair
 for each fedmsg-enabled service that will be running on it, so you'd
 run:
 ....
 $ source ./vars
 $ ./build-and-sign-key shell-app10.phx2.fedoraproject.org
 $ ./build-and-sign-key bodhi-app10.phx2.fedoraproject.org
 $ ./build-and-sign-key mediawiki-app10.phx2.fedoraproject.org
 ....
 Just creating the keys isn't quite enough, there are four more things
 you'll need to do.
 The private keys are created in your checkout of the private repo under
 ~/private/private/fedmsg-certs/keys . There will be four files for each
 cert you created: <hexdigits>.pem (ex: 5B.pem) and
 <service>-<fqdn>.\{crt,csr,key} git add, commit, and push all of those.
 Second, You need to edit
 `ansible/roles/fedmsg/files/cert-tools/rebuild-all-fedmsg-certs` and add
 the argument of the commands you just ran, so that next time certs need
 to be blown away and recreated, the new service-hosts will be included.
 For the examples above, you would need to add to the list:
 ....
 shell-app10.phx2.fedoraproject.org
 bodhi-app10.phx2.fedoraproject.org
 mediawiki-app10.phx2.fedoraproject.org
 ....
 You need to ensure that the keys are distributed to the host with the
 proper permissions. Only the bodhi user should be able to access bodhi's
 private key. This can be accomplished by using the `fedmsg::certificate`
 in ansible. It should distribute your new keys to the correct hosts and
 correctly permission them.
 Lastly, if you haven't already updated the global fedmsg config, you'll
 need to. You need to add your new service-node to `fedmsg.d/endpoint.py`
 and to `fedmsg.d/ssl.py`. Those can be found in
 `ansible/roles/fedmsg/templates/fedmsg.d`. See
 http://fedmsg.readthedocs.org/en/latest/config.html for more information
 on the layout and meaning of those files.
 == Revoking a key
 In `ansible/roles/fedmsg/files/cert-tools` run:
 ....
 $ source ./vars
 $ ./revoke-full <service>-<fqdn>
 ....
 This will alter `private/fedmsg-certs/keys/crl.pem` which should be
 picked up and served publicly, and then consumed by all fedmsg consumers
 globally.
 `crl.pem` is publicly available at
 http://fedoraproject.org/fedmsg/crl.pem
 [NOTE]
 .Note
 ====
 Even though crl.pem lives in the private repo, we're just keeping it
 there for convenience. It really _should_ be served publicly, so don't
 panic. :)
 ====
 [NOTE]
 .Note
 ====
 At the time of this writing, the CRL is not actually used. I need one
 publicly available first so we can test it out.
 ====
--- a/modules/sysadmin_guide/pages/fedmsg-gateway.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-gateway.adoc
@ -0,0 +1,106 @@
 = fedmsg-gateway SOP
 Outgoing raw ZeroMQ message stream.
 [NOTE]
 .Note
 ====
 see also: fedmsg-websocket
 ====
 == Contact Information
 Owner:::
  Messaging SIG, Fedora Infrastructure Team
 Contact:::
  #fedora-apps, #fedora-admin, #fedora-noc
 Servers:::
  busgateway01, proxy0*
 Purpose:::
  Expose raw ZeroMQ messages outside the FI environment.
 == Description
 Users outside of Fedora Infrastructure can listen to the production
 message bus by connecting to specific addresses. This is required for
 local users to run their own hubs and message processors ("Consumers").
 It is also required for user-facing tools like fedmsg-notify to work.
 The specific public endpoints are:
 production::
  tcp://hub.fedoraproject.org:9940
 staging::
  tcp://stg.fedoraproject.org:9940
 fedmsg-gateway, the daemon running on busgateway01, is listening to the
 FI production fedmsg bus and will relay every message that it receives
 out to a special ZMQ pub endpoint bound to port 9940. haproxy mediates
 connections to the fedmsg-gateway daemon.
 == Connection Flow
 Clients connect through haproxy on proxy0*:9940 are redirected to
 busgateway0*:9940. This can be found in the haproxy.cfg entry for
 `listen fedmsg-raw-zmq 0.0.0.0:9940`.
 This is different than the apache reverse proxy pass setup we have for
 the app0* and packages0* machines. _That_ flow looks something like
 this:
 ....
 Client -> apache(proxy01) -> haproxy(proxy01) -> apache(app01)
 ....
 The flow for the raw zmq stream provided by fedmsg-gateway looks
 something like this:
 ....
 Client -> haproxy(proxy01) -> fedmsg-gateway(busgateway01)
 ....
 haproxy is listening on a public port.
 At the time of this writing, haproxy does not actually load balance
 zeromq session requests across multiple busgateway0* machines, but there
 is nothing stopping us from adding them. New hosts can be added in
 ansible and pressed from busgateway01's template. Add them to the
 fedmsg-raw-zmq listen in haproxy's config and it should Just Work.
 == Increasing the Maximum Number of Concurrent Connections
 HTTP requests are typically very short (a few seconds at most). This
 means that the number of concurrent tcp connections we require for most
 of our services is quite low (1024 is overkill). ZeroMQ tcp connections,
 on the other hand, are expected to live for quite a long time.
 Consequently we needed to scale up the number of possible concurrent tcp
 connections.
 All of this is in ansible and should be handled for us automatically if
 we bring up new nodes.
 * The pam_limits user limit for the fedmsg user was increased from 1024
 to 160000 on busgateway01.
 * The pam_limits user limit for the haproxy user was increased from 1024
 to 160000 on the proxy0* machines.
 * The zeromq High Water Mark (HWM) was increased to 160000 on
 busgateway01.
 * The maximum number of connections allowed was increased in
 haproxy.cfg.
 == Nagios
 New nagios checks were added for this that check to see if the number of
 concurrent connections through haproxy is approaching the maximum number
 allowed.
 You can check these numbers by hand by inspecting the haproxy web
 interface: https://admin.fedoraproject.org/haproxy/proxy1#fedmsg-raw-zmq
 Look at the "Sessions" section. "Cur" is the current number of sessions
 versus "Max", the maximum number seen at the same time and "Limit", the
 maximum number of concurrent connections allowed.
 == RHIT
 We had RHIT open up port 9940 special to proxy01.phx2 for this.
--- a/modules/sysadmin_guide/pages/fedmsg-introduction.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-introduction.adoc
@ -0,0 +1,57 @@
 = fedmsg introduction and basics, SOP
 General information about fedmsg
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-admin, #fedora-noc
 Servers::
  Almost all of them.
 Purpose::
  Introduce sysadmins to fedmsg tools and config
 == Description
 fedmsg is a system that links together most of our webapps and services
 into a message mesh or net (often called a "bus"). It is built on top of
 the zeromq messaging library.
 fedmsg has its own developer documentation that is a good place to check
 if this or other SOPs don't provide enough information -
 http://fedmsg.rtfd.org
 == Tools
 Generally, fedmsg-tail and fedmsg-logger are the two most commonly used
 tools for debugging and testing. To see if bus-connectivity exists
 between two machines, log onto each of them and run the following on the
 first:
 ....
 $ echo testing from $(hostname) | fedmsg-logger
 ....
 And run the following on the second:
 ....
 $ fedmsg-tail --really-pretty
 ....
 == Configuration
 fedmsg configuration lives in /etc/fedmsg.d/
 `/etc/fedmsg.d/endpoints.py` keeps the list of every possible fedmsg
 endpoint. It acts as a global index that defines the bus.
 See fedmsg.readthedocs.org/en/latest/config/ for a full glossary of
 configuration values.
 == Logs
 fedmsg daemons keep their logs in /var/log/fedmsg. fedmsg message hooks
 in existing apps (like bodhi) will log any errors to the logs of the app
 they've been added to (like /var/log/httpd/error_log).
--- a/modules/sysadmin_guide/pages/fedmsg-irc.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-irc.adoc
@ -0,0 +1,29 @@
 = fedmsg-irc SOP
 ____
 Echo fedmsg bus activity to IRC.
 ____
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-fedmsg, #fedora-admin, #fedora-noc
 Servers::
  value03
 Purpose::
  Echo fedmsg bus activity to IRC
 == Description
 fedmsg-irc is a daemon running on value03 and value01.stg. It is
 listening to the fedmsg bus and echoing that activity to the
 #fedora-fedmsg channel in IRC.
 It can be configured to ignore certain messages, join certain rooms, and
 take on a different nick by editing the values in `/etc/fedmsg.d/irc.py`
 and restarting it with `sudo service fedmsg-irc restart`
 See http://fedmsg.readthedocs.org/en/latest/config/#term-irc for more
 information on configuration.
--- a/modules/sysadmin_guide/pages/fedmsg-new-message-type.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-new-message-type.adoc
@ -0,0 +1,73 @@
 = Adding a new fedmsg message type
 == Instrumenting the program
 First, figure out how you're going to publish the message? Is it from a
 shell script or from a long running process?
 If its from shell script, you need to just add a
 [.title-ref]#fedmsg-logger# statement to the script. Remember to set the
 [.title-ref]#--modname# and [.title-ref]#--topic# for your new message's
 fully-qualified topic.
 If its from a python process, you need to just add a
 `fedmsg.publish(..)` call. The same concerns about modname and topic
 apply here.
 If this is a short-lived python process, you'll want to add
 [.title-ref]#active=True# to the call to `fedmsg.publish(..)`. This will
 make the fedmsg lib "actively" reach out to our fedmsg-relay running on
 busgateway01.
 If it is a long-running python process (like a WSGI thread), then you
 don't need to pass any extra arguments. You don't want it to reach out
 to the fedmsg-relay if possible. Your process will require that some
 "endpoints" are created for it in `/etc/fedmsg.d/`. More on that below.
 == Supporting infrastructure
 You need to make sure that the machine this is running on has a cert and
 key that can be read by the program to sign its message. If you don't
 have a cert already, then you need to create it in the private repo. Ask
 a sysadmin-main member.
 Then you need to declare those certs in the [.title-ref]#fedmsg_certs#
 data structure stored typically in our ansible `group_vars/` for this
 service. Declare both the name of the cert, what group and user it
 should be owned by, and in the `can_send:` section, declare the list of
 topics that this cert should be allowed to publish.
 If this is a long-running python process that is _not_ passing
 [.title-ref]#active=True# to the call to
 [.title-ref]#fedmsg.publish(..)#, then you have to also declare
 endpoints for it. You do that by specifying the `fedmsg_wsgi_procs` and
 `fedmsg_wsgi_vars` in the `group_vars` for your service. The iptables
 rules and fedmsg endpoints should be automatically created for you on
 the next playbook run.
 == Supporting code
 At this point, you can push the change out to production and be
 publishing messages "okay". Everything should be fine.
 However, your message will show up blank in datagrepper, in IRC, and in
 FMN, and everywhere else we try to render it. You _must_ then follow up
 and write a new [.title-ref]#Processor# for it in the fedmsg_meta
 library we maintain:
 https://github.com/fedora-infra/fedmsg_meta_fedora_infrastructure
 You also _must_ write a test case for it there. The docs listing all
 topics we publish at http://fedora-fedmsg.rtfd.org/ is automatically
 generated from the test suite. Please don't forget this.
 Lastly, you should cut a release of fedmsg_meta and deploy it using the
 [.title-ref]#playbooks/manual/upgrade/fedmsg.yml# playbook, which should
 update all the relevant hosts.
 == Corner cases
 If the process publishing the new message lives _outside_ our main
 network, you have to jump through more hoops. Look at abrt, koschei, and
 copr for examples of how to configure this (you need a special firewall
 rule, and they need to be configured to talk to our "inbound gateway"
 running on the proxies.
--- a/modules/sysadmin_guide/pages/fedmsg-relay.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-relay.adoc
@ -0,0 +1,58 @@
 = fedmsg-relay SOP
 Bridge ephemeral scripts into the fedmsg bus.
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-admin, #fedora-noc
 Servers::
  app01
 Purpose::
  Bridge ephemeral bash and python scripts into the fedmsg bus.
 == Description
 fedmsg-relay is running on app01, which is a bad choice. We should look
 to move it to a more isolated place in the future. busgateway01 would be
 a better choice.
 "Ephemeral" scripts like `pkgdb2branch.py`, the post-receive git hook on
 pkgs01, and anywhere fedmsg-logger is used all depend on fedmsg-relay.
 Instead of emitting messages "directly" to the rest of the bus, they use
 fedmsg-relay as an intermediary.
 Check that fedmsg-relay is running by looking for it in the process
 list. You can restart it in the standard way with
 `sudo service fedmsg-relay restart`. Check for its logs in
 `/var/log/fedmsg/fedmsg-relay.log`
 Ephemeral scripts know where the fedmsg-relay is by looking for the
 relay_inbound and relay_outbound values in the global fedmsg config.
 == But What is it Doing? And Why?
 The fedmsg bus is designed to be "passive" in its normal operation. A
 mod_wsgi process under httpd sets up its fedmsg publisher socket to
 passively emit messages on a certain port. When some other service wants
 to receive these messages, it is up to that service to know where
 mod_wsgi is emitting and to actively connect there. In this way,
 emitting is passive and listening is active.
 We get a problem when we have a one-off or "ephemeral" script that is
 not a long-running process -- a script like pkgdb2branch which is run
 when a user runs it and which ends shortly after. Listeners who want
 these scripts messages will find that they are usually not available
 when they try to connect.
 To solve this problem, we introduced the "fedmsg-relay" daemon which is
 a kind of "passive"-to-"passive" adaptor. It binds to an outbound port
 on one end where it will publish messages (like normal) but it also
 binds to an another port where it listens passively for inbound
 messages. Ephemeral scripts then actively connect to the passive inbound
 port of the fedmsg-relay to have their payloads echoed on the
 bus-proper.
 See http://fedmsg.readthedocs.org/en/latest/topology/ for a diagram.
--- a/modules/sysadmin_guide/pages/fedmsg-websocket.adoc
+++ b/modules/sysadmin_guide/pages/fedmsg-websocket.adoc
@ -0,0 +1,70 @@
 = websocket SOP
 websocket communication with Fedora apps.
 see-also: `fedmsg-gateway.txt`
 == Contact Information
 Owner::
  Messaging SIG, Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-admin, #fedora-noc
 Servers::
  busgateway01, proxy0*, app0*
 Purpose::
  Expose a websocket server for FI apps to use
 == Description
 WebSocket is a protocol (an extension of HTTP/1.1) by which client web
 browsers can establish full-duplex socket communications with a server
 --the "real-time web".
 In our case, webapps served from app0* and packages0* will include
 javascript code instructing client browsers to establish a second
 connection to our WebSocket server. They point browsers to the following
 addresses:
 production::
  wss://hub.fedoraproject.org:9939
 staging::
  wss://stg.fedoraproject.org:9939
 The websocket server itself is a fedmsg-hub daemon running on
 busgateway01. It is configured to enable its websocket server component
 in the presence of certain configuration values.
 haproxy mediates connections to the fedmsg-hub websocket server daemon.
 An stunnel daemon provides SSL support.
 == Connection Flow
 The connection flow is much the same as in the fedmsg-gateway.txt SOP,
 but is somewhat more complicated.
 "Normal" HTTP requests to our app servers traverse the following chain:
 ....
 Client -> apache(proxy01) -> haproxy(proxy01) -> apache(app01)
 ....
 The flow for a websocket requests looks something like this:
 ....
 Client -> stunnel(proxy01) -> haproxy(proxy01) -> fedmsg-hub(busgateway01)
 ....
 stunnel is listening on a public port, negotiates the SSL connection,
 and redirects the connection to haproxy who in turn hands it off to the
 fedmsg-hub websocket server listening on busgateway01.
 At the time of this writing, haproxy does not actually load balance
 zeromq session requests across multiple busgateway0* machines, but there
 is nothing stopping us from adding them. New hosts can be added in
 ansible and pressed from busgateway01's template. Add them to the
 fedmsg-websockets listen in haproxy's config and it should Just Work.
 == RHIT
 We had RHIT open up port 9939 special to proxy01.phx2 for this.
--- a/modules/sysadmin_guide/pages/fedocal.adoc
+++ b/modules/sysadmin_guide/pages/fedocal.adoc
@ -0,0 +1,34 @@
 = Fedocal SOP
 Fedocal is a web-based group calender application that is made available
 to the various groups with in the Fedora project.
 == Contents
 [arabic]
 . Contact Information
 . Documentation Links
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Location::
  https://apps.fedoraproject.org/calendar
 Servers
 Purpose::
  To provide links to the documentation for fedocal, as it exists
  elsewhere on the internet and it was decided that a link document
  would be a better use of resources than to rewrite the book.
 == Documentation Links
 For information on the latest and greatest in fedocal please review:
 http://fedocal.readthedocs.org/en/latest/
 For documentation on the usage of fedocal please consult:
 http://fedocal.readthedocs.org/en/latest/usage.html
--- a/modules/sysadmin_guide/pages/fedora-releases.adoc
+++ b/modules/sysadmin_guide/pages/fedora-releases.adoc
@ -0,0 +1,360 @@
 = Fedora Release Infrastructure SOP
 This SOP contains all of the steps required by the Fedora Infrastructure
 team in order to get a release out. Much of this work overlaps with the
 Release Engineering team (and at present share many of the same
 members). Some work may get done by releng, some may get done by
 Infrastructure, as long as it gets done, it doesn't matter.
 == Contact Information
 Owner:::
  Fedora Infrastructure Team, Fedora Release Engineering Team
 Contact:::
  #fedora-admin, #fedora-releng, sysadmin-main, sysadmin-releng
 Location:::
  N/A
 Servers:::
  All
 Purpose:::
  Releasing a new version of Fedora
 == Preparations
 Before a release ships, the following items need to be completed.
 [arabic]
 . New website from the websites team (typically hosted at
 http://getfedora.org/_/)
 . Verify mirror space (for all test releases as well)
 . Verify with rel-eng permissions on content are right on the mirrors.
 Don't leak.
 . {blank}
 +
 Communication with Red Hat IS (Give at least 2 months notice, then::
  reminders as the time comes near) (final release only)
 . Infrastructure change freeze
 . Modify Template:FedoraVersion to reference new version. (Final release
 only)
 . Move old releases to archive (post final release only)
 . {blank}
 +
 Switch release from development/N to normal releases/N/ tree in mirror::
  manager (post final release only)
 == Change Freeze
 The rules are simple:
 * Hosts with the ansible variable "freezes" "True" are frozen.
 * You may make changes as normal on hosts that are not frozen. (For
 example, staging is never frozen)
 * Changes to frozen hosts requires a freeze break request sent to the
 fedora infrastructure list, containing a description of the problem or
 issue, actions to be taken and (if possible) patches to ansible that
 will be applied. These freeze breaks must then get two approvals from
 sysadmin-main or sysadmin-releng group members before being applied.
 * Changes to recover from outages are acceptable to frozen hosts if
 needed.
 Change freezes will be sent to the fedora-infrastructure-list and begin
 3 weeks before each release and the final release. The freeze will end
 one day after the release. Note, if the release slips during a change
 freeze, the freeze just extends until the day after a release ships.
 You can get a list of frozen/non-frozen hosts by:
 ....
 git clone https://pagure.io/fedora-infra/ansible.git
 scripts/freezelist -i inventory
 ....
 == Notes about release day
 Release day is always an interesting and unique event. After the final
 sprint from test to the final release a lot of the developers will be
 looking forward to a bit of time away, as well as some sleep. Once
 Release Engineering has built the final tree, and synced it to the
 mirrors it is our job to make sure everything else (except the bit flip)
 gets done as painlessly and easily as possible.
 [NOTE]
 .Note
 ====
 All communication is typically done in #fedora-admin. Typically these
 channels are laid back and staying on topic isn't strictly enforced. On
 release day this is not true. We encourage people to come, stay in the
 room and be quiet unless they have a specific task or question releated
 to release day. Its nothing personal, but release day can get out of
 hand quick.
 ====
 During normal load, our websites function as normal. This is
 especially true since we've moved the wiki to mod_fcgi. On release day
 our load spikes a great deal. During the Fedora 6 launch many services
 were offline for hours. Some (like the docs) were off for days. A large
 part of this outage was due to the wiki not being able to handle the
 load, part was a lack of planning by the Infrastructure team, and part
 is still a mystery. There are questions as to whether or not all of the
 traffic was legit or a ddos.
 The Fedora 7 release went much better. Some services were offline for
 minutes at a time but very little of it was out longer then that. The
 wiki crashed, as it always does. We had made sure to make the
 fedoraproject.org landing page static though. This helped a great deal
 though we did see load on the proxy boxes as spiky.
 Recent releases have been quite smooth due to a number of changes: we
 have a good deal more bandwith on master mirrors, more cpus and memory,
 as well as prerelease versions are much easier to come by for those
 interested before release day.
 == Day Prior to Release Day
 === Step 1 (Torrent)
 Setup the torrent. All files can be synced with the torrent box but just
 not published to the world. Verify with sha1sum. Follow the instructions
 on the torrentrelease.txt sop up to and including step 4.
 === Step 2 (Website)
 Verify the website design / content has been finalized with the websites
 team. Update the Fedora version number wiki template if this is a final
 release. It will need to be changed in
 https://fedoraproject.org/wiki/Template:CurrentFedoraVersion
 Additionally, there are redirects in the ansible
 playbooks/include/proxies-redirects.yml file for Cloud Images. These
 should be pushed as soon as the content is available. See:
 https://pagure.io/fedora-infrastructure/issue/3866 for example
 === Step 3 (Mirrors)
 Verify enough mirrors are setup and have Fedora ready for release. If
 for some reason something is broken it needs to be fixed. Many of the
 mirrors are running a check-in script. This lets us know who has Fedora
 without having to scan everyone. Hide the Alpha, Beta, and Preview
 releases from the publiclist page.
 You can check this by looking at:
 ....
 wget "http://mirrors.fedoraproject.org/mirrorlist?path=pub/fedora/linux/releases/test/28-Beta&country=global"
 (replace 28 and Beta with the version and release.)
 ....
 == Release day
 === Step 1 (Prep and wait)
 Verify the mirrors are ready and that the torrent has valid copies of
 its files (use sha1sum)
 Do not move on to step two until the Release Engineering team has given
 the ok for the release. It is the releng team's decision as to whether
 or not we release and they may pull the plug at any moment.
 === Step 2 (Torrent)
 Once given the ok to release, the Infrastructure team should publish the
 torrent and encourage people to seed. Complete the steps on the
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/torrentrelease.html
 after step 4.
 === Step 3 (Bit flip)
 The mirrors sit and wait for a single permissions bit to be altered so
 that they show up to their services. The bit flip (done by the releng
 team) will replicate out to the mirrors. Verify that the mirrors have
 received the change by seeing if it is actually available, just use a
 spot check. Once that is complete move on.
 === Step 4 (Website)
 Once all of the distribution pieces are verified (mirrors and torrent),
 all that is left is to publish the website. At present this is done by
 making sure the master branch of fedora-web is pulled by the
 syncStatic.sh script in ansible. It will sync in an hour normally but on
 release day people don't like to wait that long so do the following on
 sundries01
 ____
 sudo -u apache /usr/local/bin/lock-wrapper syncStatic 'sh -x
 /usr/local/bin/syncStatic'
 ____
 Once that completes, on batcave01:
 ....
 sudo -i ansible proxy\* "/usr/bin/rsync --delete -a --no-owner --no-group bapp02::getfedora.org/ /srv/web/getfedora.org/"
 ....
 Verify http://getfedora.org/ is working.
 === Step 5 (Docs)
 Just as with the website, the docs site needs to be published. Just as
 above follow the following steps:
 ....
 /root/bin/docs-sync
 ....
 === Step 6 (Monitor)
 Once the website is live, keep an eye on various news sites for the
 release announcement. Closely watch the load on all of the boxes, proxy,
 application and otherwise. If something is getting overloaded, see
 suggestions on this page in the "Juggling Resources" section.
 === Step 7 (Badges) (final release only)
 We have some badge rules that are dependent on which release of Fedora
 we're on. As you have time, please performs the following on your local
 box:
 ....
 $ git clone ssh://git@pagure.io/fedora-badges.git
 $ cd badges
 ....
 Edit `rules/tester-it-still-works.yml` and update the release tag to
 match the now old but stable release. For instance, if we just released
 fc21, then the tag in that badge rule should be fc20.
 Edit `rules/tester-you-can-pry-it-from-my-cold-dead-hands.yml` and
 update the release tag to match the release that is about to reach EOL.
 For instance, if we just released f28, then the tag in that badge rule
 should be f26. Commit the changes:
 ....
 $ git commit -a -m 'Updated tester badge rule for f28 release.'
 $ git push origin master
 ....
 Then, on batcave, perform the following:
 ....
 $ sudo -i ansible-playbook $(pwd)/playbooks/manual/push-badges.yml
 ....
 === Step 8 (Done)
 Just chill, keep an eye on everything and make changes as needed. If you
 can't keep a service up, try to redirect randomly to some of the
 mirrors.
 == Priorities
 Priorities of during release day (In order):
 [arabic]
 . {blank}
 +
 Website::
  Anything related to a user landing at fedoraproject.org, and clicking
  through to a mirror or torrent to download something must be kept up.
  This is distribution, and without it we can potentially lose many
  users.
 . {blank}
 +
 Linked addresses::
  We do not have direct control over what Hacker News, Phoronix or
  anyone else links to. If they link to something on the wiki and it is
  going down or link to any other site we control a rewrite should be
  put in place to direct them to http://fedoraproject.org/get-fedora.
 . {blank}
 +
 Torrent::
  The torrent server has never had problems during a release. Make sure
  it is up.
 . {blank}
 +
 Release Notes::
  Typically grouped with the docs site, the release notes are often
  linked to (this is fine, no need to redirect) but keep an eye on the
  logs and ensure that where we've said the release notes are, that they
  can be found there. In previous releases we sometimes had to make this
  available in more than one spot.
 . {blank}
 +
 docs.fedoraproject.org::
  People will want to see whats new in Fedora and get further
  documentation about it. Much of this is in the release notes.
 . {blank}
 +
 wiki::
  Because it is so resource heavy, and because it is so developer
  oriented we have no choice but to give the wiki a lower priority.
 . Everything else.
 == Juggling Resources
 In our environment we're running different things on many different
 servers. Using Xen we can easily give machines more or less ram,
 processors. We can take down builders and bring up application servers.
 The trick is to be smart and make sure you understand what is causing
 the problem. These are some tips to keep in mind:
 * IPTables based bandwidth and connection limiting (successful in the
 past)
 * Altering the weight on the proxy balancers
 * Create static pages out of otherwise dynamic content
 * Redirect pages to a mirror
 * Add a server / remove un-needed servers
 == CHECKLISTS:
 === Beta:
 * Announce infrastructure freeze 3 weeks before Beta
 * Change /topic in #fedora-admin
 * mail infrastucture list a reminder.
 * File all tickets
 * new website
 * check mirror permissions, mirrormanager, check mirror sizes, release
 day ticket.
 After release is a "go":
 * Make sure torrents are setup and ready to go.
 * fedora-web needs a branch for fN-beta. In it:
 * Beta used on get-prerelease
 * get-prerelease doesn't direct to release
 * verify is updated with Beta info
 * releases.txt gets a branched entry for preupgrade
 * bfo gets updated to have a Beta entry.
 After release:
 * Update /topic in #fedora-admin
 * post to infrastructure list that freeze is over.
 === Final:
 * Announce infrastructure freeze 2 weeks before Final
 * Change /topic in #fedora-admin
 * mail infrastucture list a reminder.
 * File all tickets
 * new website, check mirror permissions, mirrormanager, check
 * mirror sizes, release day ticket.
 After release is a "go":
 * Make sure torrents are setup and ready to go.
 * fedora-web needs a branch for fN-alpha. In it:
 * get-prerelease does direct to release
 * verify is updated with Final info
 * bfo gets updated to have a Final entry.
 * update wiki version numbers and names.
 After release:
 * Update /topic in #fedora-admin
 * post to infrastructure list that freeze is over.
 * Move MirrorManager repository tags from the development/$version/
 Directory objects, to the releases/$version/ Directory objects. This is
 done using the `move-devel-to-release --version=$version` command on
 bapp02. This is usually done now a week or two after release.
--- a/modules/sysadmin_guide/pages/fedorapackages.adoc
+++ b/modules/sysadmin_guide/pages/fedorapackages.adoc
@ -0,0 +1,108 @@
 = Fedora Packages SOP
 This SOP is for the Fedora Packages web application.
 https://apps.fedoraproject.org/packages
 == Contents
 [arabic]
 . Contact Information
 . Deploying to the servers
 . Maintenance
 . Checking for AGPL violations
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, #fedora-apps
 Persons::
  cverna
 Location::
  PHX2
 Servers::
  packages03.phx2.fedoraproject.org packages04.phx2.fedoraproject.org
  packages03.stg.phx2.fedoraproject.org
 Purpose::
  Web interface for searching packages information
 == Deploying to the servers
 === Deploying
 Once the new version is built, it needs to be deployed. To deploy the
 new version, you need
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/sshaccess.html[ssh
 access] to batcave01.phx2.fedoraproject.org and
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ansible.html[permissions
 to run the Ansible playbook].
 All the following commands should be run from batcave01.
 You can check the upstream documentation, on how to build a new release.
 This process results in a fedora-packages rpm available in the infra-tag
 rpm repo.
 You should make use of the staging instance in order to test the new
 version of the application.
 === Upgrading
 To upgrade, run the upgrade playbook:
 ....
 $ sudo rbac-playbook manual/upgrade/packages.yml
 ....
 This will upgrade the fedora-pacages package and restart the Apache web
 server and fedmsg-hub service.
 === Rebuild the xapian Database
 If you need to rebuild the xapian database then you can run the
 following playbook:
 ....
 $ sudo rbac-playbook manual/rebuild/fedora-packages.yml
 ....
 == Maintenance
 The web application is served by httpd and managed by the httpd
 service.:
 ....
 $ sudo systemctl restart httpd
 ....
 can be used to restart the service if needed. The application log files
 are available under [.title-ref]#/var/log/httpd/# directory.
 The xapian database is updated by a fedmsg consumer. You can restart the
 fedmsg-hub serivce if needed by using:
 ....
 $ sudo systemctl restart fedmsg-hub
 ....
 To check the consumer logs you can use:
 ....
 $ sudo journalctl -u fedmsg-hub
 ....
 == Checking for AGPL violations
 To remain AGPL compliant, we must ensure that all modifications to the
 code are made available in the SRPM that we link to in the footer of the
 application. You can easily query our app servers to determine if any
 AGPL violating code modifications have been made to the package.:
 ....
 func-command --host="*app*" --host="community*" "rpm -V fedoracommunity"
 ....
 You can safely ignore any changes to non-code files in the output. If
 any violations are found, the Infrastructure Team should be notified
 immediately.
--- a/modules/sysadmin_guide/pages/fedorapastebin.adoc
+++ b/modules/sysadmin_guide/pages/fedorapastebin.adoc
@ -0,0 +1,82 @@
 = Fedora Pastebin SOP
 [arabic]
 . Contact Information
 . Introduction
 . Installation
 . Dashboard
 . Add a word to censored list
 == 1. Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  athmane herlo
 Sponsor::
  nirik
 Location::
  phx2
 Servers::
  paste01.stg, paste01.dev
 Purpose::
  To host Fedora Pastebin
 == 2. Introduction
 Fedora pastebin is powered by sticky-notes which is included in EPEL.
 Fedora theming (skin) is included in ansible role.
 == 3. Installation
 Sticky-notes needs a MySQL db and a user with 'select, update, delete,
 insert' privileges.
 It's recommended to dump and import db from a working installation to
 save time (skipping the installation and tweaking).
 By default the installation is locked ie: you can't relaunch it.
 However, you can unlock the installation by commenting the line
 containing `$gsod->trigger` in `/etc/sticky-notes/install.php` then
 pointing the web browser to '/install'
 The configuration file containing general settings and DB credentials is
 located in `/etc/sticky-notes/config.php`
 == 4. Dashboard
 Sticky-notes has a dashboard (URL: /admin/) that can be used to :
 * {blank}
 +
 Manage pastes:::
  ** deleting paste
  ** getting information about the paste author (IP/Date/time etc...)
 * Manage users (aka admins) which can log into the dashboard
 * Manage IP Bans (add / delete banned IPs).
 * Authentication (not needed)
 * {blank}
 +
 Site configuration:::
  ** General configuration (included in config.php).
  ** Project Honey Pot configuration (not a FOSS service)
  ** Word censor configuration: a list of words to be censored in
  pastes.
 == 5. Add a word to censored list
 If a word is in censored list, any paste containing that word will be
 rejected, to add one, edit the variable '$sg_censor' in sticky-notes
 configuration file.:
 ....
 $sg_censor = "WORD1
 WORD2
 ...
 ...
 WORDn";
 ....
--- a/modules/sysadmin_guide/pages/fedorawebsites.adoc
+++ b/modules/sysadmin_guide/pages/fedorawebsites.adoc
@ -0,0 +1,303 @@
 = Websites Release SOP
 ____
 * {blank}
 [arabic]
 . Preparing the website for a release
 ** 1.1 Obsolete GPG key of the EOL Fedora release
 ** 1.2 Update GPG key
 *** 1.2.1 Steps
 * {blank}
 [arabic, start=2]
 . Update website
 ** 2.1 For Alpha
 ** 2.2 For Beta
 ** 2.3 For GA
 * {blank}
 [arabic, start=3]
 . Fire in the hole
 * {blank}
 [arabic, start=4]
 . Tips
 ** 4.1 Merging branches
 [arabic]
 . Preparing the website for a new release cycle
 ____
 1.1 Obsolete GPG key
 One month after a Fedora release the release number 'FXX-2' (i.e. 1
 month after F21 release, F19 will be EOL) will be EOL (End of Life). At
 this point we should drop the GPG key from the list in verify/ and move
 the keys to the obsolete keys page in keys/obsolete.html.
 1.2 Update GPG key
 After another couple of weeks and as the next release approaches, watch
 the fedora-release package for a new key to be added. Use the
 update-gpg-keys script in the fedora-web git repository to add it to
 static/. Manually add it to /keys and /verify in all websites where we
 use these keys:
 ____
 * arm.fpo
 * getfedora.org
 * labs.fpo
 * spins.fpo
 ____
 1.2.1 Steps
 [loweralpha]
 . Get a copy of the new key(s) from the fedora-release repo, you will
 find FXX-primary and FXX-secondary keys. Save them in ./tools to make
 the update easier.
 +
 https://pagure.io/fedora-repos
 . Start by editing ./tools/update-gpg-keys and adding the key-ids of any
 obsolete keys to the obsolete_keys list.
 . Then run that script to add the new key(s) to the fedora.gpg block:
 +
 fedora-web git:(master) cd tools/ tools git:(master) ./update-gpg-keys
 RPM-GPG-KEY-fedora-23-primary tools git:(master) ./update-gpg-keys
 RPM-GPG-KEY-fedora-23-secondary
 +
 This will add the key(s) to the keyblock in static/fedora.gpg and create
 a text file for the key in static/$KEYID.txt as well. Verify that these
 files have been created properly and contain all the keys that they
 should.
 * Handy checks: gpg static/fedora.gpg or gpg static/$KEYID.txt
 * Adding "--with-fingerprint" option will add the fingerprint to the
 output
 +
 The output of fedora.gpg should contain only the actual keys, not the
 obsolete keys. The single text files should contain the correct
 information for the uploaded key.
 . Next, add new key(s) to the list in data/verify.html and move the new
 key informations in the keys page in data/content/keys/index.html. A
 script to aid in generating the HTML code for new keys is in
 ./tools/make-gpg-key-html. It will print HTML to stdout for each
 RPM-GPG-KEY-* file given as arguments. This is suitable for copy/paste
 (or directly importing if your editor supports this). Check the copied
 HTML code and select if the key info is for a primary or secondary key
 (output says 'Primary or Secondary').
 +
 tools git:(master) ./make-gpg-key-html RPM-GPG-KEY-fedora-23-primary
 +
 Build the website with 'make en test' and carefully verify that the data
 is correct. Please double check all keys in
 http://localhost:5000/en/keys and http://localhost:5000/en/verify.
 +
 NOTE: the tool will give you an outdated output, adapt it to the new
 websites and bootstrap layout!
 ____
 ____
 [arabic, start=2]
 . Update website
 ____
 2.1 For Alpha
 ____
 [loweralpha]
 . Create the fXX-alpha branch from master fedora-web git:(master) git
 push origin master:refs/heads/f22-alpha
 +
 and checkout to the new branch: fedora-web git:(master) git checkout -t
 -b f13-alpha origin/f13-alpha
 . Update the global variables Change curr_state to Alpha for all arches
 . Add Alpha banner Upload the FXX-Alpha banner to
 static/images/banners/f22alpha.png which should appear in every
 $\{PRODUCT}/download/index.html page. Make sure the banner is shown in
 all sidebars, also in labs, spins, and arm.
 . Check all Download links and paths in
 $\{PRODUCT}/prerelease/index.html You can find all paths in bapp01 (sudo
 su - mirrormanager first) or you can look at the downlaod page
 http://dl.fedoraproject.org/pub/alt/stage
 . Add CHECKSUM files to static/checksums and verify that the paths are
 correct. The files should be in sundries01 and you can query them with:
 $ find /pub/fedora/linux/releases/test/17-Alpha/ -type f -name
 _CHECKSUM_ -exec cp '\{}' . ; Remember to add the right checksums to the
 right websites (same path).
 . Add EC2 AMI IDs for Alpha. All IDs now are in the globalvar.py file.
 We get all data from there, even the redirect path to trac the AMI IDs.
 We now also have a script which is useful to get all the AMI IDs
 uploaded with fedimg. Execute it to get the latest uploads, but don't
 run the script too early, as new builds are added constantly. fedora-web
 git:(fXX-alpha) python ~/fedora-web/tools/get_ami.py
 . Add CHECKSUM files also to http://spins.fedoraproject.org in
 static/checksums. Verify the paths are correct in
 data/content/verify.html. (see point e) to query them on sundries01).
 Same for labs.fpo and arm.fpo.
 . Verify all paths and links on http://spins.fpo, labs.fpo and arm.fpo.
 . Update Alpha Image sizes and pre_cloud_composedate in
 ./build.d/globalvar.py. Verify they are right in Cloud images and Docker
 image.
 . Update the new POT files and push them to Zanata (ask a maintainer to
 do so) every time you change text strings.
 . Add this build to stg.fedoraproject.org (ansible syncStatic.sh.stg) to
 test the pages online.
 ____
 . Release Date:
 ____
 * Merge the fXX-alpha branch to master and correct conflicts manually
 * Remove the redirect of prerelease pages in ansible, edit:
 * ansible/playbooks/include/proxies-redirects.yml
 * ask a sysadmin-main to run playbook
 * When ready and about 90 minutes before Release Time push to master
 * Tag the commit as new release and push it too: $ git tag -a FXX-Alpha
 -m 'Releasing Fedora XX Alpha' $ git push --tags
 * If needed follow "Fire in the hole" below.
 ____
 2.2 For Beta
 ____
 [loweralpha]
 . Create the fXX-beta branch from master fedora-web git:(master) git
 push origin master:refs/heads/f22-beta
 +
 and checkout to the new branch: fedora-web git:(master) git checkout -t
 -b f22-beta origin/f22-beta
 . Update the global variables Change curr_state to Beta for all arches
 . Add Alpha banner Upload the FXX-Beta banner to
 static/images/banners/f22beta.png which should appear in every
 $\{PRODUCT}/download/index.html page. Make sure the banner is shown in
 all sidebars, also in labs, spins, and arm.
 . Check all Download links and paths in
 $\{PRODUCT}/prerelease/index.html You can find all paths in bapp01 (sudo
 su - mirrormanager first) or you can look at the downlaod page
 http://dl.fedoraproject.org/pub/alt/stage
 . Add CHECKSUM files to static/checksums and verify that the paths are
 correct. The files should be in sundries and you can query them with: $
 find /pub/fedora/linux/releases/test/17-Beta/ -type f -name _CHECKSUM_
 -exec cp '\{}' . ; Remember to add the right checksums to the right
 websites (same path).
 . Add EC2 AMI IDs for Beta. All IDs now are in the globalvar.py file. We
 get all data from there, even the redirect path to trac the AMI IDs. We
 now also have a script which is useful to get all the AMI IDs uploaded
 with fedimg. Execute it to get the latest uploads, but don't run the
 script too early, as new builds are added constantly. fedora-web
 git:(fXX-beta) python ~/fedora-web/tools/get_ami.py
 . Add CHECKSUM files also to http://spins.fedoraproject.org in
 static/checksums. Verify the paths are correct in
 data/content/verify.html. (see point e) to query them on sundries01).
 Same for labs.fpo and arm.fpo.
 . Remove static/checksums/Fedora-XX-Alpha* in all websites.
 . Verify all paths and links on http://spins.fpo, labs.fpo and arm.fpo.
 . Update Beta Image sizes and pre_cloud_composedate in
 ./build.d/globalvar.py. Verify they are right in Cloud images and Docker
 image.
 . Update the new POT files and push them to Zanata (ask a maintainer to
 do so) every time you change text strings.
 . Add this build to stg.fedoraproject.org (ansible syncStatic.sh.stg) to
 test the pages online.
 ____
 . Release Date:
 ____
 * Merge the fXX-beta branch to master and correct conflicts manually
 * When ready and about 90 minutes before Release Time push to master
 * Tag the commit as new release and push it too: $ git tag -a FXX-Beta
 -m 'Releasing Fedora XX Beta' $ git push --tags
 * If needed follow "Fire in the hole" below.
 ____
 2.3 For GA
 ____
 [loweralpha]
 . Create the fXX branch from master fedora-web git:(master) git push
 origin master:refs/heads/f22
 +
 and checkout to the new branch: fedora-web git:(master) git checkout -t
 -b f22 origin/f22
 . Update the global variables Change curr_state for all arches
 . Check all Download links and paths in $\{PRODUCT}/download/index.html
 You can find all paths in bapp01 (sudo su - mirrormanager first) or you
 can look at the downlaod page http://dl.fedoraproject.org/pub/alt/stage
 . Add CHECKSUM files to static/checksums and verify that the paths are
 correct. The files should be in sundries01 and you can query them with:
 $ find /pub/fedora/linux/releases/17/ -type f -name _CHECKSUM_ -exec cp
 '\{}' . ; Remember to add the right checksums to the right websites
 (same path).
 . At some point freeze translations. Add an empty PO_FREEZE file to
 every website's directory you want to freeze.
 . Add EC2 AMI IDs for GA. All IDs now are in the globalvar.py file. We
 get all data from there, even the redirect path to trac the AMI IDs. We
 now also have a script which is useful to get all the AMI IDs uploaded
 with fedimg. Execute it to get the latest uploads, but don't run the
 script too early, as new builds are added constantly. fedora-web
 git:(fXX) python ~/fedora-web/tools/get_ami.py
 . Add CHECKSUM files also to http://spins.fedoraproject.org in
 static/checksums. Verify the paths are correct in
 data/content/verify.html. (see point e) to query them on sundries01).
 Same for labs.fpo and arm.fpo.
 . Remove static/checksums/Fedora-XX-Beta* in all websites.
 . Verify all paths and links on http://spins.fpo, labs.fpo and arm.fpo.
 . Update GA Image sizes and cloud_composedate in ./build.d/globalvar.py.
 Verify they are right in Cloud images and Docker image.
 . Update static/js/checksum.js and check if the paths and checksum still
 match.
 . Update the new POT files and push them to Zanata (ask a maintainer to
 do so) every time you change text strings.
 . Add this build to stg.fedoraproject.org (ansible syncStatic.sh.stg) to
 test the pages online.
 ____
 . Release Date:
 ____
 * Merge the fXX-beta branch to master and correct conflicts manually
 * Add the redirect of prerelease pages in ansible, edit:
 * ansible/playbooks/include/proxies-redirects.yml
 * ask a sysadmin-main to run playbook
 * Unfreeze translations by deleting the PO_FREEZE files
 * When ready and about 90 minutes before Release Time push to master
 * Update the short links for the Cloud Images for 'Fedora XX', 'Fedora
 XX-1' and 'Latest'
 * Tag the commit as new release and push it too: $ git tag -a FXX -m
 'Releasing Fedora XX' $ git push --tags
 * If needed follow "Fire in the hole" below.
 ____
 ____
 [arabic, start=3]
 . Fire in the hole
 ____
 We now use ansible for everything, and normally use a regular build to
 make the websites live. If something is not happening as expected, you
 should get in contact with a sysadmin-main to run the ansible playbook
 again.
 All our stuff, such as SyncStatic.sh and SyncTranslation.sh scripts are
 now also in ansible!
 Staging server app02 and production server bapp01 do not exist anymore,
 now our staging websites are on sundries01.stg and the production on
 sundries01. Change your scripts accordingly and as sysadmin-web you
 should have access to those servers as before.
 ____
 ____
 [arabic, start=4]
 . Tips
 ____
 4.1 Merging branches
 Suggested by Ricky This can be useful if you're _sure_ all new changes
 on devel branch should go into the master branch. Conflicts will be
 solved directly accepting only the changes in the devel branch. If
 you're not 100% sure do a normal merge and fix conflicts manually!
 $ git merge f22-beta $ git checkout --theirs f22-beta [list of
 conflicting po files] $ git commit
--- a/modules/sysadmin_guide/pages/fmn.adoc
+++ b/modules/sysadmin_guide/pages/fmn.adoc
@ -0,0 +1,204 @@
 = FedMsg Notifications (FMN) SOP
 Route individualized notifications to fedora contributors over email,
 irc.
 == Contact Information
 === Owner
 * Messaging SIG
 * Fedora Infrastructure Team
 === Contact
 ____
 * #fedora-apps for FMN development
 * #fedora-fedmsg for an IRC feed of all fedmsgs
 * #fedora-admin for problems with the deployment of FMN
 * #fedora-noc for outage/crisis alerts
 ____
 === Servers
 Production servers:
 ____
 * notifs-backend01.phx2.fedoraproject.org (RHEL 7)
 * notifs-web01.phx2.fedoraproject.org (RHEL 7)
 * notifs-web02.phx2.fedoraproject.org (RHEL 7)
 ____
 Staging servers:
 ____
 * notifs-backend01.stg.phx2.fedoraproject.org (RHEL 7)
 * notifs-web01.stg.phx2.fedoraproject.org (RHEL 7)
 * notifs-web02.stg.phx2.fedoraproject.org (RHEL 7)
 ____
 === Purpose
 Route notifications to users
 == Description
 fmn is a pair of systems intended to route fedmsg notifications to
 Fedora contributors and users.
 There is a web interface running on notifs-web01 and notifs-web02 that
 allows users to login and configure their preferences to select this or
 that type of message.
 There is a backend running on notifs-backend01 where most of the work is
 done.
 The backend process is a 'fedmsg-hub' daemon, controlled by systemd.
 == Hosts
 === notifs-backend
 This host runs:
 * `fedmsg-hub.service`
 * One or more `fmn-worker@.service`. Currently notifs-backend01 runs
 `fmn-worker@\{1-4}.service`
 * `fmn-backend@1.service`
 * `fmn-digests@1.service`
 * `rabbitmq-server.service`, an AMQP broker used to communicate between
 the services.
 * `redis.service`, used for caching.
 This host relies on a PostgreSQL database running on
 db01.phx2.fedoraproject.org.
 === notifs-web
 This host runs:
 * A Python WSGI application via Apache httpd that serves the
 https://apps.fedoraproject.org/notifications%3E[FMN web user interface].
 This host relies on a PostgreSQL database running on
 db01.phx2.fedoraproject.org.
 == Deployment
 Once upstream releases a new version of
 https://github.com/fedora-infra/fmn[fmn],
 https://github.com/fedora-infra/fmn.web[fmn-web], or
 https://github.com/fedora-infra/fmn.sse[fmn-sse] creating a Git tag, a
 new version can be built an deployed into Fedora infrastructure.
 === Building
 FMN is packaged in Fedora and EPEL as
 https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn/[python-fmn]
 (the backend),
 https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-web/[python-fmn-web]
 (the frontend), and the optional
 https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-sse/[python-fmn-sse].
 Since all the hosts run RHEL 7, you need to build all these packages for
 EPEL 7.
 === Configuration
 If there are any configuration updates required by the new version of
 FMN, update the `notifs` Ansible roles on
 batcave01.phx2.fedoraproject.org. Remember to use:
 ....
 {% if env == 'staging' %}
    <new config here>
 {% else %}
    <retain old config>
 {% endif %}
 ....
 When deploying the update to staging. You can apply configuration
 updates to staging by running:
 ....
 $ sudo rbac-playbook -l staging groups/notifs-backend.yml
 $ sudo rbac-playbook -l staging groups/notifs-web.yml
 ....
 Simply drop the `-l staging` to update the production configuration.
 === Upgrading
 To upgrade the
 https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn/[python-fmn],
 https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-web/[python-fmn-web],
 and
 https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-sse/[python-fmn-sse]
 packages, apply configuration changes, and restart the services, you
 should use the manual upgrade playbook:
 ....
 $ sudo rbac-playbook -l staging manual/upgrade/fmn.yml
 ....
 Again, drop the `-l staging` flag to upgrade production.
 Be aware that the FMN services take a significant amount of time to
 start up as they pre-heat their caches before starting work.
 == Service Administration
 Disable an account (on notifs-backend01):
 ....
 $ sudo -u fedmsg /usr/local/bin/fmn-disable-account USERNAME
 ....
 Restart:
 ....
 $ sudo systemctl restart fedmsg-hub
 ....
 Watch logs:
 ....
 $ sudo journalctl -u fedmsg-hub -f
 ....
 Configuration:
 ....
 $ ls /etc/fedmsg.d/
 $ sudo fedmsg-config | less
 ....
 Monitor performance:
 ....
 http://threebean.org/fedmsg-health-day.html#FMN
 ....
 Upgrade (from batcave):
 ....
 $ sudo -i ansible-playbook /srv/web/infra/ansible/playbooks/manual/upgrade/fmn.yml
 ....
 == Mailing Lists
 We use FMN as a way to forward certain kinds of messages to mailing
 lists so people can read them the good old fashioned way that they like
 to. To accomplish this, we create 'bot' FAS accounts with their own FMN
 profiles and we set their email addresses to the lists in question.
 If you need to change the way some set of messages are forwarded, you
 can do it from the FMN web interface (if you are an FMN admin as defined
 in the config file in roles/notifs/frontend/). You can navigate to
 https://apps.fedoraproject.org/notifications/USERNAME.id.fedoraproject.org
 to do this.
 If the account exists as a FAS user already (for instance, the
 `virtmaint` user) but it does not yet exist in FMN, you can add it to
 the FMN database by logging in to notifs-backend01 and running
 `fmn-create-user --email DESTINATION@EMAIL.COM --create-defaults FAS_USERNAME`.
--- a/modules/sysadmin_guide/pages/fpdc.adoc
+++ b/modules/sysadmin_guide/pages/fpdc.adoc
@ -0,0 +1,100 @@
 = FPDC SOP
 Fedora Product Definition Center is a service that aims to replace
 https://pdc.fedoraproject.org/[PDC] in Fedora. It is meant to be a
 database with REST API access used to store data needed by other
 services.
 == Contact Information
 Owner::
  Infrastructure Team
 Contact::
  #fedora-apps, #fedora-admin
 Persons::
  cverna, abompard
 Location::
  Phoenix (Openshift)
 Public addresses::
  * fpdc.fedoraproject.org
  * fpdc.stg.fedoraproject.org
 Servers::
  * os.fedoraproject.org
  * os.stg.fedoraproject.org
 Purpose::
  Centralize metadata and facilitate access.
 == Systems
 FPDC is built using the DJANGO REST FRAMEWORK and uses a POSTGRESQL
 database to store the metadata. The application is run on Openshift and
 uses the Source-to-image technology to build the container directly from
 the https://github.com/fedora-infra/fpdc[git repository].
 In the staging and production environments, the application is
 automatically rebuilt for every new commit in the [.title-ref]#staging#
 or [.title-ref]#production# branch, this is achieved by configuring a
 github webhook's to trigger an openshift deployment.
 For example a new deployment to staging would look like that:
 ____
 git clone git@github.com:fedora-infra/fpdc.git cd fpdc git checkout
 staging git rebase master git push origin staging
 ____
 The initial Openshift project deployment is manual and is done using the
 following ansible playbook :
 ....
 sudo rbac-playbook openshift-apps/fpdc.yml
 ....
 This will create a new fpdc project in Openshift with all the needed
 configuration.
 == Logs
 Logs can be retrive using the openshift command line:
 ....
 $ oc login os-master01.phx2.fedoraproject.org
 You must obtain an API token by visiting https://os.fedoraproject.org/oauth/token/request
 $ oc login os-master01.phx2.fedoraproject.org --token=<Your token here>
 $ oc -n fpdc get pods
 fpdc-28-bfj52          1/1       Running     522        28d
 $ oc logs fpdc-28-bfj52
 ....
 == Database migrations
 FPDC uses the [.title-ref]#recreate# deployment configuration of
 openshift, which means that openshift will bring down the pods currently
 running and recreate new ones with the new version of the application.
 In the phase between the pods being down and the new pods being up, the
 database migrations are run in an independent pod.
 == Things that could go wrong
 Hopefully not much. If something goes wrong is it currently advised to
 kill the pods to trigger a fresh deployment. :
 ....
 $ oc login os-master01.phx2.fedoraproject.org
 You must obtain an API token by visiting https://os.fedoraproject.org/oauth/token/request
 $ oc login os-master01.phx2.fedoraproject.org --token=<Your token here>
 $ oc -n fpdc get pods
 fpdc-28-bfj52          1/1       Running     522        28d
 $ oc delete pod fpdc-28-bfj52
 ....
 It is also possible to rollback to a previous version :
 ....
 $ oc -n fpdc get dc
 NAME      REVISION   DESIRED   CURRENT   TRIGGERED BY
 fpdc      39         1         1         config,image(fpdc:latest)
 $ oc -n fpdc rollback fpdc
 ....
--- a/modules/sysadmin_guide/pages/freemedia.adoc
+++ b/modules/sysadmin_guide/pages/freemedia.adoc
@ -0,0 +1,261 @@
 = FreeMedia Infrastructure SOP
 This page is for defining the SOP for Fedora FreeMedia Program. This
 will cover the infrastructural things as well as procedural things.
 == Contents
 [arabic]
 . Location of Resources
 . Location on Ansible
 . Opening of the form
 . Closing of the Form
 . Tentative timeline
 . How to
 ____
 [arabic]
 . Open
 . Close
 ____
 ____
 [arabic, start=7]
 . Handling of tickets
 ____
 ____
 [arabic]
 . Login
 . Rejecting Invalid Tickets
 . Accepting Valid Tickets
 ____
 ____
 [arabic, start=8]
 . Handling of non fulfilled requests
 . How to handle membership applications
 ____
 == Location of Resources
 * The web form is at
 https://fedoraproject.org/freemedia/FreeMedia-form.html
 * The TRAC is at [63]https://fedorahosted.org/freemedia/report
 == Location on ansible
 $PWD = `roles/freemedia/files`
 Freemedia form::
  FreeMedia-form.html
 Backup form::
  FreeMedia-form.html.orig
 Closed form::
  FreeMedia-close.html
 Backend processing script::
  process.php
 Error Document::
  FreeMedia-error.html
 == Opening of the form
 The form will be opened on the First day of each month.
 == Closing of the Form
 === Tentative timeline
 The form will be closed after a couple of days. This may vary according
 to the capacity.
 == How to
 * The form is available at `roles/freemedia/files/FreeMedia-form.html`
 and `roles/freemedia/files//FreeMedia-form.html.orig`
 * The closed form is at `roles/freemedia/files/FreeMedia-close.html`
 === Open
 * Goto roles/freemedia/tasks
 * Open `main.yml`
 * Goto line 32.
 * {blank}
 +
 To Open: Change the line to read::::
  src="FreeMedia-form.html"
 * After opening the form, go to trac and grant "Ticket Create and Ticket
 View" privilege to "Anonymous".
 === Close
 * Goto roles/freemedia/tasks
 * Open main.yml
 * Goto line 32.
 * {blank}
 +
 To Close: Change the line to read::::
  src="FreeMedia-close.html",
 * {blank}
 +
 After closing the form, go to trac and remove "Ticket Create and::
  Ticket View" privilege from "Anonymous".
 [NOTE]
 .Note
 ====
 * Have to check about monthly cron. * Have to write about changing
 init.pp for closing and opening
 ====
 == Handling of tickets
 === Login
 * {blank}
 +
 Contributors are requested to visit::
  https://fedorahosted.org/freemedia/report
 * Please login with your FAS account.
 === Rejecting Invalid Tickets
 * If a ticket is invalid, don't accept the request. Go to "resolve as:"
 and select "invalid" and then press "Submit Changes".
 * A ticket is Invalid if
 +
 ____
 ** No Valid email-id is provided.
 ** The region does not match the country.
 ** No Proper Address is given.
 ____
 * If a ticket is duplicate, accept one copy, close the others as
 duplicate Go to "resolve as:" and select "duplicate" and then press
 "Submit Changes".
 === Accepting Valid Tickets
 * If you wish to fulfill a request, please ensure it from the above
 section, it is not liable to be discarded.
 * Now "Accept" the ticket from the "Action" field at the bottom, and
 press the "Submit Changes" button.
 * These accepted tickets will be available from
 https://fedorahosted.org/freemedia/report user both "My Tickets" and
 "Accepted Tickets for XX" (XX= your region e.g APAC)
 * When You ship the request, please go to the ticket again, go to
 "resolve as:" from the "Action" field and select "Fixed" and then press
 "Submit Changes".
 * If an accepted ticket is not finalised by the end of the month, is
 should be closed with "shipping status unknown" in a comment
 === Handling of non fulfilled requests
 We shall close all the pending requests by the end of the Month.
 * Please Check your region
 === How to handle membership applications
 Steps to become member of Free-media Group.
 [arabic]
 . Create an account in Fedora Account System (FAS)
 . {blank}
 +
 Create an user page in Fedora Wiki with contact data. Like::
  User:<nick-name>. There are templates.
 . Apply to Free-Media Group in FAS
 . Apply to Free-Media mailing list subscription
 ==== Rules for deciding over membership applications
 [cols=",,,,",]
 |===
 |Case |Applied to Free-Media Group |User Page Created |Applied to
 Free-Media List a|
 ____
 Action
 ____
 |======= |================ |========== |===============
 |=========================
 |1 |Yes a|
 ____
 Yes
 ____
 |Yes |Approve Group and mailing list applications
 a|
 '''''
 a|
 '''''
 a|
 '''''
 a|
 '''''
 |-------------------------Put on hold + Write to
 |2 |Yes a|
 ____
 Yes
 ____
 |No |subscribe to list Within a Week
 a|
 '''''
 a|
 '''''
 a|
 '''''
 a|
 '''''
 |-------------------------Put on hold + Write to
 |3 |Yes a|
 ____
 No
 ____
 |whatever |make User Page Within a Week
 |------- |---------------- |---------- |---------------
 |-------------------------
 |4 |No a|
 ____
 No
 ____
 |Yes |Reject
 |===
 [NOTE]
 .Note
 ====
 {empty}1. As you need to have an FAS account for steps 2 and 3, this is
 not included in the decision rules above 2. The time to be on hold is
 one week. If not action is taken after one week, the application has to
 be rejected. 3. When writing asking to fulfil steps, send CC to other
 Free-media sponsors to let them know the application has been reviewed.
 ====
--- a/modules/sysadmin_guide/pages/freenode-irc-channel.adoc
+++ b/modules/sysadmin_guide/pages/freenode-irc-channel.adoc
@ -0,0 +1,72 @@
 = Freenode IRC Channel Infrastructure SOP
 Fedora uses the freenode IRC network for it's IRC communications. If you
 want to make a new Fedora Related IRC Channel, please follow the
 following guidelines.
 == Contents
 [arabic]
 . Contact Information
 . Is a new channel needed?
 . Adding new channel
 . Recovering/fixing an existing channel
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin
 Location:::
  freenode
 Servers:::
  none
 Purpose:::
  Provides a channel for Fedora contributors to use.
 == Is a new channel needed?
 First you should see if one of the existing Fedora channels will meet
 your needs. Adding a new channel can give you a less noisy place to
 focus on something, but at the cost of less people being involved. If
 you topic/area is development related, perhaps the main #fedora-devel
 channel will meet your needs?
 == Adding new channel
 * Make sure the channel is in the #fedora-* namespace. This allows the
 Fedora Group Coordinator to make changes to it if needed.
 * Found the channel. You do this by /join #channelname, then /msg
 chanserv register #channelname
 * Setup GUARD mode. This allows ChanServ to be in the channel for easier
 management: `/msg chanserv set #channel GUARD on`
 * Add Some other Operators/Managers to the access list. This would allow
 them to manage the channel if you are asleep or absent.:
 +
 ....
 /msg chanserv access #channel add NICK +ARfiorstv
 ....
 You can see what the various flags mean at
 http://toxin.jottit.com/freenode_chanserv_commands#cs03
 You may want to consider adding some or all of the folks in #fedora-ops
 who manage other channels to help you with yours. You can see this list
 with [.title-ref]##/msg chanserv access #fedora-ops list##`
 * Set default modes. `/msg chanserv set mlock #channel +Ccnt` (The t for
 topic lock is optional, if your channel would like to have people change
 the topic often).
 * If your channel is of general interest, add it to the main communicate
 page of IRC Channels, and possibly announce it to your target audience.
 * You may want to request zodbot join your channel if you need it's
 functions. You can request that in #fedora-admin.
 == Recovering/fixing an existing channel
 If there is an existing channel in the #fedora-* namespace that has a
 missing founder/operator, please contact the Fedora Group Coordinator:
 [49]User:Spot and request it be reassigned. Follow the above procedure
 on the channel once done so it's setup and has enough operators/managers
 to not need reassiging again.
--- a/modules/sysadmin_guide/pages/freshmaker.adoc
+++ b/modules/sysadmin_guide/pages/freshmaker.adoc
@ -0,0 +1,149 @@
 = Freshmaker SOP
 [NOTE]
 .Note
 ====
 Freshmaker is very new and changing rapidly. We'll try to keep this up
 to date as best we can.
 ====
 Freshmaker is a service that watches message bus activity and tries
 to rebuild _compound_ artifacts when their constituent pieces change.
 == Contact Information
 Owner::
  Factory2 Team, Release Engineering Team, Infrastructure Team
 Contact::
  #fedora-modularity, #fedora-admin, #fedora-releng
 Persons::
  jkaluza, cqi, qwan, sochotni, threebean
 Location::
  Phoenix
 Public addresses::
  * freshmaker.fedoraproject.org
 Servers::
  * freshmaker-frontend0[1-2].phx2.fedoraproject.org
  * freshmaker-backend01.phx2.fedoraproject.org
 Purpose::
  Rebuild compound artifacts. See description for more detail.
 == Description
 See also
 http://fedoraproject.org/wiki/Infrastructure/Factory2/Focus/Freshmaker
 for some of the original (old) thinking on Freshmaker.
 As per the summary above, Freshmaker is a bus-oriented system that
 watches for changes to smaller pieces of content, and triggers rebuilds
 of larger pieces of content.
 It doesn't do the actual _builds_ itself, but instead requests rebuilds
 in our existing build systems.
 It handles a number of different content types. In Fedora, we would like
 to roll out rebuilds in the following order:
 === Module Builds
 When a spec file changes on a particular dist-git branch, trigger
 rebuilds of all modules that declare dependencies on that rpm branch.
 Consider the _traditional workflow_ today. You make a patch to the
 [.title-ref]#f27# of your package, and you know you need to build that
 patch for f27, and then later submit an update for this single build.
 Packagers know what to do.
 Consider the _modular workflow_. You make a patch to the
 [.title-ref]#2.2# branch of your package, but now, which modules do you
 rebuild? Maybe you had one in mind that you wanted to fix, but are there
 others that you forgot about -- that you don't even know about? Kevin
 could maintain a module that pulls in my rpm branch and he never told
 me. Even if he did, I have to now maintain a list of modules that depend
 on my rpm, and request rebuilds of them everytime I patch my .spec file.
 This is unmanageable.
 Freshmaker deals with this by watching the bus for dist-git fedmsg
 messages. When it sees a change on a branch, it looks up the list of
 modules that depend on that branch, and requests rebuilds of them in the
 MBS.
 === Container Slow Flow
 When a traditional rpm or modular rpm is _shipped stable_, this trigger
 rebuilds of all containers that ever included previous versions of this
 rpm.
 This applies to both modular and non-modular contexts. Today, you build
 an rpm that fixes a CVE, but _some other person_ maintains a container
 that includes your RPM. Maybe they never told you about this. Maybe they
 didn't notice your CVE fix. Their container will remain outdated and
 vulnerable.. forever?
 Freshmaker deals with this by watching the bus for dist-git messages
 about rpms being shipped to the stable updates repo. When they're
 shipped, it looks up all containers that ever included pervious versions
 of the rpm in question, and it triggers rebuilds of them.
 _Waiting_ until the rpm ships to stable is _necessary_ because the
 container build process doesn't know about unshipped content. This is
 how containers are built manually today, and it is annoying. Which
 brings us to the more complicated...
 === Container Fast Flow
 When a traditional rpm or modular rpm is _signed_, generate a repo
 containing it and rebuild all containers that ever included that rpm
 before. This is the better version of the slow flow, but is more
 complicated so we're deferring it until after we've proved the first two
 cases out.
 Freshmaker will do this by requesting an interim build repo from ODCS
 (the On Demand Compose Service). ODCS can be given the appropriate koji
 tag and will produce a repo of (pre-signed) rpms. Freshmaker will
 request a rebuild of the container and will pass the ODCS repo url in.
 This gives us an auditable trail of disposable repos.
 == Systems
 There is a frontend and a backend.
 Everything in the previous section describes the backend behavior.
 The frontend exists to provide an HTTP API that can be queried to find
 out the status of the backend: What is it doing? What is it planning to
 do? What has it done already?
 == Observing Freshmaker Behavior
 There is currently no command line tool to query Freshmaker, but
 Freshmaker provides REST API which can be used to observe Freshmaker
 behavior. This is available at the following URLs:
 * https://freshmaker.fedoraproject.org/api/1/events
 * https://freshmaker.fedoraproject.org/api/1/builds
 The first [.title-ref]#/events# URL should return a list of events that
 Freshmaker has noticed, recorded, and is handling. Handled events should
 produce associated builds.
 The second [.title-ref]#/builds# URL should return a list of builds that
 Freshmaker has submitted and is monitoring. Each build should be
 traceable back to the event that triggered it.
 == Logs
 The frontend logs are on freshmaker-frontend0[1-2] in
 `/var/log/httpd/error_log`.
 The backend logs are on freshmaker-backend01. Look in the journal for
 the [.title-ref]#fedmsg-hub# service.
 == Upgrading
 The package in question is [.title-ref]#freshmaker#. Please use the
 [.title-ref]#playbooks/manual/upgrade/freshmaker.yml# playbook.
 == Things that could go wrong
 TODO. We don't know yet. Probably lots of things.
--- a/modules/sysadmin_guide/pages/gather-easyfix.adoc
+++ b/modules/sysadmin_guide/pages/gather-easyfix.adoc
@ -0,0 +1,39 @@
 = Fedora gather easyfix SOP
 Fedora-gather-easyfix as the name says gather tickets marked as easyfix
 from multiple sources (pagure, github and fedorahosted currently).
 Providing a single place for new-comers to find small tasks to work on.
 == Contents
 [arabic]
 . Contact Information
 . Documentation Links
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Location::
  http://fedoraproject.org/easyfix/
 Servers::
  sundries01, sundries02, sundries01.stg
 Purpose::
  Gather easyfix tickets from multiple sources.
 Upstream sources are hosted on github at:
 https://github.com/fedora-infra/fedora-gather-easyfix/
 The files are then mirrored to our ansible repo, under the
 [.title-ref]#easyfix/gather# role.
 The project is a simple script `gather_easyfix.py` gathering information
 from the projects sets on the
 https://fedoraproject.org/wiki/Easyfix[Fedora wiki] and outputing a
 single html file. This html file is then improved via the css and
 javascript files present in the sources.
 The generated html file together with the css and js files are then
 synced to the proxies for public consumption :)
--- a/modules/sysadmin_guide/pages/gdpr_delete.adoc
+++ b/modules/sysadmin_guide/pages/gdpr_delete.adoc
@ -0,0 +1,121 @@
 = GDPR Delete SOP
 This SOP covers how Fedora Infrastructure handles General Data
 Protection Regulation (GDPR) Delete Requests. It contains information
 about how system administrators will use tooling to respond to Delete
 requests, as well as how application developers can integrate their
 applications with that tooling.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  nirik
 Location::
  Phoenix
 Servers::
  batcave01.phx2.fedoraproject.org Various application servers, which
  will run scripts to delete data.
 Purpose::
  Respond to Delete requests.
 == Responding to a Deletion Request
 This section covers how a system administrator will use our
 `gdpr-delete.yml` playbook to respond to a Delete request.
 When processing a Delete request, perform the following steps:
 [arabic, start=0]
 . Verify that the requester is who they say they are. If the request
 came in email ask them to file an issue at
 https://pagure.io/fedora-pdr/new_issue Use the following in email reply
 to them:
 +
 `In order to verify your identity, please file a new issue at   https://pagure.io/fedora-pdr/new_issue using the appropriate issue type.    Please note this form requires you to sign in to your account to verify   your identity.`
 +
 If the request has come via Red Hat internal channels as an explicit
 request to delete, mark the ticket with the tag `rh`. This tag will help
 delineate requests for any future reporting needs.
 +
 If they do not have a FAS account, indicate to them that there is no
 data to be deleted. Use this response:
 +
 `Your request for deletion has been reviewed. Since there is no related account in the Fedora Account System, the Fedora infrastructure does not store data relevant for this deletion request. Note that some public content related to Fedora you may have previously submitted without an account, such as to public mailing lists, is not deleted since accurate maintenance of this data serves Fedora's legitimate business interests, the public interest, and the interest of the open source community.`
 . Identify the users FAS account name. The Delete playbook will use this
 FAS account to delete the required data. Update the fedora-pdr issue
 saying the request has been received. There is a 'quick response' in the
 pagure issue tracker to note this.
 . Login to FAS and clear the `Telephone number` entry, set Country to
 `Other`, clear `Lattitude` and `Longitude` and `IRC Nick` and
 `GPG Key ID` and set `Time Zone` to UTC and `Locale` to `en` and set the
 user status to `disabled`. If the user is not in cla_done plus one
 group, you are done. Update the ticket and close it. This step will be
 folded into the following one once we implement it.
 . If the user is in cla_done + one group, they may have additional data:
 Run the gdpr delete playbook on `batcave01`. You will need to define one
 Ansible variable for the playbook. `sar_fas_user` will be the FAS
 username of the user.
 +
 ____
 $ sudo ansible-playbook playbooks/manual/gdpr/delete.yml -e
 gdpr_delete_fas_user=bowlofeggs
 ____
 +
 After the script completes, update the ticket that the request is
 completed and close it. There is a 'quick response' in the pagure issue
 tracker to note this.
 == Integrating an application with our delete playbook
 This section covers how an infrastructure application can be configured
 to integrate with our `delete.yml` playbook. To integrate, you must
 create a script and Ansible variables so that your application is
 compatible with this playbook.
 === Script
 You need to create a script and have your project's Ansible role install
 that script somewhere (most likely on a host from your project - for
 example fedocal's is going on `fedocal01`.) It's not a bad idea to put
 your script into your upstream project. This script should accept one
 environment variable as input: `GDPR_DELETE_USERNAME`. This will be a
 FAS username.
 Some scripts may need secrets embedded in them - if you must do this be
 careful to install the script with `0700` permissions, ensuring that
 only `gdpr_delete_script_user` (defined below) can run them. Bodhi
 worked around this concern by having the script run as `apache` so it
 could read Bodhi's server config file to get the secrets, so it does not
 have secrets in its script.
 === Variables
 In addition to writing a script, you need to define some Ansible
 variables for the host that will run your script:
 [cols=",,",options="header",]
 |===
 |Variable |Description |Example
 |``gdpr_delete_script |`` The full path to the script. a|
 ____
 `/usr/bin/fedocal-delete`
 ____
 |``gdpr_delete_script |_user`` The user the script should be run as a|
 ____
 `apache`
 ____
 |===
 You also need to add the host that the script should run on to the
 `[gdpr_delete]` group in `inventory/inventory`:
 ....
 [gdpr_delete]
 fedocal01.phx2.fedoraproject.org
 ....
--- a/modules/sysadmin_guide/pages/gdpr_sar.adoc
+++ b/modules/sysadmin_guide/pages/gdpr_sar.adoc
@ -0,0 +1,153 @@
 = GDPR SAR SOP
 This SOP covers how Fedora Infrastructure handles General Data
 Protection Regulation (GDPR) Subject Access Requests (SAR). It contains
 information about how system administrators will use tooling to respond
 to SARs, as well as how application developers can integrate their
 applications with that tooling.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Persons::
  bowlofeggs
 Location::
  Phoenix
 Servers::
  batcave01.phx2.fedoraproject.org Various application servers, which
  will run scripts to collect SAR data.
 Purpose::
  Respond to SARs.
 == Responding to a SAR
 This section covers how a system administrator will use our `sar.yml`
 playbook to respond to a SAR.
 When processing a SAR, perform the following steps:
 [arabic, start=0]
 . Verify that the requester is who they say they are. If the request
 came in email and the user has a FAS account, ask them to file an issue
 at https://pagure.io/fedora-pdr/new_issue Use the following in email
 reply to them:
 +
 `In order to verify your identity, please file a new issue at   https://pagure.io/fedora-pdr/new_issue using the appropriate issue type.    Please note this form requires you to sign in to your account to verify   your identity.`
 +
 If the request has come via Red Hat internal channels as an explicit
 request to delete, mark the ticket with the tag `rh`. This tag will help
 delineate requests for any future reporting needs.
 . Identify an e-mail address for the requester, and if applicable, their
 FAS account name. The SAR playbook will use both of these since some
 applications have data associated with FAS accounts and others have data
 associated with e-mail addresses. Update the fedora-pdr issue saying the
 request has been received. There is a 'quick response' in the pagure
 issue tracker to note this.
 . Run the SAR playbook on `batcave01`. You will need to define three
 Ansible variables for the playbook. `sar_fas_user` will be the FAS
 username, if applicable; this may be omitted if the requester does not
 have a FAS account. `sar_email` will be the e-mail address associated
 with the user. `sar_tar_output_path` will be the path you want the
 playbook to write the resulting tarball to, and should have a `.tar.gz`
 extension. For example, if `bowlofeggs` submitted a SAR and his e-mail
 address is `bowlof@eggs.biz`, you might run the playbook like this:
 +
 ....
 $ sudo ansible-playbook playbooks/manual/gdpr/sar.yml -e sar_fas_user=bowlofeggs \
    -e sar_email=bowlof@eggs.biz -e sar_tar_output_path=/home/bowlofeggs/bowlofeggs.tar.gz
 ....
 . Generate a random sha512 with something like:
 `openssl rand 512 | sha512sum` and then move the output file to
 /srv/web/infra/pdr/the-sha512.tar.gz
 . Update the ticket to fixed / processed on pdr requests to have a link
 to https://infrastructure.fedoraproject.org/infra/pdr/the-sha512.tar.gz
 and tell them it will be available for one week.
 == Integrating an application with our SAR playbook
 This section covers how an infrastructure application can be configured
 to integrate with our `sar.yml` playbook. To integrate, you must create
 a script and Ansible variables so that your application is compatible
 with this playbook.
 === Script
 You need to create a script and have your project's Ansible role install
 that script somewhere (most likely on a host from your project - for
 example Bodhi's is going on `bodhi-backend02`.) It's not a bad idea to
 put your script into your upstream project - there are plans for
 upstream Bodhi to ship `bodhi-sar`, for example. This script should
 accept two environment variables as input: `SAR_USERNAME` and
 `SAR_EMAIL`. Not all applications will use both, so do what makes sense
 for your application. The first will be a FAS username and the second
 will be an e-mail address. Your script should gather the required
 information related to those identifiers and print it in a machine
 readable format to stdout. Bodhi, for example, prints information to
 stdout in `JSON`.
 Some scripts may need secrets embedded in them - if you must do this be
 careful to install the script with `0700` permissions, ensuring that
 only `sar_script_user` (defined below) can run them. Bodhi worked around
 this concern by having the script run as `apache` so it could read
 Bodhi's server config file to get the secrets, so it does not have
 secrets in its script.
 === Variables
 In addition to writing a script, you need to define some Ansible
 variables for the host that will run your script:
 [cols=",,",options="header",]
 |===
 |Variable |Description |Example
 |`sar_script` |The full path to the script. |`/usr/bin/bodhi-sar`
 |`sar_script_user` |The user the script should be run as |`apache`
 |`sar_output_file` |The name of the file to write into the output
 tarball |`bodhi.json`
 |===
 You also need to add the host that the script should run on to the
 `[sar]` group in `inventory/inventory`:
 ....
 [sar]
 bodhi-backend02.phx2.fedoraproject.org
 ....
 === Variables for OpenShift apps
 When you need to add OpenShift app to SAR playbook, you need to add
 following variables to existing `sar_openshift` dictionary:
 [cols=",,",options="header",]
 |===
 |Variable |Description |Example
 |`sar_script` |The full path to the script. |`/usr/local/bin/sar.py`
 |`sar_output_file` |The name of the file to write into the output
 tarball |`anitya.json`
 |`openshift_namespace` |The namespace in which the application is
 running |`release-monitoring`
 |`openshift_pod` |The pod name in which the script will be run
 |`release-monitoring-web`
 |===
 The `sar_openshift` dictionary is located in
 `inventory/group_vars/os_masters`:
 ....
 sar_openshift:
    # Name of the app
    release-monitoring:
      sar_script: /usr/local/bin/sar.py
      sar_output_file: anitya.json
      openshift_namespace: release-monitoring
      openshift_pod: release-monitoring-web
 ....
--- a/modules/sysadmin_guide/pages/geoip-city-wsgi.adoc
+++ b/modules/sysadmin_guide/pages/geoip-city-wsgi.adoc
@ -0,0 +1,62 @@
 = geoip-city-wsgi SOP
 A simple web service that return geoip information as JSON-formatted
 dictionary in utf-8. Particularly, it's used by anaconda[1] to get the
 most probable territory code, based on the public IP of the caller.
 == Contents
 [arabic]
 . Contact Information
 . Basic Function
 . Ansible Roles
 . Apps depending of geoip-city-wsgi
 . Documentation Links
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-admin, #fedora-noc
 Location::
  https://geoip.fedoraproject.org
 Servers::
  sundries*, sundries*-stg
 Purpose::
  A simple web service that return geoip information as JSON-formatted
  dictionary in utf-8. Particularly, it's used by anaconda[1] to get the
  most probable territory code, based on the public IP of the caller.
 == Basic Function
 * Users go to https://geoip.fedoraproject.org/city
 * The website is exposed via
 `/etc/httpd/conf.d/geoip-city-wsgi-proxy.conf`.
 * Return a string with geoip information with syntax as JSON-formatted
 dict in utf8
 * It also currently accepts one override: ?ip=xxx.xxx.xxx.xxx, e.g.
 https://geoip.fedoraproject.org/city?ip=18.0.0.1 which then uses the
 passed IP address instead of the determined IP address of the client.
 == Ansible Roles
 The geoip-city-wsgi role
 https://pagure.io/fedora-infra/ansible/blob/main/f/roles/geoip-city-wsgi
 is present in sundries playbook
 https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/groups/sundries.yml
 the proxy task are present in
 https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/include/proxies-reverseproxy.yml
 == Apps depending of geoip-city-wsgi
 unknown
 == Documentation Links
 app: https://geoip.fedoraproject.org source:
 https://github.com/fedora-infra/geoip-city-wsgi bugs:
 https://github.com/fedora-infra/geoip-city-wsgi/issues Role:
 https://pagure.io/fedora-infra/ansible/blob/main/f/tree/roles/geoip-city-wsgi
 [1] https://fedoraproject.org/wiki/Anaconda
--- a/modules/sysadmin_guide/pages/github.adoc
+++ b/modules/sysadmin_guide/pages/github.adoc
@ -0,0 +1,67 @@
 = Using github for Infra Projects
 We're presently using github to host git repositories and issue tracking
 for some infrastructure projects. Anything we need to know should be
 recorded here.
 == Setting up a new repo
 Create projects inside of the fedora-infra group:
 https://github.com/fedora-infra
 That will allow us to more easily track what projects we have.
 [TODO] How do we create a new project and import it?
 * After creating a new repo, click on the Settings tab to set up some
 fancy things.
 +
 If using git-flow for your project:
 ** Set the default branch from 'master' to 'develop'. Having the default
 branch be develop is nice: new contributors will automatically start
 committing there if they're not paying attention to what branch they're
 on. You almost never want to commit directly to the master branch.
 +
 If there does not exist a develop branch, you should create one by
 branching off of master.:
 +
 ....
 $ git clone GIT_URL
 $ git checkout -b develop
 $ git push --all
 ....
 ** Set up an IRC hook for notifications. From the "settings" tab click
 on "Webhooks & Services." Under the "Add Service" dropdown, find "IRC"
 and click it. You might need to enter your password. In the form, you
 probably want the following values:
 *** Server, irc.freenode.net
 *** Port, 6697
 *** Room, #fedora-apps
 *** Nick, <nothing>
 *** Branch Regexes, <nothing>
 *** Password, <nothing>
 *** Ssl, <on>
 *** Message Without Join, <on>
 *** No Colors, <off>
 *** Long Url, <off>
 *** Notice, <on>
 *** Active, <on>
 == Add an EasyFix label
 The EasyFix label is used to mark bugs that are potentially fixable by
 new contributors getting used to our source code or relatively new to
 python programming. GitHub doesn't provide this label automatically so
 we have to add it. You can add the label from the issues page of the
 repository or use this curl command to add it:
 ....
 curl -k -u '$GITHUB_USERNAME:$GITHUB_PASSWORD' https://api.github.com/repos/fedora-infra/python-fedora/labels -H "Content-Type: application/json" -d '{"name":"EasyFix","color":"3b6eb4"}'
 ....
 Please try to use the same color for consistency between Fedora
 Infrastructure Projects. You can then add the github repo to the list
 that easyfix.fedoraproject.org scans for easyfix tickets here:
 https://fedoraproject.org/wiki/Easyfix
--- a/modules/sysadmin_guide/pages/github2fedmsg.adoc
+++ b/modules/sysadmin_guide/pages/github2fedmsg.adoc
@ -0,0 +1,50 @@
 = github2fedmsg SOP
 Bridge github events onto our fedmsg bus.
 App: https://apps.fedoraproject.org/github2fedmsg/ Source:
 https://github.com/fedora-infra/github2fedmsg/
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-apps, #fedora-admin, #fedora-noc
 Servers::
  github2fedmsg01
 Purpose::
  Bridge github events onto our fedmsg bus.
 == Description
 github2fedmsg is a small Python Pyramid app that bridges github events
 onto our fedmsg bus by way of github's "webhooks" feature. It is what
 allows us to have IRC notifications of github activity via fedmsg. It
 has two phases of operation:
 * Infrequently, a user will log in to github2fedmsg via Fedora OpenID.
 They then push a button to also log in to github.com. They are then
 logged in to github2fedmsg with _both_ their FAS account and their
 github account.
 +
 They are then presented with a list of their github repositories. They
 can toggle each one: "on" or "off". When they turn a repo on, our webapp
 makes a request to github.com to install a "webhook" for that repo with
 a callback URL to our app.
 * When events happen to that repo on github.com, github looks up our
 callback URL and makes an http POST request to us, informing us of the
 event. Our github2fedmsg app receives that, validates it, and then
 republishes the content to our fedmsg bus.
 == What could go wrong?
 * Restarting the app or rebooting the host shouldn't cause a problem. It
 should come right back up.
 * Our database could die. We have a db with a list of all the repos we
 have turned on and off. We would want to restore that from backup.
 * If github gets compromised, they might have to revoke all of their
 application credentials. In that case, our app would fail to work. There
 are _lots_ of private secrets set in our private repo that allow our app
 to talk to github.com. There are inline comments there with instructions
 about how to generate new keys and secrets.
--- a/modules/sysadmin_guide/pages/gitweb.adoc
+++ b/modules/sysadmin_guide/pages/gitweb.adoc
@ -0,0 +1,26 @@
 = Gitweb Infrastructure SOP
 Gitweb-caching is the web interface we use to expose git to the web at
 http://git.fedorahosted.org/git/
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-hosted
 Location::
  Serverbeach
 Servers::
  hosted[1-2]
 Purpose::
  Http access to git sources.
 == Basic Function
 * Users go to [46]http://git.fedorahosted.org/git/
 * Pages are generated from cache stored in `/var/cache/gitweb-caching/`.
 * The website is exposed via
 `/etc/httpd/conf.d/git.fedoraproject.org.conf`.
 * Main config file is `/var/www/gitweb-caching/gitweb_config.pl`. This
 pulls git repos from /git/.
--- a/modules/sysadmin_guide/pages/greenwave.adoc
+++ b/modules/sysadmin_guide/pages/greenwave.adoc
@ -0,0 +1,112 @@
 = Greenwave SOP
 == Contact Information
 Owner::
  Factory2 Team, Fedora QA Team, Infrastructure Team
 Contact::
  #fedora-qa, #fedora-admin
 Persons::
  gnaponie (giulia), mprahl, lucarval, ralph (threebean)
 Location::
  Phoenix
 Public addresses::
  * https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/version
  * https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/policies
  * https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/decision
 Servers::
  * In OpenShift.
 Purpose::
  Provide gating decisions.
 == Description
 * See
 http://fedoraproject.org/wiki/Infrastructure/Factory2/Focus/Greenwave[the
 focus document] for background.
 * See https://pagure.io/docs/greenwave/[the upstream docs] for more
 detailed info.
 Greenwave's job is:
 * answering yes/no questions (or making decisions)
 * about artifacts (RPM packages, source tarballs, …)
 * at certain gating points in our pipeline
 * based on test results
 * according to some policy
 In particular, we'll be using Greenwave to provide yes/no gating
 decisions _to Bodhi_ about rpms in each update. Greenwave will do this
 by consulting resultsdb and waiverdb for individual test results and
 then combining those results into an aggregate decision.
 The _policies_ for how those results should be combined or ignored, are
 defined in ansible in
 `roles/openshift-apps/greenwave/templates/configmap.yml`. We expect to
 grow these over time to new use cases (rawhide compose gating, etc..)
 == Observing Greenwave Behavior
 Login to `os-master01.phx2.fedoraproject.org` as `root` (or,
 authenticate remotely with openshift using
 `oc login https://os.fedoraproject.org`), and run:
 ....
 $ oc project greenwave
 $ oc status -v
 $ oc logs -f dc/greenwave-web
 ....
 == Database
 Greenwave currently has no database (and we'd like to keep it that way).
 It relies on `resultsdb` and `waiverdb` for information.
 == Upgrading
 You can roll out configuration changes by changing the files in
 `roles/openshift-apps/greenwave/` and running the
 `playbooks/openshift-apps/greenwave.yml` playbook.
 To understand how the software is deployed, take a look at these two
 files:
 * `roles/openshift-apps/greenwave/templates/imagestream.yml`
 * `roles/openshift-apps/greenwave/templates/buildconfig.yml`
 See that we build a fedora-infra specific image on top of an app image
 published by upstream. The `latest` tag is automatically deployed to
 staging. This should represent the latest commit to the `master` branch
 of the upstream git repo that passed its unit and functional tests.
 The `prod-fedora` tag is manually controlled. To upgrade prod to match
 what is in stage, move the `prod-fedora` tag to point to the same image
 as the `latest` tag. Our buildconfig is configured to poll that tag, so
 a new os.fp.o build and deployment should be automatically created.
 You can watch the build and deployment with `oc` commands.
 You can poll this URL to see what version is live at the moment:
 https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/version
 == Troubleshooting
 In case of problems with greenwave messaging, check the logs of the
 container dc/greenwave-fedmsg-consumers to see if the is something
 wrong:
 ....
 $ oc logs -f dc/greenwave-fedmsg-consumers
 ....
 It is also possible to check if greenwave is actually publishing
 messages looking at
 https://apps.fedoraproject.org/datagrepper/raw?category=greenwave&delta=127800&rows_per_page=1[this
 link] and checking the time of the last message.
 In case of problems with greenwave webapp, check the logs of the
 container dc/greenwave-web:
 ....
 $ oc logs -f dc/greenwave-web
 ....
--- a/modules/sysadmin_guide/pages/guestdisk.adoc
+++ b/modules/sysadmin_guide/pages/guestdisk.adoc
@ -0,0 +1,134 @@
 = Guest Disk Resize SOP
 Resize disks in our kvm guests
 == Contents
 [arabic]
 . Contact Information
 . How to do it
 ____
 [arabic]
 . KVM/libvirt Guests
 ____
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin, sysadmin-main
 Location:::
  PHX, Tummy, ibiblio, Telia, OSUOSL
 Servers:::
  All xen servers, kvm/libvirt servers.
 Purpose:::
  Resize guest disks
 == How to do it
 === KVM/libvirt Guests
 [arabic]
 . {blank}
 +
 SSH to the kvm server and resize the guest's logical volume. If you::
  want to be extra careful, make a snapshot of the LV first:
  +
 ....
 lvcreate -n [guest name]-snap -L 10G -s /dev/VolGroup00/[guest name] 
 ....
  +
  Optional, but always good to be careful
 . Shutdown the guest:
 +
 ....
 sudo virsh shutdown [guest name]
 ....
 . Disable the guests lv:
 +
 ....
 lvchange -an /dev/VolGroup00/[guest name]
 ....
 . Resize the lv:
 +
 ....
 lvresize -L [NEW TOTAL SIZE]G /dev/VolGroup00/[guest name]
 or
 lvresize -L +XG /dev/VolGroup00/[guest name]
 (to add X GB to the disk)
 ....
 . Enable the lv:
 +
 ....
 lvchange -ay /dev/VolGroup00/[guest name]
 ....
 . Bring the guest back up:
 +
 ....
 sudo virsh start [guest name]
 ....
 . Login into the guest:
 +
 ....
 sudo virsh console [guest name]
 You may wish to boot single user mode to avoid services coming up and going down again
 ....
 . On the guest, run:
 +
 ....
 fdisk /dev/vda
 ....
 . Delete the the LVM partition on the guest you want to add space to and
 recreate it with the maximum size. Make sure to set its type to LV (8e):
 +
 ....
 p to list partitions
 d to delete selected partition
 n to create new partition (default values should be ok)
 t to change partition type (set to 8e)
 w to write changes 
 ....
 . Run partprobe:
 +
 ....
 partprobe
 ....
 . Check the size of the partition:
 +
 ....
 fdisk -l /dev/vdaN
 ....
 +
 If this still reflects the old size, then reboot the guest and verify
 that its size changed correctly when it comes up again.
 . Login to the guest again, and run:
 +
 ....
 pvresize /dev/vdaN
 ....
 . A vgs should now show the new size. Use lvresize to resize the root
 lv:
 +
 ....
 lvresize -L [new root partition size]G /dev/GuestVolGroup00/root
 (pvs will tell you how much space is available)
 ....
 . Finally, resize the root partition:
 +
 ....
 resize2fs /dev/GuestVolGroup00/root
 (If the root fs is ext4)
 or
 xfs_growfs /dev/GuestVolGroup00/root
 (if the root fs is xfs)
 ....
 +
 verify that everything worked out, and delete the snapshot you made if
 you made one.
--- a/modules/sysadmin_guide/pages/guestedit.adoc
+++ b/modules/sysadmin_guide/pages/guestedit.adoc
@ -0,0 +1,80 @@
 = Guest Editing SOP
 Various virsh commands
 == Contents
 [arabic]
 . Contact Information
 . How to do it
 +
 ____
 [arabic]
 .. add/remove cpus
 .. resize memory
 ____
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin, sysadmin-main
 Location:::
  PHX, Tummy, ibiblio, Telia, OSUOSL
 Servers:::
  All xen servers, kvm/libvirt servers.
 Purpose:::
  Resize guest disks
 == How to do it
 === Add cpu
 [arabic]
 . SSH to the virthost server
 . Calculate the number of CPUs the system needs
 . `sudo virsh setvcpus  <guest> <num_of_cpus> --config` - ie:
 +
 ....
 sudo virsh setvcpus bapp01 16 --config
 ....
 . Shutdown the virtual system
 . Start the virtual system
 [NOTE]
 .Note
 ====
 Note that using [.title-ref]#virsh reboot# is insufficient. You have to
 actually stop the domain and start it with `virsh destroy <guest>` and
 `virsh start <guest>` for the change to take effect.
 ====
 [arabic, start=6]
 . Login and check that cpu count matches
 . *Remember to update the group_vars in ansible* to match the new value
 you set, if appropriate.
 === Resize memory
 [arabic]
 . SSH to the virthost server
 . Calculate the amount of memory the system needs in kb
 . `sudo virsh setmem <guest> <num_in_kilobytes> --config` - ie:
 +
 ....
 sudo virsh setmem bapp01 16777216 --config
 ....
 . Shutdown the virtual system
 . Start the virtual system
 [NOTE]
 .Note
 ====
 Note that using [.title-ref]#virsh reboot# is insufficient. You have to
 actually stop the domain and start it with `virsh destroy <guest>` and
 `virsh start <guest>` for the change to take effect.
 ====
 [arabic, start=6]
 . Login and check that memory matches
 . *Remember to update the group_vars in ansible* to match the new value
 you set, if appropriate.
--- a/modules/sysadmin_guide/pages/haproxy.adoc
+++ b/modules/sysadmin_guide/pages/haproxy.adoc
@ -0,0 +1,143 @@
 = Haproxy Infrastructure SOP
 haproxy is an application that does load balancing at the tcp layer or
 at the http layer. It can do generic tcp balancing but it does
 specialize in http balancing. Our proxy servers are still running apache
 and that is what our users connect to. But instead of using
 mod_proxy_balancer and ProxyPass balancer://, we do a ProxyPass to
 [45]http://localhost:10001/ or [46]http://localhost:10002/. haproxy must
 be told to listen to an individual port for each farm. All haproxy farms
 are listed in /etc/haproxy/haproxy.cfg.
 == Contents
 [arabic]
 . Contact Information
 . How it works
 . Configuration example
 . Stats
 . Advanced Usage
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin, sysadmin-main, sysadmin-web group
 Location:::
  Phoenix, Tummy, Telia
 Servers:::
  proxy1, proxy2, proxy3, proxy4, proxy5
 Purpose:::
  Provides load balancing from the proxy layer to our application layer.
 == How it works
 haproxy is a load balancer. If you're familiar, this section won't be
 that interesting. haproxy in its normal usage acts just like a web
 server. It listens on a port for requests. Unlike most webservers though
 it then sends that request to one of our back end application servers
 and sends the response back. This is referred to as reverse proxying. We
 typically configure haproxy to send check to a specific url and look for
 the response code. If this url isn't sent, it just does basic checks to
 /. In most of our configurations we're using round robin balancing. IE,
 request 1 goes to app1, request2 goes to app2, request 3 goes to app3
 request 4 goes to app1, and the whole process repeats.
 [WARNING]
 .Warning
 ====
 These checks do add load to the app servers. As well as additional
 connections. Be smart about which url you're checking as it gets checked
 often. Also be sure to verify the application servers can handle your
 new settings, monitor them closely for the hour or two after you make
 changes.
 ====
 == Configuration example
 The below example is how our fedoraproject wiki could be configured.
 Each application should have its own farm. Even though it may have an
 identical configuration to another farm, this allows easy addition and
 subtraction of specific nodes when we need them.:
 ....
 listen  fpo-wiki 0.0.0.0:10001
 balance roundrobin
 server  app1 app1.fedora.phx.redhat.com:80 check inter 2s rise 2 fall 5
 server  app2 app2.fedora.phx.redhat.com:80 check inter 2s rise 2 fall 5
 server  app4 app4.fedora.phx.redhat.com:80 backup check inter 2s rise 2 fall 5
 option  httpchk GET /wiki/Infrastructure
 ....
 * The first line "listen ...." Says to create a farm called 'fpo-wiki'.
 Listening on all IP's on port 10001. fpo-wiki can be arbitrary but make
 it something obvious. Aside from that the important bit is :10001.
 Always make sure that when creating a new farm, its listening on a
 unique port. In Fedora's case we're starting at 10001, and moving up by
 one. Just check the config file for the lowest open port above 10001.
 * The next line "balance roundrobin" says to use round robin balancing.
 * The server lines each add a new node to the balancer farm. In this
 case the wiki is being served from app1, app2 and app4. If the wiki is
 available at [53]http://app1.fedora.phx.redhat.com/wiki/ Then this
 config would be used in conjunction with "RewriteRule ^/wiki/(.*)
 [54]http://localhost:10001/wiki/$1 [P,L]".
 * 'server' means we're adding a new node to the farm
 * {blank}
 +
 'app1' is the worker name, it is analagous to fpo-wiki but should::
  match shorthostname of the node to make it easy to follow.
 * 'app1.fedora.phx.redhat.com:80' is the hostname and port to be
 contacted.
 * 'check' means to check via bottom line "option httpchk GET
 /wiki/Infrastructure" which will use /wiki/Infrastructure to verify the
 wiki is working. If that URL fails, that entire node will be taken out
 of the farm mix.
 * 'inter 2s' means to check every 2 seconds. 2s is the same as 2000 in
 this case.
 * 'rise 2' means to not put this node back in the mix until it has had
 two successful connections in a row. haproxy will continue to check
 every 2 seconds whether a node is up or down
 * 'fall 5' means to take a node out of the farm after 5 failures.
 * 'backup' You'll notice that app4 has a 'backup' option. We don't
 actually use this for the wiki but do for other farms. It basically
 means to continue checking and treat this node like any other node but
 don't send it any production traffic unless the other two nodes are
 down.
 All of these options can be tweaked so keep that in mind when changing
 or building a new farm. There are other configuration options in this
 file that are global. Please see the haproxy documentation for more
 info:
 ....
 /usr/share/doc/haproxy-1.3.14.6/haproxy-en.txt
 ....
 == Stats
 In order to view the stats for a farm please see the stats page. Each
 proxy server has its own stats page since each one is running its own
 haproxy server. To view the stats point your browser to
 https://admin.fedoraproject.org/haproxy/shorthostname/ so proxy1 is at
 https://admin.fedoraproject.org/haproxy/proxy1/ The trailing / is
 important.
 * https://admin.fedoraproject.org/haproxy/proxy1/
 * https://admin.fedoraproject.org/haproxy/proxy2/
 * https://admin.fedoraproject.org/haproxy/proxy3/
 * https://admin.fedoraproject.org/haproxy/proxy4/
 * https://admin.fedoraproject.org/haproxy/proxy5/
 == Advanced Usage
 haproxy has some more advanced usage that we've not needed to worry
 about yet but is worth mentioning. For example, one could send users to
 just one app server based on session id. If user A happened to hit app1
 first and user B happened to hit app4 first. All subsequent requests for
 user A would go to app1 and user B would go to app4. This is handy for
 applications that cannot normally be balanced because of shared storage
 needs or other locking issues. This won't solve all problems though and
 can have negative affects for example when app1 goes down user A would
 either lose their session, or be unable to work until app1 comes back
 up. Please do some great testing before looking in to this option.
--- a/modules/sysadmin_guide/pages/hosted_git_to_svn.adoc
+++ b/modules/sysadmin_guide/pages/hosted_git_to_svn.adoc
@ -0,0 +1,191 @@
 = Fedorahosted migrations
 Migrating hosted repositories to that of another type.
 == Contents
 [arabic]
 . Contact Information
 . Description
 . SVN to GIT migration
 ____
 [arabic]
 . Questions left to be answered with this SOP
 ____
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-hosted
 Location::
  Serverbeach
 Servers::
  hosted1, hosted2
 Purpose::
  Migrate hosted SCM repositories to that of another SCM.
 == Description
 fedorahosted.org can be used to host open source projects. Occasionally
 those projects want to change the SCM they utilize. This document
 provides documentation for doing so.
 [arabic]
 . An scm for maintaining the code. The currently supported scm's include
 Mercurial, Git, Bazaar, or SVN. Note: There is no cvs
 . A trac instance, which provides a mini-wiki for hosting information
 and also provides a ticketing system.
 . A mailing list
 [IMPORTANT]
 .Important
 ====
 This page is for administrators only. People wishing to request a hosted
 project should use the [50]Ticketing System ; see the new project
 request template. (Requires Fedora Account)
 ====
 == SVN to GIT migration
 === FAS User Prep
 Currently you must manually generate $PROJECTNAME-users.txt by grabbing
 a list of people in the FAS group - and recording them in th following
 format:
 ....
 $fasusername = FirstName LastName <$emailaddress>
 ....
 This is error prone, and will stop the git-svn fetch below if an author
 appears that doesn't exist in the list of users.:
 ....
 svn log --quiet | awk '/^r/ {print $3}' | sort -u
 ....
 The above will generate a list of users in the svn repo.
 If all users are FAS users you can use the following script to create a
 users file (written by tmz (Todd Zullinger):
 ....
 #!/bin/bash
 if [ -z "$1" ]; then
 echo "usage: $0 <svn repo>" >&2
 exit 1
 fi
 svnurl=file:///svn/$1
 if ! svn info $svnurl &>/dev/null; then
 echo "$1 is not a valid svn repo." >&2
 fi
 svn log -q $svnurl | awk '/^r[0-9]+/ {print $3}' | sort -u | while read user; do
 name=$( (getent passwd $user 2>/dev/null | awk -F: '{print $5}') || '' )
 [ -z "$name" ] && name=$user
 email="$user@fedoraproject.org"
 echo "$user=$name <$email>"
 done
 ....
 === Doing the conversion
 [arabic]
 . Log into hosted1
 . Make a temporary directory to convert the repos in:
 +
 ....
 $ sudo mkdir /tmp/tmp-$PROJECTNAME.git
 $ cd /tmp/tmp-$PROJECTNAME.git
 ....
 . Create an git repo ready to receive migrated SVN data:
 +
 ....
 $ sudo git-svn init http://svn.fedorahosted.org/svn/$PROJECTNAME --no-metadata
 ....
 . Tell git to fetch and convert the repository:
 +
 ....
 $ git svn fetch
 .. note::
 This creation of a temporary repository is necessary because SVN leaves a
 number of items floating around that git can ignore, and we want those
 essentially ignored.
 ....
 . {blank}
 +
 From here, you'll wanted to follow [53]Creating a new git repo as if::
  cloning an existing git repository to Fedorahosted.
 . After that process is done - kindly remove the temporary repo that was
 created:
 +
 ....
 $ sudo rm -rf /tmp/tmp-$PROJECTNAME.git
 ....
 === Doing the converstion (alternate)
 Alternately, here's another way to do this (tmz):
 Setup a working dir:
 ....
 [tmz@hosted1 tmp (master)]$ mkdir im-chooser-conversion && cd im-chooser-conversion
 ....
 Create authors file mapping svn usernames to Name <email> form git
 uses.:
 ....
 [tmz@hosted1 im-chooser-conversion (master)]$ ~tmz/svn-to-git-authors im-chooser > authors
 ....
 Convert svn to git:
 ....
 [tmz@hosted1 im-chooser-conversion (master)]$ git svn clone -s -A authors --no-metadata file:///svn/im-chooser
 ....
 Move svn branches and tags into proper locations for the new git repo.
 (git-svn leaves them as 'remote' branches/tags.):
 ....
 [tmz@hosted1 im-chooser-conversion (master)]$ cd im-chooser
 [tmz@hosted1 im-chooser (master)]$ mv .git/refs/remotes/tags/* .git/refs/tags/ && rmdir .git/refs/remotes/tags
 [tmz@hosted1 im-chooser (master)]$ mv .git/refs/remotes/* .git/refs/heads/
 ....
 Now 'git branch' and 'git tag' should display the branches/tags.
 Create a bare repo from the converted git repo. Using `file://$(pwd)`
 here ensures that git copies all objects to the new bare repo.:
 ....
 [tmz@hosted1 im-chooser-conversion (master)]$ git clone --bare --shared file://$(pwd)/im-chooser im-chooser.git
 ....
 Follow the steps in
 https://fedoraproject.org/wiki/Hosted_repository_setup to finish setting
 proper modes and permissions for the repo. Don't forget to update the
 description file.
 [NOTE]
 .Note
 ====
 This still leaves moving the converted bare repo (im-chooser.git) to
 /git and fixing up the user/group.
 ====
 == Questions left to be answered with this SOP
 * Obviously we need to have requestor review the migration and confirm
 it's ok.
 * Do we then delete the old SCM contents?
 * Do we need to change the FAS-group type to grant them access to
 pull/push from it?
--- a/modules/sysadmin_guide/pages/hotfix.adoc
+++ b/modules/sysadmin_guide/pages/hotfix.adoc
@ -0,0 +1,51 @@
 = HOTFIXES SOP
 From time to time we have to quickly patch a problem or issue in
 applications in our infrastructure. This process allows us to do that
 and track what changed and be ready to remove it when the issue is fixed
 upstream.
 == Ansible based items:
 For ansible, they should be placed after the task that installs the
 package to be changed or modified. Either in roles or tasks.
 hotfix tasks should be called "HOTFIX description" They should also link
 in comments to any upstream bug or ticket. They should also have tags of
 'hotfix'
 The process is:
 * Create a diff of any files changed in the fix.
 * Check in the _link:[original] files and change to role/task
 * Check in now your diffs of those same files.
 * ansible will replace the files on the affected machines completely
 with the fixed versions.
 * If you need to back it out, you can revert the diff step, wait and
 then remove the first checkin
 Example:
 ....
 <task that installs the httpd package>
 #
 # install hash randomization hotfix
 # See bug https://bugzilla.redhat.com/show_bug.cgi?id=812398
 #
 - name: hotfix - copy over new httpd init script
  copy: src="{{ files }}/hotfix/httpd/httpd.init" dest=/etc/init.d/httpd
        owner=root group=root mode=0755
  notify:
  - restart apache
  tags:
  - config
  - hotfix
  - apache
 ....
 == Upstream changes
 Also, if at all possible a bug should be filed with the upstream
 application to get the fix in the next version. Hotfixes are something
 we should strive to only carry a short time.
--- a/modules/sysadmin_guide/pages/hotness.adoc
+++ b/modules/sysadmin_guide/pages/hotness.adoc
@ -0,0 +1,147 @@
 = The New Hotness
 https://github.com/fedora-infra/the-new-hotness/[the-new-hotness] is a
 https://fedora-messaging.readthedocs.io/en/stable/[fedora messaging
 consumer] that subscribes to
 https://release-monitoring.org/[release-monitoring.org] fedora messaging
 notifications to determine when a package in Fedora should be updated.
 For more details on the-new-hotness, consult the
 http://the-new-hotness.readthedocs.io/[project documentation].
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin #fedora-apps
 Persons::
  zlopez
 Location::
  iad2.fedoraproject.org
 Servers::
  Production
  +
  * hotness01.iad2.fedoraproject.org
  +
  Staging
  +
  * hotness01.stg.iad2.fedoraproject.org
 Purpose::
  File issues when upstream projects release new versions of a package
 == Hosts
 The current deployment is made up of the-new-hotness OpenShift
 namespace.
 [[the-new-hotness-1]]
 === the-new-hotness
 This OpenShift namespace runs following pods:
 * A fedora messaging consumer
 This OpenShift project relies on:
 * `anitya-sop` as message publisher
 * Fedora messaging RabbitMQ hub for consuming messages
 * Koji for scratch builds
 * Bugzilla for issue reporting
 == Releasing
 The release process is described in
 https://the-new-hotness.readthedocs.io/en/stable/dev-guide.html#release-guide[the-new-hotness
 documentation].
 === Deploying
 Staging deployment of the-new-hotness is deployed in OpenShift on
 os-master01.stg.iad2.fedoraproject.org.
 To deploy staging instance of the-new-hotness you need to push changes
 to staging branch on
 https://github.com/fedora-infra/the-new-hotness[the-new-hotness GitHub].
 GitHub webhook will then automatically deploy a new version of
 the-new-hotness on staging.
 Production deployment of the-new-hotness is deployed in OpenShift on
 os-master01.iad2.fedoraproject.org.
 To deploy production instance of the-new-hotness you need to push
 changes to production branch on
 https://github.com/fedora-infra/the-new-hotness[the-new-hotness GitHub].
 GitHub webhook will then automatically deploy a new version of
 the-new-hotness on production.
 ==== Configuration
 To deploy the new configuration, you need
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/sshaccess.html[ssh
 access] to batcave01.iad2.fedoraproject.org and
 https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ansible.html[permissions
 to run the Ansible playbook].
 All the following commands should be run from batcave01.
 First, ensure there are no configuration changes required for the new
 update. If there are, update the Ansible anitya role(s) and optionally
 run the playbook:
 ....
 $ sudo rbac-playbook openshift-apps/the-new-hotness.yml
 ....
 The configuration changes could be limited to staging only using:
 ....
 $ sudo rbac-playbook openshift-apps/the-new-hotness.yml -l staging
 ....
 This is recommended for testing new configuration changes.
 ==== Upgrading
 ===== Staging
 To deploy new version of the-new-hotness you need to push changes to
 staging branch on
 https://github.com/fedora-infra/the-new-hotness[the-new-hotness GitHub].
 GitHub webhook will then automatically deploy a new version of
 the-new-hotness on staging.
 ===== Production
 To deploy new version of the-new-hotness you need to push changes to
 production branch on
 https://github.com/the-new-hotness/anitya[the-new-hotness GitHub].
 GitHub webhook will then automatically deploy a new version of
 the-new-hotness on production.
 Congratulations! The new version should now be deployed.
 == Monitoring Activity
 It can be nice to check up on the-new-hotness to make sure its behaving
 correctly. You can see all the Bugzilla activity using the
 https://bugzilla.redhat.com/page.cgi?id=user_activity.html[user activity
 query] (staging uses
 https://partner-bugzilla.redhat.com/page.cgi?id=user_activity.html[partner-bugzilla.redhat.com])
 and querying for the `upstream-release-monitoring@fedoraproject.org`
 user.
 You can also view all the Koji tasks dispatched by the-new-hotness. For
 example, you can see the
 https://koji.fedoraproject.org/koji/tasks?state=failed&owner=hotness[failed
 tasks] it has created.
 To monitor the pods of the-new-hotness you can connect to Fedora infra
 OpenShift and look at the state of pods.
 For staging look at the [.title-ref]#the-new-hotness# namespace in
 https://os.stg.fedoraproject.org/console/project/release-monitoring/overview[staging
 OpenShift instance].
 For production look at the [.title-ref]#the-new-hotness# namespace in
 https://os.fedoraproject.org/console/project/release-monitoring/overview[production
 OpenShift instance].
--- a/modules/sysadmin_guide/pages/hubs.adoc
+++ b/modules/sysadmin_guide/pages/hubs.adoc
@ -0,0 +1,144 @@
 = Fedora Hubs SOP
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, sysadmin-tools, sysadmin-hosted
 Location::
  ?
 Servers::
  <prod-srv-hostname>, <stg-srv-hostname>, hubs-dev.fedorainfracloud.org
 Purpose::
  Contributor and team portal.
 == Description
 Fedora Hubs aggregates user and team activity throughout the Fedora
 infrastructure (and elsewhere) to show what a user or a team is doing.
 It helps new people find a place to contribute.
 === Components
 Fedora Hubs has the following components:
 * a SQL database like PostgreSQL (in the Fedora infra we're using the
 shared database).
 * a Redis server that is used as a message bus (it is not critical if
 the content is lost). System service: `redis`.
 * a MongoDB server used to store the contents of the activity feeds.
 It's JSON data, limited to 100 entries per user or group. Service:
 `mongod`.
 * a Flask-based WSGI app served by Apache + mod_wsgi, that will also
 serve the JS front end as static files. System service: `httpd`.
 * a Fedmsg listener that receives messages from the fedmsg bus and puts
 them in Redis. System service: `fedmsg-hub`.
 * a set of "triage" workers that pull the raw messages from Redis,
 process them using SQL queries and puts work items in another Redis
 queue. System service: `fedora-hubs-triage@`.
 * a set of "worker" daemons that pull from this other Redis queue, work
 on the items by making SQL queries and external HTTP requests (to Github
 for example), and put reload notifications in the SSE Redis queue. They
 also access the caching system, which can be local files or memcached.
 System service: `fedora-hubs-worker@`.
 * The SSE server (Twisted-based) that pulls from that Redis queue and
 sends reload notifications to the connected browsers. It handles
 long-lived HTTP connection but there is little activity: only the
 notifications and a "keepalive ping" message every 30 seconds to every
 connected browser. System service: `fedora-hubs-sse`. Apache is
 configured to proxy the `/sse` path to this server.
 == Managing the services
 Restarting all the services:
 ....
 systemctl restart fedmsg-hub fedora-hubs-\*
 ....
 By default, 4 `triage` daemons and 4 `worker` daemons are enabled. To
 add another `triage` daemon and another `worker` daemon, you can run:
 ....
 systemctl enable --now fedora-hubs-triage@5.service
 systemctl enable --now fedora-hubs-worker@5.service
 ....
 It is not necessary to have the same number of `triage` and `worker`
 daemons, in fact it is expected that more `worker` than `triage` daemons
 will be necessary, as they do more time-consuming work.
 == Hubs-specific operations
 Other Hubs-specific operations are done using the
 [.title-ref]#fedora-hubs# command:
 ....
 $ fedora-hubs
 Usage: fedora-hubs [OPTIONS] COMMAND [ARGS]...
 Options:
  --help  Show this message and exit.
 Commands:
  cache  Cache-related operations.
  db     Database-related operations.
  fas    FAS-related operations.
  run    Run daemon processes.
 ....
 === Manipulating the cache
 The `cache` subcommand is used to do cache-related operations:
 ....
 $ fedora-hubs cache
 Usage: fedora-hubs cache [OPTIONS] COMMAND [ARGS]...
  Cache-related operations.
 Options:
  --help  Show this message and exit.
 Commands:
  clean     Clean the specified WIDGETs (id or name).
  coverage  Check the cache coverage.
  list      List widgets for which there is cached data.
 ....
 For example, to check the cache coverage:
 ....
 $ fedora-hubs cache coverage
 107 cached values found, 95 are missing.
 52.97 percent cache coverage.
 ....
 The cache coverage value is an interesting metric that could be used in
 a Nagios check. A value below 50% could be considered as significant of
 application slowdowns and could thus generate a warning.
 === Interacting with FAS
 The `fas` subcommand is used to get information from FAS:
 ....
 $ fedora-hubs fas
 Usage: fedora-hubs fas [OPTIONS] COMMAND [ARGS]...
  FAS-related operations.
 Options:
  --help  Show this message and exit.
 Commands:
  create-team  Create the team hub NAME from FAS.
  sync-teams   Sync all the team hubs NAMEs from FAS.
 ....
 To add a new team hub for a FAS group, run:
 ....
 $ fedora-hubs fas create-team <fas-group-name>
 ....
--- a/modules/sysadmin_guide/pages/ibm_rsa_ii.adoc
+++ b/modules/sysadmin_guide/pages/ibm_rsa_ii.adoc
@ -0,0 +1,60 @@
 = IBM RSA II Infrastructure SOP
 Many of our physical machines use RSA II cards for remote management.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  PHX, ibiblio
 Servers::
  All physical IBM machines
 Purpose::
  Provide remote management for our physical IBM machines
 == Restarting the RSA II card
 Normally, the RSA II can be restarted from the web/ssh interface. If you
 are locked out of any outside access to the RSA II, follow these
 instructions on the physical machine.
 If the machine can be rebooted without issue, cut off all power to the
 machine, wait a few seconds, and restart everything.
 Otherwise, to restart the card without rebooting the machine:
 [arabic]
 . Download and install the IBM Remote Supervisor Adapter II Daemon
 +
 ____
 [arabic]
 .. `yum install usbutils libusb-devel` # (needed by the RSA II daemon)
 .. {blank}
 +
 Download the correct tarball from::
  http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5071676&brandind=5000008
  (TODO: check if this can be packaged in Fedora)
 .. Extract the tarball and run `sudo ./install.sh --update`
 ____
 . {blank}
 +
 Download and extract the IBM Advanced Settings Utility::
  http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=TOOL-ASU&brandind=5000016
  +
  ____
  [WARNING]
  .Warning
  ====
  this tarball dumps files in the current working directory
  ====
  ____
 . Issue a `sudo ./asu64 rebootrsa` to reboot the RSA II.
 . Clean up: `yum remove ibmusbasm64`
 == Other Resources
 http://www.redbooks.ibm.com/abstracts/sg246495.html may be a useful
 resource to refer to when working with this.
--- a/modules/sysadmin_guide/pages/index.adoc
+++ b/modules/sysadmin_guide/pages/index.adoc
@ -0,0 +1,73 @@
 = System Administrator Guide
 Welcome to The Fedora Infrastructure system administration guide.
 [[sysadmin-getting-started]]
 == Getting Started
 If you haven't already, you should complete the general
 `getting-started` guide. Once you've completed that, you're ready to get
 involved in the
 https://admin.fedoraproject.org/accounts/group/view/fi-apprentice[Fedora
 Infrastructure Apprentice] group.
 === Fedora Infrastructure Apprentice
 The
 https://admin.fedoraproject.org/accounts/group/view/fi-apprentice[Fedora
 Infrastructure Apprentice] group in the Fedora Account System grants
 read-only access to many Fedora infrastructure machines. This group is
 used for new folks to look around at the infrastructure setup, check
 machines and processes and see where they might like to contribute
 moving forward. This also allows apprentices to examine and gather info
 on problems, then propose solutions.
 [NOTE]
 .Note
 ====
 This group will be pruned often of inactive folks who miss the monthly
 email check-in on the
 https://lists.fedoraproject.org/admin/lists/infrastructure.lists.fedoraproject.org/[infrastructure
 mailing list]. There's nothing personal in this and you're welcome to
 re-join later when you have more time, we just want to make sure the
 group only has active members.
 ====
 Members of the https://admin.fedoraproject.org/accounts/group/view/fi-apprentice[Fedora
 Infrastructure Apprentice] group have ssh/shell access to many machines,
 but no sudo rights or ability to commit to the
 https://pagure.io/fedora-infra/ansible/[Ansible repository] (but they do
 have read-only access). Apprentice can, however, contribute to the
 infrastructure documentation by making a pull request to the
 https://pagure.io/infra-docs/[infra-docs] repository. Access is via the
 bastion.fedoraproject.org machine and from there to each machine. See
 the `ssh-sop` for instructions on how to set up SSH. You can see a list
 of hosts that allow apprentice access by using:
 ....
 $  ./scripts/hosts_with_var_set -i inventory/ -o ipa_client_shell_groups=fi-apprentice
 ....
 from a checkout of the https://pagure.io/fedora-infra/ansible/[Ansible
 repository]. The Ansible repository is hosted on pagure.io at
 `https://pagure.io/fedora-infra/ansible.git`.
 === Selecting a Ticket
 Start by checking out the
 https://pagure.io/fedora-infrastructure/issues?status=Open&tags=easyfix[easyfix
 tickets]. Tickets marked with this tag are a good place for apprentices
 to learn how things are setup, and also contribute a fix.
 Since apprentices do not have commit access to the
 https://pagure.io/fedora-infra/ansible/[Ansible repository], you should
 make your change, produce a patch with `git diff`, and attach it to the
 infrastructure ticket you are working on. It will then be reviewed.
 [[sops]]
 == Standard Operating Procedures
 Below is a table of contents containing all the standard operating
 procedures for Fedora Infrastructure applications. For information on
 how to write a new standard operating procedure, consult the guide on
 `develop-sops`.
--- a/modules/sysadmin_guide/pages/infra-git-repo.adoc
+++ b/modules/sysadmin_guide/pages/infra-git-repo.adoc
@ -0,0 +1,55 @@
 = Infrastructure Git Repos
 Setting up an infrastructure git repo - and the push mechanisms for the
 magicks
 We have a number of git repos (in /git on batcave) that manage files for
 ansible, our docs, our common host info database and our kickstarts This
 is a doc on how to setup a new one of these, if it is needed.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  Phoenix
 Servers::
  batcave01.phx2.fedoraproject.org, batcave-comm01.qa.fedoraproject.org
 == Steps
 Create the bare repo:
 ....
 make $git_dir
 setfacl -m d:g:$yourgroup:rwx -m d:g:$othergroup:rwx  \
 -m g:$yourgroup:rwx -m g:$othergroup:rwx $git_dir
 cd $git_dir
 git init --bare
 ....
 edit up config - add these lines to the bottom:
 ....
 [hooks]
 # (normallysysadmin-members@fedoraproject.org)
 mailinglist = emailaddress@yourdomain.org
 emailprefix =
 maildomain = fedoraproject.org
 reposource = /path/to/this/dir
 repodest = /path/to/where/you/want/the/files/dumped
 ....
 edit up description - make it something useful:
 ....
 cd hooks
 rm -f *.sample
 cp hooks from /git/infra-docs/hooks/ on batcave01 to this path
 ....
 modify sudoers to allow users in whatever groups can commit to this repo
 can run /usr/local/bin/syncgittree.sh w/o inputting a password
--- a/modules/sysadmin_guide/pages/infra-hostrename.adoc
+++ b/modules/sysadmin_guide/pages/infra-hostrename.adoc
@ -0,0 +1,115 @@
 = Infrastructure Host Rename SOP
 This page is intended to guide you through the process of renaming a
 virtual node.
 == Contents
 [arabic]
 . Introduction
 . Finding out where the host is
 . Preparation
 . Renaming the Logical Volume
 . Doing the actual rename
 . Telling ansible about the new host
 . VPN Stuff
 == Introduction
 Throughout this SOP, we will refer to the old hostname as $oldhostname
 and the new hostname as $newhostname. We will refer to the Dom0 host
 that the vm resides on as $vmhost.
 If this process is being followed so that a temporary-named host can
 replace a production host, please be sure to follow the
 [51]Infrastructure retire machine SOP to properly decommission the old
 host before continuing.
 == Finding out where the host is
 In order to rename the host, you must have access to the Dom0 (host) on
 which the virtual server resides. To find out which host that is, log in
 to batcave01, and run:
 ....
 grep $oldhostname /var/log/virthost-lists.out
 ....
 The first column of the output will be the Dom0 of the virtual node.
 == Preparation
 SSH to $oldhostname. If the new name is replacing a production box,
 change the IP Address that it binds to, in
 `/etc/sysconfig/network-scripts/ifcfg-eth0`.
 Also change the hostname in `/etc/sysconfig/network`.
 At this point, you can `sudo poweroff` $oldhostname.
 Open an ssh session to $vmhost, and make sure that the node is listed as
 `shut off`. If it is not, you can force it off with:
 ....
 virsh destroy $oldhostname
 ....
 == Renaming the Logical Volume
 Find out the name of the logical volume (on $vmhost):
 ....
 virsh dumpxml $oldhostname | grep 'source dev'
 ....
 This will give you a line that looks like
 `<source dev='/dev/VolGroup00/$oldhostname'/>` which tells you that
 `/dev/VolGroup00/$oldhostname` is the path to the logical volume.
 Run `/usr/sbin/lvrename` (the path that you found above) (the path that
 you found above, with $newhostname at the end instead of $oldhostname)`
 For example::::
  /usr/sbin/lvrename /dev/VolGroup00/noc03-tmp /dev/VolGroup00/noc01
 == Doing the actual rename
 Now that the logical volume has been renamed, we can rename the host in
 libvirt.
 Dump the configuration of $oldhostname into an xml file, by running:
 ....
 virsh dumpxml $oldhostname > $newhostname.xml
 ....
 Open up $newhostname.xml, and change all instances of $oldhostname to
 $newhostname.
 Save the file and run:
 ....
 virsh define $newhostname.xml
 ....
 If there are no errors above, you can undefine $oldhostname:
 ....
 virsh undefine $oldhostname
 ....
 Power on $newhostname, with:
 ....
 virsh start $newhostname
 ....
 And remember to set it to autostart:
 ....
 virsh autostart $newhostname
 ....
 == VPN Stuff
 TODO
--- a/modules/sysadmin_guide/pages/infra-raidmismatch.adoc
+++ b/modules/sysadmin_guide/pages/infra-raidmismatch.adoc
@ -0,0 +1,75 @@
 = Infrastructure/SOP/Raid Mismatch Count
 What to do when a raid device has a mismatch count
 == Contents
 [arabic]
 . Contact Information
 . Description
 . Correction
 ____
 [arabic]
 . Step 1
 . Step 2
 ____
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  All
 Servers::
  Physical hosts
 Purpose::
  Provides database connection to many of our apps.
 == Description
 In some situations a raid device may indicate there is a count mismatch
 as listed in:
 ....
 /sys/block/mdX/md/mismatch_cnt
 ....
 Anything other than 0 is considered not good. Though if the number is
 low it's probably nothing to worry about. To correct this situation try
 the directions below.
 == Correction
 More than anything these steps are to A) Verify there is no problem and
 B) make the error go away. If step 1 and step 2 don't correct the
 problems, PROCEED WITH CAUTION. The steps below, however, should be
 relatively safe.
 Issue a repair (replace mdX with the questionable raid device):
 ....
 echo repair > /sys/block/mdX/md/sync_action
 ....
 Depending on the size of the array and disk speed this can take a while.
 Watch the progress with:
 ....
 cat /proc/mdstat
 ....
 Issue a check. It's this check that will reset the mismatch count if
 there are no problems. Again replace mdX with your actual raid device.:
 ....
 echo check > /sys/block/mdX/md/sync_action
 ....
 Just as before, you can watch the progress with:
 ....
 cat /proc/mdstat
 ....
--- a/modules/sysadmin_guide/pages/infra-repo.adoc
+++ b/modules/sysadmin_guide/pages/infra-repo.adoc
@ -0,0 +1,113 @@
 = Infrastructure Yum Repo SOP
 In some cases RPM's in Fedora need to be rebuilt for the Infrastructure
 team to suit our needs. This repo is provided to the public (except for
 the RHEL RPMs). Rebuilds go into this repo which are stored on the
 netapp and shared via the proxy servers after being built on koji.
 For basic instructions, read the standard documentation on Fedora wiki:
 - https://fedoraproject.org/wiki/Using_the_Koji_build_system
 This document will only outline the differences between the "normal"
 repos and the infra repos.
 == Contents
 [arabic]
 . Contact Information
 . Building an RPM
 . Tagging an existing build
 . Promoting a staging build
 . Koji package list
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Location::
  PHX [53] https://kojipkgs.fedoraproject.org/repos-dist/
 Servers::
  koji batcave01 / Proxy Servers
 Purpose::
  Provides infrastructure repo for custom Fedora Infrastructure rebuilds
 == Building an RPM
 Building an RPM for Infrastructure is significantly easier then building
 an RPM for Fedora. Basically get your SRPM ready, then submit it to koji
 for building to the $repo-infra target. (e.g. epel7-infra).
 Example:
 ....
 rpmbuild --define "dist .el7.infra" -bs test.spec
 koji build epel7-infra test-1.0-1.el7.infra.src.rpm
 ....
 [NOTE]
 .Note
 ====
 Remember to build it for every dist / arch you need to deploy it on.
 ====
 After it has been built, you will see it's tagged as
 $repo-infra-candidate, this means that it is a candidate for being
 signed. The automatic signing system will pick it up and sign the
 package for you without any further intervention. You can track when
 this is done by checking the build info: when it is moved from
 $repo-infra-candidate to $repo-infra-stg, it has been signed. You can
 check this on the web interface (look under "Tags"), or via:
 ....
 koji buildinfo test-1.0-1.el7.infra
 ....
 After the build has been tagged into the $repo-infra-stg tag,
 tag2distrepo will automatically create a distrepo task, which will
 update the repository so that the package is available on staging hosts.
 After this time, you can yum clean all and then install the packages via
 yum install or yum update.
 == Tagging existing builds
 If you already have a real build and want to use it in the
 infrastructure before it has landed in stable, you can tag it into the
 respective infra-candidate tag. For example, if you have an epel7 build
 of test2-1.0-1.el7.infra, run:
 ....
 koji tag epel7-infra-candidate test2-1.0-1.el7.infra
 ....
 And then the same autosigning and repogen from the previous section
 applies.
 == Promoting a staging build
 After getting autosigned, builds will land in the respective infra-stg
 tag, for example epel7-infra-stg. These tags go into repos that are
 enabled on staging machines, but not on production. If you decide, after
 testing, that the build is good enough for production, you can promote
 it by running:
 ....
 koji move epel7-infra-stg epel7-infra test2-1.0-1.el7.infra
 ....
 == Koji package list
 If you try to build a package into the infra tags, and koji says
 something like: BuildError: package test not in list for tag
 epel7-infra-candidate That means that the package has not been added to
 the list for building in that particular tag. Either add the package to
 the respective Fedora/EPEL branches (this is the preferred method, since
 we should always aim to get everything packaged for Fedora/EPEL), or add
 the package to the listing for the respective tag.
 To add package to infra tag, run:
 ....
 koji add-pkg $tag $package --owner=$user
 ....
--- a/modules/sysadmin_guide/pages/infra-retiremachine.adoc
+++ b/modules/sysadmin_guide/pages/infra-retiremachine.adoc
@ -0,0 +1,44 @@
 = Infrastructure retire machine SOP
 == Introduction
 When a machine (be it virtual instance or real physical hardware is
 decommisioned, a set of steps must be followed to ensure that the
 machine is properly removed from the set of machines we manage and
 doesn't cause problems down the road.
 == Retire process
 [arabic]
 . {blank}
 +
 Ensure that the machine is no longer used for anything. Use git-grep,::
  stop services, etc.
 . {blank}
 +
 Remove the machine from ansible. Make sure you not only remove the
 main::
  machine name, but also any aliases it might have (or move them to an
  active server if they are active services. Make sure to search for the
  IP address(s) of the machine as well. Ensure dns is updated to remove
  the machine.
 . {blank}
 +
 Remove the machine from any labels in hardware devices like consoles
 or::
  the like.
 . Revoke the ansible cert for the machine.
 . {blank}
 +
 Move the machine xml defintion to ensure it does NOT start on boot.
 You::
  can move it to 'name-retired-YYYY-MM-DD'.
 . {blank}
 +
 Ensure any backend storage the machine was using is freed or renamed
 to::
  name-retired-YYYY-MM-DD
 == TODO
 fill in commands
--- a/modules/sysadmin_guide/pages/infra-yubikey.adoc
+++ b/modules/sysadmin_guide/pages/infra-yubikey.adoc
@ -0,0 +1,140 @@
 = Infrastructure/SOP/Yubikey
 This document describes how yubikey authentication works
 == Contents
 [arabic]
 . Contact Information
 . User Information
 . Host Admins
 +
 ____
 [arabic]
 .. pam_yubico
 ____
 . Server Admins
 +
 ____
 [arabic]
 .. Basic architecture
 .. ykval
 .. ykksm
 .. Physical Yubikey info
 ____
 . fas integration
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  Phoenix
 Servers::
  fas*, db02
 Purpose::
  Provides yubikey authentication in Fedora
 == Config Files
 * `/etc/httpd/conf.d/yk-ksm.conf`
 * `/etc/httpd/conf.d/yk-val.conf`
 * `/etc/ykval/ykval-config.php`
 * `/etc/ykksm/ykksm-config.php`
 * `/etc/fas.cfg`
 == User Information
 See [57]Infrastruture/Yubikey
 == Host Admins
 pam_yubico
 Generated from fas, the /etc/yubikeyid works like a authroized_keys file
 and maps valid keys to users. It is downloaded from FAS:
 [58]https://admin.fedoraproject.org/accounts/yubikey/dump
 == Server Admins
 === Basic architecture
 Yubikey authentication takes place in 3 basic phases.
 [arabic]
 . User presses yubikey which generates a one time password
 . {blank}
 +
 The one time password makes its way to the yk-val application which::
  verifies it is not a replay
 . {blank}
 +
 yk-val passes that otp on to the yk-ksm application which verifies the::
  key itself is a valid key
 If all of those steps succeed, the ykval application sends back an OK
 and authentication is considered successful. The two applications are
 defined below, if either of them is unavailable, yubikey authentication
 will fail.
 ==== ykval
 Database: db02:ykval
 The database contains 3 tables. clients: just a valid client. These are
 not users, these are systems able to authenticate against ykval. In our
 case Fedora is the only client so there's just one entry here queue:
 Used for distributed setups (we don't do this) yubikeys: maps which
 yubikey belongs to which user
 ykval is installed on fas* and is located at:
 [59]http://localhost/yk-val/verify
 Purpose: Is to map keys to users and protect against replay attacks
 ==== ykksm
 Database: db02:ykksm
 The database contains one table: yubikeys: maps who created keys, what
 key was created, when, and the public name and serial number, whether
 its active, etc.
 ykksm is installed on fas* at [60]http://localhost/yk-ksm
 Purpose: verify if a key is a valid known key or not. Nothing contacts
 this service directly except for ykval. This should be considered the
 “high security” portion of the system as access to this table would
 allow users to make their own yubikeys.
 ==== Physical Yubikey info
 The actual yubikey contains information to generate a one time password.
 The important bits to know are the begining of the otp contains the
 identifier of the key (used similar to how ssh uses authorized_keys) and
 note the rest of it contains lots of bits of information, including a
 serial incremental.
 Sample key: `ccccfcdaivjrvdhvzfljbbievftnvncljhibkulrftt`
 Breaking this up, the first 12 characters are the identifier. This can
 be considered 'public'
 ccccfcdaivj rvdhvzfljbbievftnvncljhibkulrftt
 The second half is the otp part.
 == fas integration
 Fas integration has two main parts. First is key generation, the next is
 activation. The fas-plugin-yubikey contains the bits for both, and
 verification. Users call on this page to generate the key info:
 [61]https://admin.fedoraproject.org/accounts/yubikey/genkey
 The fas password field automatically detects whether someone is using a
 otp or a regular password. It then sends otp requests to yk-val for
 verification.
--- a/modules/sysadmin_guide/pages/ipsilon.adoc
+++ b/modules/sysadmin_guide/pages/ipsilon.adoc
@ -0,0 +1,225 @@
 = Ipsilon Infrastructure SOP
 == Contents
 [arabic]
 . Contact Information
 . Description
 . Known Issues
 . ReStarting
 . Configuration
 . {blank}
 +
 Common actions::
  6.1. Registering OpenID Connect Scopes 6.2. Generate an OpenID Connect
  token 6.3. Create OpenID Connect secrets for apps
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Primary upstream contact::
  Patrick Uiterwijk - FAS: puiterwijk
 Backup upstream contact::
  Simo Sorce - FAS: simo (irc: simo) Howard Johnson - FAS: merlinthp
  (irc: MerlinTHP) Rob Crittenden - FAS: rcritten (irc: rcrit)
 Location::
  Phoenix
 Servers::
  ipsilon01.phx2.fedoraproject.org ipsilon02.phx2.fedoraproject.org
  ipsilion01.stg.phx2.fedoraproject.org.
 Purpose::
  Ipsilon is our central authentication service that is used to
  authenticate users agains FAS. It is seperate from FAS.
 == Description
 Ipsilon is our central authentication agent that is used to authenticate
 users agains FAS. It is seperate from FAS. The only service that is not
 using this currently is the wiki. It is a web service that is presented
 via httpd and is load balanced by our standard haproxy setup.
 == Known issues
 No known issues at this time. There is not currently a logout option for
 ipsilon, but it is not considered an issue. If group memberships are
 updated in ipsilon the user will need to wait a few minutes for them to
 replicate to the all the systems.
 == Restarting
 To restart the application you simply need to ssh to the servers for the
 problematic region and issue an 'service httpd restart'. This should
 rarely be required.
 == Configuration
 Configuration is handled by the ipsilon.yaml playbook in Ansible. This
 can also be used to reconfigure application, if that becomes nessecary.
 == Common actions
 This section describes some common configuration actions.
 === OpenID Connect Scope Registration
 As documented on
 https://fedoraproject.org/wiki/Infrastructure/Authentication,
 application developers can request their own scopes. When a request for
 this comes in, look in ansible/roles/ipsilon/files/oidc_scopes/ and copy
 an example module. Copy this to a new file, so we have a file per scope
 set. Fill in the information:
 ____
 * name is an Ipsilon-internal name. This should not include any spaces
 * display_name is the name that is displayed to the category of scopes
 to the user
 * scopes is a dictionary with the full scope identifier (with namespace)
 as keys. The values are dicts with the following keys:
 +
 ____
 ** display_name: The complete display name for this scope. This is what
 the user gets shown to accept/reject
 ** claims: A list of additional "claims" (pieces of user information) an
 application will get when the user
 ** consents to this scope. For most scopes, this will be the empty list.
 ____
 ____
 In ansible/roles/ipsilon/tasks/main.yml, add the name of the new file
 (without .py) to the with_items of "Copy OpenID Connect scope
 registrations"). To enable, open
 ansible/roles/ipsilon/templates/configuration.conf, and look for the
 lines starting with "openidc enabled extensions". Add the name of the
 plugin (in the "name" field of the file) to the environment this
 scopeset has been requested for. Run the ansible ipsilon.yml playbook.
 === Generate an OpenID Connect token
 There is a handy script in the Ansible project under
 `scripts/generate-oidc-token` that can help you generate an OIDC token.
 It has a self-explanatory `--help` argument, and it will print out some
 SQL that you can run against Ipsilon's database, as well as the token
 that you seek.
 The `SERVICE_NAME` (the required positional argument) is the name of the
 application that wants to use the token to perform actions against
 another service.
 To generate the scopes, you can visit our link:[authentication] docs and
 find the service you want the token to be used for. Each service has a
 base namespace (a URL) and one or more scopes for that namespace. To
 form a scope for this script, you concatenate the namespace of the
 service with the scope you want to grant the service. You can provide
 the script the -s flag multiple times if you want to grant more than one
 scope to the same token.
 As an example, to give Bodhi access to create waivers in WaiverDB, you
 can see that the base namespace is
 `https://waiverdb.fedoraproject.org/oidc/` and that there is a
 `create-waiver` scope. You can run this to generate Ipsilon SQL and a
 token with that scope:
 ....
 [bowlofeggs@batcave01 ansible][PROD]$ ./scripts/generate-oidc-token bodhi -e 365 -s https://waiverdb.fedoraproject.org/oidc/create-waiver
 Run this SQL against Ipsilon's database:
 --------START CUTTING HERE--------
 BEGIN;
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','username','bodhi@service');
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','security_check','-ptBqVLId-kUJquqkVyhvR0DbDULIiKp1eqbXqG_dfVK9qACU6WwRBN3-7TRfoOn');
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','client_id','bodhi');
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','expires_at','1557259744');
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','type','Bearer');
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','issued_at','1525723744');
 insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','scope','["openid", "https://someapp.fedoraproject.org/"]');
 COMMIT;
 -------- END CUTTING HERE --------
 Token: 2a5f2dff-4e93-4a8d-8482-e62f40dce046_-ptBqVLId-kUJquqkVyhvR0DbDULIiKp1eqbXqG_dfVK9qACU6WwRBN3-7TRfoOn
 ....
 Once you have the SQL, you can run it against Ipsilon's database, and
 you can provide the token to the application through some secure means
 (such as putting into Ansible's secrets and telling the requestor the
 Ansible variable they can use to access it.)
 === Create OpenID Connect secrets for apps
 Application wanting to use OpenID Connect need to register against our
 OpenID Connect server (Ipsilon). Since we do not allow self-registration
 (except on iddev.fedorainfracloud.org) for obvious reasons, the secrets
 need to be created and configured per application and environment
 (production vs staging).
 To do so: - Go to the private ansible repository. - Edit the file:
 `files/ipsilon/openidc.{{env}}.static` - At the bottom of this file, add
 the information concerning the application you are adding. This will
 look something like:
 ____
 ....
 fedocal client_name="fedocal"
 fedocal client_secret="<long random string>"
 fedocal redirect_uris=["https://calendar.stg.fedoraproject.org/oidc_callback"]
 fedocal client_uri="https://calendar.stg.fedoraproject.org/"
 fedocal ipsilon_internal={"type":"static","client_id":"fedocal","trusted":true}
 fedocal contacts=["admin@fedoraproject.org"]
 fedocal client_id=null
 fedocal policy_uri="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
 fedocal grant_types="authorization_code"
 fedocal response_types="code"
 fedocal application_type="web"
 fedocal subject_type="pairwise"
 fedocal logo_uri=null
 fedocal tos_uri=null
 fedocal jwks_uri=null
 fedocal jwks=null
 fedocal sector_identifier_uri=null
 fedocal request_uris=[]
 fedocal require_auth_time=null
 fedocal token_endpoint_auth_method="client_secret_post"
 fedocal id_token_signed_response_alg="RS256"
 fedocal request_object_signing_alg="none"
 fedocal initiate_login_uri=null
 fedocal default_max_age=null
 fedocal default_acr_values=null
 fedocal client_secret_expires_at=0
 ....
 ____
 In most of situation, only the first 5 lines (up to `ipsilon_internal`)
 will change. If the application is not using flask-oidc or is not
 maintained by the Fedora Infrastructure the first 11 lines (up to
 `application_type`) may change. The remaining lines require a deeper
 understanding of OpenID Connect and Ipsilon.
 [NOTE]
 .Note
 ====
 `client_id` in `ipsilon_internal` must match the begining of the line,
 and the `client_id` field must either match the begining of the line or
 be `null` as in the example here.
 ====
 [NOTE]
 .Note
 ====
 In our OpenID connect server, OIDC.user_getfield('nickname') will return
 the FAS username, which we know from FAS is unique. However, not all
 OpenID Connect servers enforce this constraint, so the application code
 may rely on the `sub` which is the only key that is sure to be unique.
 If the application relies on `sub` and wants `sub` to return the FAS
 username, then the configuration should be adjusted with:
 `subject_type="public"`.
 ====
 After adjusting this file, you will need to make the `client_secret`
 available to the application via ansible, for this simply add it to
 `vars.yml` as we do for the other private variables and provide the
 variable name to the person who requested it.
 Finally, commit and push the changes to both files and run the
 `ipsilon.yml` playbook.
--- a/modules/sysadmin_guide/pages/iscsi.adoc
+++ b/modules/sysadmin_guide/pages/iscsi.adoc
@ -0,0 +1,149 @@
 = iSCSI
 iscsi allows one to share and mount block devices using the scsi
 protocol over a network. Fedora currently connects to a netapp that has
 an iscsi export.
 == Contents
 [arabic]
 . Contact Information
 . Typical uses
 . iscsi basics
 ____
 [arabic]
 . Terms
 . iscsi's basic login / logout procedure is
 ____
 [arabic, start=4]
 . Loggin in
 . Logging out
 . Important note about creating new logical volumes
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  Phoenix
 Servers::
  xen[1-15]
 Purpose::
  Provides iscsi connectivity to our netapp.
 == Typical uses
 The best uses for Fedora are for servers that are not part of a farm or
 live replicated. For example, we wouldn't put app1 on the iscsi share
 because we don't gain anything from it. Shutting down app1 to move it
 isn't an issue because app1 is part of our application server farm.
 noc1, however, is not replicated. It's a stand alone box that, at best,
 would have a non-live failover. By placing this host on an iscsi share,
 we can make it more highly available as it allows us to move that box
 around our virtualization infrastructure without rebooting it or even
 taking it down.
 == iscsi basics
 === Terms
 * initiator means client
 * target means server
 * swab means mop
 * deck means floor
 === iscsi's basic login / logout procedure is
 [arabic]
 . {blank}
 +
 Notify your client that a new target is available (similar to editing::
  /etc/fstab for a new nfs mount)
 . Login to the iscsi target (similar to running "mount /my/nfs"
 . Logout from the iscsi target (similar to running "umount /my/nfs"
 . {blank}
 +
 Delete the target from the client (similar to removing the nfs mount::
  from /etc/fstab)
 ==== Logging in
 Most mounts are covered by ansible so this should be automatic. In the
 event that something goes wrong though, the best way to fix this is:
 * Notify the client of the target:
 +
 ....
 iscsiadm --mode node --targetname iqn.1992-08.com.netapp:sn.118047036 --portal 10.5.88.21:3260 -o new
 ....
 * Log in to the new target:
 +
 ....
 iscsiadm --mode node --targetname iqn.1992-08.com.netapp:sn.118047036 --portal 10.5.88.21:3260 --login
 ....
 * Scan and activate lvm:
 +
 ....
 pvscan
 vgscan
 vgchange -ay xenGuests
 ....
 Once this is done, one should be able to run "lvs" to see the logical
 volumes
 ==== Logging out
 Logging out isn't normally needed, for example rebooting a machine
 automatically logs the initiator out. Should a problem arise though here
 are the steps:
 * Disable the logical volume:
 +
 ....
 vgchange -an xenGuests
 ....
 * log out:
 +
 ....
 iscsiadm --mode node --targetname iqn.1992-08.com.netapp:sn.118047036 --portal 10.5.88.21:3260 --logout
 ....
 [NOTE]
 .Note
 ====
 `Cannot deactivate volume group`
 If the vgchange command fails with an error about not being able to
 deactivate the volume group, this means that one of the logical volumes
 is still in use. By running "lvs" you can get a list of volume groups.
 Look in the Attr column. There are 6 attrs listed. The 5th column
 usually has a '-' or an 'a'. 'a' means its active, - means it is not. To
 the right of that (the last column) you will see an '-' or an 'o'. If
 you see an 'o' that means that logical volume is still mounted and in
 use.
 ====
 [IMPORTANT]
 .Important
 ====
 Note about creating new logical volumes
 At present we do not have logical volume locking on the xen servers.
 This is dangerous and being worked on. Basically when you create a new
 volume on a host, you need to run:
 ....
 pvscan
 vgscan
 lvscan
 ....
 on the other virtualization servers.
 ====
--- a/modules/sysadmin_guide/pages/jenkins-fedmsg.adoc
+++ b/modules/sysadmin_guide/pages/jenkins-fedmsg.adoc
@ -0,0 +1,40 @@
 = Jenkins Fedmsg SOP
 Send information about Jenkins builds to fedmsg.
 == Contact Information
 Owner::
  Ricky Elrod, Fedora Infrastructure Team
 Contact::
  #fedora-apps
 == Reinstalling when it disappears
 For an as-of-yet unknown reason, the plugin sometimes seems to
 disappear, though it still shows as "installed" on Jenkins.
 To re-install it, grab [.title-ref]#fedmsg.hpi# from
 [.title-ref]#/srv/web/infra/bigfiles/jenkins#. Go to the Jenkins web
 interface and log in. Click [.title-ref]#Manage Jenkins# ->
 [.title-ref]#Manage Plugins# -> [.title-ref]#Advanced#. Upload the
 plugin and on the page that comes up, check the box to have Jenkins
 restart when running jobs are finished.
 == Configuration Values
 These are written here in case the Jenkins configuration ever gets lost.
 This is how to configure the jenkins-fedmsg-emit plugin.
 Assume the plugin is already installed.
 Go to "Configure Jenkins" -> "System Configuration"
 Towards the bottom, look for "Fedmsg Emitter"
 Values:
 Signing: Checked Fedmsg Endpoint: tcp://209.132.181.16:9941 Environment
 Shortname: prod Certificate File:
 /etc/pki/fedmsg/jenkins-jenkins.fedorainfracloud.org.crt Keystore File:
 /etc/pki/fedmsg/jenkins-jenkins.fedorainfracloud.org.key
--- a/modules/sysadmin_guide/pages/kerneltest-harness.adoc
+++ b/modules/sysadmin_guide/pages/kerneltest-harness.adoc
@ -0,0 +1,60 @@
 = Kerneltest-harness SOP
 The kerneltest-harness is the web application used to gather and present
 statistics about kernel test results.
 == Contents
 [arabic]
 . Contact Information
 . Documentation Links
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin
 Location::
  https://apps.fedoraproject.org/kerneltest/
 Servers::
  kerneltest01, kerneltest01.stg
 Purpose::
  Provide a system to gather and present kernel tests results
 == Add a new Fedora release
 * Login
 * On the front page, in the menu on the left side, if there is a
 [.title-ref]#Fedora Rawhide# release, click on [.title-ref]#(edit)#.
 * Bump the [.title-ref]#Release number# on [.title-ref]#Fedora Rawhide#
 to avoid conflicts with the new release you're creating
 * Back on the index page, click on [.title-ref]#New release#
 * Complete the form:
 +
 Release number::
  This would be the integer version of the Fedora release, for example
  24 for Fedora 24.
 Support::
  The current status of the Fedora release
  +
  ** Rawhide for Fedora Rawhide
  ** Test for branched release
  ** Release for released Fedora
  ** Retired for retired release of Fedora
 == Upload new test results
 The kernel tests are available on the
 https://git.fedorahosted.org/cgit/kernel-tests.git/[kernel-test] git
 repository.
 Once ran with [.title-ref]#runtests.sh#, you can upload the resulting
 file either using [.title-ref]#fedora_submit.py# or the UI.
 If you choose the UI the steps are simply:
 * Login
 * Click on [.title-ref]#Upload# in the main menu on the top
 * Select the result file generated by running the tests
 * Submit
--- a/modules/sysadmin_guide/pages/kickstarts.adoc
+++ b/modules/sysadmin_guide/pages/kickstarts.adoc
@ -0,0 +1,171 @@
 = Kickstart Infrastructure SOP
 Kickstart scripts provide our install infrastructure. We have a plethora
 of different kickstarts to best match the system you are trying to
 install.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main
 Location::
  Everywhere we have machines.
 Servers::
  batcave01 (stores kickstarts and install media)
 Purpose::
  Provides our install infrastructure
 == Introduction
 Our kickstart infrastructure lives on batcave01. All install media and
 kickstart scripts are located on batcave01. Because the RHEL binaries
 are not public we have these bits blocked. You can add needed IPs to
 (from batcave01):
 ....
 ansible/roles/batcave/files/allows
 ....
 == Physical Machine (kvm virthost)
 [NOTE]
 .Note
 ====
 PXE Booting: If PXE booting just follow the prompt after doing the pxe boot (most
 hosts will pxeboot via console hitting f12).
 ====
 === Prep
 This only works on an already booted box, many boxes at our colocations
 may have to be rebuilt by the people in those locations first. Also make
 sure the IP you are about to boot to install from is allowed to our IP
 restricted infrastructure.fedoraproject.org as noted above (in
 Introduction).
 Download the vmlinuz and initrd images.
 for a rhel6 install:
 ....
 wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL6-x86_64/images/pxeboot/vmlinuz \
    -O /boot/vmlinuz-install
 wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL6-x86_64/images/pxeboot/initrd.img \
    -O /boot/initrd-install.img
 grubby --add-kernel=/boot/vmlinuz-install \
       --args="ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-6-nohd \
       repo=https://infrastructure.fedoraproject.org/repo/rhel/RHEL6-x86_64/ \
       ksdevice=link ip=$IP gateway=$GATEWAY netmask=$NETMASK dns=$DNS" \
       --title="install el6" --initrd=/boot/initrd-install.img
 ....
 for a rhel7 install:
 ....
 wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL7-x86_64/images/pxeboot/vmlinuz -O /boot/vmlinuz-install
 wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL7-x86_64/images/pxeboot/initrd.img -O /boot/initrd-install.img
 ....
 For phx2 hosts:
 ....
 grubby --add-kernel=/boot/vmlinuz-install \
       --args="ks=http://10.5.126.23/repo/rhel/ks/hardware-rhel-7-nohd \
       repo=http://10.5.126.23/repo/rhel/RHEL7-x86_64/ \
       net.ifnames=0 biosdevname=0 bridge=br0:eth0 ksdevice=br0 \
       ip={{ br0_ip }}::{{ gw }}:{{ nm }}:{{ hostname }}:br0:none" \
       --title="install el7" --initrd=/boot/initrd-install.img
 ....
 (You will need to setup the br1 device if any after install)
 For non phx2 hosts:
 ....
 grubby --add-kernel=/boot/vmlinuz-install \
       --args="ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-ext \
       repo=https://infrastructure.fedoraproject.org/repo/rhel/RHEL7-x86_64/ \
       net.ifnames=0 biosdevname=0 bridge=br0:eth0 ksdevice=br0 \
       ip={{ br0_ip }}::{{ gw }}:{{ nm }}:{{ hostname }}:br0:none" \
       --title="install el7" --initrd=/boot/initrd-install.img
 ....
 Fill in the br0 ip, gateway, etc
 The default here is to use the hardware-rhel-7-nohd config which
 requires you to connect via VNC to the box and configure its drives. If
 this is a new machine or you are fine with blowing everything away, you
 can instead use
 https://infrastructure.fedoraproject.org/rhel/ks/hardware-rhel-6-minimal
 as your kickstart
 If you know the number of hard drives the system has there are other
 kickstarts which can be used.
 2 disk system::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-02disk
 or external::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-02disk-ext
 4 disk system::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-04disk
 or external::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-04disk-ext
 6 disk system::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-06disk
 or external::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-06disk-ext
 8 disk system::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-08disk
 or external::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-08disk-ext
 10 disk system::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-10disk
 or external::::
  ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-10disk-ext
 Double and triple check your configuration settings (On RHEL-6
 `cat /boot/grub/menu.lst` and on RHEL-7 `cat /boot/grub2/grub.cfg`),
 especially your IP information. In places like ServerBeach not all hosts
 have the same netmask or gateway. Once everything you are ready to run
 the commands to get it set up to boot next boot.
 RHEL-6:
 ....
 echo "savedefault --default=0 --once" | grub --batch
 shutdown -r now
 ....
 RHEL-7:
 ....
 grub2-reboot 0
 shutdown -r now
 ....
 === Installation
 Once the box logs you out, start pinging the IP address. It will
 disappear and come back. Once you can ping it again, try to open up a
 VNC session. It can take a couple of minutes after the box is back up
 for it to actually allow vnc sessions. The VNC password is in the
 kickstart script on batcave01:
 ....
 grep vnc /mnt/fedora/app/fi-repo/rhel/ks/hardware-rhel-7-nohd
 vncviewer $IP:1
 ....
 If using the standard kickstart script, one can watch as the install
 completes itself, there should be no need to do anything. If using the
 hardware-rhel-6-nohd script, one will need to configure the drives. The
 password is in the kickstart file in the kickstart repo.
 === Post Install
 Run ansible on the box asap to set root passwords and other security
 features. Don't leave a newly installed box sitting around.
--- a/modules/sysadmin_guide/pages/koji-archive.adoc
+++ b/modules/sysadmin_guide/pages/koji-archive.adoc
@ -0,0 +1,33 @@
 This SOP documents how to archive Fedora EOL'd builds from the DEFAULT
 volume to archived volume.
 Before archiving the builds, identify if any of the EOL'd release builds
 are still being used in the current releases. For example. to test if
 f28 builds are still being using in f32, use:
 $ koji list-tagged f32 | grep fc28
 Tag all these builds to koji's do-not-archive-yet tag, so that they wont
 be archived. To do that, first add the packages to the
 do-not-archive-tag
 $ koji add-pkg do-not-archive-yet --owner <username> pkg1 pkg2 ...
 Then tags the builds to do-not-archive-yet tag
 $ koji tag-build do-not-archive-yet build1 build2 ...
 Then update the archive policy which is available in releng repo
 (https://pagure.io/releng/blob/master/f/koji-archive-policy)
 Run the following from compose-x86-01.phx2.fedoraproject.org
 $ cd $ wget https://pagure.io/releng/raw/master/f/koji-archive-policy $
 git clone https://pagure.io/koji-tools/ $ cd koji-tools $
 ./koji-change-volumes -p compose_koji -v ~/archive-policy
 In any case, if you need to move a build back to DEFAULT volume
 $ koji add-pkg do-not-archive-yet --owner <username> pkg1 $ koji
 tag-build do-not-archive-yet build1 $ koji set-build-volume DEFAULT
 <n-v-r>
--- a/modules/sysadmin_guide/pages/koji-builder-setup.adoc
+++ b/modules/sysadmin_guide/pages/koji-builder-setup.adoc
@ -0,0 +1,118 @@
 = Setup Koji Builder SOP
 == Contents
 * Setting up a new koji builder
 * Resetting/installing an old koji builder
 == Builder Setup
 Setting up a new koji builder involves a goodly number of steps:
 === Network Overview
 [arabic]
 . First get an instance spun up following the kickstart sop.
 . {blank}
 +
 Define a hostname for it on the 125 network and a $hostname-nfs name::
  for it on the .127 network.
 . make sure the instance has 2 network connections:
 * eth0 should be on the .125 network
 * eth1 should be on the .127 network
 +
 ____
 For VM eth0 should be on br0, eth1 on br1 on the vmhost.
 ____
 === Setup Overview
 * install the system as normal:
 +
 ....
 virt-install -n $builder_fqdn -r $memsize \
 -f $path_to_lvm  --vcpus=$numprocs \
 -l http://10.5.126.23/repo/rhel/RHEL6-x86_64/ \
 -x "ksdevice=eth0 ks=http://10.5.126.23/repo/rhel/ks/kvm-rhel-6 \
 ip=$ip netmask=$netmask gateway=$gw dns=$dns \
 console=tty0 console=ttyS0" \
 --network=bridge=br0 --network=bridge=br1 \
 --vnc --noautoconsole
 ....
 * run python `/root/tmp/setup-nfs-network.py` this should print out the
 -nfs hostname that you made above
 * change root pw
 * disable selinux on the machine in /etc/sysconfig/selinux
 * reboot
 * setup ssl cert into private/builders - use fqdn of host as DN
 ** login to fas01 as root
 ** `cd /var/lib/fedora-ca`
 ** `./kojicerthelper.py normal --outdir=/tmp/  \ --name=$fqdn_of_the_new_builder  --cadir=. --caname=Fedora`
 ** info for the cert should be like this:
 +
 ....
 Country Name (2 letter code) [US]:
 State or Province Name (full name) [North Carolina]:
 Locality Name (eg, city) [Raleigh]:
 Organization Name (eg, company) [Fedora Project]:
 Organizational Unit Name (eg, section) []:Fedora Builders
 Common Name (eg, your name or your servers hostname) []:$fqdn_of_new_builder
 Email Address []:buildsys@fedoraproject.org
 ....
 ** scp the file in `/tmp/$\{fqdn}_key_and_cert.pem` over to batcave01
 ** put file in the private repo under `private/builders/$dn}.pem`
 ** `git add` + `git commit`
 ** `git push`
 * run `./sync-hosts` in infra-hosts repo; `git commit; git push`
 * as a koji admin run:
 +
 ....
 koji add-host $fqdnr i386 x86_64
 (note: those are yum basearchs on the end - season to taste)
 ....
 === Resetting/installing an old koji builder
 * disable the builder in koji (ask a koji admin)
 * halt the old system (halt -p)
 * undefine the vm instance on the buildvmhost:
 +
 ....
 virsh undefine $builder_fqdn
 ....
 * reinstall it - from the buildvmhost run:
 +
 ....
 virt-install -n $builder_fqdn -r $memsize \
 -f $path_to_lvm  --vcpus=$numprocs \
 -l http://10.5.126.23/repo/rhel/RHEL6-x86_64/ \
 -x "ksdevice=eth0 ks=http://10.5.126.23/repo/rhel/ks/kvm-rhel-6 \
 ip=$ip netmask=$netmask gateway=$gw dns=$dns \
 console=tty0 console=ttyS0" \
 --network=bridge=br0 --network=bridge=br1 \
 --vnc --noautoconsole
 ....
 * watch install via vnc:
 +
 ....
 vncviewer -via bastion.fedoraproject.org $builder_fqdn:1
 ....
 * when the install finishes:
 ** start the instance on the buildvmhost:
 +
 ....
 virsh start $builder_fqdn
 ....
 ** set it to autostart on the buildvmhost:
 +
 ....
 virsh autostart $builder_fqdn
 ....
 * when the guest comes up
 ** login via ssh using the temp root password
 ** python /root/tmp/setup-nfs-network.py
 ** change root password
 ** disable selinux in /etc/sysconfig/selinux
 ** reboot
 ** ask a koji admin to re-enable the host
--- a/modules/sysadmin_guide/pages/koji.adoc
+++ b/modules/sysadmin_guide/pages/koji.adoc
@ -0,0 +1,224 @@
 = Koji Infrastructure SOP
 [NOTE]
 .Note
 ====
 We are transitioning from two buildsystems, koji for Fedora and plague
 for EPEL, to just using koji. This page documents both.
 ====
 Koji and plague are our buildsystems. They share some of the same
 machines to do their work.
 == Contents
 [arabic]
 . Contact Information
 . Description
 . Add packages into Buildroot
 . Troubleshooting and Resolution
 [arabic]
 . Restarting Koji
 . kojid won't start or some builders won't connect
 . OOM (Out of Memory) Issues
 [arabic]
 . Increase Memory
 . Decrease weight
 [arabic, start=4]
 . Disk Space Issues
 {empty}5. Should there be mention of being sure filesystems in chroots
 are unmounted before you delete the chroots?
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-build group
 Persons::
  mbonnet, dgilmore, f13, notting, mmcgrath, SmootherFrOgZ
 Location::
  Phoenix
 Servers::
  * koji.fedoraproject.org
  * buildsys.fedoraproject.org
  * xenbuilder[1-4]
  * hammer1, ppc[1-4]
 Purpose::
  Build packages for Fedora.
 == Description
 Users submit builds to koji.fedoraproject.org or
 buildsys.fedoraproject.org. From there it gets passed on to the
 builders.
 [IMPORTANT]
 .Important
 ====
 At present plague and koji are unaware of each other. A result of this
 may be an overloaded builder. A easy fix for this is not clear at this
 time
 ====
 == Add packages into Buildroot
 Some contributors may have the need to build packages against fresh
 built packages which are not into buildroot yet. Koji has override tags
 as a Inheritance to the build tag in order to include them into
 buildroot which can be set by:
 ....
 koji tag-pkg dist-$release-override <package_nvr>
 ....
 == Troubleshooting and Resolution
 === Restarting Koji
 If for some reason koji needs to be restarted, make sure to restart the
 koji master first, then the builders. If the koji master has been down
 for a short enough time the builders do not need to be restarted.:
 ....
 service httpd restart
 service kojira restart
 service kojid restart
 ....
 [IMPORTANT]
 .Important
 ====
 If postgres becomes interrupted in some way, koji will need to be
 restarted. As long as the koji master daemon gets restarted the builders
 should reconnect automatically. If the db server has been restarted and
 the builders don't seem to be building, restart their daemons as well.
 ====
 === kojid won't start or some builders won't connect
 In the event that some items are able to connect to koji while some are
 not, please make sure that the database is not filled up on connections.
 This is common if koji crashes and the db connections aren't properly
 cleared. Upon restart many of the connections are full so koji cannot
 reconnect. Clearing old connections is easy, guess about how long it the
 new koji has been up and pick a number of minutes larger then that and
 kill those queries. From db3 as postgres run:
 ....
 echo "select procpid from pg_stat_activity where usename='koji' and now() - query_start \
 >= '00:40:00' order by query_start;" | psql koji | grep "^  " | xargs kill
 ....
 === OOM (Out of Memory) Issues
 Out of memory issues occur from time to time on the build machines.
 There are a couple of options for correction. The first fix is to just
 restart the machine and hope it was a one time thing. If the problem
 continues please choose from one of the following options.
 ==== Increase Memory
 The xen machines can have memory increased on their corresponding xen
 hosts. At present this is the table:
 [width="34%",cols="44%,56%",]
 |===
 |xen3 |xenbuilder1
 |xen4 |xenbuilder2
 |disabled |xenbuilder3
 |xen8 |xenbuilder4
 |===
 Edit `/etc/xen/xenbuilder[1-4]` and add more memory.
 ==== Decrease weight
 Each builder has a weight as to how much work can be given to it.
 Presently the only way to alter weight is actually changing the database
 on db3:
 ....
 $ sudo su - postgres
 -bash-2.05b$ psql koji
 koji=# select * from host limit 1;
 id | user_id |          name          |  arches   | task_load | capacity | ready | enabled
 ---+---------+------------------------+-----------+-----------+----------+-------+---------
 6  |     130 | ppc3.fedora.redhat.com | ppc ppc64 |       1.5 |        4 | t     | t
 (1 row)
 koji=# update host set capacity=2 where name='ppc3.fedora.redhat.com';
 ....
 Simply update capacity to a lower number.
 === Disk Space Issues
 The builders use a lot of temporary storage. Failed builds also get left
 on the builders, most should get cleaned but plague does not. The
 easiest thing to do is remove some older cache dirs.
 Step one is to turn off both koji and plague:
 ....
 /etc/init.d/plague-builder stop
 /etc/init.d/kojid stop
 ....
 Next check to see what file system is full:
 ....
 df -h
 ....
 [IMPORTANT]
 .Important
 ====
 If any one of the following directories is full, send an outage
 notification as outlined in: [62]Infrastructure/OutageTemplate to the
 fedora-infrastructure-list and fedora-devel-list, then contact Mike
 McGrath
 * /mnt/koji
 * /mnt/ntap-fedora1/scratch
 * /pub/epel
 * /pub/fedora
 ====
 Typically just / will be full. The next thing to do is determine if
 we have any extremely large builds left on the builder. Typical
 locations include /var/lib/mock and /mnt/build (/mnt/build actually is
 on the local filesystem):
 ....
 du -sh /var/lib/mock/* /mnt/build/*
 ....
 `/var/lib/mock/dist-f8-build-10443-1503`::
  classic koji build
 `/var/lib/mock/fedora-6-ppc-core-57cd31505683ef1afa533197e91608c5a2c52864`::
  classic plague build
 If nothing jumps out immediately, just start deleting files older than
 one week. Once enough space has been freed start koji and plague back
 up:
 ....
 /etc/init.d/plague-builder start
 /etc/init.d/kojid start
 ....
 === Unmounting
 [WARNING]
 .Warning
 ====
 Should there be mention of being sure filesystems in chroots are
 unmounted before you delete the chroots?
 Res ipsa loquitur.
 ====
--- a/modules/sysadmin_guide/pages/koschei.adoc
+++ b/modules/sysadmin_guide/pages/koschei.adoc
@ -0,0 +1,208 @@
 = Koschei SOP
 Koschei is a continuous integration system for RPM packages. Koschei
 runs package scratch builds after dependency change or after time elapse
 and reports package buildability status to interested parties.
 Production instance: https://apps.fedoraproject.org/koschei Staging
 instance: https://apps.stg.fedoraproject.org/koschei
 == Contact Information
 Owner::
  mizdebsk, msimacek
 Contact::
  #fedora-admin
 Location::
  Fedora Cloud
 Purpose::
  continuous integration system
 == Deployment
 Koschei deployment is managed by two Ansible playbooks:
 ....
 sudo rbac-playbook groups/koschei-backend.yml
 sudo rbac-playbook groups/koschei-web.yml
 ....
 == Description
 Koschei is deployed on two separate machines - `koschei-backend` and
 `koschei-web`
 Frontend (`koschei-web`) is a Flask WSGi application running with httpd.
 It displays information to users and allows editing package groups and
 changing priorities.
 Backend (`koschei-backend`) consists of multiple services:
 * `koschei-watcher` - listens to fedmsg events for complete builds and
 changes build states in the database
 * `koschei-repo-resolver` - resolves package dependencies in given repo
 using hawkey and compares them with previous iteration to get a
 dependency diff. It resolves all packages in the newest repo available
 in Koji. The output is a base for scheduling new builds
 * `koschei-build-resolver` - resolves complete builds in the repo in
 which they were done in Koji. Produces the dependency differences
 visible in the frontend
 * `koschei-scheduler` - schedules new builds based on multiple criteria:
 ** dependency priority - dependency changes since last build valued by
 their distance in the dependency graph
 ** manual and static priorities - set manually in the frontend. Manual
 priority is reset after each build, static priority persists
 ** time priority - time elapsed since the last build
 * `koschei-polling` - polls the same types of events as koschei-watcher
 without reliance on fedmsg. Additionaly takes care of package list
 synchronization and other regularly executed tasks
 == Configuration
 Koschei configuration is in `/etc/koschei/config-backend.cfg` and
 `/etc/koschei/config-frontend.cfg`, and is merged with the default
 configuration in `/usr/share/koschei/config.cfg` (the ones in `/etc`
 overrides the defaults in `/usr`). Note the merge is recursive. The
 configuration contains all configurable items for all Koschei services
 and the frontend. The alterations to configuration that aren't temporary
 should be done through ansible playbook. Configuration changes have no
 effect on already running services -- they need to be restarted, which
 happens automatically when using the playbook.
 == Disk usage
 Koschei doesn't keep on disk anything that couldn't be recreated easily
 - all important data is stored in PostgreSQL database, configuration is
 managed by Ansible, code installed by RPM and so on.
 To speed up operation and reduce load on external servers, Koschei
 caches some data obtained from services it integrates with. Most
 notably, YUM repositories downloaded from Koji are kept in
 `/var/cache/koschei/repodata`. Each repository takes about 100 MB of
 disk space. Maximal number of repositories kept at time is controlled by
 `cache_l2_capacity` parameter in `config-backend.cfg`
 (`config-backend.cfg.j2` in Ansible). If repodata cache starts to
 consume too much disk space, that value can be decreased - after
 restart, `koschei-*-resolver` will remove least recently used cache
 entries to respect configured cache capacity.
 == Database
 Koschei needs to connect to a PostgreSQL database, other database
 systems are not supported. Database connection is specified in the
 configuration under the `database_config` key that can contain the
 following keys: `username, password, host, port, database`.
 After an update of koschei, the database needs to be migrated to new
 schema. This happens automatically when using the upgrade playbook.
 Alternatively, it can be executed manulally using:
 ....
 koschei-admin alembic upgrade head
 ....
 The backend services need to be stopped during the migration.
 == Managing koschei services
 Koschei services are systemd units managed through `systemctl`. They can
 be started and stopped independently in any order. The frontend is run
 using httpd.
 == Suspespending koschei operation
 For stopping builds from being scheduled, stopping the
 `koschei-scheduler` service is enough. For planned Koji outages, it's
 recommended to stop `koschei-scheduler`. It is not necessary, as koschei
 can recover from Koji errors and network errors automatically, but when
 Koji builders are stopped, it may cause unexpected build failures that
 would be reported to users. Other services can be left running as they
 automatically restart themselves on Koji and network errors.
 == Limiting Koji usage
 Koschei is by default limited to 30 concurrently running builds. This
 limit can be changed in the configuration under `koji_config.max_builds`
 key. There's also Koji load monitoring, that prevents builds from being
 scheduled when Koji load is higher that certain threshold. That should
 prevent scheduling builds during mass rebuilds, so it's not necessary to
 stop scheduling during those.
 == Fedmsg notifications
 Koschei optionally supports sending fedmsg notifications for package
 state changes. The fedmsg dispatch can be turned on and off in the
 configuration (key `fedmsg-publisher.enabled`). Koschei doesn't supply
 configuration for fedmsg, it lets the library to load it's own (in
 `/etc/fedmsg.d/`).
 == Setting admin announcement
 Koschei can display announcement in web UI. This is mostly useful to
 inform users about outages or other problems.
 To set announcement, run as koschei user:
 ....
 koschei-admin set-notice "Koschei operation is currently suspended due to scheduled Koji outage"
 ....
 or:
 ....
 koschei-admin set-notice "Sumbitting scratch builds by Koschei is currently disabled due to Fedora 23 mass rebuild"
 ....
 To clear announcement, run as koschei user:
 ....
 koschei-admin clear-notice
 ....
 == Adding package groups
 Packages can be added to one or more group.
 To add new group named "mynewgroup", run as koschei user:
 ....
 koschei-admin add-group mynewgroup
 ....
 To add new group named "mynewgroup" and populate it with some packages,
 run as koschei user:
 ....
 koschei-admin add-group mynewgroup pkg1 pkg2 pkg3
 ....
 == Set package static priority
 Some packages are more or less important and can have higher or lower
 priority. Any user can change manual priority, which is reset after
 package is rebuilt. Admins can additionally set static priority, which
 is not affected by package rebuilds.
 To set static priority of package "foo" to value "100", run as koschei
 user:
 ....
 koschei-admin --collection f27 set-priority --static foo 100
 ....
 == Branching a new Fedora release
 After branching occurs and Koji build targets have been created, Koschei
 should be updated to reflect the new state. There is a special admin
 command for this purpose, which takes care of copying the configuration
 and also last builds from the history.
 To branch the collection from Fedora 27 to Fedora 28, use the following:
 ....
 koschei-admin branch-collection f27 f28 -d 'Fedora 27' -t f28 --bugzilla-version 27
 ....
 Then you can optionally verify that the collection configuration is
 correct by visiting https://apps.fedoraproject.org/koschei/collections
 and examining the configuration of the newly branched collection.
--- a/modules/sysadmin_guide/pages/layered-image-buildsys.adoc
+++ b/modules/sysadmin_guide/pages/layered-image-buildsys.adoc
@ -0,0 +1,281 @@
 = Layered Image Build System
 The
 https://docs.pagure.org/releng/layered_image_build_service.html[Fedora
 Layered Image Build System], often referred to as
 https://github.com/projectatomic/osbs-client[OSBS] (OpenShift Build
 Service) as that is the upstream project that this is based on, is used
 to build Layered Container Images in the Fedora Infrastructure via Koji.
 == Contents
 [arabic]
 . Contact Information
 . Overview
 . Setup
 . Outage
 == Contact Information
 Owner::
  Clement Verna (cverna)
 Contact::
  #fedora-admin, #fedora-releng, #fedora-noc, sysadmin-main,
  sysadmin-releng
 Location::
  osbs-control01, osbs-master01, osbs-node01, osbs-node02
  registry.fedoraproject.org, candidate-registry.fedoraproject.org
  +
  osbs-control01.stg, osbs-master01.stg, osbs-node01.stg,
  osbs-node02.stg registry.stg.fedoraproject.org,
  candidate-registry.stg.fedoraproject.org
  +
  x86_64 koji buildvms
 Purpose::
  Layered Container Image Builds
 == Overview
 The build system is setup such that Fedora Layered Image maintainers
 will submit a build to Koji via the `fedpkg container-build` command a
 `container` namespace within
 https://src.fedoraproject.org/projects/container/*[DistGit]. This will
 trigger the build to be scheduled in
 https://www.openshift.org/[OpenShift] via
 https://github.com/projectatomic/osbs-client[osbs-client] tooling, this
 will create a custom
 https://docs.openshift.org/latest/dev_guide/builds.html[OpenShift Build]
 which will use the pre-made buildroot container image that we have
 created. The https://github.com/projectatomic/atomic-reactor[Atomic
 Reactor] (`atomic-reactor`) utility will run within the buildroot and
 prep the build container where the actual build action will execute, it
 will also maintain uploading the
 https://fedoraproject.org/wiki/Koji/ContentGenerators[Content Generator]
 metadata back to https://fedoraproject.org/wiki/Koji[Koji] and upload
 the built image to the candidate docker registry. This will run on a
 host with iptables rules restricting access to the docker bridge, this
 is how we will further limit the access of the buildroot to the outside
 world verifying that all sources of information come from Fedora.
 Completed layered image builds are hosted in a candidate docker registry
 which is then used to pull the image and perform tests.
 == Setup
 The Layered Image Build System setup is currently as follows (more
 detailed view available in the
 https://docs.pagure.org/releng/layered_image_build_service.html[RelEng
 Architecture Document]):
 ....
 === Layered Image Build System Overview ===
     +--------------+                           +-----------+
     |              |                           |           |
     |   koji hub   +----+                      |  batcave  |
     |              |    |                      |           |
     +--------------+    |                      +----+------+
                         |                           |
                         V                           |
             +----------------+                      V
             |                |           +----------------+
             |  koji builder  |           |                +-----------+
             |                |           | osbs-control01 +--------+  |
             +-+--------------+           |                +-----+  |  |
               |                          +----------------+     |  |  |
               |                                                 |  |  |
               |                                                 |  |  |
               |                                                 |  |  |
               V                                                 |  |  |
    +----------------+                                           |  |  |
    |                |                                           |  |  |
    | osbs-master01  +------------------------------+           [ansible]
    |                +-------+                      |            |  |  |
    +----------------+       |                      |            |  |  |
         ^                   |                      |            |  |  |
         |                   |                      |            |  |  |
         |                   V                      V            |  |  |
         |        +-----------------+       +----------------+   |  |  |
         |        |                 |       |                |   |  |  |
         |        |  osbs-node01    |       |  osbs-node02   |   |  |  |
         |        |                 |       |                |   |  |  |
         |        +-----------------+       +----------------+   |  |  |
         |               ^                           ^           |  |  |
         |               |                           |           |  |  |
         |               |                           +-----------+  |  |
         |               |                                          |  |
         |               +------------------------------------------+  |
         |                                                             |
         +-------------------------------------------------------------+
 ....
 === Deployment
 From batcave you can run the following
 ....
 $ sudo rbac-playbook groups/osbs/deploy-cluster.yml
 ....
 This is going to deploy the OpenShift cluster used by OSBS. Currently
 the playbook deploys 2 clusters (x86_64 and aarch64). Ansible tags can
 be used to deploy only one of these if needed for example
 [.title-ref]#osbs-x86-deploy-openshift#.
 If the openshift-ansible playbook fails it can be easier to run it
 directly from osbs-control01 and use the verbose mode.
 ....
 $ ssh osbs-control01.iad2.fedoraproject.org
 $ sudo -i
 # cd /root/openshift-ansible
 # ansible-playbook -i cluster-inventory playbooks/prerequisites.yml
 # ansible-playbook -i cluster-inventory playbooks/deploy_cluster.yml
 ....
 Once these playbook have been successfull, you can configure OSBS on the
 cluster. For that use the following playbook
 ....
 $ sudo rbac-playbook groups/osbs/configure-osbs.yml
 ....
 When this is done we need to get the new koji service token and update
 its value in the private repository
 ....
 $ ssh osbs-master01.iad2.fedoraproject.org
 $ sudo -i
 # oc -n osbs-fedora sa get-token koji
 dsjflksfkgjgkjfdl ....
 ....
 The token needs to be saved in the private ansible repo in
 [.title-ref]#files/osbs/production/x86-64-osbs-koji#. Once this is done
 you can run the builder playbook to update that token.
 ....
 $ sudo rbac-playbook groups/buildvm.yml -t osbs
 ....
 === Operation
 Koji Hub will schedule the containerBuild on a koji builder via the
 koji-containerbuild-hub plugin, the builder will then submit the build
 in OpenShift via the koji-containerbuild-builder plugin which uses the
 osbs-client python API that wraps the OpenShift API along with a custom
 OpenShift Build JSON payload.
 The Build is then scheduled in OpenShift and it's logs are captured by
 the koji plugins. Inside the buildroot, atomic-reactor will upload the
 built container image as well as provide the metadata to koji's content
 generator.
 == Outage
 If Koji is down, then builds can't be scheduled but repairing Koji is
 outside the scope of this document.
 If either the candidate-registry.fedoraproject.org or
 registry.fedoraproject.org Container Registries are unavailable, but
 repairing those is also outside the scope of this document.
 === OSBS Failures
 OpenShift Build System itself can have various types of failures that
 are known about and the recovery procedures are listed below.
 ==== Ran out of disk space
 Docker uses a lot of disk space, and while the osbs-nodes have been
 alloted what is considered to be ample disk space for builds (since they
 are automatically cleaned up periodically) it is possible this will run
 out.
 To resolve this, run the following commands:
 ....
 # These command will clean up old/dead docker containers from old OpenShift
 # Pods
 $ for i in $(sudo docker ps -a | awk '/Exited/ { print $1 }'); do sudo docker rm $i; done
 $ for i in $(sudo docker images -q -f 'dangling=true'); do sudo docker rmi $i; done
 # This command should only be run on osbs-master01 (it won't work on the
 # nodes)
 #
 # This command will clean up old builds and related artifacts in OpenShift
 # that are older than 30 days (We can get more aggressive about this if
 # necessary, the main reason these still exist is in the event we need to
 # debug something. All build info we care about is stored in Koji.)
 $ oadm prune builds --orphans --keep-younger-than=720h0m0s --confirm
 ....
 ==== A node is broken, how to remove it from the cluster?
 If a node is having an issue, the following command will effectively
 remove it from the cluster temporarily.
 In this example, we are removing osbs-node01
 ....
 $ oadm manage-node osbs-node01.phx2.fedoraproject.org --schedulable=true
 ....
 ==== Container Builds are unable to access resources on the network
 Sometimes the Container Builds will fail and the logs will show that the
 buildroot is unable to access networked resources (docker registry, dnf
 repos, etc).
 This is because of a bug in OpenShift v1.3.1 (current upstream release
 at the time of this writing) where an OpenVSwitch flow is left behind
 when a Pod is destroyed instead of the flow being deleted along with the
 Pod.
 Method to confirm the issue is unfortunately multi-step since it's not a
 cluster-wide issue but isolated to the node experiencing the problem.
 First in the koji createContainer task there is a log file called
 openshift-incremental.log and in there you will find a key:value in some
 JSON output similar to the following:
 ....
 'openshift_build_selflink': u'/oapi/v1/namespaces/default/builds/cockpit-f24-6``
 ....
 The last field of the value, in this example `cockpit-f24-6` is the
 OpenShift build identifier. We need to ssh into `osbs-master01` and get
 information about which node that ran on.
 ....
 # On osbs-master01
 #   Note: the output won't be pretty, but it gives you the info you need
 $ sudo oc get build cockpit-f25-3 -o yaml | grep osbs-node
 ....
 Once you know what machine you need, ssh into it and run the following:
 ....
 $ sudo docker run --rm -ti buildroot /bin/bash'
 # now attempt to run a curl command
 $ curl https://google.com
 # This should get refused, but if this node is experiencing the networking
 # issue then this command will hang and eventually time out
 ....
 How to fix:
 Reboot the affected node that's experiencing the issue, when the node
 comes back up OpenShift will rebuild the flow tables on OpenVSwitch and
 things will be back to normal.
 ....
 systemctl reboot
 ....
--- a/modules/sysadmin_guide/pages/librariesio2fedmsg.adoc
+++ b/modules/sysadmin_guide/pages/librariesio2fedmsg.adoc
@ -0,0 +1,27 @@
 = librariesio2fedmsg SOP
 librariesio2fedmsg is a small service that converts Server-Sent Events
 from https://libraries.io/[libraries.io] to fedmsgs.
 librariesio2fedmsg is an instance of
 https://github.com/fedora-infra/sse2fedmsg[sse2fedmsg] using the
 http://firehose.libraries.io/events[libraries.io firehose] running on
 https://os.fedoraproject.org/[OpenShift] and publishes its fedmsgs
 through the busgateway01.phx2.fedoraproject.org relay using the
 `org.fedoraproject.prod.sse2fedmsg.librariesio` topic.
 == Updating
 sse2fedmsg is installed directly from its git repository, so once a new
 release is tagged in sse2fedmsg, just update the tag in the git URL
 provided to pip in the
 https://infrastructure.fedoraproject.org/infra/ansible/roles/openshift-apps/librariesio2fedmsg/files/[build
 config].
 == Deploying
 Run the playbook to apply the new OpenShift configuration:
 ....
 $ sudo rbac-playbook openshift-apps/librariesio2fedmsg.yml
 ....
--- a/modules/sysadmin_guide/pages/linktracking.adoc
+++ b/modules/sysadmin_guide/pages/linktracking.adoc
@ -0,0 +1,77 @@
 = Link tracking
 Using link tracking is [43]an easy way for us to find out how people are
 getting to our download page. People might click over to our download
 page from any of a number of areas, and knowing the relative usage of
 those links can help us understand what materials we're producing are
 more effective than others.
 == Adding links
 Each link should be constructed by adding ? to the URL, followed by a
 short code that includes:
 * an indicator for the link source (such as the wiki release notes)
 * an indicator for the Fedora release in specific (such as F15 for the
 final, or F15a for the Alpha test release)
 So a link to get.fp.o from the one-page release notes would become
 http://get.fedoraproject.org/?opF15.
 == FAQ
 I want to copy a link to my status update for social networking, or my
 blog.::
  If you're posting a status update to identi.ca, for example, use the
  link tracking code for status updates. Don't copy a link straight from
  an announcement that includes link tracking from the announcement. You
  can copy the link itself but remember to change the portion after the
  ? to instead use the st code for status updates and blogs, followed by
  the Fedora release version (such as F16a, F16b, or F16), like this:
  +
 ....
 http://fedoraproject.org/get-prerelease?stF16a
 ....
 I want to point people to the announcement from my blog. Should I use
 the announcement link tracking code?::
  The actual URL link itself is the announcement URL. Add the link
  tracking code for blogs, which would start with ?st and end with the
  Fedora release version, like this:
  +
 ....
 http://fedoraproject.org/wiki/F16_release_announcement?stF16a
 ....
 == The codes
 [NOTE]
 .Note
 ====
 Additions to this table are welcome.
 ====
 [cols=",",options="header",]
 |===
 |Link source |Code
 |Email announcements |an
 |----------------------------------------------- |----------
 |Wiki announcements |wkan
 |----------------------------------------------- |----------
 |Front page |fp
 |----------------------------------------------- |----------
 |Front page of wiki |wkfp
 |----------------------------------------------- |----------
 |The press release Red Hat makes |rhpr
 |----------------------------------------------- |----------
 |http://redhat.com/fedora |rhf
 |----------------------------------------------- |----------
 |Test phase release notes on |wkrn
 |----------------------------------------------- |----------
 |Official release notes |rn
 |----------------------------------------------- |----------
 |Official installation guide |ig
 |----------------------------------------------- |----------
 |One-page release notes |op
 |----------------------------------------------- |----------
 |Status links (blogs, social media) |st
 |===
--- a/modules/sysadmin_guide/pages/loopabull.adoc
+++ b/modules/sysadmin_guide/pages/loopabull.adoc
@ -0,0 +1,148 @@
 = Loopabull
 https://github.com/maxamillion/loopabull[Loopabull] is an event-driven
 https://www.ansible.com/[Ansible]-based automation engine. This is used
 for various tasks, originally slated for
 https://pagure.io/releng-automation[Release Engineering Automation].
 == Contents
 [arabic]
 . Contact Information
 . Overview
 . Setup
 . Outage
 == Contact Information
 Owner::
  Adam Miller (maxamillion) Pierre-Yves Chibon (pingou)
 Contact::
  #fedora-admin, #fedora-releng, #fedora-noc, sysadmin-main,
  sysadmin-releng
 Location::
  loopabull01.phx2.fedoraproject.org
  loopabull01.stg.phx2.fedoraproject.org
 Purpose::
  Event Driven Automation of tasks within the Fedora Infrastructure and
  Fedora Release Engineering
 == Overview
 The https://github.com/maxamillion/loopabull[loopabull] system is setup
 such that an event will take place within the infrastructure and a
 http://www.fedmsg.com/en/latest/[fedmsg] is sent, then loopabull will
 consume that message, trigger an https://www.ansible.com/[Ansible]
 http://docs.ansible.com/ansible/playbooks.html[playbook] that shares a
 name with the fedmsg topic, and provide the payload of the fedmsg to the
 playbook as
 https://github.com/ansible/ansible/blob/devel/docs/man/man1/ansible-playbook.1.asciidoc.in[extra
 variables].
 == Setup
 The setup is relatively simple, the Overview above describes it and a
 more detailed version can be found in the [.title-ref]#releng docs#.
 ....
 +-----------------+             +-------------------------------+
 |                 |             |                               |
 |    fedmsg       +------------>|  Looper                       |
 |                 |             |   (fedmsg handler plugin)     |
 |                 |             |                               |
 +-----------------+             +-------------------------------+
                                                |
                                                |
             +-------------------+              |
             |                   |              |
             |                   |              |
             |    Loopabull      +<-------------+
             |  (Event Loop)     |
             |                   |
             +---------+---------+
                       |
                       |
                       |
                       |
                       V
            +----------+-----------+
            |                      |
            |   ansible-playbook   |
            |                      |
            +----------------------+
 ....
 === Deployment
 Loopabull is deployed on two hosts, one for the production instance:
 `loopabull01.prod.phx2.fedoraproject.org` and one for the staging
 instance: `loopabull01.stg.phx2.fedoraproject.org`.
 Each host is running loopabull with 5 workers reacting to fedmsg
 notifications.
 == Expanding loopabull
 The documentation to expand loopabull's usage is documented at:
 https://pagure.io/Fedora-Infra/loopabull-tasks
 == Outage
 In the event that loopabull isn't responding or isn't running playbooks
 as it should be, the following scenarios should be approached.
 === What is going on?
 There are a few commands that may help figuring out what is going:
 * Check the status of the different services:
 ....
 systemctl |grep loopabull
 ....
 * Follow the logs of the different services:
 ....
 journalctl -lfu loopabull -u loopabull@1 -u loopabull@2 -u loopabull@3 \
    -u loopabull@4 -u loopabull@5
 ....
 If a playbook returns a non-zero error code, the worker running it will
 be stopped. If that happens, you may want to carefully review the logs
 to assess what lead to this situation so it can be prevented in the
 future.
 * Monitoring the queue size
 The loopabull service listens to the fedmsg bus and puts the messages as
 they come into a rabbitmq/amqp queue for the workers to process. If you
 want to see the number of messages pending to be processed by the
 workers you can check the queue size using:
 ....
 rabbitmqctl list_queues
 ....
 The output will be something like:
 ....
 Listing queues ...
 workers 489989
 ...done.
 ....
 Where `workers` is the name of the queue used by loopabull and `489989`
 the number of messages in that queue (yes that day we were recovering
 from a several-day long outage).
 === Network Interruption
 Sometimes if the network is interrupted, the loopabull service will hang
 because the fedmsg listener will hold a dead socket open. The service
 and its workers simply needs to be restarted at that point.
 ....
 systemctl restart loopabull loopabull@1 loopabull@2 loopabull@3 \
    loopabull@4 loopabull@5
 ....
--- a/modules/sysadmin_guide/pages/mailman.adoc
+++ b/modules/sysadmin_guide/pages/mailman.adoc
@ -0,0 +1,115 @@
 = Mailman Infrastructure SOP
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, sysadmin-tools, sysadmin-hosted
 Location::
  phx2
 Servers::
  mailman01, mailman02, mailman01.stg
 Purpose::
  Provides mailing list services.
 == Description
 Mailing list services for Fedora projects are located on the
 mailman01.phx2.fedoraproject.org server.
 == Common Tasks
 === Creating a new mailing list
 * Log into mailman01
 * `sudo -u mailman mailman3 create <listname>@lists.fedora(project|hosted).org --owner <username>@fedoraproject.org --notify`
 +
 [IMPORTANT]
 .Important
 ====
 Please make sure to add a valid description to the newly created list.
 (to avoid [no description available] on listinfo index)
 ====
 == Removing content from archives
 We don't.
 It's not easy to remove content from the archives and it's generally
 useless as well because the archives are often mirrored by third parties
 as well as being in the INBOXs of all of the people on the mailing list
 at that time. Here's an example message to send to someone who requests
 removal of archived content:
 ....
 Greetings,
 We're sorry to say that we don't remove content from the mailing list archives.
 Doing so is a non-trivial amount of work and usually doesn't achieve anything
 because the content has already been disseminated to a wide audience that we do
 not control.  The emails have gone out to all of the subscribers of the mailing
 list at that time and also (for a great many of our lists) been copied by third
 parties (for instance: http://markmail.org and http://gmane.org).
 Sorry we cannot help further,
 Mailing lists and their owners
 ....
 == Checking Membership
 Are you in need of checking who owns a certain mailing list without
 having to search around on list's frontpages?
 Mailman has a nice tool that will help us list members by type.
 Get a full list of all the mailing lists hosted on the server:
 ....
 sudo -u mailman mailman3 lists
 ....
 Get the list of regular members for example@example.com:
 ....
 sudo -u mailman mailman3 members example@example.com
 ....
 Get the list of owners for example@example.com:
 ....
 sudo -u mailman mailman3 members -R owner example@example.com
 ....
 Get the list of moderators for example@example.com:
 ....
 sudo -u mailman mailman3 members -R moderator example@example.com
 ....
 == Troubleshooting and Resolution
 === List Administration
 Specific users are marked as 'site admins' in the database.
 Please file a issue if you feel you need to have this access.
 === Restart Procedure
 If the server needs to be restarted mailman should come back on it's
 own. Otherwise each service on it can be restarted:
 ....
 sudo service mailman3 restart
 sudo service postfix restart
 ....
 == How to delete a mailing list
 Delete a list, but keep the archives:
 ....
 sudo -u mailman mailman3 remove <listname>
 ....
--- a/modules/sysadmin_guide/pages/making-ssl-certificates.adoc
+++ b/modules/sysadmin_guide/pages/making-ssl-certificates.adoc
@ -0,0 +1,53 @@
 = SSL Certificate Creation SOP
 Every now and then you will need to create an SSL certificate for a
 Fedora Service.
 == Creating a CSR for a new server.
 Know your hostname, ie [.title-ref]##lists.fedoraproject.org##`:
 ....
 export ssl_name=<fqdn of host> 
 ....
 Create the cert. 8192 does not work with various boxes so we use 4096
 currently.:
 ....
 openssl genrsa -out ${ssl_name}.pem 4096
 openssl req -new  -key ${ssl_name}.pem -out $(ssl_name}.csr
 Country Name (2 letter code) [XX]:US
 State or Province Name (full name) []:NM
 Locality Name (eg, city) [Default City]:Raleigh
 Organization Name (eg, company) [Default Company Ltd]:Red Hat
 Organizational Unit Name (eg, section) []:Fedora Project
 Common Name (eg, your name or your server's hostname)
 []:lists.fedorahosted.org
 Email Address []:admin@fedoraproject.org
 Please enter the following 'extra' attributes
 to be sent with your certificate request
 A challenge password []:
 An optional company name []:
 ....
 send the CSR to the signing authority and wait for a cert. place all
 three into private directory so that you can make certs in the future.
 == Creating a temporary self-signed certificate.
 Repeat the steps above but add in the following:
 ....
 openssl x509 -req -days 30 -in ${ssl_name}.csr -signkey ${ssl_name}.pem -out ${ssl_name}.cert
 Signature ok
 subject=/C=US/ST=NM/L=Raleigh/O=Red Hat/OU=Fedora
 Project/CN=lists.fedorahosted.org/emailAddress=admin@fedoraproject.org
 ....
 Getting Private key
 We only want a self-signed certificate to be good for a short time so 30
 days sounds good.
--- a/modules/sysadmin_guide/pages/massupgrade.adoc
+++ b/modules/sysadmin_guide/pages/massupgrade.adoc
@ -0,0 +1,418 @@
 = Mass Upgrade Infrastructure SOP
 Every once in a while, we need to apply mass upgrades to our servers for
 various security and other upgrades.
 == Contents
 [arabic]
 . Contact Information
 . Preparation
 . Staging
 . Special Considerations
 +
 ____
 * Disable builders
 * Post reboot action
 * Schedule autoqa01 reboot
 * Bastion01 and Bastion02 and openvpn server
 * Special yum directives
 ____
 . Update Leader
 . Group A reboots
 . Group B reboots
 . Group C reboots
 . Doing the upgrade
 . Doing the reboot
 . Aftermath
 == Contact Information
 Owner:::
  Fedora Infrastructure Team
 Contact:::
  #fedora-admin, sysadmin-main, infrastructure@lists.fedoraproject.org,
  #fedora-noc
 Location:::
  All over the world.
 Servers:::
  all
 Purpose:::
  Apply kernel/other upgrades to all of our servers
 == Preparation
 [arabic]
 . Determine which host group you are going to be doing updates/reboots
 on.
 +
 Group "A"::
  servers that end users will see or note being down and anything that
  depends on them.
 Group "B"::
  servers that contributors will see or note being down and anything
  that depends on them.
 Group "C"::
  servers that infrastructure will notice are down, or are redundent
  enough to reboot some with others taking the load.
 . Appoint an 'Update Leader' for the updates.
 . Follow the [61]Outage Infrastructure SOP and send advance notification
 to the appropriate lists. Try to schedule the update at a time when many
 admins are around to help/watch for problems and when impact for the
 group affected is less. Do NOT do multiple groups on the same day if
 possible.
 . Plan an order for rebooting the machines considering two factors:
 +
 ____
 * Location of systems on the kvm or xen hosts. [You will normally reboot
 all systems on a host together]
 * Impact of systems going down on other services, operations and users.
 Thus since the database servers and nfs servers are the backbone of many
 other systems, they and systems that are on the same xen boxes would be
 rebooted before other boxes.
 ____
 . To aid in organizing a mass upgrade/reboot with many people helping,
 it may help to create a checklist of machines in a gobby document.
 . Schedule downtime in nagios.
 . Make doubly sure that various app owners are aware of the reboots
 == Staging
 ____
 Any updates that can be tested in staging or a pre-production
 environment should be tested there first. Including new kernels, updates
 to core database applications / libraries. Web applications, libraries,
 etc.
 ____
 == Special Considerations
 While this may not be a complete list, here are some special things that
 must be taken into account before rebooting certain systems:
 === Disable builders
 Before the following machines are rebooted, all koji builders should be
 disabled and all running jobs allowed to complete:
 ____
 * db04
 * nfs01
 * kojipkgs02
 ____
 Builders can be removed from koji, updated and re-added. Use:
 ....
 koji disable-host NAME
  and
 koji enable-host NAME
 ....
 [NOTE]
 .Note
 ====
 you must be a koji admin
 ====
 Additionally, rel-eng and builder boxes may need a special version
 of rpm. Make sure to check with rel-eng on any rpm upgrades for them.
 === Post reboot action
 The following machines require post-boot actions (mostly entering
 passphrases). Make sure admins that have the passphrases are on hand for
 the reboot:
 ____
 * backup-2 (LUKS passphrase on boot)
 * sign-vault01 (NSS passphrase for sigul service)
 * sign-bridge01 (NSS passphrase for sigul bridge service)
 * serverbeach* (requires fixing firewall rules):
 ____
 Each serverbeach host needs 3 or 4 iptables rules added anytime it's
 rebooted or libvirt is upgraded:
 ....
 iptables -I FORWARD -o virbr0 -j ACCEPT 
 iptables -I FORWARD -i virbr0 -j ACCEPT 
 iptables -t nat -I POSTROUTING -s 192.168.122.3/32 -j SNAT --to-source 66.135.62.187
 ....
 [NOTE]
 .Note
 ====
 The source is the internal guest ips, the to-source is the external ips
 that map to that guest ip. If there are multiple guests, each one needs
 the above SNAT rule inserted.
 ====
 === Schedule autoqa01 reboot
 There is currently an autoqa01.c host on cnode01. Check with QA folks
 before rebooting this guest/host.
 === Bastion01 and Bastion02 and openvpn server
 We need one of the bastion machines to be up to provide openvpn for all
 machines. Before rebooting bastion02, modify:
 `manifests/nodes/bastion0*.phx2.fedoraproject.org.pp` files to start
 openvpn server on bastion01, wait for all clients to re-connect, reboot
 bastion02 and then revert back to it as openvpn hub.
 === Special yum directives
 Sometimes we will wish to exclude or otherwise modify the yum.conf on a
 machine. For this purpose, all machines have an include, making them
 read
 [62]http://infrastructure.fedoraproject.org/infra/hosts/FQHN/yum.conf.include
 from the infrastructure repo. If you need to make such changes, add them
 to the infrastructure repo before doing updates.
 == Update Leader
 Each update should have a Leader appointed. This person will be in
 charge of doing any read-write operations, and delegating to others to
 do tasks. If you aren't specficially asked by the Leader to reboot or
 change something, please don't. The Leader will assign out machine
 groups to reboot, or ask specific people to look at machines that didn't
 come back up from reboot or aren't working right after reboot. It's
 important to avoid multiple people operating on a single machine in a
 read-write manner and interfering with changes.
 == Group A reboots
 Group A machines are end user critical ones. Outages here should be
 planned at least a week in advance and announced to the announce list.
 List of machines currently in A group (note: this is going to be
 automated)
 These hosts are grouped based on the virt host they reside on:
 * torrent02.fedoraproject.org
 * ibiblio02.fedoraproject.org
 * people03.fedoraproject.org
 * ibiblio03.fedoraproject.org
 * collab01.fedoraproject.org
 * serverbeach09.fedoraproject.org
 * db05.phx2.fedoraproject.org
 * virthost03.phx2.fedoraproject.org
 * db01.phx2.fedoraproject.org
 * virthost04.phx2.fedoraproject.org
 * db-fas01.phx2.fedoraproject.org
 * proxy01.phx2.fedoraproject.org
 * virthost05.phx2.fedoraproject.org
 * ask01.phx2.fedoraproject.org
 * virthost06.phx2.fedoraproject.org
 These are the rest:
 * bapp02.phx2.fedoraproject.org
 * bastion02.phx2.fedoraproject.org
 * app05.fedoraproject.org
 * backup02.fedoraproject.org
 * bastion01.phx2.fedoraproject.org
 * fas01.phx2.fedoraproject.org
 * fas02.phx2.fedoraproject.org
 * log02.phx2.fedoraproject.org
 * memcached03.phx2.fedoraproject.org
 * noc01.phx2.fedoraproject.org
 * ns02.fedoraproject.org
 * ns04.phx2.fedoraproject.org
 * proxy04.fedoraproject.org
 * smtp-mm03.fedoraproject.org
 * batcave02.phx2.fedoraproject.org
 * mm3test.fedoraproject.org
 * packages02.phx2.fedoraproject.org
 === Group B reboots
 This Group contains machines that contributors use. Announcements of
 outages here should be at least a week in advance and sent to the
 devel-announce list.
 These hosts are grouped based on the virt host they reside on:
 * db04.phx2.fedoraproject.org
 * bvirthost01.phx2.fedoraproject.org
 * nfs01.phx2.fedoraproject.org
 * bvirthost02.phx2.fedoraproject.org
 * pkgs01.phx2.fedoraproject.org
 * bvirthost03.phx2.fedoraproject.org
 * kojipkgs02.phx2.fedoraproject.org
 * bvirthost04.phx2.fedoraproject.org
 These are the rest:
 * koji04.phx2.fedoraproject.org
 * releng03.phx2.fedoraproject.org
 * releng04.phx2.fedoraproject.org
 === Group C reboots
 Group C are machines that infrastructure uses, or can be rebooted in
 such a way as to continue to provide services to others via multiple
 machines. Outages here should be announced on the infrastructure list.
 Group C hosts that have proxy servers on them:
 * proxy02.fedoraproject.org
 * ns05.fedoraproject.org
 * hosted-lists01.fedoraproject.org
 * internetx01.fedoraproject.org
 * app01.dev.fedoraproject.org
 * darkserver01.dev.fedoraproject.org
 * fakefas01.fedoraproject.org
 * proxy06.fedoraproject.org
 * osuosl01.fedoraproject.org
 * proxy07.fedoraproject.org
 * bodhost01.fedoraproject.org
 * proxy03.fedoraproject.org
 * smtp-mm02.fedoraproject.org
 * tummy01.fedoraproject.org
 * app06.fedoraproject.org
 * noc02.fedoraproject.org
 * proxy05.fedoraproject.org
 * smtp-mm01.fedoraproject.org
 * telia01.fedoraproject.org
 * app08.fedoraproject.org
 * proxy08.fedoraproject.org
 * coloamer01.fedoraproject.org
 +
 ____
 Other Group C hosts:
 ____
 * ask01.stg.phx2.fedoraproject.org
 * app02.stg.phx2.fedoraproject.org
 * proxy01.stg.phx2.fedoraproject.org
 * releng01.stg.phx2.fedoraproject.org
 * value01.stg.phx2.fedoraproject.org
 * virthost13.phx2.fedoraproject.org
 * db-fas01.stg.phx2.fedoraproject.org
 * pkgs01.stg.phx2.fedoraproject.org
 * packages01.stg.phx2.fedoraproject.org
 * virthost11.phx2.fedoraproject.org
 * app01.stg.phx2.fedoraproject.org
 * koji01.stg.phx2.fedoraproject.org
 * db02.stg.phx2.fedoraproject.org
 * fas01.stg.phx2.fedoraproject.org
 * virthost10.phx2.fedoraproject.org
 * autoqa01.qa.fedoraproject.org
 * autoqa-stg01.qa.fedoraproject.org
 * bastion-comm01.qa.fedoraproject.org
 * batcave-comm01.qa.fedoraproject.org
 * virthost-comm01.qa.fedoraproject.org
 * compose-x86-01.phx2.fedoraproject.org
 * compose-x86-02.phx2.fedoraproject.org
 * download01.phx2.fedoraproject.org
 * download02.phx2.fedoraproject.org
 * download03.phx2.fedoraproject.org
 * download04.phx2.fedoraproject.org
 * download05.phx2.fedoraproject.org
 * download-rdu01.vpn.fedoraproject.org
 * download-rdu02.vpn.fedoraproject.org
 * download-rdu03.vpn.fedoraproject.org
 * fas03.phx2.fedoraproject.org
 * secondary01.phx2.fedoraproject.org
 * memcached04.phx2.fedoraproject.org
 * virthost01.phx2.fedoraproject.org
 * app02.phx2.fedoraproject.org
 * value03.phx2.fedoraproject.org
 * virthost07.phx2.fedoraproject.org
 * app03.phx2.fedoraproject.org
 * value04.phx2.fedoraproject.org
 * ns03.phx2.fedoraproject.org
 * darkserver01.phx2.fedoraproject.org
 * virthost08.phx2.fedoraproject.org
 * app04.phx2.fedoraproject.org
 * packages02.phx2.fedoraproject.org
 * virthost09.phx2.fedoraproject.org
 * hosted03.fedoraproject.org
 * serverbeach06.fedoraproject.org
 * hosted04.fedoraproject.org
 * serverbeach07.fedoraproject.org
 * collab02.fedoraproject.org
 * serverbeach08.fedoraproject.org
 * dhcp01.phx2.fedoraproject.org
 * relepel01.phx2.fedoraproject.org
 * sign-bridge02.phx2.fedoraproject.org
 * koji03.phx2.fedoraproject.org
 * bvirthost05.phx2.fedoraproject.org
 * (disable each builder in turn, update and reenable).
 * ppc11.phx2.fedoraproject.org
 * ppc12.phx2.fedoraproject.org
 * backup03
 == Doing the upgrade
 If possible, system upgrades should be done in advance of the reboot
 (with relevant testing of new packages on staging). To do the upgrades,
 make sure that the Infrastructure RHEL repo is updated as necessary to
 pull in the new packages ([63]Infrastructure Yum Repo SOP)
 On batcave01, as root run:
 ....
 func-yum [--host=hostname] update
 ....
 ..note: --host can be specified multiple times and takes wildcards.
 pinging people as necessary if you are unsure about any packages.
 Additionally you can see which machines still need rebooted with:
 ....
 sudo func-command --timeout=10 --oneline /usr/local/bin/needs-reboot.py | grep yes
 ....
 You can also see which machines would need a reboot if updates were all
 applied with:
 ....
 sudo func-command --timeout=10 --oneline /usr/local/bin/needs-reboot.py after-updates | grep yes
 ....
 == Doing the reboot
 In the order determined above, reboots will usually be grouped by the
 virtualization hosts that the servers are on. You can see the guests per
 virt host on batcave01 in /var/log/virthost-lists.out
 To reboot sets of boxes based on which virthost they are we've written a
 special script which facilitates it:
 ....
 func-vhost-reboot virthost-fqdn
 ....
 ex:
 ....
 sudo func-vhost-reboot virthost13.phx2.fedoraproject.org
 ....
 == Aftermath
 [arabic]
 . Make sure that everything's running fine
 . Reenable nagios notification as needed
 . {blank}
 +
 Make sure to perform any manual post-boot setup (such as entering::
  passphrases for encrypted volumes)
 . Close outage ticket.
 === Non virthost reboots:
 If you need to reboot specific hosts and make sure they recover -
 consider using:
 ....
 sudo func-host-reboot hostname hostname1 hostname2 ...
 ....
 If you want to reboot the hosts one at a time waiting for each to come
 back before rebooting the next pass a -o to func-host-reboot.
--- a/modules/sysadmin_guide/pages/mastermirror.adoc
+++ b/modules/sysadmin_guide/pages/mastermirror.adoc
@ -0,0 +1,79 @@
 = Master Mirror Infrastructure SOP
 == Contents
 [arabic]
 . Contact Information
 . PHX Master Mirror Setup
 . RDU I2 Master Mirror Setup
 . Raising Issues
 == Contact Information
 Owner:::
  Red Hat IS
 Contact:::
  #fedora-admin, Red Hat ticket
 Location:::
  PHX
 Servers:::
  server[1-5].download.phx.redhat.com
 Purpose:::
  Provides the master mirrors for Fedora distribution
 == PHX Master Mirror Setup
 The master mirrors are accessible as:
 ....
 download1.fedora.redhat.com -> CNAME to download3.fedora.redhat.com
 download2.fedora.redhat.com -> currently no DNS entry
 download3.fedora.redhat.com -> 209.132.176.20
 download4.fedora.redhat.com -> 209.132.176.220
 download5.fedora.redhat.com -> 209.132.176.221
 ....
 from the outside. download.fedora.redhat.com is a round robin to the
 above::
  IPs.
 The external IPs correspond to internal load balancer IPs that balance
 between server[1-5]:
 ....
 209.132.176.20  -> 10.9.24.20
 209.132.176.220 -> 10.9.24.220
 209.132.176.221 -> 10.9.24.221
 ....
 The load balancers then balance between the below Fedora IPs on the
 rsync servers:
 ....
 10.8.24.21 (fedora1.download.phx.redhat.com) - server1.download.phx.redhat.com
 10.8.24.22 (fedora2.download.phx.redhat.com) - server2.download.phx.redhat.com
 10.8.24.23 (fedora3.download.phx.redhat.com) - server3.download.phx.redhat.com
 10.8.24.24 (fedora4.download.phx.redhat.com) - server4.download.phx.redhat.com
 10.8.24.25 (fedora5.download.phx.redhat.com) - server5.download.phx.redhat.com
 ....
 == RDU I2 Master Mirror Setup
 [NOTE]
 .Note
 ====
 This section is awaiting confirmation from RH - information here may not
 be 100% accurate yet.
 ====
 download-i2.fedora.redhat.com (rhm-i2.redhat.com) is a round robin
 between:
 ....
 204.85.14.3 - 10.11.45.3
 204.85.14.5 - 10.11.45.5
 ....
 == Raising Issues
 Issues with any of this setup should be raised in a helpdesk ticket.
--- a/modules/sysadmin_guide/pages/mbs.adoc
+++ b/modules/sysadmin_guide/pages/mbs.adoc
@ -0,0 +1,217 @@
 = Module Build Service Infra SOP
 The MBS is a build orchestrator on top of Koji for "modules".
 https://fedoraproject.org/wiki/Changes/ModuleBuildService
 == Contact Information
 Owner::
  Release Engineering Team, Infrastructure Team
 Contact::
  #fedora-modularity, #fedora-admin, #fedora-releng
 Persons::
  jkaluza, fivaldi, breilly, mikem
 Location::
  Phoenix
 Public addresses::
  * mbs.fedoraproject.org
 Servers::
  * mbs-frontend0[1-2].phx2.fedoraproject.org
  * mbs-backend01.phx2.fedoraproject.org
 Purpose::
  Build modules for Fedora.
 == Description
 Users submit builds to mbs.fedoraproject.org referencing their modulemd
 file in dist-git. (In the future, users will not submit their own module
 builds. The [.title-ref]#freshmaker# daemon (running in infrastructure)
 will watch for .spec file changes and modulemd.yaml file changes -- it
 will submit the relevant module builds to the MBS on behalf of users.)
 The request to build a module is received by the MBS flask app running
 on the mbs-frontend nodes.
 Cursory validation of the submitted modulemd is performed on the
 frontend: are the named packages valid? Are their branches valid? The
 MBS keeps a copy of the modulemd and appends additional data describing
 which branches pointed to which hashes at the time of submission.
 A fedmsg from the frontend triggers the backend to start building the
 module. First, tags and build/srpm-build groups are created. Then, a
 module-build-macros package is synthesized and submitted as an srpm
 build. When it is complete and available in the buildroot, the rest of
 the rpm builds are submitted.
 These are grouped and limited in two ways:
 * First, there is a global NUM_CONCURRENT_BUILDS config option that
 controls how many koji builds the MBS is allowed to have open at any
 time. It serves as a throttle.
 * Second, a given module may specify that it's components should have a
 certain "build order". If there are 50 components, it may say that the
 first 25 of them are in one buildorder batch, and the second 25 are in
 another buildorder batch. The first batch will be submitted and, when
 complete, tagged back into the buildroot. Only after they are available
 will the second batch of 25 begin.
 When the last component is complete, the MBS backend marks the build as
 "done", and then marks it again as "ready". (There is currently no
 meaning to the "ready" state beyond "done". We reserved that state for
 future CI interactions.)
 == Observing MBS Behavior
 === The mbs-build command
 The https://pagure.io/fm-orchestrator[fm-orchestrator repo] and the
 [.title-ref]#module-build-service# package provide an
 [.title-ref]#mbs-build# command with a few subcommands. For general
 help:
 ....
 $ mbs-build --help
 ....
 To generate a report of all currently active module builds:
 ....
 $ mbs-build overview
  ID  State    Submitted             Components    Owner    Module
 ----  -------  --------------------  ------------  -------  -----------------------------------
 570  build    2017-06-01T17:18:11Z  35/134        psabata  shared-userspace-f26-20170601141014
 569  build    2017-06-01T14:18:04Z  14/15         mkocka   mariadb-f26-20170601141728
 ....
 To generate a report of an individual module build, given its ID:
 ....
 $ mbs-build info 569
 NVR                                             State     Koji Task
 ----------------------------------------------  --------  ------------------------------------------------------------
 libaio-0.3.110-7.module_414736cc                COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803741
                                                BUILDING  https://koji.fedoraproject.org/koji/taskinfo?taskID=19804081
 libedit-3.1-17.20160618cvs.module_414736cc      COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803745
 compat-openssl10-1.0.2j-6.module_414736cc       COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803746
 policycoreutils-2.6-5.module_414736cc           COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803513
 selinux-policy-3.13.1-255.module_414736cc       COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803748
 systemtap-3.1-5.module_414736cc                 COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803742
 libcgroup-0.41-11.module_ea91dfb0               COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19685834
 net-tools-2.0-0.42.20160912git.module_414736cc  COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19804010
 time-1.7-52.module_414736cc                     COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803747
 desktop-file-utils-0.23-3.module_ea91dfb0       COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19685835
 libselinux-2.6-6.module_ea91dfb0                COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19685833
 module-build-macros-0.1-1.module_414736cc       COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803333
 checkpolicy-2.6-1.module_414736cc               COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19803514
 dbus-glib-0.108-2.module_ea91dfb0               COMPLETE  https://koji.fedoraproject.org/koji/taskinfo?taskID=19685836
 ....
 To actively watch a module build in flight, given its ID:
 ....
 $ mbs-build watch 570
 Still building:
   libXrender https://koji.fedoraproject.org/koji/taskinfo?taskID=19804885
   libXdamage https://koji.fedoraproject.org/koji/taskinfo?taskID=19805153
 Failed:
   libXxf86vm https://koji.fedoraproject.org/koji/taskinfo?taskID=19804903
 Summary:
   2 components in the BUILDING state
   34 components in the COMPLETE state
   1 components in the FAILED state
   97 components in the undefined state
 psabata's build #570 of shared-userspace-f26 is in the "build" state
 ....
 === The releng repo
 There are more tools located in the [.title-ref]#scripts/mbs/# directory
 of the releng repo: https://pagure.io/releng/blob/master/f/scripts/mbs
 == Cancelling a module build
 Users can cancel their own module builds with:
 ....
 $ mbs-build cancel $BUILD_ID
 ....
 MBS admins can also cancel builds of any user.
 [NOTE]
 .Note
 ====
 MBS admins are defined as members of the groups listed in the
 [.title-ref]#ADMIN_GROUPS# configuration options in
 [.title-ref]#roles/mbs/common/templates/config.py#.
 ====
 == Logs
 The frontend logs are on mbs-frontend0[1-2] in
 `/var/log/httpd/error_log`.
 The backend logs are on mbs-backend01. Look in the journal for the
 [.title-ref]#fedmsg-hub# service.
 == Upgrading
 The package in question is [.title-ref]#module-build-service#. Please
 use the [.title-ref]#playbooks/manual/upgrade/mbs.yml# playbook.
 == Managing Bootstrap Modules
 In general, modules use other modules to define their buildroots, but
 what defines the buildroot of the very first module? For this, we use
 "bootstrap" modules which are manually selected. For some history on
 this, see these tickets:
 * https://pagure.io/releng/issue/6791
 * https://pagure.io/fedora-infrastructure/issue/6097
 The tag for a bootstrap module needs to be manually created and
 populated by Release Engineering. Builds for that tag are curated and
 selected from other Fedora tags, with care to ensure that only as many
 builds are added as needed.
 The existence of the tag is not enough for the bootstrap module to be
 useable by MBS. MBS discovers the bootstrap module as a possible
 dependency for other yet-to-be-built modules by querying PDC. During
 normal operation, these entries in PDC are automatically created by
 pdc-updater on pdc-backend02, but for the bootstrap tag they need to be
 manually created and linked to the new bootstrap tag.
 The fm-orchestrator repo has a
 https://pagure.io/fm-orchestrator/blob/master/f/bootstrap[bootstrap/]
 directory with tools that we used to create the first bootstrap entries.
 If you need to create a new bootsrap entry or modify an existing one,
 use these tools for inspiration. They are not general purpose and will
 likely have to be modified to do what is needed. In particular, see
 [.title-ref]#import-to-pdc.py# as an example of creating a new entry and
 [.title-ref]#activate-in-pdc.py# for an example of editing an existing
 entry.
 To be usable, you'll need a token with rights to speak to staging/prod
 PDC. See the PDC SOP for information on client configuration in
 [.title-ref]#/etc/pdc.d/# and on where to find those tokens.
 == Things that could go wrong
 === Overloading koji
 If koji is overloaded, it should be acceptable to _stop_ the fedmsg-hub
 daemon on mbs-backend01 at any time.
 [NOTE]
 .Note
 ====
 As builds finish in koji, they will be _missed_ by the backend.. but
 when it restarts it should find them in datagrepper. If that fails as
 well, the mbs backend has a poller which should start up ~5 minutes
 after startup that checks koji for anything it may have missed, at which
 point it will resume functioning.
 ====
 If koji continues to be overloaded after startup, try decreasing the
 [.title-ref]#NUM_CONCURRENT_BUILDS# option in the config file in
 [.title-ref]#roles/mbs/common/templates/#.
--- a/modules/sysadmin_guide/pages/memcached.adoc
+++ b/modules/sysadmin_guide/pages/memcached.adoc
@ -0,0 +1,71 @@
 = Memcached Infrastructure SOP
 Our memcached setup is currently only used for wiki sessions. With
 mediawiki, sessions stored in files over NFS or in the DB are very slow.
 Memcached is a non-blocking solution for our session storage.
 == Contents
 [arabic]
 . Contact Information
 . Checking Status
 . Flushing Memcached
 . Restarting Memcached
 . Configuring Memcached
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, sysadmin-web groups
 Location::
  PHX
 Servers::
  memcached03, memcached04
 Purpose::
  Provide caching for Fedora web applications.
 == Checking Status
 Our memcached instances are currently firewalled to only allow access
 from wiki application servers. To check the status of an instance, use:
 ....
 echo stats | nc memcached0{3,4} 11211
 ....
 from an allowed host.
 == Flushing Memcached
 Sometimes, wrong contents get cached, and the cache should be flushed.
 To do this, use:
 ....
 echo flush_all | nc memcached0{3,4} 11211
 ....
 from an allowed host.
 == Restarting Memcached
 Note that restarting an memcached instance will drop all sessions stored
 on that instance. As mediawiki uses hashing to distribute sessions
 across multiple instances, restarting one out of two instances will
 result in about half of the total sessions being dropped.
 To restart memcached:
 ....
 sudo /etc/init.d/memcached restart
 ....
 == Configuring Memcached
 Memcached is currently setup as a role in the ansible git repo. The main
 two tunables are the MAXCONN (the maximum number of concurrent
 connections) and CACHESIZE (the amount memory to use for storage). These
 variables can be set through $memcached_maxconn and $memcached_cachesize
 in ansible. Additionally, other options (as described in the memcached
 manpage) can be set via $memcached_options.
--- a/modules/sysadmin_guide/pages/message-tagging-service.adoc
+++ b/modules/sysadmin_guide/pages/message-tagging-service.adoc
@ -0,0 +1,85 @@
 = Message Tagging Service SOP
 == Contact Information
 Owner::
  Factory2 Team, Fedora QA Team, Infrastructure Team
 Contact::
  #fedora-qa, #fedora-admin
 Persons::
  cqi, lucarval, vmaljulin
 Location::
  Phoenix
 Servers::
  * In OpenShift.
 Purpose::
  Tag module build
 == Description
 Message Tagging Service, aka MTS, is an event-driven microservice to tag
 a module build triggered by MBS specific event.
 MTS basically listens on message bus for the MBS event
 `mbs.build.state.change`. Once a message is received, the module build
 represented by that message will be tested if it matches any predefined
 rules. Each rule definition has destination tag defined. If a rule
 matches the build, the destination tag will be applied to that build.
 Only module build in ready state is handled by MTS for now.
 == Observing Behavior
 Login to `os-master01.phx2.fedoraproject.org` as `root` (or,
 authenticate remotely with openshift using
 `oc login https://os.fedoraproject.org`), and run:
 ....
 oc project mts
 oc status -v
 oc logs -f dc/mts
 ....
 == Database
 MTS does not use database.
 == Configuration
 Please do remember to increase `MTS_CONFIG_VERSION` so that Openshift
 creates a new pod after running the playbook.
 == Deployment
 You can roll out configuration changes by changing the files in
 `roles/openshift-apps/message-tagging-service/` and running the
 `playbooks/openshift-apps/message-tagging-service.yml` playbook.
 === Stage
 MTS docker image is built automatically and pushed to upstream quay.io.
 By default, tag `latest` is applied to a fresh image. Tag `stg` is
 applied to image, then run the playbook
 `playbooks/openshift-apps/message-tagging-service.yml` with environment
 `staging`.
 === Prod
 If everything works well, apply tag `prod` to docker image in quay.io,
 then, run the playbook with environment `prod`.
 == Update Rules
 https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/message-tagging-service/files/mts-rules.yml[Rules
 file] is managed along side the playbook role in same repository.
 For detailed information of rules format, please refer to
 https://pagure.io/modularity/blob/master/f/drafts/module-tagging-service/format.md[documentation]
 under Modularity.
 == Troubleshooting
 In case of problems with MTS, check the logs:
 ....
 oc logs -f dc/mts
 ....
--- a/modules/sysadmin_guide/pages/mirrorhiding.adoc
+++ b/modules/sysadmin_guide/pages/mirrorhiding.adoc
@ -0,0 +1,36 @@
 = Mirror hiding Infrastructure SOP
 At times, such as release day, there may be a conflict between Red Hat
 trying to release content for RHEL, and Fedora trying to release Fedora.
 One way to limit the pain to Red Hat on release day is to hide
 download.fedora.redhat.com from the publiclist and mirrorlist
 redirector, which will keep most people from downloading the content
 from Red Hat directly.
 == Contact Information
 Owner::
  Fedora Infrastructure Team
 Contact::
  #fedora-admin, sysadmin-main, sysadmin-web group
 Location::
  Phoenix
 Servers::
  app3, app4
 Purpose::
  Hide Public Mirrors from the publiclist / mirrorlist redirector
 == Description
 To hide a public mirror, so it doesn't appear on the publiclist or the
 mirrorlist, simply go into the MirrorManager administrative web user
 interface, at [45]https://admin.fedoraproject.org/mirrormanager. Fedora
 sysadmins can see all Sites and Hosts. For each Site and Host, there is
 a checkbox marked "private", which if set, will hide that Site (and all
 its Hosts), or just that single Host, such that it won't appear on the
 public lists.
 To make a private-marked mirror public, simply clear the "private"
 checkbox again.
 This change takes effect at the top of each hour.
--- a/modules/sysadmin_guide/pages/mirrormanager-S3-EC2-netblocks.adoc
+++ b/modules/sysadmin_guide/pages/mirrormanager-S3-EC2-netblocks.adoc
@ -0,0 +1,20 @@
 = AWS Mirrors
 Fedora Infrastructure mirrors EPEL content (/pub/epel) into Amazon
 Simple Storage Service (S3) in multiple regions, to make it fast for EC2
 CentOS/RHEL users to get EPEL content from an effectively local mirror.
 For this to work, we have private mirror entries in MirrorManager, one
 for each region, which include the EC2 netblocks for that region.
 Amazon updates their list of network blocks roughly monthly, as they
 consume additional address space. Therefore, we need to make the
 corresponding changes into MirrorManager's entries for same.
 Amazon publishes their list of network blocks on their forum site, with
 the subject "Announcement: Amazon EC2 Public IP Ranges". As of November
 2014, this was https://forums.aws.amazon.com/ann.jspa?annID=1701
 As of November 19, 2014, Amazon publishes it as a JSON file we can
 download.
 http://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html
--- a/Show more
+++ b/Show more
		`@ -1 +0,0 @@`
			`* xref:index.adoc[Communishift documentation]`