Added the infra SOPs ported to asciidoc.
This commit is contained in:
parent
8a7f111a12
commit
a0301e30f1
148 changed files with 18575 additions and 17 deletions
|
@ -7,11 +7,9 @@ they may be maintained by other people or team).
|
||||||
|
|
||||||
Services handling identity and providing personal space to our contributors.
|
Services handling identity and providing personal space to our contributors.
|
||||||
|
|
||||||
FAS https://fas.fedoraproject.org[fas.fp.o]::
|
Accounts https://accounts.fedoraproject.org/[accounts.fp.o]::
|
||||||
The __F__edora __A__ccount __S__ystem, our directory and identity management
|
Our directory and identity management tool provides community members with a single account to login on Fedora
|
||||||
tool, provides community members with a single account to login on Fedora
|
services. Registering an account there is one of the first things to do if you plan to work on Fedora.
|
||||||
services. https://admin.fedoraproject.org/accounts/user/new[Creating an
|
|
||||||
account] is one of the first things to do if you plan to work on Fedora.
|
|
||||||
|
|
||||||
Fedora People https://fedorapeople.org/[fedorapeople.org]::
|
Fedora People https://fedorapeople.org/[fedorapeople.org]::
|
||||||
Personnal web space provided to community members to share files, git
|
Personnal web space provided to community members to share files, git
|
||||||
|
|
|
@ -1 +0,0 @@
|
||||||
* xref:index.adoc[Communishift documentation]
|
|
|
@ -1,10 +0,0 @@
|
||||||
:experimental:
|
|
||||||
= Communishift documentation
|
|
||||||
|
|
||||||
link:https://console-openshift-console.apps.os.fedorainfracloud.org/[Communishift] is the name for the OpenShift community cluster run by the Fedora project.
|
|
||||||
It's intended to be a place where community members can test/deploy/run things that are of benefit to the community at a lower SLE (Service Level Expectation) than services directly run and supported by infrastructure, additionally doing so in a self service manner.
|
|
||||||
It's also an incubator for applications that may someday be more fully supported once they prove their worth.
|
|
||||||
Finally, it's a place for Infrastructure folks to learn and test and discover OpenShift in a less constrained setting than our production clusters.
|
|
||||||
|
|
||||||
This documentation focuses on implementation details of Fedora's OpenShift instance, not on OpenShift usage in general.
|
|
||||||
These instructions are already covered by link:https://docs.openshift.com/container-platform/4.1/welcome/index.html[upstream documentation].
|
|
|
@ -1 +1,144 @@
|
||||||
* link:https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/index.html[System Administrator Guide]
|
* xref:index.adoc[Sysadmin Guide]
|
||||||
|
** xref:2-factor.adoc[Two factor auth]
|
||||||
|
** xref:accountdeletion.adoc[Account Deletion SOP]
|
||||||
|
** xref:anitya.adoc[Anitya Infrastructure SOP]
|
||||||
|
** xref:ansible.adoc[ansible - SOP in review ]
|
||||||
|
** xref:apps-fp-o.adoc[apps-fp-o - SOP in review ]
|
||||||
|
** xref:archive-old-fedora.adoc[archive-old-fedora - SOP in review ]
|
||||||
|
** xref:arm.adoc[arm - SOP in review ]
|
||||||
|
** xref:askbot.adoc[askbot - SOP in review ]
|
||||||
|
** xref:aws-access.adoc[aws-access - SOP in review ]
|
||||||
|
** xref:basset.adoc[basset - SOP in review ]
|
||||||
|
** xref:bastion-hosts-info.adoc[bastion-hosts-info - SOP in review ]
|
||||||
|
** xref:bladecenter.adoc[bladecenter - SOP in review ]
|
||||||
|
** xref:blockerbugs.adoc[blockerbugs - SOP in review ]
|
||||||
|
** xref:bodhi.adoc[bodhi - SOP in review ]
|
||||||
|
** xref:bugzilla2fedmsg.adoc[bugzilla2fedmsg - SOP in review ]
|
||||||
|
** xref:bugzilla.adoc[bugzilla - SOP in review ]
|
||||||
|
** xref:cloud.adoc[cloud - SOP in review ]
|
||||||
|
** xref:collectd.adoc[collectd - SOP in review ]
|
||||||
|
** xref:communishift.adoc[communishift - SOP in review ]
|
||||||
|
** xref:compose-tracker.adoc[compose-tracker - SOP in review ]
|
||||||
|
** xref:contenthosting.adoc[contenthosting - SOP in review ]
|
||||||
|
** xref:copr.adoc[copr - SOP in review ]
|
||||||
|
** xref:cyclades.adoc[cyclades - SOP in review ]
|
||||||
|
** xref:darkserver.adoc[darkserver - SOP in review ]
|
||||||
|
** xref:database.adoc[database - SOP in review ]
|
||||||
|
** xref:datanommer.adoc[datanommer - SOP in review ]
|
||||||
|
** xref:debuginfod.adoc[debuginfod - SOP in review ]
|
||||||
|
** xref:denyhosts.adoc[denyhosts - SOP in review ]
|
||||||
|
** xref:departing-admin.adoc[departing-admin - SOP in review ]
|
||||||
|
** xref:dns.adoc[dns - SOP in review ]
|
||||||
|
** xref:docs.fedoraproject.org.adoc[docs.fedoraproject.org - SOP in review ]
|
||||||
|
** xref:fas-notes.adoc[fas-notes - SOP in review ]
|
||||||
|
** xref:fas-openid.adoc[fas-openid - SOP in review ]
|
||||||
|
** xref:fedmsg-certs.adoc[fedmsg-certs - SOP in review ]
|
||||||
|
** xref:fedmsg-gateway.adoc[fedmsg-gateway - SOP in review ]
|
||||||
|
** xref:fedmsg-introduction.adoc[fedmsg-introduction - SOP in review ]
|
||||||
|
** xref:fedmsg-irc.adoc[fedmsg-irc - SOP in review ]
|
||||||
|
** xref:fedmsg-new-message-type.adoc[fedmsg-new-message-type - SOP in review ]
|
||||||
|
** xref:fedmsg-relay.adoc[fedmsg-relay - SOP in review ]
|
||||||
|
** xref:fedmsg-websocket.adoc[fedmsg-websocket - SOP in review ]
|
||||||
|
** xref:fedocal.adoc[fedocal - SOP in review ]
|
||||||
|
** xref:fedorapackages.adoc[fedorapackages - SOP in review ]
|
||||||
|
** xref:fedorapastebin.adoc[fedorapastebin - SOP in review ]
|
||||||
|
** xref:fedora-releases.adoc[fedora-releases - SOP in review ]
|
||||||
|
** xref:fedorawebsites.adoc[fedorawebsites - SOP in review ]
|
||||||
|
** xref:fmn.adoc[fmn - SOP in review ]
|
||||||
|
** xref:fpdc.adoc[fpdc - SOP in review ]
|
||||||
|
** xref:freemedia.adoc[freemedia - SOP in review ]
|
||||||
|
** xref:freenode-irc-channel.adoc[freenode-irc-channel - SOP in review ]
|
||||||
|
** xref:freshmaker.adoc[freshmaker - SOP in review ]
|
||||||
|
** xref:gather-easyfix.adoc[gather-easyfix - SOP in review ]
|
||||||
|
** xref:gdpr_delete.adoc[gdpr_delete - SOP in review ]
|
||||||
|
** xref:gdpr_sar.adoc[gdpr_sar - SOP in review ]
|
||||||
|
** xref:geoip-city-wsgi.adoc[geoip-city-wsgi - SOP in review ]
|
||||||
|
** xref:github2fedmsg.adoc[github2fedmsg - SOP in review ]
|
||||||
|
** xref:github.adoc[github - SOP in review ]
|
||||||
|
** xref:gitweb.adoc[gitweb - SOP in review ]
|
||||||
|
** xref:greenwave.adoc[greenwave - SOP in review ]
|
||||||
|
** xref:guestdisk.adoc[guestdisk - SOP in review ]
|
||||||
|
** xref:guestedit.adoc[guestedit - SOP in review ]
|
||||||
|
** xref:haproxy.adoc[haproxy - SOP in review ]
|
||||||
|
** xref:hosted_git_to_svn.adoc[hosted_git_to_svn - SOP in review ]
|
||||||
|
** xref:hotfix.adoc[hotfix - SOP in review ]
|
||||||
|
** xref:hotness.adoc[hotness - SOP in review ]
|
||||||
|
** xref:hubs.adoc[hubs - SOP in review ]
|
||||||
|
** xref:ibm_rsa_ii.adoc[ibm_rsa_ii - SOP in review ]
|
||||||
|
** xref:index.adoc[index - SOP in review ]
|
||||||
|
** xref:infra-git-repo.adoc[infra-git-repo - SOP in review ]
|
||||||
|
** xref:infra-hostrename.adoc[infra-hostrename - SOP in review ]
|
||||||
|
** xref:infra-raidmismatch.adoc[infra-raidmismatch - SOP in review ]
|
||||||
|
** xref:infra-repo.adoc[infra-repo - SOP in review ]
|
||||||
|
** xref:infra-retiremachine.adoc[infra-retiremachine - SOP in review ]
|
||||||
|
** xref:infra-yubikey.adoc[infra-yubikey - SOP in review ]
|
||||||
|
** xref:ipsilon.adoc[ipsilon - SOP in review ]
|
||||||
|
** xref:iscsi.adoc[iscsi - SOP in review ]
|
||||||
|
** xref:jenkins-fedmsg.adoc[jenkins-fedmsg - SOP in review ]
|
||||||
|
** xref:kerneltest-harness.adoc[kerneltest-harness - SOP in review ]
|
||||||
|
** xref:kickstarts.adoc[kickstarts - SOP in review ]
|
||||||
|
** xref:koji.adoc[koji - SOP in review ]
|
||||||
|
** xref:koji-archive.adoc[koji-archive - SOP in review ]
|
||||||
|
** xref:koji-builder-setup.adoc[koji-builder-setup - SOP in review ]
|
||||||
|
** xref:koschei.adoc[koschei - SOP in review ]
|
||||||
|
** xref:layered-image-buildsys.adoc[layered-image-buildsys - SOP in review ]
|
||||||
|
** xref:librariesio2fedmsg.adoc[librariesio2fedmsg - SOP in review ]
|
||||||
|
** xref:linktracking.adoc[linktracking - SOP in review ]
|
||||||
|
** xref:loopabull.adoc[loopabull - SOP in review ]
|
||||||
|
** xref:mailman.adoc[mailman - SOP in review ]
|
||||||
|
** xref:making-ssl-certificates.adoc[making-ssl-certificates - SOP in review ]
|
||||||
|
** xref:massupgrade.adoc[massupgrade - SOP in review ]
|
||||||
|
** xref:mastermirror.adoc[mastermirror - SOP in review ]
|
||||||
|
** xref:mbs.adoc[mbs - SOP in review ]
|
||||||
|
** xref:memcached.adoc[memcached - SOP in review ]
|
||||||
|
** xref:message-tagging-service.adoc[message-tagging-service - SOP in review ]
|
||||||
|
** xref:mirrorhiding.adoc[mirrorhiding - SOP in review ]
|
||||||
|
** xref:mirrormanager.adoc[mirrormanager - SOP in review ]
|
||||||
|
** xref:mirrormanager-S3-EC2-netblocks.adoc[mirrormanager-S3-EC2-netblocks - SOP in review ]
|
||||||
|
** xref:mote.adoc[mote - SOP in review ]
|
||||||
|
** xref:nagios.adoc[nagios - SOP in review ]
|
||||||
|
** xref:netapp.adoc[netapp - SOP in review ]
|
||||||
|
** xref:new-hosts.adoc[new-hosts - SOP in review ]
|
||||||
|
** xref:nonhumanaccounts.adoc[nonhumanaccounts - SOP in review ]
|
||||||
|
** xref:nuancier.adoc[nuancier - SOP in review ]
|
||||||
|
** xref:odcs.adoc[odcs - SOP in review ]
|
||||||
|
** xref:openqa.adoc[openqa - SOP in review ]
|
||||||
|
** xref:openshift.adoc[openshift - SOP in review ]
|
||||||
|
** xref:openvpn.adoc[openvpn - SOP in review ]
|
||||||
|
** xref:orientation.adoc[orientation - SOP in review ]
|
||||||
|
** xref:outage.adoc[outage - SOP in review ]
|
||||||
|
** xref:packagedatabase.adoc[packagedatabase - SOP in review ]
|
||||||
|
** xref:packagereview.adoc[packagereview - SOP in review ]
|
||||||
|
** xref:pagure.adoc[pagure - SOP in review ]
|
||||||
|
** xref:pdc.adoc[pdc - SOP in review ]
|
||||||
|
** xref:pesign-upgrade.adoc[pesign-upgrade - SOP in review ]
|
||||||
|
** xref:planetsubgroup.adoc[planetsubgroup - SOP in review ]
|
||||||
|
** xref:privatefedorahosted.adoc[privatefedorahosted - SOP in review ]
|
||||||
|
** xref:publictest-dev-stg-production.adoc[publictest-dev-stg-production - SOP in review ]
|
||||||
|
** xref:rabbitmq.adoc[rabbitmq - SOP in review ]
|
||||||
|
** xref:rdiff-backup.adoc[rdiff-backup - SOP in review ]
|
||||||
|
** xref:registry.adoc[registry - SOP in review ]
|
||||||
|
** xref:requestforresources.adoc[requestforresources - SOP in review ]
|
||||||
|
** xref:resultsdb.adoc[resultsdb - SOP in review ]
|
||||||
|
** xref:retrace.adoc[retrace - SOP in review ]
|
||||||
|
** xref:reviewboard.adoc[reviewboard - SOP in review ]
|
||||||
|
** xref:scmadmin.adoc[scmadmin - SOP in review ]
|
||||||
|
** xref:selinux.adoc[selinux - SOP in review ]
|
||||||
|
** xref:sigul-upgrade.adoc[sigul-upgrade - SOP in review ]
|
||||||
|
** xref:simple_koji_ci.adoc[simple_koji_ci - SOP in review ]
|
||||||
|
** xref:sshaccess.adoc[sshaccess - SOP in review ]
|
||||||
|
** xref:sshknownhosts.adoc[sshknownhosts - SOP in review ]
|
||||||
|
** xref:staging.adoc[staging - SOP in review ]
|
||||||
|
** xref:status-fedora.adoc[status-fedora - SOP in review ]
|
||||||
|
** xref:syslog.adoc[syslog - SOP in review ]
|
||||||
|
** xref:tag2distrepo.adoc[tag2distrepo - SOP in review ]
|
||||||
|
** xref:torrentrelease.adoc[torrentrelease - SOP in review ]
|
||||||
|
** xref:unbound.adoc[unbound - SOP in review ]
|
||||||
|
** xref:virt-image.adoc[virt-image - SOP in review ]
|
||||||
|
** xref:virtio.adoc[virtio - SOP in review ]
|
||||||
|
** xref:virt-notes.adoc[virt-notes - SOP in review ]
|
||||||
|
** xref:voting.adoc[voting - SOP in review ]
|
||||||
|
** xref:waiverdb.adoc[waiverdb - SOP in review ]
|
||||||
|
** xref:wcidff.adoc[wcidff - SOP in review ]
|
||||||
|
** xref:wiki.adoc[wiki - SOP in review ]
|
||||||
|
** xref:zodbot.adoc[zodbot - SOP in review ]
|
||||||
|
|
98
modules/sysadmin_guide/pages/2-factor.adoc
Normal file
98
modules/sysadmin_guide/pages/2-factor.adoc
Normal file
|
@ -0,0 +1,98 @@
|
||||||
|
= Two factor auth
|
||||||
|
|
||||||
|
Fedora Infrastructure has implemented a form of two factor auth for
|
||||||
|
people who have sudo access on Fedora machines. In the future we may
|
||||||
|
expand this to include more than sudo but this was deemed to be a high
|
||||||
|
value, low hanging fruit.
|
||||||
|
|
||||||
|
== Using two factor
|
||||||
|
|
||||||
|
http://fedoraproject.org/wiki/Infrastructure_Two_Factor_Auth
|
||||||
|
|
||||||
|
To enroll a Yubikey, use the fedora-burn-yubikey script like normal. To
|
||||||
|
enroll using FreeOTP or Google Authenticator, go to
|
||||||
|
https://admin.fedoraproject.org/totpcgiprovision/
|
||||||
|
|
||||||
|
=== What's enough authentication?
|
||||||
|
|
||||||
|
FAS Password+FreeOTP or FAS Password+Yubikey Note: don't actually enter
|
||||||
|
a +, simple enter your FAS Password and press your yubikey or enter your
|
||||||
|
FreeOTP code.
|
||||||
|
|
||||||
|
== Administrating and troubleshooting two factor
|
||||||
|
|
||||||
|
Two factor auth is implemented by a modified copy of the
|
||||||
|
https://github.com/mricon/totp-cgi project doing the authentication and
|
||||||
|
pam_url submitting the authentication tokens.
|
||||||
|
|
||||||
|
totp-cgi runs on the fas servers (currently fas01.stg and
|
||||||
|
fas01/fas02/fas03 in production), listening on port 8443 for pam_url
|
||||||
|
requests.
|
||||||
|
|
||||||
|
FreeOTP, Google authenticator and yubikeys are supported as tokens to
|
||||||
|
use with your password.
|
||||||
|
|
||||||
|
=== FreeOTP, Google authenticator:
|
||||||
|
|
||||||
|
FreeOTP application is preferred, however Google authenticator works as
|
||||||
|
well. (Note that Google authenticator is not open source)
|
||||||
|
|
||||||
|
This is handled via totpcgi. There's a command line tool to manage
|
||||||
|
users, totpprov. See 'man totpprov' for more info. Admins can use this
|
||||||
|
tool to revoke lost tokens (google authenticator only) with 'totpprov
|
||||||
|
delete-user username'
|
||||||
|
|
||||||
|
To enroll using FreeOTP or Google Authenticator for production machines,
|
||||||
|
go to https://admin.fedoraproject.org/totpcgiprovision/
|
||||||
|
|
||||||
|
To enroll using FreeOTP or Google Authenticator for staging machines, go
|
||||||
|
to https://admin.stg.fedoraproject.org/totpcgiprovision/
|
||||||
|
|
||||||
|
You'll be prompted to login with your fas username and password.
|
||||||
|
|
||||||
|
Note that staging and production differ.
|
||||||
|
|
||||||
|
=== YubiKeys:
|
||||||
|
|
||||||
|
Yubikeys are enrolled and managed in FAS. Users can self-enroll using
|
||||||
|
the fedora-burn-yubikey utility included in the fedora-packager package.
|
||||||
|
|
||||||
|
=== What do I do if I lose my token?
|
||||||
|
|
||||||
|
Send an email to admin@fedoraproject.org that is encrypted/signed with
|
||||||
|
your gpg key from FAS, or otherwise identifies you are you.
|
||||||
|
|
||||||
|
=== How to remove a token (so the user can re-enroll)?
|
||||||
|
|
||||||
|
First we MUST verify that the user is who they say they are, using any
|
||||||
|
of the following:
|
||||||
|
|
||||||
|
* Personal contact where the person can be verified by member of
|
||||||
|
sysadmin-main.
|
||||||
|
* Correct answers to security questions.
|
||||||
|
* Email request to admin@fedoraproject.org that is gpg encrypted by the
|
||||||
|
key listed for the user in fas.
|
||||||
|
|
||||||
|
Then:
|
||||||
|
|
||||||
|
. For google authenticator,
|
||||||
|
+
|
||||||
|
____
|
||||||
|
.. ssh into batcave01 as root
|
||||||
|
.. ssh into os-master01.iad2.fedoraproject.org
|
||||||
|
.. $ oc project fas
|
||||||
|
.. $ oc get pods
|
||||||
|
.. $ oc rsh <pod> (Pick one of totpcgi pods from the above list)
|
||||||
|
.. $ totpprov delete-user <username>
|
||||||
|
____
|
||||||
|
. For yubikey: login to one of the fas machines and run:
|
||||||
|
/usr/local/bin/yubikey-remove.py username
|
||||||
|
|
||||||
|
The user can then go to
|
||||||
|
https://admin.fedoraproject.org/totpcgiprovision/ and reprovision a new
|
||||||
|
device.
|
||||||
|
|
||||||
|
If the user emails admin@fedoraproject.org with the signed request, make
|
||||||
|
sure to reply to all indicating that a reset was performed. This is so
|
||||||
|
that other admins don't step in and reset it again after its been reset
|
||||||
|
once.
|
294
modules/sysadmin_guide/pages/accountdeletion.adoc
Normal file
294
modules/sysadmin_guide/pages/accountdeletion.adoc
Normal file
|
@ -0,0 +1,294 @@
|
||||||
|
= Account Deletion SOP
|
||||||
|
|
||||||
|
For the most part we do not delete accounts. In the case that a deletion
|
||||||
|
is paramount, it will need to be coordinated with appropriate entities.
|
||||||
|
|
||||||
|
Disabling accounts is another story but is limited to those with the
|
||||||
|
appropriate privileges. Reasons for accounts to be disabled can be one
|
||||||
|
of the following:
|
||||||
|
|
||||||
|
____
|
||||||
|
* Person has placed SPAM on the wiki or other sites.
|
||||||
|
* It is seen that the account has been compromised by a third party.
|
||||||
|
* A person wishes to leave the Fedora Project and wants the account
|
||||||
|
disabled.
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
* Disabling
|
||||||
|
** Disable Accounts
|
||||||
|
** Disable Groups
|
||||||
|
* User Requested disables
|
||||||
|
* Renames
|
||||||
|
** Rename Accounts
|
||||||
|
** Rename Groups
|
||||||
|
* Deletion
|
||||||
|
** Delete Accounts
|
||||||
|
** Delete Groups
|
||||||
|
|
||||||
|
=== Disable
|
||||||
|
|
||||||
|
Disabling accounts is the easiest to accomplish as it just blocks people
|
||||||
|
from using their account. It does not remove the account name and
|
||||||
|
associated UID so we don't have to worry about future, unintentional
|
||||||
|
collisions.
|
||||||
|
|
||||||
|
== Disable Accounts
|
||||||
|
|
||||||
|
To begin with, accounts should not be disabled until there is a ticket
|
||||||
|
in the Infrastructure ticketing system. After that the contents inside
|
||||||
|
the ticket need to be verified (to make sure people aren't playing
|
||||||
|
pranks or someone is in a crappy mood). This needs to be logged in the
|
||||||
|
ticket (who looked, what they saw, etc). Then the account can be
|
||||||
|
disabled.:
|
||||||
|
|
||||||
|
....
|
||||||
|
ssh db02
|
||||||
|
sudo -u postgres pqsql fas2
|
||||||
|
|
||||||
|
fas2=# begin;
|
||||||
|
fas2=# select * from people where username = 'FOOO';
|
||||||
|
....
|
||||||
|
|
||||||
|
Here you need to verify that the account looks right, that there is only
|
||||||
|
one match, or other issues. If there are multiple matches you need to
|
||||||
|
contact one of the main sysadmin-db's on how to proceed.:
|
||||||
|
|
||||||
|
....
|
||||||
|
fas2=# update people set status = 'admin_disabled' where username = 'FOOO';
|
||||||
|
fas2=# commit;
|
||||||
|
fas2=# /q
|
||||||
|
....
|
||||||
|
|
||||||
|
== Disable Groups
|
||||||
|
|
||||||
|
There is no explicit way to disable groups in FAS2. Instead, we close
|
||||||
|
the group for adding new members and optionally remove existing members
|
||||||
|
from it. This can be done from the web UI if you are an administrator of
|
||||||
|
the group or you are in the accounts group. First, go to the group info
|
||||||
|
page. Then click the (edit) link next to Group Details. Make sure that
|
||||||
|
the Invite Only box is checked. This will prevent other users from
|
||||||
|
requesting the group on their own.
|
||||||
|
|
||||||
|
If you want to remove the existing users, View the Group info, then
|
||||||
|
click on the View Member List link. Click on All under the Results
|
||||||
|
heading. Then go through and click on Remove for each member.
|
||||||
|
|
||||||
|
Doing this in the database instead can be quicker if you have a lot of
|
||||||
|
people to remove. Once again, this requires someone in sysadmin-db to do
|
||||||
|
the work:
|
||||||
|
|
||||||
|
....
|
||||||
|
ssh db02
|
||||||
|
sudo -u postgres pqsql fas2
|
||||||
|
|
||||||
|
fas2=# begin;
|
||||||
|
fas2=# update group, set invite_only = true where name = 'FOOO';
|
||||||
|
fas2=# commit;
|
||||||
|
fas2=# begin;
|
||||||
|
fas2=# select p.name, g.name, r.role_status from people as p, person_roles as r, groups as g
|
||||||
|
where p.id = r.person_id and g.id = r.group_id
|
||||||
|
and g.name = 'FOOO';
|
||||||
|
fas2=# -- Make sure that the list of users in the groups looks correct
|
||||||
|
fas2=# delete from person_roles where person_roles.group_id = (select id from groups where g.name = 'FOOO');
|
||||||
|
fas2=# -- number of rows in both of the above should match
|
||||||
|
fas2=# commit;
|
||||||
|
fas2=# /q
|
||||||
|
....
|
||||||
|
|
||||||
|
=== User Requested Disables
|
||||||
|
|
||||||
|
According to our Privacy Policy, a user may request that their personal
|
||||||
|
information from FAS if they want to disable their account. We can do
|
||||||
|
this but need to do some extra work over simply setting the account
|
||||||
|
status to disabled.
|
||||||
|
|
||||||
|
== Record User's CLA information
|
||||||
|
|
||||||
|
If the user has signed the CLA/FPCA, then they may have contributed
|
||||||
|
something to Fedora that we'll need to contact them about at a later
|
||||||
|
date. For that, we need to keep at least the following information:
|
||||||
|
|
||||||
|
* Fedora username
|
||||||
|
* human name
|
||||||
|
* email address
|
||||||
|
|
||||||
|
All of this information should be on the CLA email that is sent out when
|
||||||
|
a user signs up. We need to verify with spot (Tom Callaway) that he has
|
||||||
|
that record. If not, we need to get it to him. Something like:
|
||||||
|
|
||||||
|
....
|
||||||
|
select id, username, human_name, email, telephone, facsimile, postal_address from people where username = 'USERNAME';
|
||||||
|
....
|
||||||
|
|
||||||
|
and send it to spot to keep.
|
||||||
|
|
||||||
|
== Remove the personal information
|
||||||
|
|
||||||
|
The following sequence of db commands should do it:
|
||||||
|
|
||||||
|
....
|
||||||
|
fas2=# begin;
|
||||||
|
fas2=# select * from people where username = 'USERNAME';
|
||||||
|
....
|
||||||
|
|
||||||
|
Here you need to verify that the account looks right, that there is only
|
||||||
|
one match, or other issues. If there are multiple matches you need to
|
||||||
|
contact one of the main sysadmin-db's on how to proceed.:
|
||||||
|
|
||||||
|
....
|
||||||
|
fas2=# update people set human_name = '', gpg_keyid = null, ssh_key = null, unverified_email = null, comments = null, postal_address = null, telephone = null, facsimile = null, affiliation = null, ircnick = null, status = 'inactive', locale = 'C', timezone = null, latitude = null, longitude = null, country_code = null, email = 'disabled1@fedoraproject.org' where username = 'USERNAME';
|
||||||
|
....
|
||||||
|
|
||||||
|
Make sure only one record was updated:
|
||||||
|
|
||||||
|
....
|
||||||
|
fas2=# select * from people where username = 'USERNAME';
|
||||||
|
....
|
||||||
|
|
||||||
|
Make sure the correct record was updated:
|
||||||
|
|
||||||
|
....
|
||||||
|
fas2=# commit;
|
||||||
|
....
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
The email address is both not null and unique in the database. Due to
|
||||||
|
this, you need to set it to a new string for every user who requests
|
||||||
|
deletion like this.
|
||||||
|
====
|
||||||
|
|
||||||
|
=== Renames
|
||||||
|
|
||||||
|
In general, renames do not require as much work as deletions but they
|
||||||
|
still require coordination. This is because renames do not change the
|
||||||
|
UID/GID but some of our applications save information based on
|
||||||
|
username/groupname rather than UID/GID.
|
||||||
|
|
||||||
|
== Rename Accounts
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
Needs more eyes This list may not be complete.
|
||||||
|
====
|
||||||
|
|
||||||
|
* Check the databases for koji, pkgdb, and bodhi for occurrences of
|
||||||
|
the old username and update them to the new username.
|
||||||
|
* Check fedorapeople.org for home directories and yum repositories under
|
||||||
|
the old username that would need to be renamed
|
||||||
|
* Check (or ask the user to check and update) mailing list subscriptions
|
||||||
|
on fedorahosted.org and lists.fedoraproject.org under the old
|
||||||
|
username@fedoraproject.org email alias
|
||||||
|
* Check whether the user has a username@fedoraproject.org bugzilla
|
||||||
|
account in python-fedora and update that. Also ask the user to update
|
||||||
|
that in bugzilla.
|
||||||
|
* If the user is in a sysadmin-* group, check for home directories on
|
||||||
|
bastion and other infrastructure boxes that are owned by them and need
|
||||||
|
to be renamed (Could also just tell the user to backup any files there
|
||||||
|
themselves b/c they're getting a new home directory).
|
||||||
|
* grep through ansible for occurrences of the username
|
||||||
|
* Check for entries in trac on fedorahosted.org for the username as an
|
||||||
|
"Assigned to" or "CC" entry.
|
||||||
|
* Add other places to check here
|
||||||
|
|
||||||
|
== Rename Groups
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
Needs more eyes This list may not be complete.
|
||||||
|
====
|
||||||
|
* grep through ansible for occurrences of the group name.
|
||||||
|
* Check for group-members,group-admins,group-sponsors@fedoraproject.org
|
||||||
|
email alias presence in any fedorahosted.org or lists.fedoraproject.org
|
||||||
|
mailing list
|
||||||
|
* Check for entries in trac on fedorahosted.org for the username as an
|
||||||
|
"Assigned to" or "CC" entry.
|
||||||
|
* Add other places to check here
|
||||||
|
|
||||||
|
=== Deletion
|
||||||
|
|
||||||
|
Deletion is the toughest one to audit because it requires that we look
|
||||||
|
through our systems looking for the UID and GID in addition to looking
|
||||||
|
for the username and password. The UID and GID are used on things like
|
||||||
|
filesystem permissions so we have to look there as well. Not catching
|
||||||
|
these places may lead to security issus should the UID/GID ever be
|
||||||
|
reused.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Recommended to rename instead When not strictly necessary to purge all
|
||||||
|
traces of an account, it's highlyrecommended to rename the user or group
|
||||||
|
to something like DELETED_oldusername instead of deleting. This avoids
|
||||||
|
the problems and additional checking that we have to do below.
|
||||||
|
====
|
||||||
|
== Delete Accounts
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
Needs more eyes This list may be incomplete. Needs more people to look
|
||||||
|
at this and find places that may need to be updated
|
||||||
|
====
|
||||||
|
* Check everything for the #Rename Accounts case.
|
||||||
|
* Figure out what boxes a user may have had access to in the past. This
|
||||||
|
means you need to look at all the groups a user may ever have been
|
||||||
|
approved for (even if they are not approved for those groups now). For
|
||||||
|
instance, any git*, svn*, bzr*, hg* groups would have granted access to
|
||||||
|
hosted03 and hosted04. packager would have granted access to
|
||||||
|
pkgs.fedoraproject.org. Pretty much any group grants access to
|
||||||
|
fedorapeople.org.
|
||||||
|
* For those boxes, run a find over the files there to see if the UID
|
||||||
|
owns any files on the system:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
# find / -uid 100068 -print
|
||||||
|
....
|
||||||
|
+
|
||||||
|
Any files owned by that uid must be reassigned to another user or::
|
||||||
|
removed.
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
What to do about backups? Backups pose a special problem as they may
|
||||||
|
contain the uid that's being removed. Need to decide how to handle this
|
||||||
|
====
|
||||||
|
* Add other places to check here
|
||||||
|
|
||||||
|
== Delete Groups
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
Needs more eyes This list may be incomplete. Needs more people to look
|
||||||
|
at this and find places that may need to be updated
|
||||||
|
====
|
||||||
|
* Check everything for the #Rename Groups case.
|
||||||
|
* Figure out what boxes may have had files owned by that group. This
|
||||||
|
means that you'd need to look at the users in that group, what boxes
|
||||||
|
they have shell accounts on, and then look at those boxes. groups used
|
||||||
|
for hosted would also need to add hosted03 and hosted04 to that list and
|
||||||
|
the box that serves the hosted mailing lists.
|
||||||
|
* For those boxes, run a find over the files there to see if the GID
|
||||||
|
owns any files on the system:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
# find / -gid 100068 -print
|
||||||
|
....
|
||||||
|
+
|
||||||
|
Any files owned by that GID must be reassigned to another group or
|
||||||
|
removed.
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
What to do about backups? Backups pose a special problem as they may
|
||||||
|
contain the gid that's being removed. Need to decide how to handle this
|
||||||
|
====
|
||||||
|
* Add other places to check here
|
210
modules/sysadmin_guide/pages/anitya.adoc
Normal file
210
modules/sysadmin_guide/pages/anitya.adoc
Normal file
|
@ -0,0 +1,210 @@
|
||||||
|
= Anitya Infrastructure SOP
|
||||||
|
|
||||||
|
Anitya is used by Fedora to track upstream project releases and maps
|
||||||
|
them to downstream distribution packages, including (but not limited to)
|
||||||
|
Fedora.
|
||||||
|
|
||||||
|
Anitya staging instance: https://stg.release-monitoring.org
|
||||||
|
|
||||||
|
Anitya production instance: https://release-monitoring.org
|
||||||
|
|
||||||
|
Anitya project page: https://github.com/fedora-infra/anitya
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-apps
|
||||||
|
Persons::
|
||||||
|
zlopez
|
||||||
|
Location::
|
||||||
|
iad2.fedoraproject.org
|
||||||
|
Servers::
|
||||||
|
Production
|
||||||
|
+
|
||||||
|
* os-master01.iad2.fedoraproject.org
|
||||||
|
+
|
||||||
|
Staging
|
||||||
|
+
|
||||||
|
* os-master01.stg.iad2.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Map upstream releases to Fedora packages.
|
||||||
|
|
||||||
|
== Hosts
|
||||||
|
|
||||||
|
The current deployment is made up of release-monitoring OpenShift
|
||||||
|
namespace.
|
||||||
|
|
||||||
|
=== release-monitoring
|
||||||
|
|
||||||
|
This OpenShift namespace runs following pods:
|
||||||
|
|
||||||
|
* The apache/mod_wsgi application for release-monitoring.org
|
||||||
|
* A libraries.io SSE client
|
||||||
|
* A service checking for new releases
|
||||||
|
|
||||||
|
This OpenShift project relies on:
|
||||||
|
|
||||||
|
* A postgres db server running in OpenShift
|
||||||
|
* Lots of external third-party services. The anitya webapp can scrape
|
||||||
|
pypi, rubygems.org, sourceforge and many others on command.
|
||||||
|
* Lots of external third-party services. The check service makes all
|
||||||
|
kinds of requests out to the Internet that can fail in various ways.
|
||||||
|
* Fedora messaging RabbitMQ hub for publishing messages
|
||||||
|
|
||||||
|
Things that rely on this host:
|
||||||
|
|
||||||
|
* `hotness-sop` is a fedora messaging consumer running in Fedora Infra
|
||||||
|
in OpenShift. It listens for Anitya messages from here and performs
|
||||||
|
actions on koji and bugzilla.
|
||||||
|
|
||||||
|
== Releasing
|
||||||
|
|
||||||
|
The release process is described in
|
||||||
|
https://anitya.readthedocs.io/en/latest/contributing.html#release-guide[Anitya
|
||||||
|
documentation].
|
||||||
|
|
||||||
|
=== Deploying
|
||||||
|
|
||||||
|
Staging deployment of Anitya is deployed in OpenShift on
|
||||||
|
os-master01.stg.iad2.fedoraproject.org.
|
||||||
|
|
||||||
|
To deploy staging instance of Anitya you need to push changes to staging
|
||||||
|
branch on https://github.com/fedora-infra/anitya[Anitya GitHub]. GitHub
|
||||||
|
webhook will then automatically deploy a new version of Anitya on
|
||||||
|
staging.
|
||||||
|
|
||||||
|
Production deployment of Anitya is deployed in OpenShift on
|
||||||
|
os-master01.iad2.fedoraproject.org.
|
||||||
|
|
||||||
|
To deploy production instance of Anitya you need to push changes to
|
||||||
|
production branch on https://github.com/fedora-infra/anitya[Anitya
|
||||||
|
GitHub]. GitHub webhook will then automatically deploy a new version of
|
||||||
|
Anitya on production.
|
||||||
|
|
||||||
|
==== Configuration
|
||||||
|
|
||||||
|
To deploy the new configuration, you need
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/sshaccess.html[ssh
|
||||||
|
access] to batcave01.iad2.fedoraproject.org and
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ansible.html[permissions
|
||||||
|
to run the Ansible playbook].
|
||||||
|
|
||||||
|
All the following commands should be run from batcave01.
|
||||||
|
|
||||||
|
First, ensure there are no configuration changes required for the new
|
||||||
|
update. If there are, update the Ansible anitya role(s) and optionally
|
||||||
|
run the playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook openshift-apps/release-monitoring.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
The configuration changes could be limited to staging only using:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook openshift-apps/release-monitoring.yml -l staging
|
||||||
|
....
|
||||||
|
|
||||||
|
This is recommended for testing new configuration changes.
|
||||||
|
|
||||||
|
==== Upgrading
|
||||||
|
|
||||||
|
===== Staging
|
||||||
|
|
||||||
|
To deploy new version of Anitya you need to push changes to staging
|
||||||
|
branch on https://github.com/fedora-infra/anitya[Anitya GitHub]. GitHub
|
||||||
|
webhook will then automatically deploy a new version of Anitya on
|
||||||
|
staging.
|
||||||
|
|
||||||
|
===== Production
|
||||||
|
|
||||||
|
To deploy new version of Anitya you need to push changes to production
|
||||||
|
branch on https://github.com/fedora-infra/anitya[Anitya GitHub]. GitHub
|
||||||
|
webhook will then automatically deploy a new version of Anitya on
|
||||||
|
production.
|
||||||
|
|
||||||
|
Congratulations! The new version should now be deployed.
|
||||||
|
|
||||||
|
== Administrating release-monitoring.org
|
||||||
|
|
||||||
|
Anitya web application offers some functionality to administer itself.
|
||||||
|
|
||||||
|
User admin status is tracked in Anitya database. Admin users can grant
|
||||||
|
or revoke admin priviledges to users in the
|
||||||
|
https://release-monitoring.org/users[users tab].
|
||||||
|
|
||||||
|
Admin users have additional functionality available in web interface. In
|
||||||
|
particular, admins can view flagged projects, remove projects and remove
|
||||||
|
package mappings etc.
|
||||||
|
|
||||||
|
For more information see
|
||||||
|
https://anitya.readthedocs.io/en/stable/admin-user-guide.html[Admin user
|
||||||
|
guide] in Anitya documentation.
|
||||||
|
|
||||||
|
=== Flags
|
||||||
|
|
||||||
|
Anitya lets users flag projects for administrator attention. This is
|
||||||
|
accessible to administrators in the
|
||||||
|
https://release-monitoring.org/flags[flags tab].
|
||||||
|
|
||||||
|
== Monitoring
|
||||||
|
|
||||||
|
To monitor the activity of Anitya you can connect to Fedora infra
|
||||||
|
OpenShift and look at the state of pods.
|
||||||
|
|
||||||
|
For staging look at the [.title-ref]#release-monitoring# namespace in
|
||||||
|
https://os.stg.fedoraproject.org/console/project/release-monitoring/overview[staging
|
||||||
|
OpenShift instance].
|
||||||
|
|
||||||
|
For production look at the [.title-ref]#release-monitoring# namespace in
|
||||||
|
https://os.fedoraproject.org/console/project/release-monitoring/overview[production
|
||||||
|
OpenShift instance].
|
||||||
|
|
||||||
|
== Troubleshooting
|
||||||
|
|
||||||
|
This section contains various issues encountered during deployment or
|
||||||
|
configuration changes and possible solutions.
|
||||||
|
|
||||||
|
=== Fedmsg messages aren't sent
|
||||||
|
|
||||||
|
*Issue:* Fedmsg messages aren't sent.
|
||||||
|
|
||||||
|
*Solution:* Set USER environment variable in pod.
|
||||||
|
|
||||||
|
*Explanation:* Fedmsg is using USER env variable as a username inside
|
||||||
|
messages. Without USER env set it just crashes and didn't send anything.
|
||||||
|
|
||||||
|
=== Cronjob is crashing
|
||||||
|
|
||||||
|
*Issue:* Cronjob pod is crashing on start, even after configuration
|
||||||
|
change that should fix the behavior.
|
||||||
|
|
||||||
|
*Solution:* Restart the cronjob. This could be done by OPS.
|
||||||
|
|
||||||
|
*Explanation:* Every time the cronjob is executed after crash it is
|
||||||
|
trying to actually reuse the pod with bad configuration instead of
|
||||||
|
creating a new one with new configuration.
|
||||||
|
|
||||||
|
=== Database migration is taking too long
|
||||||
|
|
||||||
|
*Issue:* Database migration is taking few hours to complete.
|
||||||
|
|
||||||
|
*Solution:* Stop every pod and cronjob before migration.
|
||||||
|
|
||||||
|
*Explanation:* When creating new index or doing some other complex
|
||||||
|
operation on database, the migration script needs exclusive access to
|
||||||
|
the database.
|
||||||
|
|
||||||
|
=== Old version is deployed instead the new one
|
||||||
|
|
||||||
|
*Issue:* The pod is deployed with old version of Anitya, but it says
|
||||||
|
that it was triggered by correct commit.
|
||||||
|
|
||||||
|
*Solution:* Set [.title-ref]#dockerStrategy# in buildconfig.yml to
|
||||||
|
noCache.
|
||||||
|
|
||||||
|
*Explanation:* The OpenShift is by default caching the layers of docker
|
||||||
|
containers, so if there is no change in Dockerfile it will just use the
|
||||||
|
cached version and don't run the commands again.
|
252
modules/sysadmin_guide/pages/ansible.adoc
Normal file
252
modules/sysadmin_guide/pages/ansible.adoc
Normal file
|
@ -0,0 +1,252 @@
|
||||||
|
= Ansible infrastructure SOP/Information.
|
||||||
|
|
||||||
|
== Background
|
||||||
|
|
||||||
|
Fedora infrastructure used to use func and puppet for system change
|
||||||
|
management. We are now using ansible for all system change mangement and
|
||||||
|
ad-hoc tasks.
|
||||||
|
|
||||||
|
== Overview
|
||||||
|
|
||||||
|
Ansible runs from batcave01 or backup01. These hosts run a ssh-agent
|
||||||
|
that has unlocked the ansible root ssh private key. (This is unlocked
|
||||||
|
manually by a human with the passphrase each reboot, the passphrase
|
||||||
|
itself is not stored anywhere on the machines). Using 'sudo -i',
|
||||||
|
sysadmin-main members can use this agent to access any machines with the
|
||||||
|
ansible root ssh public key setup, either with 'ansible' for one-off
|
||||||
|
commands or 'ansible-playbook' to run playbooks.
|
||||||
|
|
||||||
|
Playbooks are idempotent (or should be). Meaning you should be able to
|
||||||
|
re-run the same playbook over and over and it should get to a state
|
||||||
|
where 0 items are changing.
|
||||||
|
|
||||||
|
Additionally (see below) there is a rbac wrapper that allows members of
|
||||||
|
some other groups to run playbooks against specific hosts.
|
||||||
|
|
||||||
|
=== GIT repositories
|
||||||
|
|
||||||
|
There are 2 git repositories associated with Ansible:
|
||||||
|
|
||||||
|
* The Fedora Infrastructure Ansible repository and replicas.
|
||||||
|
+
|
||||||
|
[CAUTION]
|
||||||
|
.Caution
|
||||||
|
====
|
||||||
|
This is a public repository. Never commit private data to this repo.
|
||||||
|
====
|
||||||
|
+
|
||||||
|
image:ansible-repositories.png[image]
|
||||||
|
+
|
||||||
|
This repository exists as several copies or replicas:
|
||||||
|
** The "upstream" repository on Pagure.
|
||||||
|
+
|
||||||
|
https://pagure.io/fedora-infra/ansible
|
||||||
|
+
|
||||||
|
This repository is the public facing place where people can contribute
|
||||||
|
(e.g. pull requests) as well as the authoritative source. Members of the
|
||||||
|
`sysadmin` FAS group or the `fedora-infra` Pagure group have commit
|
||||||
|
access to this repository.
|
||||||
|
+
|
||||||
|
To contribute changes, fork the repository on Pagure and submit a Pull
|
||||||
|
Request. Someone from the aforementioned groups can then review and
|
||||||
|
merge them.
|
||||||
|
+
|
||||||
|
It is recommended that you configure git to use `pull --rebase` by
|
||||||
|
default by running `git config --bool pull.rebase true` in your ansible
|
||||||
|
clone directory. This configuration prevents unneeded merges which can
|
||||||
|
occur if someone else pushes changes to the remote repository while you
|
||||||
|
are working on your own local changes.
|
||||||
|
** Two bare mirrors on [.title-ref]#batcave01#, `/srv/git/ansible.git`
|
||||||
|
and `/srv/git/mirrors/ansible.git`
|
||||||
|
+
|
||||||
|
[CAUTION]
|
||||||
|
.Caution
|
||||||
|
====
|
||||||
|
These are public repositories. Never commit private data to these
|
||||||
|
repositories. Don't commit or push to these repos directly, unless
|
||||||
|
Pagure is unavailable.
|
||||||
|
====
|
||||||
|
+
|
||||||
|
The `mirror_pagure_ansible` service on [.title-ref]#batcave01# receives
|
||||||
|
bus messages about changes in the repository on Pagure, fetches these
|
||||||
|
into `/srv/git/mirrors/ansible.git` and pushes from there to
|
||||||
|
`/srv/git/ansible.git`. When this happens, various actions are triggered
|
||||||
|
via git hooks:
|
||||||
|
*** The working copy at `/srv/web/infra/ansible` is updated.
|
||||||
|
*** A mail about the changes is sent to [.title-ref]#sysadmin-members#.
|
||||||
|
*** The changes are announced on the message bus, which in turn triggers
|
||||||
|
announcements on IRC.
|
||||||
|
+
|
||||||
|
You can check out the repo locally on [.title-ref]#batcave01# with:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
git clone /srv/git/ansible.git
|
||||||
|
....
|
||||||
|
+
|
||||||
|
If the Ansible repository on Pagure is unavailable, members of the
|
||||||
|
[.title-ref]#sysadmin# group may commit directly, provided this
|
||||||
|
procedure is followed:
|
||||||
|
[arabic]
|
||||||
|
. The synchronization service is stopped and disabled:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo systemctl disable --now mirror_pagure_ansible.service
|
||||||
|
....
|
||||||
|
. Changes are applied to the repository on [.title-ref]#batcave01#.
|
||||||
|
. After Pagure is available again, the changes are pushed to the
|
||||||
|
repository there.
|
||||||
|
. The synchronization service is enabled and started:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo systemctl enable --now mirror_pagure_ansible.service
|
||||||
|
....
|
||||||
|
** `/srv/web/infra/ansible` on [.title-ref]#batcave01#, the working copy
|
||||||
|
from which playbooks are run.
|
||||||
|
+
|
||||||
|
[CAUTION]
|
||||||
|
.Caution
|
||||||
|
====
|
||||||
|
This is a public repository. Never commit private data to this repo.
|
||||||
|
Don't commit or push to this repo directly, unless Pagure is
|
||||||
|
unavailable.
|
||||||
|
====
|
||||||
|
+
|
||||||
|
You can access it also via a cgit web interface at:
|
||||||
|
https://pagure.io/fedora-infra/ansible/
|
||||||
|
+
|
||||||
|
[verse]
|
||||||
|
--
|
||||||
|
|
||||||
|
--
|
||||||
|
* `/srv/git/ansible-private` on [.title-ref]#batcave01#.
|
||||||
|
+
|
||||||
|
[CAUTION]
|
||||||
|
.Caution
|
||||||
|
====
|
||||||
|
This is a private repository for passwords and other sensitive data. It
|
||||||
|
is not available in cgit, nor should it be cloned or copied remotely.
|
||||||
|
====
|
||||||
|
+
|
||||||
|
This repository is only accessible to members of 'sysadmin-main'.
|
||||||
|
|
||||||
|
=== Cron job/scheduled runs
|
||||||
|
|
||||||
|
With use of run_ansible-playbook_cron.py that is run daily via cron we
|
||||||
|
walk through playbooks and run them with [.title-ref]#--check --diff#
|
||||||
|
params to perform a dry-run.
|
||||||
|
|
||||||
|
This way we make sure all the playbooks are idempotent and there is no
|
||||||
|
unexpected changes on servers (or playbooks).
|
||||||
|
|
||||||
|
=== Logging
|
||||||
|
|
||||||
|
We have in place a callback plugin that stores history for any
|
||||||
|
ansible-playbook runs and then sends a report each day to
|
||||||
|
sysadmin-logs-members with any CHANGED or FAILED actions. Additionally,
|
||||||
|
there's a fedmsg plugin that reports start and end of ansible playbook
|
||||||
|
runs to the fedmsg bus. Ansible also logs to syslog verbose reporting of
|
||||||
|
when and what commands and playbooks were run.
|
||||||
|
|
||||||
|
=== role based access control for playbooks
|
||||||
|
|
||||||
|
There's a wrapper script on batcave01 called 'rbac-playbook' that allows
|
||||||
|
non sysadmin-main members to run specific playbooks against specific
|
||||||
|
groups of hosts. This is part of the ansible_utils package. The upstream
|
||||||
|
for ansible_utils is: https://bitbucket.org/tflink/ansible_utils
|
||||||
|
|
||||||
|
To add a new group:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. add the playbook name and sysadmin group to the rbac-playbook
|
||||||
|
(ansible-private repo)
|
||||||
|
. add that sysadmin group to sudoers on batcave01 (also in
|
||||||
|
ansible-private repo)
|
||||||
|
|
||||||
|
To use the wrapper:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook playbook.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Directory setup
|
||||||
|
|
||||||
|
=== Inventory
|
||||||
|
|
||||||
|
The inventory directory tells ansible all the hosts that are managed by
|
||||||
|
it and the groups they are in. All files in this dir are concatenated
|
||||||
|
together, so you can split out groups/hosts into separate files for
|
||||||
|
readability. They are in ini file format.
|
||||||
|
|
||||||
|
Additionally under the inventory directory are host_vars and group_vars
|
||||||
|
subdirectories. These are files named for the host or group and
|
||||||
|
containing variables to set for that host or group. You should strive to
|
||||||
|
set variables in the highest level possible, and precedence is in:
|
||||||
|
global, group, host order.
|
||||||
|
|
||||||
|
=== Vars
|
||||||
|
|
||||||
|
This directory contains global variables as well as OS specific
|
||||||
|
variables. Note that in order to use the OS specific ones you must have
|
||||||
|
'gather_facts' as 'True' or ansible will not have the facts it needs to
|
||||||
|
determine the OS.
|
||||||
|
|
||||||
|
=== Roles
|
||||||
|
|
||||||
|
Roles are a collection of tasks/files/templates that can be used on any
|
||||||
|
host or group of hosts that all share that role. In other words, roles
|
||||||
|
should be used except in cases where configuration only applies to a
|
||||||
|
single host. Roles can be reused between hosts and groups and are more
|
||||||
|
portable/flexable than tasks or specific plays.
|
||||||
|
|
||||||
|
=== Scripts
|
||||||
|
|
||||||
|
In the ansible git repo under scripts are a number of utilty scripts for
|
||||||
|
sysadmins.
|
||||||
|
|
||||||
|
=== Playbooks
|
||||||
|
|
||||||
|
In the ansible git repo there's a directory for playbooks. The top level
|
||||||
|
contains utility playbooks for sysadmins. These playbooks perform
|
||||||
|
one-off functions or gather information. Under this directory are hosts
|
||||||
|
and groups playbooks. These playbooks are for specific hosts and groups
|
||||||
|
of hosts, from provision to fully configured. You should only use a host
|
||||||
|
playbook in cases where there will never be more than one of that thing.
|
||||||
|
|
||||||
|
=== Tasks
|
||||||
|
|
||||||
|
This directory contains one-off tasks that are used in playbooks. Some
|
||||||
|
of these should be migrated to roles (we had this setup before roles
|
||||||
|
existed in ansible). Those that are truely only used on one host/group
|
||||||
|
could stay as isolated tasks.
|
||||||
|
|
||||||
|
=== Syntax
|
||||||
|
|
||||||
|
Ansible now warns about depreciated syntax. Please fix any cases you see
|
||||||
|
related to depreciation warnings.
|
||||||
|
|
||||||
|
Templates use the jinja2 syntax.
|
||||||
|
|
||||||
|
== Libvirt virtuals
|
||||||
|
|
||||||
|
* TODO: add steps to make new libvirt virtuals in staging and production
|
||||||
|
* TODO: merge in new-hosts.txt
|
||||||
|
|
||||||
|
== Cloud Instances
|
||||||
|
|
||||||
|
* TODO: add how to make new cloud instances
|
||||||
|
* TODO: merge in from ansible README file.
|
||||||
|
|
||||||
|
== rdiff-backups
|
||||||
|
|
||||||
|
see:
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/rdiff-backup.html
|
||||||
|
|
||||||
|
== Additional Reading/Resources
|
||||||
|
|
||||||
|
Upstream docs:::
|
||||||
|
https://docs.ansible.com/
|
||||||
|
Example repo with all kinds of examples:::
|
||||||
|
* https://github.com/ansible/ansible-examples
|
||||||
|
* https://gist.github.com/marktheunissen/2979474
|
||||||
|
Jinja2 docs:::
|
||||||
|
http://jinja.pocoo.org/docs/
|
31
modules/sysadmin_guide/pages/apps-fp-o.adoc
Normal file
31
modules/sysadmin_guide/pages/apps-fp-o.adoc
Normal file
|
@ -0,0 +1,31 @@
|
||||||
|
= apps-fp-o SOP
|
||||||
|
|
||||||
|
Updating and maintaining the landing page at
|
||||||
|
https://apps.fedoraproject.org/
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-apps, #fedora-admin
|
||||||
|
Servers:::
|
||||||
|
proxy0*
|
||||||
|
Purpose:::
|
||||||
|
Have a nice landing page for all our webapps.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
We have a number of webapps, many of which our users don't know about.
|
||||||
|
This page was created so there was a central place where users could
|
||||||
|
stumble through them and learn.
|
||||||
|
|
||||||
|
The page is generated by a ansible role in ansible/roles/apps-fp-o/ It
|
||||||
|
makes use of an RPM package, the source code for which is at
|
||||||
|
https://github.com/fedora-infra/apps.fp.o
|
||||||
|
|
||||||
|
You can update the page by updating the apps.yaml file in that ansible
|
||||||
|
module.
|
||||||
|
|
||||||
|
When ansible is run next, the two ansible handlers should see your
|
||||||
|
changes and regenerate the static html and json data for the page.
|
60
modules/sysadmin_guide/pages/archive-old-fedora.adoc
Normal file
60
modules/sysadmin_guide/pages/archive-old-fedora.adoc
Normal file
|
@ -0,0 +1,60 @@
|
||||||
|
= How to Archive Old Fedora Releases
|
||||||
|
|
||||||
|
The Fedora download servers contain terabytes of data, and to allow for
|
||||||
|
mirrors to not have to take all of that data, infrastructure regularly
|
||||||
|
moves data of end of lifed releases (from /pub/fedora/linux) to the
|
||||||
|
archives section (/pub/archive/fedora/linux)
|
||||||
|
|
||||||
|
== Steps Involved
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. log into batcave01.phx2.fedoraproject.org and ssh to bodhi-backend01
|
||||||
|
+
|
||||||
|
$ sudo -i ssh root@bodhi-backend01.iad2.fedoraproject.org # su - ftpsync
|
||||||
|
$
|
||||||
|
. Then change into the releases directory.
|
||||||
|
+
|
||||||
|
$ cd /pub/fedora/linux/releases
|
||||||
|
. Check to see that the target directory doesn't already exist.
|
||||||
|
+
|
||||||
|
$ ls /pub/archive/fedora/linux/releases/
|
||||||
|
. If the target directory does not already exist, do a recursive link
|
||||||
|
copy of the tree you want to the target
|
||||||
|
+
|
||||||
|
$ cp -lvpnr 21 /pub/archive/fedora/linux/releases/21
|
||||||
|
. If the target directory already exists, then we need to do a recursive
|
||||||
|
rsync to update any changes in the trees since the previous copy.
|
||||||
|
+
|
||||||
|
$ rsync -avAXSHP --delete ./21/ /pub/archive/fedora/linux/releases/21/
|
||||||
|
. We now do the updates and updates/testing in similar ways.
|
||||||
|
+
|
||||||
|
$ cd ../updates/ $ cp -lpnr 21 /pub/archive/fedora/linux/updates/21 $ cd
|
||||||
|
testing $ cp -lpnr 21 /pub/archive/fedora/linux/updates/testing/21
|
||||||
|
|
||||||
|
Alternative if this is a later refresh of an older copy.
|
||||||
|
|
||||||
|
____
|
||||||
|
$ cd ../updates/ $ rsync -avAXSHP 21/
|
||||||
|
/pub/archive/fedora/linux/updates/21/ $ cd testing $ rsync -avAXSHP 21/
|
||||||
|
/pub/archive/fedora/linux/updates/testing/21/
|
||||||
|
____
|
||||||
|
|
||||||
|
[arabic, start=7]
|
||||||
|
. Do the same with fedora-secondary.
|
||||||
|
. Announce to the mirror list this has been done and that in 2 weeks you
|
||||||
|
will move the old trees to archives.
|
||||||
|
. In two weeks, log into mm-backend01 and run the archive script
|
||||||
|
+
|
||||||
|
sudo -u mirrormanager mm2_move-to-archive --originalCategory="Fedora
|
||||||
|
Linux" --archiveCategory="Fedora Archive" --directoryRe='/21/Everything'
|
||||||
|
. If there are problems, the postgres DB may have issues and so you need
|
||||||
|
to get a DBA to update the backend to fix items.
|
||||||
|
. Wait an hour or so then you can remove the files from the main tree.
|
||||||
|
+
|
||||||
|
ssh bodhi-backend01 cd /pub/fedora/linux cd releases/21 ls # make sure
|
||||||
|
you have stuff here rm -rf * ln ../20/README . cd ../../updates/21 ls #
|
||||||
|
make sure you have stuff here rm -rf * ln ../20/README . cd
|
||||||
|
../testing/21 ls # make sure you have stuff here rm -rf * ln
|
||||||
|
../20/README .
|
||||||
|
|
||||||
|
This should complete the archiving.
|
205
modules/sysadmin_guide/pages/arm.adoc
Normal file
205
modules/sysadmin_guide/pages/arm.adoc
Normal file
|
@ -0,0 +1,205 @@
|
||||||
|
= Fedora ARM Infrastructure
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-releng
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
arm01, arm02, arm03, arm04
|
||||||
|
Purpose::
|
||||||
|
Information on working with the arm SOCs
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
We have 4 arm chassis in phx2, each containing 24 SOCs (System On Chip).
|
||||||
|
|
||||||
|
Each chassis has 2 physical network connections going out from it. The
|
||||||
|
first one is used for the management interface on each SOC. The second
|
||||||
|
one is used for eth0 for each SOC.
|
||||||
|
|
||||||
|
Current allocations (2016-03-11):
|
||||||
|
|
||||||
|
arm01::
|
||||||
|
primary builders attached to koji.fedoraproject.org
|
||||||
|
arm02::
|
||||||
|
primary arch builders attached to koji.fedoraproject.org
|
||||||
|
arm03::
|
||||||
|
In cloud network, public qa/packager and copr instances
|
||||||
|
arm04::
|
||||||
|
primary arch builders attached to koji.fedoraproject.org
|
||||||
|
|
||||||
|
== Hardware Configuration
|
||||||
|
|
||||||
|
Each SOC has:
|
||||||
|
|
||||||
|
* eth0 and eth1 (unused) and a management interface.
|
||||||
|
* 4 cores
|
||||||
|
* 4GB ram
|
||||||
|
* a 300GB disk
|
||||||
|
|
||||||
|
SOCs are addressed by:
|
||||||
|
|
||||||
|
....
|
||||||
|
arm{chassisnumber}-builder{number}.arm.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
Where chassisnumber is 01 to 04 and number is 00-23
|
||||||
|
|
||||||
|
== PXE installs
|
||||||
|
|
||||||
|
Kickstarts for the machines are in the kickstarts repo.
|
||||||
|
|
||||||
|
PXE config is on noc01. (or cloud-noc01.cloud.fedoraproject.org for
|
||||||
|
arm03)
|
||||||
|
|
||||||
|
The kickstart installs the latests Fedora and sets them up with a base
|
||||||
|
package set.
|
||||||
|
|
||||||
|
== IPMI tool Management
|
||||||
|
|
||||||
|
The SOCs are managed via their mgmt interfaces using a custom ipmitool
|
||||||
|
as well as a custom python script called 'cxmanage'. The ipmitool
|
||||||
|
changes have been submitted upstream and cxmanage is under review in
|
||||||
|
Fedora.
|
||||||
|
|
||||||
|
The ipmitool is currently installed on noc01 and it has ability to talk
|
||||||
|
to them on their management interface. noc01 also serves dhcp and is a
|
||||||
|
pxeboot server for the SOCs.
|
||||||
|
|
||||||
|
However you will need to add it to your path:
|
||||||
|
|
||||||
|
....
|
||||||
|
export PATH=$PATH:/opt/calxeda/bin/
|
||||||
|
....
|
||||||
|
|
||||||
|
Some common commands:
|
||||||
|
|
||||||
|
To set the SOC to boot the next time only with pxe:
|
||||||
|
|
||||||
|
....
|
||||||
|
ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org chassis bootdev pxe
|
||||||
|
....
|
||||||
|
|
||||||
|
To set the SOC power off:
|
||||||
|
|
||||||
|
....
|
||||||
|
ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org power off
|
||||||
|
....
|
||||||
|
|
||||||
|
To set the SOC power on:
|
||||||
|
|
||||||
|
....
|
||||||
|
ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org power on
|
||||||
|
....
|
||||||
|
|
||||||
|
To get a serial over lan console from the SOC:
|
||||||
|
|
||||||
|
....
|
||||||
|
ipmitool -U admin -P thepassword -H arm03-builder11-mgmt.arm.fedoraproject.org -I lanplus sol activate
|
||||||
|
....
|
||||||
|
|
||||||
|
== DISK mapping
|
||||||
|
|
||||||
|
Each SOC has a disk. They are however mapped to the internal 00-23 in a
|
||||||
|
non direct manner:
|
||||||
|
|
||||||
|
....
|
||||||
|
HDD Bay EnergyCard SOC (Port 1) SOC Num
|
||||||
|
0 0 3 03
|
||||||
|
1 0 0 00
|
||||||
|
2 0 1 01
|
||||||
|
3 0 2 02
|
||||||
|
4 1 3 07
|
||||||
|
5 1 0 04
|
||||||
|
6 1 1 05
|
||||||
|
7 1 2 06
|
||||||
|
8 2 3 11
|
||||||
|
9 2 0 08
|
||||||
|
10 2 1 09
|
||||||
|
11 2 2 10
|
||||||
|
12 3 3 15
|
||||||
|
13 3 0 12
|
||||||
|
14 3 1 13
|
||||||
|
15 3 2 14
|
||||||
|
16 4 3 19
|
||||||
|
17 4 0 16
|
||||||
|
18 4 1 17
|
||||||
|
19 4 2 18
|
||||||
|
20 5 3 23
|
||||||
|
21 5 0 20
|
||||||
|
22 5 1 21
|
||||||
|
23 5 2 22
|
||||||
|
....
|
||||||
|
|
||||||
|
Looking at the system from the front, the bay numbering starts from left
|
||||||
|
to right.
|
||||||
|
|
||||||
|
== cxmanage
|
||||||
|
|
||||||
|
The cxmanage tool can be used to update firmware or gather diag info.
|
||||||
|
|
||||||
|
Until cxmanage is packaged, you can use it from a python virtualenv:
|
||||||
|
|
||||||
|
....
|
||||||
|
virtualenv --system-site-packages cxmanage
|
||||||
|
cd cxmanage
|
||||||
|
source bin/activate
|
||||||
|
pip install --extra-index-url=http://sources.calxeda.com/python/packages/ cxmanage
|
||||||
|
<use cxmanage>
|
||||||
|
deactivate
|
||||||
|
....
|
||||||
|
|
||||||
|
Some cxmanage commands
|
||||||
|
|
||||||
|
....
|
||||||
|
cxmanage sensor arm03-builder00-mgmt.arm.fedoraproject.org
|
||||||
|
Getting sensor readings...
|
||||||
|
1 successes | 0 errors | 0 nodes left | .
|
||||||
|
|
||||||
|
MP Temp 0
|
||||||
|
arm03-builder00-mgmt.arm.fedoraproject.org: 34.00 degrees C
|
||||||
|
Minimum : 34.00 degrees C
|
||||||
|
Maximum : 34.00 degrees C
|
||||||
|
Average : 34.00 degrees C
|
||||||
|
... (and about 20 more sensors)...
|
||||||
|
....
|
||||||
|
|
||||||
|
....
|
||||||
|
cxmanage info arm03-builder00-mgmt.arm.fedoraproject.org
|
||||||
|
Getting info...
|
||||||
|
1 successes | 0 errors | 0 nodes left | .
|
||||||
|
|
||||||
|
[ Info from arm03-builder00-mgmt.arm.fedoraproject.org ]
|
||||||
|
Hardware version : EnergyCard X04
|
||||||
|
Firmware version : ECX-1000-v2.1.5
|
||||||
|
ECME version : v0.10.2
|
||||||
|
CDB version : v0.10.2
|
||||||
|
Stage2boot version : v1.1.3
|
||||||
|
Bootlog version : v0.10.2
|
||||||
|
A9boot version : v2012.10.16-3-g66a3bf3
|
||||||
|
Uboot version : v2013.01-rc1_cx_2013.01.17
|
||||||
|
Ubootenv version : v2013.01-rc1_cx_2013.01.17
|
||||||
|
DTB version : v3.7-4114-g34da2e2
|
||||||
|
....
|
||||||
|
|
||||||
|
firmware update:
|
||||||
|
|
||||||
|
....
|
||||||
|
cxmanage --internal-tftp 10.5.126.41:6969 --all-nodes fwupdate package ECX-1000_update-v2.1.5.tar.gz arm03-builder00-mgmt.arm.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
(note that this runs against the 00 management interface for the chassis
|
||||||
|
and updates all the nodes), and that we must run a tftpserver on port
|
||||||
|
6969 for firewall handling.
|
||||||
|
|
||||||
|
== Links
|
||||||
|
|
||||||
|
http://sources.calxeda.com/python/packages/cxmanage/
|
||||||
|
|
||||||
|
== Contacts
|
||||||
|
|
||||||
|
help.desk@boston.co.uk is the contact to send repair requests to.
|
359
modules/sysadmin_guide/pages/askbot.adoc
Normal file
359
modules/sysadmin_guide/pages/askbot.adoc
Normal file
|
@ -0,0 +1,359 @@
|
||||||
|
= Ask Fedora SOP
|
||||||
|
|
||||||
|
To set up https://ask.fedoraproject.org based on Askbot as a question
|
||||||
|
and answer support forum for the Fedora community. A production instance
|
||||||
|
could be seen at https://ask.fedoraproject.org and the staging instance
|
||||||
|
is at http://ask.stg.fedoraproject.org/
|
||||||
|
|
||||||
|
This page describes how to set up and customize it from scratch.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Creating database
|
||||||
|
. Setting up the forum
|
||||||
|
. Adding administrators
|
||||||
|
. Change settings within the forum
|
||||||
|
. Database tweaks
|
||||||
|
. Debugging
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
anyone from the sysadmin team
|
||||||
|
Sponsor::
|
||||||
|
nirik
|
||||||
|
Location::
|
||||||
|
phx2
|
||||||
|
Servers::
|
||||||
|
ask01 , ask01.stg
|
||||||
|
Purpose::
|
||||||
|
To host Ask Fedora
|
||||||
|
|
||||||
|
== Creating database
|
||||||
|
|
||||||
|
We use the postgresql database backend. To add the database to a
|
||||||
|
postgresql server:
|
||||||
|
|
||||||
|
....
|
||||||
|
# psql -U postgres
|
||||||
|
postgres# create user askfedora with password 'xxx';
|
||||||
|
postgres# create database askfedora;
|
||||||
|
postgres# ALTER DATABASE askfedora owner to askfedora;
|
||||||
|
postgres# \q;
|
||||||
|
....
|
||||||
|
|
||||||
|
Now setup the db tables if this is a new install:
|
||||||
|
|
||||||
|
....
|
||||||
|
python manage.py syncdb
|
||||||
|
python manage.py migrate askbot
|
||||||
|
python manage.py migrate django_authopenid #embedded login application
|
||||||
|
....
|
||||||
|
|
||||||
|
== Setting up the forum
|
||||||
|
|
||||||
|
Askbot is packaged and available in Rawhide, Fedora 16 and EPEL 6. On a
|
||||||
|
RHEL 6 system, you need to install EPEL 6 repo first.:
|
||||||
|
|
||||||
|
....
|
||||||
|
# yum install askbot
|
||||||
|
....
|
||||||
|
|
||||||
|
The /etc/askbot/sites/ask/conf/settings.py file should look something
|
||||||
|
like:
|
||||||
|
|
||||||
|
....
|
||||||
|
DATABASE_ENGINE = 'postgresql_psycopg2'
|
||||||
|
DATABASE_NAME = 'testaskbot'
|
||||||
|
DATABASE_USER = 'askbot'
|
||||||
|
DATABASE_PASSWORD = 'xxxxx'
|
||||||
|
DATABASE_HOST = '127.0.0.1'
|
||||||
|
DATABASE_PORT = '5432'
|
||||||
|
|
||||||
|
# Outgoing mail server settings
|
||||||
|
#
|
||||||
|
DEFAULT_FROM_EMAIL = 'askfedora@fedoraproject.org'
|
||||||
|
EMAIL_SUBJECT_PREFIX = '[Askfedora]'
|
||||||
|
EMAIL_HOST='127.0.0.1'
|
||||||
|
EMAIL_PORT='25'
|
||||||
|
|
||||||
|
# This variable points to the Askbot plugin which will be used for user
|
||||||
|
# authentication. Not enabled yet because we don't need FAS auth but use
|
||||||
|
# Fedora id as a openid provider.
|
||||||
|
#
|
||||||
|
# ASKBOT_CUSTOM_AUTH_MODULE = 'authfas'
|
||||||
|
|
||||||
|
Now Ask Fedora website should be accessible from the browser.
|
||||||
|
....
|
||||||
|
|
||||||
|
== Adding administrators
|
||||||
|
|
||||||
|
As of Askbot version 0.7.21, the first user who logs in automatically
|
||||||
|
becomes the administrator. In previous versions, you have to do the
|
||||||
|
following.:
|
||||||
|
|
||||||
|
....
|
||||||
|
# cd /etc/askbot/sites/ask/conf/
|
||||||
|
# python manage.py add_admin 1
|
||||||
|
Do you really wish to make user (id=1, name=pjp) a site administrator?
|
||||||
|
yes/no: yes
|
||||||
|
....
|
||||||
|
|
||||||
|
Once a user is marked as a administrator, he or she can go into anyone's
|
||||||
|
profile, go the "moderation" tab in the end and mark them as
|
||||||
|
administrator or moderator as well as block or suspend a user.
|
||||||
|
|
||||||
|
== Change settings within the forum
|
||||||
|
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Data entry and display:::
|
||||||
|
** Disable "Allow asking questions anonymously"
|
||||||
|
** Enable "Force lowercase the tags"
|
||||||
|
** Change "Format of tag list" to "cloud"
|
||||||
|
** Change "Minimum length of search term for Ajax search" to "3"
|
||||||
|
** Change "Number of questions to list by default" to "50"
|
||||||
|
** Change "What should "unanswered question" mean?" to "Question has
|
||||||
|
no
|
||||||
|
** answers"
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Email and email alert settings::
|
||||||
|
** Change "Default news notification frequency" to "Instantly"
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Flatpages - about, privacy policy, etc.::
|
||||||
|
Change "Text of the Q&A forum About page (html format)" to the
|
||||||
|
following:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
Ask Fedora provides a community edited knowledge base and support forum
|
||||||
|
for the Fedora community. Make sure you read the FAQ and search for
|
||||||
|
existing questions before asking yours. If you want to provide feedback,
|
||||||
|
just a question in this site! Tag your questions "meta" to highlight your
|
||||||
|
questions to the administrators of Ask Fedora.
|
||||||
|
....
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Login provider settings::
|
||||||
|
** Disable "Activate local login"
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Q&A forum website parameters and urls::
|
||||||
|
** {blank}
|
||||||
|
+
|
||||||
|
Change "Site title for the Q&A forum" to "Ask Fedora: Community
|
||||||
|
Knowledge;;
|
||||||
|
Base and Support Forum"
|
||||||
|
** {blank}
|
||||||
|
+
|
||||||
|
Change "Comma separated list of Q&A site keywords" to "Ask Fedora,
|
||||||
|
forum,;;
|
||||||
|
community, support, help"
|
||||||
|
** {blank}
|
||||||
|
+
|
||||||
|
Change "Copyright message to show in the footer" to "All content is
|
||||||
|
under;;
|
||||||
|
Creative Commons Attribution Share Alike License. Ask Fedora is
|
||||||
|
community maintained and Red Hat or Fedora Project is not
|
||||||
|
responsible for content"
|
||||||
|
** {blank}
|
||||||
|
+
|
||||||
|
Change "Site description for the search engines" to "Ask Fedora:
|
||||||
|
Community;;
|
||||||
|
Knowledge Base and Support Forum"
|
||||||
|
** Change "Short name for your Q&A forum" to "Ask Fedora"
|
||||||
|
** {blank}
|
||||||
|
+
|
||||||
|
Change "Base URL for your Q&A forum, must start with http or https"
|
||||||
|
to;;
|
||||||
|
"http://ask.fedoraproject.org"
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Sidebar widget settings - main page::
|
||||||
|
** Disable "Show avatar block in sidebar"
|
||||||
|
** Disable "Show tag selector in sidebar"
|
||||||
|
* Skin and User Interface settings
|
||||||
|
** Upload "Q&A site logo"
|
||||||
|
** Upload "Site favicon". Must be a ICO format file because that is the
|
||||||
|
only one IE supports as a fav icon.
|
||||||
|
** Enable "Apply custom style sheet (CSS)"
|
||||||
|
** Upload the following custom CSS:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
#ab-main-nav a {
|
||||||
|
color: #333333;
|
||||||
|
background-color: #d8dfeb;
|
||||||
|
border: 1px solid #888888;
|
||||||
|
border-bottom: none;
|
||||||
|
padding: 0px 12px 3px 12px;
|
||||||
|
height: 25px;
|
||||||
|
line-height: 30px;
|
||||||
|
margin-right: 10px;
|
||||||
|
font-size: 18px;
|
||||||
|
font-weight: 100;
|
||||||
|
text-decoration: none;
|
||||||
|
display: block;
|
||||||
|
float: left;
|
||||||
|
}
|
||||||
|
|
||||||
|
#ab-main-nav a.on {
|
||||||
|
height: 24px;
|
||||||
|
line-height: 28px;
|
||||||
|
border-bottom: 1px solid #0a57a4;
|
||||||
|
border-right: 1px solid #0a57a4;
|
||||||
|
border-top: 1px solid #0a57a4;
|
||||||
|
border-left: 1px solid #0a57a4; /*background:#A31E39; */
|
||||||
|
background: #0a57a4;
|
||||||
|
color: #FFF;
|
||||||
|
font-weight: 800;
|
||||||
|
text-decoration: none
|
||||||
|
}
|
||||||
|
|
||||||
|
#ab-main-nav a.special {
|
||||||
|
font-size: 18px;
|
||||||
|
color: #072b61;
|
||||||
|
font-weight: bold;
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* tabs stuff */
|
||||||
|
.tabsA { float: right; }
|
||||||
|
.tabsC { float: left; }
|
||||||
|
|
||||||
|
.tabsA a.on, .tabsC a.on, .tabsA a:hover, .tabsC a:hover {
|
||||||
|
background: #fff;
|
||||||
|
color: #072b61;
|
||||||
|
border-top: 1px solid #babdb6;
|
||||||
|
border-left: 1px solid #babdb6;
|
||||||
|
border-right: 1px solid #888a85;
|
||||||
|
border-bottom: 1px solid #888a85;
|
||||||
|
height: 24px;
|
||||||
|
line-height: 26px;
|
||||||
|
margin-top: 3px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.tabsA a.rev.on, tabsA a.rev.on:hover {
|
||||||
|
padding: 0px 2px 0px 7px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.tabsA a, .tabsC a{
|
||||||
|
background: #f9f7eb;
|
||||||
|
border-top: 1px solid #eeeeec;
|
||||||
|
border-left: 1px solid #eeeeec;
|
||||||
|
border-right: 1px solid #a9aca5;
|
||||||
|
border-bottom: 1px solid #888a85;
|
||||||
|
color: #888a85;
|
||||||
|
display: block;
|
||||||
|
float: left;
|
||||||
|
height: 20px;
|
||||||
|
line-height: 22px;
|
||||||
|
margin: 5px 0 0 4px;
|
||||||
|
padding: 0 7px;
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.tabsA .label, .tabsC .label {
|
||||||
|
float: left;
|
||||||
|
font-weight: bold;
|
||||||
|
color: #777;
|
||||||
|
margin: 8px 0 0 0px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.tabsB a {
|
||||||
|
background: #eee;
|
||||||
|
border: 1px solid #eee;
|
||||||
|
color: #777;
|
||||||
|
display: block;
|
||||||
|
float: left;
|
||||||
|
height: 22px;
|
||||||
|
line-height: 28px;
|
||||||
|
margin: 5px 0px 0 4px;
|
||||||
|
padding: 0 11px 0 11px;
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
a {
|
||||||
|
color: #072b61;
|
||||||
|
text-decoration: none;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
div.side-box
|
||||||
|
{
|
||||||
|
width:200px;
|
||||||
|
padding:10px;
|
||||||
|
border:3px solid #CCCCCC;
|
||||||
|
margin:0px;
|
||||||
|
background: -moz-linear-gradient(top, #DDDDDD, #FFFFFF);
|
||||||
|
}
|
||||||
|
....
|
||||||
|
|
||||||
|
== Database tweaks
|
||||||
|
|
||||||
|
To automatically delete expired sessions, we run a trigger that makes
|
||||||
|
PostgreSQL delete them upon inserting a new one.
|
||||||
|
|
||||||
|
The code used to create this trigger was:
|
||||||
|
|
||||||
|
....
|
||||||
|
askfedora=# CREATE FUNCTION delete_old_sessions() RETURNS trigger
|
||||||
|
askfedora-# LANGUAGE plpgsql
|
||||||
|
askfedora-# AS $$
|
||||||
|
askfedora$# BEGIN
|
||||||
|
askfedora$# DELETE FROM django_session WHERE expire_date<current_timestamp;
|
||||||
|
askfedora$# RETURN NEW;
|
||||||
|
askfedora$# END
|
||||||
|
askfedora$# $$;
|
||||||
|
CREATE FUNCTION
|
||||||
|
askfedora=# CREATE TRIGGER old_sessions_gc
|
||||||
|
askfedora-# AFTER INSERT ON django_session
|
||||||
|
askfedora-# EXECUTE PROCEDURE delete_old_sessions();
|
||||||
|
....
|
||||||
|
|
||||||
|
In case this trigger causes any problems, please remove it by running:
|
||||||
|
`DROP TRIGGER old_sessions_gc;`
|
||||||
|
|
||||||
|
To make this perform, we have a custom index that's not in upstream
|
||||||
|
askbot, please remember to add that when recreating the trigger:
|
||||||
|
|
||||||
|
....
|
||||||
|
CREATE INDEX CONCURRENTLY django_session_expire_date ON django_session (expire_date);
|
||||||
|
....
|
||||||
|
|
||||||
|
If you deleted the trigger, or reinstalled without trigger, please make
|
||||||
|
sure to run `manage.py clean_sessions` regularly, so you don't end up
|
||||||
|
with a database that's too massive in size.
|
||||||
|
|
||||||
|
== Debugging
|
||||||
|
|
||||||
|
Set DEBUG to True in settings.py file and restart Apache.
|
||||||
|
|
||||||
|
== Auth issues
|
||||||
|
|
||||||
|
Users can login to ask with a variety of social media accounts. Once
|
||||||
|
they login with one they can attach other ones as well.
|
||||||
|
|
||||||
|
If a user forgets what social media they used, you can look in the
|
||||||
|
database:
|
||||||
|
|
||||||
|
Login to database host (db01.phx2.fedoraproject.org) # sudo -u postgres
|
||||||
|
psql askfedora psql> select * from django_authopenid_userassociation
|
||||||
|
where user_id like '%username%';
|
||||||
|
|
||||||
|
If they can login again with the same auth, ask them to do so. If not,
|
||||||
|
you can add the fedora account system openid auth to allow them to login
|
||||||
|
with that:
|
||||||
|
|
||||||
|
psql> insert into django_authopenid_userassociation (user_id,
|
||||||
|
openid_url,provider_name) VALUES (2595,
|
||||||
|
'http://name.id.fedoraproject.org', 'fedoraproject');
|
||||||
|
|
||||||
|
Use the ID from the previous query and replace name with the users fas
|
||||||
|
name.
|
152
modules/sysadmin_guide/pages/aws-access.adoc
Normal file
152
modules/sysadmin_guide/pages/aws-access.adoc
Normal file
|
@ -0,0 +1,152 @@
|
||||||
|
= Amazon Web Services Access
|
||||||
|
|
||||||
|
AWS includes a highly granular set of access policies, which can be
|
||||||
|
combined into roles and groups. Ipsilon is used to translate between IAM
|
||||||
|
policy groupings and groups in the Fedora Account System (FAS). Tags and
|
||||||
|
namespaces are used to keep roles resources seperate.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
nirik, pfrields
|
||||||
|
Location::
|
||||||
|
?
|
||||||
|
Servers::
|
||||||
|
N/A
|
||||||
|
Purpose::
|
||||||
|
Provide AWS resource access to contributors via FAS group membership.
|
||||||
|
|
||||||
|
== Accessing the AWS Console
|
||||||
|
|
||||||
|
To access the AWS Console via Ipsilon authentication, use
|
||||||
|
https://id.fedoraproject.org/saml2/SSO/Redirect?SPIdentifier=urn:amazon:webservices&RelayState=https://console.aws.amazon.com[this
|
||||||
|
SAML link].
|
||||||
|
|
||||||
|
You must be in the
|
||||||
|
https://admin.fedoraproject.org/accounts/group/view/aws-iam[aws-iam FAS
|
||||||
|
group] (or another group with access) to perform this action.
|
||||||
|
|
||||||
|
=== Adding a role to AWS IAM
|
||||||
|
|
||||||
|
Sign into AWS via the URL above, and visit
|
||||||
|
https://console.aws.amazon.com/iam/home[Identity and Access Management
|
||||||
|
(IAM)] in the Security, Identity and Compliance tools.
|
||||||
|
|
||||||
|
Choose Roles to view current roles. Confirm there is not already a role
|
||||||
|
matching the one you need. If not, create a new role as follows:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Select _Create role_.
|
||||||
|
. Select _SAML 2.0 federation_.
|
||||||
|
. Choose the SAML provider _id.fedoraproject.org_, which should already
|
||||||
|
be populated as a choice from previous use.
|
||||||
|
. Select the attribute _SAML:aud_. For value, enter
|
||||||
|
_https://signin.aws.amazon.com/saml_. Do not add a condition. Proceed to
|
||||||
|
the next step.
|
||||||
|
. Assign the appropriate policies from the pre-existing IAM policies.
|
||||||
|
It's unlikely you'll have to create your own, which is outside the scope
|
||||||
|
of this SOP. Then proceed to the next step.
|
||||||
|
. Set the role name and description. It is recommended you use the
|
||||||
|
_same_ role name as the FAS group for clarity. Fill in a longer
|
||||||
|
description to clarify the purpose of the role. Then choose _Create
|
||||||
|
role_.
|
||||||
|
|
||||||
|
Note or copy the Role ARN (Amazon Resource Name) for the new role.
|
||||||
|
You'll need this in the mapping below.
|
||||||
|
|
||||||
|
=== Adding a group to FAS
|
||||||
|
|
||||||
|
When finished, login to FAS and create a group to correspond to the new
|
||||||
|
role. Use the prefix _aws-_ to denote new AWS roles in FAS. This makes
|
||||||
|
them easier to locate in a search.
|
||||||
|
|
||||||
|
It may be appropriate to set group ownership for _aws-_ groups to an
|
||||||
|
Infrastructure team principal, and then add others as users or sponsors.
|
||||||
|
This is especially worth considering for groups that have modify (full)
|
||||||
|
access to an AWS resource.
|
||||||
|
|
||||||
|
=== Adding an IAM role mapping in Ipsilon
|
||||||
|
|
||||||
|
Add the new role mapping for FAS group to Role ARN in the ansible git
|
||||||
|
repo, under _roles/ipsilon/files/infofas.py_. Current mappings look like
|
||||||
|
this:
|
||||||
|
|
||||||
|
....
|
||||||
|
aws_groups = {
|
||||||
|
'aws-master': 'arn:aws:iam::125523088429:role/aws-master',
|
||||||
|
'aws-iam': 'arn:aws:iam::125523088429:role/aws-iam',
|
||||||
|
'aws-billing': 'arn:aws:iam::125523088429:role/aws-billing',
|
||||||
|
'aws-atomic': 'arn:aws:iam::125523088429:role/aws-atomic',
|
||||||
|
'aws-s3-readonly': 'arn:aws:iam::125523088429:role/aws-s3-readonly'
|
||||||
|
}
|
||||||
|
....
|
||||||
|
|
||||||
|
Add your mapping to the dictionary as shown. Start a new build/rollout
|
||||||
|
of the ipsilon project in openshift to make the changes live.
|
||||||
|
|
||||||
|
=== User accounts
|
||||||
|
|
||||||
|
If you only need to use the web interface to aws, a role (and associated
|
||||||
|
policy) should be all you need, however, if you need cli access, you
|
||||||
|
will need a user and a token. Users should be named the same as the role
|
||||||
|
they are associated with.
|
||||||
|
|
||||||
|
=== Role and User policies
|
||||||
|
|
||||||
|
Each Role (and user if there is a user needed for the role) should have
|
||||||
|
the same policy attached to it. Policies are named
|
||||||
|
'fedora-$rolename-$service' ie, 'fedora-infra-ec2'. A copy of polices is
|
||||||
|
available in the ansible repo under files/aws/iam/policies. These are in
|
||||||
|
json form.
|
||||||
|
|
||||||
|
Policies are setup such that roles/users can do most things with a
|
||||||
|
resource if it's untagged. If it's tagged it MUST be tagged with their
|
||||||
|
group: FedoraGroup / $groupname. If it's tagged with another group name,
|
||||||
|
they cannot do anything with or to that resource. (Aside from seeing it
|
||||||
|
exists).
|
||||||
|
|
||||||
|
If there's a permssion you need, please file a ticket and it will be
|
||||||
|
evaluated.
|
||||||
|
|
||||||
|
Users MUST keep tokens private and secure. YOU are responsible for all
|
||||||
|
use of tokens issued to you from Fedora Infrastructure. Report any
|
||||||
|
compromised or possibly public tokens as soon as you are aware.
|
||||||
|
|
||||||
|
Users MUST tag resources with their FedoraGroup tag within one day, or
|
||||||
|
the resource may be removed.
|
||||||
|
|
||||||
|
=== ec2
|
||||||
|
|
||||||
|
users/roles with ec2 permissions should always tag their instances with
|
||||||
|
their FedoraGroup as soon as possible. Untagged resources can be
|
||||||
|
terminated at any time.
|
||||||
|
|
||||||
|
=== s3
|
||||||
|
|
||||||
|
users/roles with s3 permissions will be given specific bucket(s) that
|
||||||
|
they can manage/use. Care should be taken to make sure nothing in them
|
||||||
|
is public that should not be.
|
||||||
|
|
||||||
|
=== cloudfront
|
||||||
|
|
||||||
|
Please file a ticket if you need cloudfront and infrastructure will do
|
||||||
|
any needed setup if approved.
|
||||||
|
|
||||||
|
== Regions
|
||||||
|
|
||||||
|
Users/groups are encouraged to use regions 'near' them or wherever makes
|
||||||
|
the most sense. If you are trying to create ec2 instances you will need
|
||||||
|
infrastructure to create a vpc in the region with network, etc. File a
|
||||||
|
ticket for such requests.
|
||||||
|
|
||||||
|
== Other Notes
|
||||||
|
|
||||||
|
AWS resource access that is not read-only should be treated with care.
|
||||||
|
In some cases, Amazon or other entities may absorb AWS costs, so changes
|
||||||
|
in usage can cause issues if not controlled or monitored. If you have
|
||||||
|
doubts about access, consult the Fedora Project Leader or Fedora
|
||||||
|
Engineering Manager.
|
118
modules/sysadmin_guide/pages/basset.adoc
Normal file
118
modules/sysadmin_guide/pages/basset.adoc
Normal file
|
@ -0,0 +1,118 @@
|
||||||
|
= Basset anti-spam service
|
||||||
|
|
||||||
|
Since the Fedora Project has come under targeted spam attacks, we have
|
||||||
|
decided to create a service that all our applications can hook into to
|
||||||
|
have a central repository for anti-spam procedures. Basset is this
|
||||||
|
service, and it's hosted on https://pagure.io/basset.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Overview
|
||||||
|
. FAS
|
||||||
|
. Trac
|
||||||
|
. Wiki
|
||||||
|
. Setup
|
||||||
|
. Outage
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Patrick Uiterwijk (puiterwijk)
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-apps, #fedora-noc, sysadmin-main
|
||||||
|
Location::
|
||||||
|
basset01
|
||||||
|
Purpose::
|
||||||
|
Centralized anti-spam
|
||||||
|
|
||||||
|
== Overview
|
||||||
|
|
||||||
|
Basset is a central anti-spam service: it received messages from
|
||||||
|
services that certain actions happened, and will then decide to accept
|
||||||
|
or deny the request, or pass it on to an administrator.
|
||||||
|
|
||||||
|
At the moment, we have the following modules live: FAS, trac, wiki.
|
||||||
|
|
||||||
|
== FAS
|
||||||
|
|
||||||
|
This module receives notifications from FAS about new users
|
||||||
|
registrations and new users signing the FPCA. With Basset enabled, FAS
|
||||||
|
will not automatically accept a new user registration or a FPCA signing,
|
||||||
|
but instead let Basset know a user tried to perform these actions and
|
||||||
|
then depend on Basset to enact this.
|
||||||
|
|
||||||
|
In the case of registration this is done by setting the user to a
|
||||||
|
spamcheck_awaiting status. As soon as Basset made a decision, it will
|
||||||
|
set the user to spamcheck_manual, spamcheck_denied or active. If it sets
|
||||||
|
the user to active, it will also send the welcome email to the user. If
|
||||||
|
it made a wrong decision, and the user is set as spamcheck_manual or
|
||||||
|
spamcheck_denied, a member of the accounts team can go to that users'
|
||||||
|
page and click the "Enable" button to override the decision. If this
|
||||||
|
needed to be done, please notify puiterwijk so that the rules Basset
|
||||||
|
uses can be updated.
|
||||||
|
|
||||||
|
For the case of the FPCA, FAS will request the cla_fpca group
|
||||||
|
membership, but not sponsor the user. At the moment that Basset decides
|
||||||
|
it accepts the request, it will sponsor the user into the group. If it
|
||||||
|
declined the FPCA request, it will remove the user from the group. To
|
||||||
|
override this decision, a member of the accounts group can go to FAS and
|
||||||
|
manually add the user to the cla_fpca group and sponsor them into it.
|
||||||
|
|
||||||
|
== Trac
|
||||||
|
|
||||||
|
For Trac, if a post gets denied, the content item gets deleted, the Trac
|
||||||
|
account gets blocked cross-instance and the FAS account gets blocked.
|
||||||
|
|
||||||
|
To unblock the user, log in to hosted03, and remove
|
||||||
|
/srv/web/trac/blocks/$username. For info on how to unblock the FAS user,
|
||||||
|
see the notes under FAS.
|
||||||
|
|
||||||
|
== Wiki
|
||||||
|
|
||||||
|
For Wiki, if an edit gets denied, the page gets deleted, the wiki
|
||||||
|
account blocked and the FAS account gets blocked.
|
||||||
|
|
||||||
|
For the wiki parts of undoing this, follow the regular mediawiki unblock
|
||||||
|
procedures using:::
|
||||||
|
* https://fedoraproject.org/wiki/Special:BlockList to check if an user
|
||||||
|
is blocked or not
|
||||||
|
* https://fedoraproject.org/wiki/Special:Unblock to unblock that user
|
||||||
|
|
||||||
|
Don't forget to unblock the account as in FAS.
|
||||||
|
|
||||||
|
== Setup
|
||||||
|
|
||||||
|
At this moment, Basset runs on a single server (basset01(.stg)), and
|
||||||
|
runs the frontend, message broker and worker all on a single server. For
|
||||||
|
all of it to work, the following services are used: - httpd (frontend) -
|
||||||
|
rabbitmq-server (broker) - mongod (mongo database server for storage of
|
||||||
|
internal info) - basset-worker (worker)
|
||||||
|
|
||||||
|
== Outage
|
||||||
|
|
||||||
|
The consequences of certain services not being up results in various
|
||||||
|
conditions:
|
||||||
|
|
||||||
|
If the httpd or frontend aren't up, no new messages will come in. FAS
|
||||||
|
will set the user to spamcheck_awaiting, but not submit it to Basset.
|
||||||
|
Work is in progress on a script to submit such entries to the queue
|
||||||
|
after Basset frontend is back. However, since this part of the code is
|
||||||
|
so small, this is not likely to be the part that's down. (You can know
|
||||||
|
that it is because the FAS logs will log an error instead of "result:
|
||||||
|
checking".)
|
||||||
|
|
||||||
|
If the worker or the mongo server are down, no messages will be
|
||||||
|
processed, but all messages queued up will be processed the moment both
|
||||||
|
of the services start again: as long as a message makes it into the
|
||||||
|
queue, it will be processed until completion.
|
||||||
|
|
||||||
|
If the worker encounters an error during processing of a message, it
|
||||||
|
will dump a tracedump into the journal log file, and stop processing any
|
||||||
|
messages. Resolve the condition reported in the error and restart the
|
||||||
|
basset-worker service, and all work will be continued, starting with the
|
||||||
|
message it was processing when it errored out.
|
||||||
|
|
||||||
|
This means that as long as the message is queued, the worker will pick
|
||||||
|
it up and handle it.
|
31
modules/sysadmin_guide/pages/bastion-hosts-info.adoc
Normal file
31
modules/sysadmin_guide/pages/bastion-hosts-info.adoc
Normal file
|
@ -0,0 +1,31 @@
|
||||||
|
= Fedora Bastion Hosts
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
There are 2 primary bastion hosts in the phx2 datacenter. One will be
|
||||||
|
active at any given time and the second will be a hot spare, ready to
|
||||||
|
take over. Switching between bastion hosts is currently a manual process
|
||||||
|
that requires changes in ansible.
|
||||||
|
|
||||||
|
There is also a bastion-comm01 bastion host for the qa.fedoraproject.org
|
||||||
|
network. This is used in cases where users only need to access resources
|
||||||
|
in that qa.fedoraproject.org.
|
||||||
|
|
||||||
|
All of the bastion hosts have an external IP that is mapped into them.
|
||||||
|
The reverse dns for these IPs is controlled by RHIT, so any changes must
|
||||||
|
be carefully coordinated.
|
||||||
|
|
||||||
|
The active bastion host performs the following functions:
|
||||||
|
|
||||||
|
* Outgoing smtp from fedora servers. This includes email aliases,
|
||||||
|
mailing list posts, build and commit notices, mailing list posts, etc.
|
||||||
|
* Incoming smtp from servers in phx2 or on the fedora vpn. Incoming mail
|
||||||
|
directly from the outside is NOT accepted or forwarded.
|
||||||
|
* ssh access to all phx2/vpn connected servers.
|
||||||
|
* openvpn hub. This is the hub that all vpn clients connect to and talk
|
||||||
|
to each other via. Taking down or stopping this service will be a major
|
||||||
|
outage of services as all proxy and app servers use the vpn to talk to
|
||||||
|
each other.
|
||||||
|
|
||||||
|
When rebuilding these machines, care must be taken to match up the dns
|
||||||
|
names externally, and to preserve the ssh host keys.
|
52
modules/sysadmin_guide/pages/bladecenter.adoc
Normal file
52
modules/sysadmin_guide/pages/bladecenter.adoc
Normal file
|
@ -0,0 +1,52 @@
|
||||||
|
= BladeCenter Access Infrastructure SOP
|
||||||
|
|
||||||
|
Many of the builders in PHX are blades in a blade center. A few other
|
||||||
|
machines are also on blades.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Common Tasks
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Logging into the web interface
|
||||||
|
. Using the Serial Console of Blades
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
PHX
|
||||||
|
Purpose::
|
||||||
|
Contains blades used for buildsystems, etc
|
||||||
|
|
||||||
|
== Common Tasks
|
||||||
|
|
||||||
|
=== Logging into the web interface
|
||||||
|
|
||||||
|
The web interface to the bladecenters let you reset power, etc. They are
|
||||||
|
bc01-mgmt and bc02-mgmt.
|
||||||
|
|
||||||
|
=== Using the Serial Console of Blades
|
||||||
|
|
||||||
|
All of the blades are set up with a serial console over lan (SOL). To
|
||||||
|
use this, ssh into the bladecenter. You can then pick your system and
|
||||||
|
bring up a console with:
|
||||||
|
|
||||||
|
....
|
||||||
|
env -T system:blade[x]
|
||||||
|
console -o
|
||||||
|
....
|
||||||
|
|
||||||
|
where x is the blade number (can be determined from web interface, etc)
|
||||||
|
|
||||||
|
To leave the console session, press Esc (
|
||||||
|
|
||||||
|
For more details on BladeCenter SOL, see
|
||||||
|
http://www-304.ibm.com/systems/support/supportsite.wss/docdisplay?brandind=5000008&lndocid=MIGR-54666
|
157
modules/sysadmin_guide/pages/blockerbugs.adoc
Normal file
157
modules/sysadmin_guide/pages/blockerbugs.adoc
Normal file
|
@ -0,0 +1,157 @@
|
||||||
|
= Blockerbugs Infrastructure SOP
|
||||||
|
|
||||||
|
https://pagure.io/fedora-qa/blockerbugs[Blockerbugs] is an app developed
|
||||||
|
by Fedora QA to aid in tracking items related to release blocking and
|
||||||
|
freeze exception bugs in branched Fedora releases.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. File Locations
|
||||||
|
. Upgrade Process
|
||||||
|
* Upgrade Preparation (for all upgrades)
|
||||||
|
* Minor Upgrade (no db change)
|
||||||
|
* Major Upgrade (with db changes)
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora QA Devel
|
||||||
|
Contact::
|
||||||
|
#fedora-qa
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
blockerbugs01.phx2, blockerbugs02.phx2, blockerbugs01.stg.phx2
|
||||||
|
Purpose::
|
||||||
|
Hosting the https://pagure.io/fedora-qa/blockerbugs[blocker bug
|
||||||
|
tracking application] for QA
|
||||||
|
|
||||||
|
== File Locations
|
||||||
|
|
||||||
|
`/etc/blockerbugs/settings.py` - configuration for the app
|
||||||
|
|
||||||
|
=== Node Roles
|
||||||
|
|
||||||
|
blockerbugs01.stg.phx2::
|
||||||
|
the staging instance, it is not load balanced
|
||||||
|
blockerbugs01.phx2::
|
||||||
|
one of the load balanced production nodes, it is responsible for
|
||||||
|
running bugzilla/bodhi/koji sync
|
||||||
|
blockerbugs02.phx2::
|
||||||
|
the other load balanced production node. It does not do any sync
|
||||||
|
operations
|
||||||
|
|
||||||
|
== Building for Infra
|
||||||
|
|
||||||
|
=== Do not use mock
|
||||||
|
|
||||||
|
For whatever reason, the `epel7-infra` koji tag rejects SRPMs with the
|
||||||
|
`el7.centos` dist tag. Make sure that you build SRPMs with:
|
||||||
|
|
||||||
|
....
|
||||||
|
rpmbuild -bs --define='dist .el7' blockerbugs.spec
|
||||||
|
....
|
||||||
|
|
||||||
|
Also note that this expects the release tarball to be in
|
||||||
|
`~/rpmbuild/SOURCES/`.
|
||||||
|
|
||||||
|
=== Building with Koji
|
||||||
|
|
||||||
|
You'll need to ask someone who has rights to build into `epel7-infra`
|
||||||
|
tag to make the build for you:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji build epel7-infra blockerbugs-0.4.4.11-1.el7.src.rpm
|
||||||
|
....
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
The fun bit of this is that `python-flask` is only available on `x86_64`
|
||||||
|
builders. If your build is routed to one of the non-x86_64, it will
|
||||||
|
fail. The only solution available to us is to keep submitting the build
|
||||||
|
until it's routed to one of the x86_64 builders and doesn't fail.
|
||||||
|
====
|
||||||
|
|
||||||
|
Once the build is complete, it should be automatically tagged into
|
||||||
|
`epel7-infra-stg` (after a ~15 min delay), so that you can test it on
|
||||||
|
blockerbugs staging instance. Once you've verified it's working well,
|
||||||
|
ask someone with infra rights to move it to `epel7-infra` tag so that
|
||||||
|
you can update it in production.
|
||||||
|
|
||||||
|
== Upgrading
|
||||||
|
|
||||||
|
Blockerbugs is currently configured through ansible and all
|
||||||
|
configuration changes need to be done through ansible.
|
||||||
|
|
||||||
|
=== Upgrade Preparation (all upgrades)
|
||||||
|
|
||||||
|
Blockerbugs is not packaged in epel, so the new build needs to exist in
|
||||||
|
the infrastructure stg repo for deployment to stg or the infrastructure
|
||||||
|
repo for deployments to production.
|
||||||
|
|
||||||
|
See the blockerbugs documentation for instructions on building a
|
||||||
|
blockerbugs RPM.
|
||||||
|
|
||||||
|
=== Minor Upgrades (no database changes)
|
||||||
|
|
||||||
|
Run the following on *both* `blockerbugs01.phx2` and
|
||||||
|
`blockerbugs02.phx2` if updating in production.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Update ansible with config changes, push changes to the ansible repo:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
roles/blockerbugs/templates/blockerbugs-settings.py.j2
|
||||||
|
....
|
||||||
|
. Clear yum cache and update the blockerbugs RPM:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
yum clean expire-cache && yum update blockerbugs
|
||||||
|
....
|
||||||
|
. Restart httpd to reload the application:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
service httpd restart
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Major Upgrades (with database changes)
|
||||||
|
|
||||||
|
Run the following on *both* `blockerbugs01.phx2` and
|
||||||
|
`blockerbugs02.phx2` if updating in production.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Update ansible with config changes, push changes to the ansible repo:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
roles/blockerbugs/templates/blockerbugs-settings.py.j2
|
||||||
|
....
|
||||||
|
. Stop httpd on *all* relevant instances (if load balanced):
|
||||||
|
+
|
||||||
|
....
|
||||||
|
service httpd stop
|
||||||
|
....
|
||||||
|
. Clear yum cache and update the blockerbugs RPM on all relevant
|
||||||
|
instances:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
yum clean expire-cache && yum update blockerbugs
|
||||||
|
....
|
||||||
|
. Upgrade the database schema:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
blockerbugs upgrade_db
|
||||||
|
....
|
||||||
|
. Check the upgrade by running a manual sync to make sure that nothing
|
||||||
|
unexpected went wrong:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
blockerbugs sync
|
||||||
|
....
|
||||||
|
. Start httpd back up:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
service httpd start
|
||||||
|
....
|
429
modules/sysadmin_guide/pages/bodhi.adoc
Normal file
429
modules/sysadmin_guide/pages/bodhi.adoc
Normal file
|
@ -0,0 +1,429 @@
|
||||||
|
= Bodhi Infrastructure SOP
|
||||||
|
|
||||||
|
Bodhi is used by Fedora developers to submit potential package updates
|
||||||
|
for releases and to manage buildroot overrides. From here, bodhi handles
|
||||||
|
all of the dirty work, from sending around emails, dealing with Koji, to
|
||||||
|
composing the repositories.
|
||||||
|
|
||||||
|
Bodhi production instance: https://bodhi.fedoraproject.org Bodhi project
|
||||||
|
page: https://github.com/fedora-infra/bodhi
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Adding a new pending release
|
||||||
|
. 0-day Release Actions
|
||||||
|
. Configuring all bodhi nodes
|
||||||
|
. Pushing updates
|
||||||
|
. Monitoring the bodhi composer output
|
||||||
|
. Resuming a failed push
|
||||||
|
. Performing a production bodhi upgrade
|
||||||
|
. Syncing the production database to staging
|
||||||
|
. Release EOL
|
||||||
|
. Adding notices to the front page or new update form
|
||||||
|
. Using the Bodhi Shell to modify updates by hand
|
||||||
|
. Using the Bodhi shell to fix uniqueness problems with e-mail addresses
|
||||||
|
. Troubleshooting and Resolution
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
bowlofeggs
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
* bodhi-backend01.phx2.fedoraproject.org (composer)
|
||||||
|
* os.fedoraproject.org (web front end and backend task workers for
|
||||||
|
non-compose tasks)
|
||||||
|
* bodhi-backend01.stg.phx2.fedoraproject.org (staging composer)
|
||||||
|
* os.stg.fedoraproject.org (staging web front end and backend task
|
||||||
|
workers for non-compose tasks)
|
||||||
|
Purpose::
|
||||||
|
Push package updates, and handle new submissions.
|
||||||
|
|
||||||
|
== Adding a new pending release
|
||||||
|
|
||||||
|
Adding and modifying releases is done using the
|
||||||
|
[.title-ref]#bodhi-manage-releases# tool.
|
||||||
|
|
||||||
|
You can add a new pending release by running this command:
|
||||||
|
|
||||||
|
....
|
||||||
|
bodhi-manage-releases create --name F23 --long-name "Fedora 23" --id-prefix FEDORA --version 23 --branch f23 --dist-tag f23 --stable-tag f23-updates --testing-tag f23-updates-testing --candidate-tag f23-updates-candidate --pending-stable-tag f23-updates-pending --pending-testing-tag f23-updates-testing-pending --override-tag f23-override --state pending
|
||||||
|
....
|
||||||
|
|
||||||
|
== Pre-Beta Bodhi config
|
||||||
|
|
||||||
|
Enable pre_beta policy in bodhi config in ansible.::::
|
||||||
|
ansible/roles/bodhi2/base/templates/production.ini.j2
|
||||||
|
|
||||||
|
Uncomment or add the following lines:
|
||||||
|
|
||||||
|
....
|
||||||
|
#f29.status = pre_beta
|
||||||
|
#f29.pre_beta.mandatory_days_in_testing = 3
|
||||||
|
#f29.pre_beta.critpath.min_karma = 1
|
||||||
|
#f29.pre_beta.critpath.stable_after_days_without_negative_karma = 14
|
||||||
|
....
|
||||||
|
|
||||||
|
== Post-Beta Bodhi config
|
||||||
|
|
||||||
|
Enable post_beta policy in bodhi config in ansible.::::
|
||||||
|
ansible/roles/bodhi2/base/templates/production.ini.j2
|
||||||
|
|
||||||
|
Comment or remove the following lines corresponding to pre_beta policy:
|
||||||
|
|
||||||
|
....
|
||||||
|
#f29.status = pre_beta
|
||||||
|
#f29.pre_beta.mandatory_days_in_testing = 3
|
||||||
|
#f29.pre_beta.critpath.min_karma = 1
|
||||||
|
#f29.pre_beta.critpath.stable_after_days_without_negative_karma = 14
|
||||||
|
....
|
||||||
|
|
||||||
|
Uncomment or add the following lines for post_beta policy
|
||||||
|
|
||||||
|
....
|
||||||
|
#f29.status = post_beta
|
||||||
|
#f29.post_beta.mandatory_days_in_testing = 7
|
||||||
|
#f29.post_beta.critpath.min_karma = 2
|
||||||
|
#f29.post_beta.critpath.stable_after_days_without_negative_karma = 14
|
||||||
|
....
|
||||||
|
|
||||||
|
== 0-day Release Actions
|
||||||
|
|
||||||
|
* update atomic config
|
||||||
|
* run the ansible playbook
|
||||||
|
|
||||||
|
Going from pending to a proper release in bodhi requires a few steps:
|
||||||
|
|
||||||
|
Change state from pending to current:
|
||||||
|
|
||||||
|
....
|
||||||
|
bodhi-manage-releases edit --name F23 --state current
|
||||||
|
....
|
||||||
|
|
||||||
|
You may also need to disable any pre-beta or post-beta policy defined in
|
||||||
|
the bodhi config in ansible.:
|
||||||
|
|
||||||
|
....
|
||||||
|
ansible/roles/bodhi2/base/templates/production.ini.j2
|
||||||
|
....
|
||||||
|
|
||||||
|
Uncomment or remove the lines related to pre and post beta polcy
|
||||||
|
|
||||||
|
....
|
||||||
|
#f29.status = post_beta
|
||||||
|
#f29.post_beta.mandatory_days_in_testing = 7
|
||||||
|
#f29.post_beta.critpath.min_karma = 2
|
||||||
|
#f29.post_beta.critpath.stable_after_days_without_negative_karma = 14
|
||||||
|
#f29.status = pre_beta
|
||||||
|
#f29.pre_beta.mandatory_days_in_testing = 3
|
||||||
|
#f29.pre_beta.critpath.min_karma = 1
|
||||||
|
#f29.pre_beta.critpath.stable_after_days_without_negative_karma = 14
|
||||||
|
....
|
||||||
|
|
||||||
|
== Configuring all bodhi nodes
|
||||||
|
|
||||||
|
Run this command from the [.title-ref]#ansible# checkout to configure
|
||||||
|
all of bodhi in production:
|
||||||
|
|
||||||
|
....
|
||||||
|
# This will configure the backends
|
||||||
|
$ sudo rbac-playbook playbooks/groups/bodhi2.yml
|
||||||
|
# This will configure the frontend
|
||||||
|
$ sudo rbac-playbook openshift-apps/bodhi.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Pushing updates
|
||||||
|
|
||||||
|
SSH into the [.title-ref]#bodhi-backend01# machine and run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -u apache bodhi-push
|
||||||
|
....
|
||||||
|
|
||||||
|
You can restrict the updates by release and/or request:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -u apache bodhi-push --releases f23,f22 --request stable
|
||||||
|
....
|
||||||
|
|
||||||
|
You can also push specific builds:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -u apache bodhi-push --builds openssl-1.0.1k-14.fc22,openssl-1.0.1k-14.fc23
|
||||||
|
....
|
||||||
|
|
||||||
|
This will display a list of updates that are ready to be pushed.
|
||||||
|
|
||||||
|
== Monitoring the bodhi composer output
|
||||||
|
|
||||||
|
You can monitor the bodhi composer via the `bodhi` CLI tool, or via the
|
||||||
|
systemd journal on `bodhi-backend01`:
|
||||||
|
|
||||||
|
....
|
||||||
|
# From the comfort of your own laptop.
|
||||||
|
$ bodhi composes list
|
||||||
|
# From bodhi-backend01
|
||||||
|
$ journalctl -f -u fedmsg-hub
|
||||||
|
....
|
||||||
|
|
||||||
|
== Resuming a failed push
|
||||||
|
|
||||||
|
If a push fails for some reason, you can easily resume it on
|
||||||
|
`bodhi-backend01` by running:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -u apache bodhi-push --resume
|
||||||
|
....
|
||||||
|
|
||||||
|
== Performing a bodhi upgrade
|
||||||
|
|
||||||
|
=== Build Bodhi
|
||||||
|
|
||||||
|
Bodhi is deployed from the infrastructure Koji repositories. At the time
|
||||||
|
of this writing, it is deployed from the `f29-infra` and `f29-infra-stg`
|
||||||
|
(for staging) repositories. Bodhi is built for these repositories from
|
||||||
|
the `master` branch of the
|
||||||
|
https://src.fedoraproject.org/rpms/bodhi[bodhi dist-git repository].
|
||||||
|
|
||||||
|
As an example, to build a Bodhi beta for the `f29-infra-stg` repository,
|
||||||
|
you can use this command:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ rpmbuild --define "dist .fc29.infra" -bs bodhi.spec
|
||||||
|
Wrote: /home/bowlofeggs/rpmbuild/SRPMS/bodhi-3.13.0-0.0.beta.e0ca5bc.fc29.infra.src.rpm
|
||||||
|
$ koji build f29-infra /home/bowlofeggs/rpmbuild/SRPMS/bodhi-3.13.0-0.0.beta.e0ca5bc.fc29.infra.src.rpm
|
||||||
|
....
|
||||||
|
|
||||||
|
When building a Bodhi release that is intended for production, we should
|
||||||
|
build from the production dist-git repo instead of uploading an SRPM:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ koji build f29-infra git+https://src.fedoraproject.org/rpms/bodhi.git#d64f40408876ec85663ec52888c4e44d92614b37
|
||||||
|
....
|
||||||
|
|
||||||
|
All builds against the `f29-infra` build target will go into the
|
||||||
|
`f29-infra-stg` repository. If you wish to promote a build from staging
|
||||||
|
to production, you can do something like this command:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ koji move-build f29-infra-stg f29-infra bodhi-3.13.0-1.fc29.infra
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Staging
|
||||||
|
|
||||||
|
The upgrade playbook will apply configuration changes after running the
|
||||||
|
alembic upgrade. Sometimes you may need changes applied to the Bodhi
|
||||||
|
systems in order to get the upgrade playbook to succeed. If you are in
|
||||||
|
this situation, you can apply those changes by running the bodhi-backend
|
||||||
|
playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook -l staging groups/bodhi-backend.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
In the [.title-ref]#os_masters inventory#
|
||||||
|
<https://pagure.io/fedora-infra/ansible/blob/main/f/inventory/group_vars/os_masters_stg>_,
|
||||||
|
edit the `bodhi_version` setting it to the version you wish to deploy to
|
||||||
|
staging. For example, to deploy `bodhi-3.13.0-1.fc29.infra` to staging,
|
||||||
|
I would set that varible like this:
|
||||||
|
|
||||||
|
....
|
||||||
|
bodhi_version: "bodhi-3.13.0-1.fc29.infra"
|
||||||
|
....
|
||||||
|
|
||||||
|
Run these commands:
|
||||||
|
|
||||||
|
....
|
||||||
|
# Synchronize the database from production to staging
|
||||||
|
$ sudo rbac-playbook manual/staging-sync/bodhi.yml -l staging
|
||||||
|
# Upgrade the Bodhi backend on staging
|
||||||
|
$ sudo rbac-playbook manual/upgrade/bodhi.yml -l staging
|
||||||
|
# Upgrade the Bodhi frontend on staging
|
||||||
|
$ sudo rbac-playbook openshift-apps/bodhi.yml -l staging
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Production
|
||||||
|
|
||||||
|
The upgrade playbook will apply configuration changes after running the
|
||||||
|
alembic upgrade. Sometimes you may need changes applied to the Bodhi
|
||||||
|
systems in order to get the upgrade playbook to succeed. If you are in
|
||||||
|
this situation, you can apply those changes by running the bodhi-backend
|
||||||
|
playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook groups/bodhi-backend.yml -l bodhi-backend
|
||||||
|
....
|
||||||
|
|
||||||
|
In the [.title-ref]#os_masters inventory#
|
||||||
|
<https://pagure.io/fedora-infra/ansible/blob/main/f/inventory/group_vars/os_masters>_,
|
||||||
|
edit the `bodhi_version` setting it to the version you wish to deploy to
|
||||||
|
production. For example, to deploy `bodhi-3.13.0-1.fc29.infra` to
|
||||||
|
production, I would set that varible like this:
|
||||||
|
|
||||||
|
....
|
||||||
|
bodhi_version: "bodhi-3.13.0-1.fc29.infra"
|
||||||
|
....
|
||||||
|
|
||||||
|
To update the bodhi RPMs in production:
|
||||||
|
|
||||||
|
....
|
||||||
|
# Update the backend VMs (this will also run the migrations, if any)
|
||||||
|
$ sudo rbac-playbook manual/upgrade/bodhi.yml -l bodhi-backend
|
||||||
|
# Update the frontend
|
||||||
|
$ sudo rbac-playbook openshift-apps/bodhi.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Syncing the production database to staging
|
||||||
|
|
||||||
|
This can be useful for testing issues with production data in staging:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook manual/staging-sync/bodhi.yml -l staging
|
||||||
|
....
|
||||||
|
|
||||||
|
== Release EOL
|
||||||
|
|
||||||
|
....
|
||||||
|
bodhi-manage-releases edit --name F21 --state archived
|
||||||
|
....
|
||||||
|
|
||||||
|
== Adding notices to the front page or new update form
|
||||||
|
|
||||||
|
You can easily add notification messages to the front page of bodhi
|
||||||
|
using the [.title-ref]#frontpage_notice# option in
|
||||||
|
[.title-ref]#ansible/roles/bodhi2/base/templates/production.ini.j2#. If
|
||||||
|
you want to flash a message on the New Update Form, you can use the
|
||||||
|
[.title-ref]#newupdate_notice# variable instead. This can be useful for
|
||||||
|
announcing things like service outages, etc.
|
||||||
|
|
||||||
|
== Using the Bodhi Shell to modify updates by hand
|
||||||
|
|
||||||
|
The "bodhi shell" is a Python shell with the SQLAlchemy session and
|
||||||
|
transaction manager initialized. It can be run from any
|
||||||
|
production/staging backend instance and allows you to modify any models
|
||||||
|
by hand.
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo pshell /etc/bodhi/production.ini
|
||||||
|
|
||||||
|
# Execute a script that sets up the `db` and provides a `delete_update` function.
|
||||||
|
# This will eventually be shipped in the bodhi package, but can also be found here.
|
||||||
|
# https://raw.githubusercontent.com/fedora-infra/bodhi/develop/tools/shelldb.py
|
||||||
|
>>> execfile('shelldb.py')
|
||||||
|
....
|
||||||
|
|
||||||
|
At this point you have access to a [.title-ref]#db# SQLAlchemy Session
|
||||||
|
instance, a [.title-ref]#t# [.title-ref]#transaction# module, and
|
||||||
|
[.title-ref]#m# for the [.title-ref]#bodhi.models#.
|
||||||
|
|
||||||
|
....
|
||||||
|
# Fetch an update, and tweak it as necessary.
|
||||||
|
>>> up = m.Update.get(u'u'FEDORA-2016-4d226a5f7e', db)
|
||||||
|
|
||||||
|
# Commit the transaction
|
||||||
|
>>> t.commit()
|
||||||
|
....
|
||||||
|
|
||||||
|
Here is an example of merging two updates together and deleting the
|
||||||
|
original.
|
||||||
|
|
||||||
|
....
|
||||||
|
>>> up = m.Update.get(u'FEDORA-2016-4d226a5f7e', db)
|
||||||
|
>>> up.builds
|
||||||
|
[<Build {'epoch': 0, 'nvr': u'resteasy-3.0.17-2.fc24'}>, <Build {'epoch': 0, 'nvr': u'pki-core-10.3.5-1.fc24'}>]
|
||||||
|
>>> b = up.builds[0]
|
||||||
|
>>> up2 = m.Update.get(u'FEDORA-2016-5f63a874ca', db)
|
||||||
|
>>> up2.builds
|
||||||
|
[<Build {'epoch': 0, 'nvr': u'resteasy-3.0.17-3.fc24'}>]
|
||||||
|
>>> up.builds.remove(b)
|
||||||
|
>>> up.builds.append(up2.builds[0])
|
||||||
|
>>> delete_update(up2)
|
||||||
|
>>> t.commit()
|
||||||
|
....
|
||||||
|
|
||||||
|
== Using the Bodhi shell to fix uniqueness problems with e-mail addresses
|
||||||
|
|
||||||
|
Bodhi currently enforces uniqueness on user e-mail addresses. There is
|
||||||
|
https://github.com/fedora-infra/bodhi/issues/2387[an issue] filed to
|
||||||
|
drop this upstream, but for the time being the constraint is enforced.
|
||||||
|
This can be a problem for users who have more than one FAS account if
|
||||||
|
they make one account use an e-mail address that was previously used by
|
||||||
|
another account, if that other account has not logged into Bodhi since
|
||||||
|
it was changed to use a different address. One way the user can fix this
|
||||||
|
themselves is to log in to Bodhi with the old account so that Bodhi
|
||||||
|
learns about its new address. However, an admin can also fix this by
|
||||||
|
hand by using the Bodhi shell.
|
||||||
|
|
||||||
|
For example, suppose a user has created `user_1` and `user_2`. Suppose
|
||||||
|
that `user_1` used to use `email_a@example.com` but has been changed to
|
||||||
|
use `email_b@example.com` in FAS, and `user_2` is now configured to use
|
||||||
|
`email_a@example.com` in FAS. If `user_2` attempts to log in to Bodhi,
|
||||||
|
it will cause a uniqueness violation since Bodhi does not know that
|
||||||
|
`user_1` has changed to `email_b@example.com`. The user can simply log
|
||||||
|
in as `user_1` to fix this, which will cause Bodhi to update its e-mail
|
||||||
|
address to `email_b@example.com`. Or an admin can fix it with a shell on
|
||||||
|
one of the Bodhi backend servers like this:
|
||||||
|
|
||||||
|
....
|
||||||
|
[bowlofeggs@bodhi-backend02 ~][PROD]$ sudo -u apache pshell /etc/bodhi/production.ini
|
||||||
|
2018-05-29 20:21:36,366 INFO [bodhi][MainThread] Using python-bugzilla
|
||||||
|
2018-05-29 20:21:36,367 DEBUG [bodhi][MainThread] Using Koji Buildsystem
|
||||||
|
2018-05-29 20:21:42,559 INFO [bodhi.server][MainThread] Bodhi ready and at your service!
|
||||||
|
Python 2.7.14 (default, Mar 14 2018, 13:36:31)
|
||||||
|
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux2
|
||||||
|
Type "help" for more information.
|
||||||
|
|
||||||
|
Environment:
|
||||||
|
app The WSGI application.
|
||||||
|
registry Active Pyramid registry.
|
||||||
|
request Active request object.
|
||||||
|
root Root of the default resource tree.
|
||||||
|
root_factory Default root factory used to create `root`.
|
||||||
|
|
||||||
|
Custom Variables:
|
||||||
|
m bodhi.server.models
|
||||||
|
|
||||||
|
>>> u = m.User.query.filter_by(name=u'user_1').one()
|
||||||
|
>>> u.email = u'email_b@example.com'
|
||||||
|
>>> m.Session().commit()
|
||||||
|
....
|
||||||
|
|
||||||
|
== Troubleshooting and Resolution
|
||||||
|
|
||||||
|
=== Atomic OSTree compose failure
|
||||||
|
|
||||||
|
If the Atomic OSTree compose fails with some sort of [.title-ref]#Device
|
||||||
|
or Resource busy# error, then run [.title-ref]#mount# to see if there
|
||||||
|
are any stray [.title-ref]#tmpfs# mounts still active:
|
||||||
|
|
||||||
|
....
|
||||||
|
tmpfs on /var/lib/mock/fedora-22-updates-testing-x86_64/root/var/tmp/rpm-ostree.bylgUq type tmpfs (rw,relatime,seclabel,mode=755)
|
||||||
|
....
|
||||||
|
|
||||||
|
You can then [.title-ref]#umount
|
||||||
|
/var/lib/mock/fedora-22-updates-testing-x86_64/root/var/tmp/rpm-ostree.bylgUq#
|
||||||
|
and resume the push again.
|
||||||
|
|
||||||
|
=== nfs repodata cache IOError
|
||||||
|
|
||||||
|
Sometimes you may hit an IOError during the updateinfo.xml generation
|
||||||
|
process from createrepo_c:
|
||||||
|
|
||||||
|
....
|
||||||
|
IOError: Cannot open /mnt/koji/mash/updates/epel7-160228.1356/../epel7.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/epel7-160228.1356/../epel7.repocache/repodata/repomd.xml doesn't exists or not a regular file
|
||||||
|
....
|
||||||
|
|
||||||
|
This issue will be resolved with NFSv4, but in the mean time it can be
|
||||||
|
worked around by removing the [.title-ref]#.repocache# directory and
|
||||||
|
resuming the push:
|
||||||
|
|
||||||
|
....
|
||||||
|
rm -fr /mnt/koji/mash/updates/epel7.repocache
|
||||||
|
....
|
122
modules/sysadmin_guide/pages/bugzilla.adoc
Normal file
122
modules/sysadmin_guide/pages/bugzilla.adoc
Normal file
|
@ -0,0 +1,122 @@
|
||||||
|
= Bugzilla Sync Infrastructure SOP
|
||||||
|
|
||||||
|
We do not run bugzilla.redhat.com. If bugzilla itself is down we need to
|
||||||
|
get in touch with Red Hat IT or one of the bugzilla hackers (for
|
||||||
|
instance, Dave Lawrence (dkl)) in order to fix it.
|
||||||
|
|
||||||
|
Infrastructure has some scripts that perform administrative functions on
|
||||||
|
bugzilla.redhat.com. These scripts sync information from FAS and the
|
||||||
|
Package Database into bugzilla.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. Troubleshooting and Resolution
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Errors while syncing bugzilla with the PackageDB
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
abadger1999
|
||||||
|
Location::
|
||||||
|
Phoenix, Denver (Tummy), Red Hat Infrastructure
|
||||||
|
Servers::
|
||||||
|
(fas1, app5) => Need to migrate these to bapp1, bugzilla.redhat.com
|
||||||
|
Purpose::
|
||||||
|
Sync Fedora information to bugzilla.redhat.com
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
At present there are two scripts that sync information from Fedora into
|
||||||
|
bugzilla.
|
||||||
|
|
||||||
|
=== export-bugzilla.py
|
||||||
|
|
||||||
|
`export-bugzilla.py` is the first script. It is responsible for syncing
|
||||||
|
Fedora Accounts into bugzilla. It adds Fedora packages and bug triagers
|
||||||
|
into a bugzilla group that gives the users extra permissions within
|
||||||
|
bugzilla. This script is run off of a cron job on FAS1. The source code
|
||||||
|
resides in the FAS git repo in `fas/scripts/export-bugzilla.*` however
|
||||||
|
the code we run on the servers presently lives in ansible:
|
||||||
|
|
||||||
|
....
|
||||||
|
roles/fas_server/files/export-bugzilla
|
||||||
|
....
|
||||||
|
|
||||||
|
=== pkgdb-sync-bugzilla
|
||||||
|
|
||||||
|
The other script is pkgdb-sync-bugzilla. It is responsible for syncing
|
||||||
|
the package owners and cclists to bugzilla from the pkgdb. The script
|
||||||
|
runs off a cron job on app5. The source code is in the packagedb bzr
|
||||||
|
repo is
|
||||||
|
`packagedb/fedora-packagedb-stable/server-scripts/pkgdb-sync-bugzilla.*`.
|
||||||
|
Just like FAS, a separate copy is presently installed from ansbile to
|
||||||
|
`/usr/local/bin/pkgdb-sync-bugzilla` but that should change ASAP as the
|
||||||
|
present fedora-packagedb package installs
|
||||||
|
`/usr/bin/pkgdb-sync-bugzilla`.
|
||||||
|
|
||||||
|
== Troubleshooting and Resolution
|
||||||
|
|
||||||
|
=== Errors while syncing bugzilla with the PackageDB
|
||||||
|
|
||||||
|
One frequent problem is that people will sign up to watch a package in
|
||||||
|
the packagedb but their email address in FAS isn't a bugzilla email
|
||||||
|
address. When this happens the scripts that try to sync the packagedb
|
||||||
|
information to bugzilla encounter an error and send an email like this:
|
||||||
|
|
||||||
|
....
|
||||||
|
Subject: Errors while syncing bugzilla with the PackageDB
|
||||||
|
|
||||||
|
The following errors were encountered while updating bugzilla with information
|
||||||
|
from the Package Database. Please have the problems taken care of:
|
||||||
|
|
||||||
|
({'product': u'Fedora', 'component': u'aircrack-ng', 'initialowner': u'baz@zardoz.org',
|
||||||
|
'initialcclist': [u'foo@bar.org', u'baz@zardoz.org']}, 504, 'The name foo@bar.org is not a
|
||||||
|
valid username. \n Either you misspelled it, or the person has not\n registered for a
|
||||||
|
Red Hat Bugzilla account.')
|
||||||
|
....
|
||||||
|
|
||||||
|
When this happens we attempt to contact the person with the problematic
|
||||||
|
mail address and get them to change it. Here's a boilerplate message:
|
||||||
|
|
||||||
|
....
|
||||||
|
To: foo@bar.org
|
||||||
|
Subject: Fedora Account System Email vs Bugzilla Email
|
||||||
|
|
||||||
|
Hello,
|
||||||
|
|
||||||
|
You are signed up to receive bug reports against the aircrack-ng package
|
||||||
|
in Fedora. Unfortunately, the email address we have for you in the
|
||||||
|
Fedora Account System is not a valid bugzilla email address. That means
|
||||||
|
that bugzilla won't send you mail and we're getting errors in the script
|
||||||
|
that syncs the cclist into bugzilla.
|
||||||
|
|
||||||
|
There's a few ways to resolve this:
|
||||||
|
|
||||||
|
1) Create a new bugzilla account with the email foo@bar.org as
|
||||||
|
an account at https://bugzilla.redhat.com.
|
||||||
|
|
||||||
|
2) Change an existing account on https://bugzilla.redhat.com to use the
|
||||||
|
foo@bar.org email address.
|
||||||
|
|
||||||
|
3) Change your email address in https://admin.fedoraproject.org/accounts
|
||||||
|
to use an email address that matches with an existing bugzilla email
|
||||||
|
address.
|
||||||
|
|
||||||
|
Please let me know what you want to do!
|
||||||
|
|
||||||
|
Thank you,
|
||||||
|
....
|
||||||
|
|
||||||
|
If the user does not reply someone in the cvsadmin group needs to go
|
||||||
|
into the pkgdb and remove the user from the cclist for the package.
|
71
modules/sysadmin_guide/pages/bugzilla2fedmsg.adoc
Normal file
71
modules/sysadmin_guide/pages/bugzilla2fedmsg.adoc
Normal file
|
@ -0,0 +1,71 @@
|
||||||
|
= bugzilla2fedmsg SOP
|
||||||
|
|
||||||
|
Receive events from bugzilla over the RH "unified messagebus" and
|
||||||
|
rebroadcast them over our own fedmsg bus.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-fedmsg, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
bugzilla2fedmsg01
|
||||||
|
Purpose::
|
||||||
|
Rebroadcast bugzilla events on our bus.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
bugzilla2fedmsg is a small service running as the 'moksha-hub' process
|
||||||
|
which receives events from bugzilla via the RH "unified messagebus" and
|
||||||
|
rebroadcasts them to our fedmsg bus.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Unlike _all_ of our other fedmsg services, this one runs as the
|
||||||
|
'moksha-hub' process and not as the 'fedmsg-hub'.
|
||||||
|
====
|
||||||
|
|
||||||
|
The bugzilla2fedmsg package provides a plugin to the moksha-hub that
|
||||||
|
connects out over the STOMP protocol to a 'fabric' of JBOSS activemq
|
||||||
|
FUSE brokers living in the Red Hat DMZ. We authenticate with a cert/key
|
||||||
|
pair that is kept in /etc/pki/fedmsg/. Those brokers should push
|
||||||
|
bugzilla events over STOMP to our moksha-hub daemon. When a message
|
||||||
|
arrives, we query bugzilla about the change to get some 'more
|
||||||
|
interesting' data to stuff in our payload, then we sign the message
|
||||||
|
using a fedmsg cert and fire it off to the rest of our bus.
|
||||||
|
|
||||||
|
This service has no database, no memcached usage. It depends on those
|
||||||
|
STOMP brokers and being able to query bugzilla.rh.com.
|
||||||
|
|
||||||
|
== Relevant Files
|
||||||
|
|
||||||
|
All managed by ansible, of course:
|
||||||
|
|
||||||
|
____
|
||||||
|
STOMP config: /etc/moksha/production.ini fedmsg config: /etc/fedmsg.d/
|
||||||
|
certs: /etc/pki/fedmsg code:
|
||||||
|
/usr/lib/python2.7/site-packages/bugzilla2fedmsg.py
|
||||||
|
____
|
||||||
|
|
||||||
|
== Useful Commands
|
||||||
|
|
||||||
|
To look at logs, run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ journalctl -u moksha-hub -f
|
||||||
|
....
|
||||||
|
|
||||||
|
To restart the service, run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ systemctl restart moksha-hub
|
||||||
|
....
|
||||||
|
|
||||||
|
== Internal Contacts
|
||||||
|
|
||||||
|
If we need to contact someone from the RH internal "unified messagebus"
|
||||||
|
team, search for "unified messagebus" in mojo. It is operated as a joint
|
||||||
|
project between RHIT and PnT Devops. See also the `#devops-message` IRC
|
||||||
|
channel, internally.
|
169
modules/sysadmin_guide/pages/cloud.adoc
Normal file
169
modules/sysadmin_guide/pages/cloud.adoc
Normal file
|
@ -0,0 +1,169 @@
|
||||||
|
= Fedora OpenStack
|
||||||
|
|
||||||
|
== Quick Start
|
||||||
|
|
||||||
|
Controller:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook hosts/fed-cloud09.cloud.fedoraproject.org.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
Compute nodes:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook groups/openstack-compute-nodes.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
If you need to install OpenStack install, either make sure the machine
|
||||||
|
is clean. Or use `ansible.git/files/fedora-cloud/uninstall.sh` script to
|
||||||
|
brute force wipe off.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
by default, the script does not wipe LVM group with VM, you have to
|
||||||
|
clean them manually. There is commented line in that script.
|
||||||
|
====
|
||||||
|
On fed-cloud09, remove the file
|
||||||
|
`/etc/packstack_sucessfully_finished` to enforce run of packstack and
|
||||||
|
few other commands.
|
||||||
|
|
||||||
|
After that wipe, you have to:
|
||||||
|
|
||||||
|
....
|
||||||
|
ifdown eth1
|
||||||
|
configure eth1 to become normal Ethernet with ip
|
||||||
|
yum install openstack-neutron-openvswitch
|
||||||
|
/usr/bin/systemctl restart neutron-ovs-cleanup
|
||||||
|
ifup eth1
|
||||||
|
....
|
||||||
|
|
||||||
|
Additionally when reprovision OpenStack, all volumes on DellEqualogic
|
||||||
|
are preserved and you have to manually remove them (or remove them from
|
||||||
|
OS before it is reprovision). SSH to DellEqualogic (credentials are at
|
||||||
|
the bottom of `/etc/cinder/cinder.conf`) and run:
|
||||||
|
|
||||||
|
....
|
||||||
|
show (to get list of volumes)
|
||||||
|
volume select <volume_name> offline
|
||||||
|
volume delete <volume_name>
|
||||||
|
....
|
||||||
|
|
||||||
|
Before installing make sure:
|
||||||
|
|
||||||
|
____
|
||||||
|
* make sure rdo repo is enabled
|
||||||
|
* `yum install openstack-packstack openstack-packstack-puppet openstack-puppet-modules`
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
`vim /usr/lib/python2.7/site-packages/packstack/plugins/dashboard_500.py`::
|
||||||
|
and missing parentheses:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
``host_resources.append((ssl_key, 'ssl_ps_server.key'))``
|
||||||
|
....
|
||||||
|
____
|
||||||
|
|
||||||
|
Now you can run playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook hosts/fed-cloud09.cloud.fedoraproject.org.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
If you run it after wipe (i.e. db has been reset), you have to:
|
||||||
|
|
||||||
|
____
|
||||||
|
* import ssh keys of users (only possible via webUI - RHBZ 1128233
|
||||||
|
* reset user passwords
|
||||||
|
____
|
||||||
|
|
||||||
|
== Compute nodes
|
||||||
|
|
||||||
|
Compute node is much easier and is written as role. Use:
|
||||||
|
|
||||||
|
....
|
||||||
|
vars_files:
|
||||||
|
- ... SNIP
|
||||||
|
- /srv/web/infra/ansible/vars/fedora-cloud.yml
|
||||||
|
- "{{ private }}/files/openstack/passwords.yml"
|
||||||
|
|
||||||
|
roles:
|
||||||
|
... SNIP
|
||||||
|
- cloud_compute
|
||||||
|
....
|
||||||
|
|
||||||
|
Define a host variable in `inventory/host_vars/FQDN.yml`:
|
||||||
|
|
||||||
|
....
|
||||||
|
compute_private_ip: 172.23.0.10
|
||||||
|
....
|
||||||
|
|
||||||
|
You should also add IP to `vars/fedora-cloud.yml`
|
||||||
|
|
||||||
|
And when adding new compute node, please update
|
||||||
|
`files/fedora-cloud/hosts`
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
When reinstalling make sure you removed all members on Dell Equalogic
|
||||||
|
(credentials are in /etc/cinder/cinder.conf on compute node) otherwise
|
||||||
|
the space will be blocked!!!
|
||||||
|
====
|
||||||
|
== Updates
|
||||||
|
|
||||||
|
Our openstack cloud should have updates applied and reboots when the
|
||||||
|
rest of our servers are updated and rebooted. This will cause an outage,
|
||||||
|
please make sure to schedule it.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Stop copr-backend process on copr-be.cloud.fedoraproject.org
|
||||||
|
. Kill all copr-builder instances.
|
||||||
|
. Kill all transient/scratch instances.
|
||||||
|
. Update all instances we control. copr, persistent, infrastructure, qa
|
||||||
|
etc.
|
||||||
|
. Shutdown all instances
|
||||||
|
. Update and reboot fed-cloud09
|
||||||
|
. Update and reboot all compute nodes
|
||||||
|
. Start up all instances that are shutdown in step 5.
|
||||||
|
|
||||||
|
TODO: add commands for above as we know them.
|
||||||
|
|
||||||
|
== Troubleshooting
|
||||||
|
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
could not connect to VM? - check your security group, default SG does
|
||||||
|
not::
|
||||||
|
allow any connection.
|
||||||
|
* packstack end up with error, it is likely race condition in puppet -
|
||||||
|
BZ 1135529. Just run it again.
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
ERROR : append() takes exactly one argument (2 given::
|
||||||
|
`vi /usr/lib/python2.7/site-packages/packstack/plugins/dashboard_500.py`
|
||||||
|
and add one more surrounding ()
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Local ip for ovs agent must be set when tunneling is enabled::
|
||||||
|
restart fed-cloud09 or: ssh to fed-cloud09; ifdown eth1; ifup eth1;
|
||||||
|
ifup br-ex
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
mongodb problem? follow::
|
||||||
|
https://ask.openstack.org/en/question/54015/mongodbpp-error-when-installing-rdo-on-centos-7/?answer=54076#post-id-54076
|
||||||
|
* `WARNING:keystoneclient.httpclient:Failed to retrieve management_url from token`:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
keystone --os-token $ADMIN_TOKEN --os-endpoint \
|
||||||
|
https://fedorainfracloud.org:35357/v2.0/ endpoint-create --region 'RegionOne' \
|
||||||
|
--service 91358b81b1aa40d998b3a28d0cfc86e7 --region 'RegionOne' --publicurl \
|
||||||
|
'https://fedorainfracloud.org:5000/v2.0' --adminurl 'http://172.24.0.9:35357/v2.0' \
|
||||||
|
--internalurl 'http://172.24.0.9:5000/v2.0'
|
||||||
|
....
|
||||||
|
|
||||||
|
== Fedora Classroom about our instance
|
||||||
|
|
||||||
|
http://meetbot.fedoraproject.org/fedora-classroom/2015-05-11/fedora-classroom.2015-05-11-15.02.log.html
|
62
modules/sysadmin_guide/pages/collectd.adoc
Normal file
62
modules/sysadmin_guide/pages/collectd.adoc
Normal file
|
@ -0,0 +1,62 @@
|
||||||
|
= Collectd SOP
|
||||||
|
|
||||||
|
Collectd ( https://collectd.org/ ) is a client/server setup that gathers
|
||||||
|
system information from clients and allows the server to display that
|
||||||
|
information over various time periods.
|
||||||
|
|
||||||
|
Our server instance runs on log01.phx2.fedoraproject.org and most other
|
||||||
|
servers run clients that connect to the server and provide it with data.
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Collectd info
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Location::
|
||||||
|
https://admin.fedoraproject.org/collectd/
|
||||||
|
Servers::
|
||||||
|
log01 and all/most other servers as clients
|
||||||
|
Purpose::
|
||||||
|
provide load and system information on servers.
|
||||||
|
|
||||||
|
== Configuration
|
||||||
|
|
||||||
|
The collectd roles configure collectd on the various machines:
|
||||||
|
|
||||||
|
collectd/base - This is the base client role for most servers.
|
||||||
|
collectd/server - This is the server for use on log01. collectd/other -
|
||||||
|
There's various other subroles for different types of clients.
|
||||||
|
|
||||||
|
== Web interface
|
||||||
|
|
||||||
|
The server web interface is available at:
|
||||||
|
|
||||||
|
https://admin.fedoraproject.org/collectd/
|
||||||
|
|
||||||
|
== Restarting
|
||||||
|
|
||||||
|
collectd runs as a normal systemd or sysvinit service, so you can:
|
||||||
|
systemctl restart collectd or service collectd restart to restart it.
|
||||||
|
|
||||||
|
== Removing old hosts
|
||||||
|
|
||||||
|
Collectd keeps information around until it's deleted, so you may need to
|
||||||
|
sometime go remove data from a host or hosts thats no longer used. To do
|
||||||
|
this:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Login to log01
|
||||||
|
. cd /var/lib/collectd/rrd
|
||||||
|
. sudo rm -rf oldhostname
|
||||||
|
|
||||||
|
== Bug reporting
|
||||||
|
|
||||||
|
Collectd is in Fedora/EPEL and we use their packages, so report bugs to
|
||||||
|
bugzilla.redhat.com.
|
76
modules/sysadmin_guide/pages/communishift.adoc
Normal file
76
modules/sysadmin_guide/pages/communishift.adoc
Normal file
|
@ -0,0 +1,76 @@
|
||||||
|
= Communishift SOP
|
||||||
|
|
||||||
|
Communishift is an OpenShift deployment hosted and maintained by Fedora
|
||||||
|
Infrastructure that is available to the community to host applications.
|
||||||
|
Fedora Infrastructure does not maintain the applications in Communishift
|
||||||
|
and is only responsible for the OpenShift deployment itself.
|
||||||
|
|
||||||
|
Production instance:
|
||||||
|
https://console-openshift-console.apps.os.fedorainfracloud.org/
|
||||||
|
|
||||||
|
Contents
|
||||||
|
|
||||||
|
== Contact information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
nirik
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
* os-node01.fedorainfracloud.org
|
||||||
|
* os-node02.fedorainfracloud.org
|
||||||
|
* os-node03.fedorainfracloud.org
|
||||||
|
* os-node04.fedorainfracloud.org
|
||||||
|
* os-node05.fedorainfracloud.org
|
||||||
|
* os-node06.fedorainfracloud.org
|
||||||
|
* os-node07.fedorainfracloud.org
|
||||||
|
* os-node08.fedorainfracloud.org
|
||||||
|
* os-node09.fedorainfracloud.org
|
||||||
|
* os-node10.fedorainfracloud.org
|
||||||
|
* os-node11.fedorainfracloud.org
|
||||||
|
* virthost-os01.fedorainfracloud.org
|
||||||
|
* virthost-os02.fedorainfracloud.org
|
||||||
|
* virthost-os03.fedorainfracloud.org
|
||||||
|
* virthost-aarch64-os01.fedorainfracloud.org
|
||||||
|
* virthost-aarch64-os02.fedorainfracloud.org
|
||||||
|
Purpose::
|
||||||
|
Allow community members to host services for the Fedora Project.
|
||||||
|
|
||||||
|
== Onboarding new users
|
||||||
|
|
||||||
|
To allow new users to create projects in Communishift, begin by adding
|
||||||
|
them to the `communishift` FAS group.
|
||||||
|
|
||||||
|
At the time of this writing, there is no automation to sync users from
|
||||||
|
the `communishift` FAS group to OpenShift, so you will need to log in to
|
||||||
|
the Communishift instance and grant that user permissions to create
|
||||||
|
projects. For example, to grant `bowlofeggs` permissions, you would do
|
||||||
|
this:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc adm policy add-cluster-role-to-user self-provisioner bowlofeggs
|
||||||
|
$ oc create clusterquota for-bowlofeggs --project-annotation-selector openshift.io/requester=bowlofeggs --hard pods=10 --hard persistentvolumeclaims=5
|
||||||
|
....
|
||||||
|
|
||||||
|
This will grant bowlofeggs the ability to provision up to 10 pods and 5
|
||||||
|
volumes.
|
||||||
|
|
||||||
|
== KVM access
|
||||||
|
|
||||||
|
We allow applications access to the kvm device so they can run emulation
|
||||||
|
faster. Anytime the cluster is re-installed, run:
|
||||||
|
|
||||||
|
!/bin/bash set -eux if ! oc get --namespace=default ds/device-plugin-kvm
|
||||||
|
&>/dev/null; then oc create --namespace=default -f
|
||||||
|
https://raw.githubusercontent.com/kubevirt/kubernetes-device-plugins/master/manifests/kvm-ds.yml
|
||||||
|
fi
|
||||||
|
|
||||||
|
See the
|
||||||
|
https://github.com/kubevirt/kubernetes-device-plugins/blob/master/docs/README.kvm.md[upstream
|
||||||
|
docs] as well as the
|
||||||
|
https://pagure.io/fedora-infrastructure/issue/8208[original request] for
|
||||||
|
this.
|
30
modules/sysadmin_guide/pages/compose-tracker.adoc
Normal file
30
modules/sysadmin_guide/pages/compose-tracker.adoc
Normal file
|
@ -0,0 +1,30 @@
|
||||||
|
= Compose Tracker SOP
|
||||||
|
|
||||||
|
Compose Tracker tracks the pungi composes and creates a ticket in a
|
||||||
|
pagure repo for the composes are not FINISHED with a tail of the debug
|
||||||
|
and the koji tasks associated to it.
|
||||||
|
|
||||||
|
Compose Tracker: https://pagure.io/releng/compose-tracker Failed
|
||||||
|
Composes Repo: https://pagure.io/releng/failed-composes
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Release Engineering Team
|
||||||
|
Contact::
|
||||||
|
#fedora-releng
|
||||||
|
Persons::
|
||||||
|
dustymabe mohanboddu
|
||||||
|
Purpose::
|
||||||
|
Track failed composes
|
||||||
|
|
||||||
|
== More Information
|
||||||
|
|
||||||
|
For information about the tool and deployment on Fedora Infra Openshift
|
||||||
|
please look at the documetation in
|
||||||
|
https://pagure.io/releng/compose-tracker/blob/master/f/README.md
|
127
modules/sysadmin_guide/pages/contenthosting.adoc
Normal file
127
modules/sysadmin_guide/pages/contenthosting.adoc
Normal file
|
@ -0,0 +1,127 @@
|
||||||
|
= Content Hosting Infrastructure SOP
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, fedora-infrastructure-list
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
secondary1, netapp[1-3], torrent1
|
||||||
|
Purpose::
|
||||||
|
Policy regarding hosting, removal and pruning of content.
|
||||||
|
Scope::
|
||||||
|
download.fedora.redhat.com, alt.fedoraproject.org,
|
||||||
|
archives.fedoraproject.org, secondary.fedoraproject.org,
|
||||||
|
torrent.fedoraproject.org
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Fedora hosts both Fedora content and some non-Fedora content. Our
|
||||||
|
resources are finite and as such we have to have some policy around when
|
||||||
|
to remove old content. This SOP describes the test to remove content.
|
||||||
|
The spirit of this SOP is to allow more people to host content and give
|
||||||
|
it a try, prove that it's useful. If it's not popular or useful, it will
|
||||||
|
get removed. Also out of date or expired content will be removed.
|
||||||
|
|
||||||
|
=== What hosting options are available
|
||||||
|
|
||||||
|
Aside from the hosting at https://pagure.io/ we have a series of mirrors
|
||||||
|
we're allowing people to use. They are located at:
|
||||||
|
|
||||||
|
* http://archive.fedoraproject.org/pub/archive/ - For archives of
|
||||||
|
historical Fedora releases
|
||||||
|
* http://secondary.fedoraproject.org/pub/fedora-secondary/ - For
|
||||||
|
secondary architectures
|
||||||
|
* http://alt.fedoraproject.org/pub/alt/ - For misc content / catchall
|
||||||
|
* http://torrent.fedoraproject.org/ - For torrent hosting
|
||||||
|
* http://spins.fedoraproject.org/ - For official Fedora Spins hosting,
|
||||||
|
mirrored somewhat
|
||||||
|
* http://download.fedoraproject.com/pub/ - For official Fedora Releases,
|
||||||
|
mirrored widely
|
||||||
|
|
||||||
|
=== Who can host? What can be hosted?
|
||||||
|
|
||||||
|
Any official Fedora content can hosted and made available for mirroring.
|
||||||
|
Official content is determined by the Council by virtue of allowing
|
||||||
|
people to use the Fedora trademark. People representing these teams will
|
||||||
|
be allowed to host.
|
||||||
|
|
||||||
|
=== Non Official Hosting
|
||||||
|
|
||||||
|
People wanting to host unofficial bits may request approval for hosting.
|
||||||
|
Create a ticket at https://pagure.io/fedora-infrastructure/ explaining
|
||||||
|
what and why Fedora should host it. Such will be reviewed by the Fedora
|
||||||
|
Infrastructure team.
|
||||||
|
|
||||||
|
Requests for non-official hosting that may conflict with existing Fedora
|
||||||
|
policies will be escalated to the Council for approval.
|
||||||
|
|
||||||
|
=== Licensing
|
||||||
|
|
||||||
|
Anything hosted with Fedora must come with a Free software license that
|
||||||
|
is approved by Fedora. See http://fedoraproject.org/wiki/Licensing for
|
||||||
|
more.
|
||||||
|
|
||||||
|
== Requesting Space
|
||||||
|
|
||||||
|
* Make sure you have a Fedora account
|
||||||
|
-https://admin.fedoraproject.org/accounts/
|
||||||
|
* Ensure you have signed the Fedora Project Contributor Agreement (FPCA)
|
||||||
|
* Submit a hosting request -https://pagure.io/fedora-infrastructure/
|
||||||
|
** Include who you are, and any group you are working with (e.g. a SIG)
|
||||||
|
** Include Space requirements
|
||||||
|
** Include an estimate of the number of downloads expected (if you can).
|
||||||
|
** Include the nature of the bits you want to host.
|
||||||
|
* Apply for group hosted-content
|
||||||
|
-https://admin.fedoraproject.org/accounts/group/view/hosted-content
|
||||||
|
|
||||||
|
== Using Space
|
||||||
|
|
||||||
|
A dedicated namespace in the mirror will be assigned to you. It will be
|
||||||
|
your responsibility to upload content, remove old content, stay within
|
||||||
|
your quota, etc. If you have any questions or concerns about this please
|
||||||
|
let us know. Generally you will use rsync. For example:
|
||||||
|
|
||||||
|
....
|
||||||
|
rsync -av --progress ./my.iso secondary01.fedoraproject.org:/srv/pub/alt/mySpace/
|
||||||
|
....
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
None of our mirrored content is backed up. Ensure that you keep backups
|
||||||
|
of your content.
|
||||||
|
====
|
||||||
|
|
||||||
|
== Content Pruning / Purging / Removal
|
||||||
|
|
||||||
|
The following guidelines / tests will be used to determine whether or
|
||||||
|
not to remove content from the mirror.
|
||||||
|
|
||||||
|
=== Expired / Old Content
|
||||||
|
|
||||||
|
If content meets any of the following criteria it may be removed:
|
||||||
|
|
||||||
|
* Content that has reached the end of life (is no longer receiving
|
||||||
|
updates).
|
||||||
|
* Pre-release content that has been superceded.
|
||||||
|
* EOL releases that have been moved to archives.
|
||||||
|
* N-2 or greater releases. If more than 3 versions of a piece of content
|
||||||
|
are on the mirror, the oldest may be removed.
|
||||||
|
|
||||||
|
=== Limited Use Content
|
||||||
|
|
||||||
|
If content meets any of the following criteria it may be removed:
|
||||||
|
|
||||||
|
* Content with exceedingly limited seeders or downloaders, with little
|
||||||
|
prospect of increasing those numbers and which is older then 1 year.
|
||||||
|
* Content such as videos or audio which are several years old.
|
||||||
|
|
||||||
|
=== Catch All Removal
|
||||||
|
|
||||||
|
Fedora reserves the right to remove any content for any reason at any
|
||||||
|
time. We'll do our best to host things but sometimes we'll need space or
|
||||||
|
just need to remove stuff for legal or policy reasons.
|
417
modules/sysadmin_guide/pages/copr.adoc
Normal file
417
modules/sysadmin_guide/pages/copr.adoc
Normal file
|
@ -0,0 +1,417 @@
|
||||||
|
= Copr
|
||||||
|
|
||||||
|
Copr is build system for 3rd party packages.
|
||||||
|
|
||||||
|
Frontend:::
|
||||||
|
* http://copr.fedorainfracloud.org/
|
||||||
|
Backend:::
|
||||||
|
* http://copr-be.cloud.fedoraproject.org/
|
||||||
|
Package signer:::
|
||||||
|
* copr-keygen.cloud.fedoraproject.org
|
||||||
|
Dist-git::
|
||||||
|
* copr-dist-git.fedorainfracloud.org
|
||||||
|
Devel instances (NO NEED TO CARE ABOUT THEM, JUST THOSE ABOVE):::
|
||||||
|
* http://copr-fe-dev.cloud.fedoraproject.org/
|
||||||
|
* http://copr-be-dev.cloud.fedoraproject.org/
|
||||||
|
* copr-keygen-dev.cloud.fedoraproject.org
|
||||||
|
* copr-dist-git-dev.fedorainfracloud.org
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
msuchy (mirek)
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-buildsys
|
||||||
|
Location::
|
||||||
|
Fedora Cloud
|
||||||
|
Purpose::
|
||||||
|
Build system
|
||||||
|
|
||||||
|
== This document
|
||||||
|
|
||||||
|
This document provides a condensed information allowing you to keep Copr
|
||||||
|
alive and working. For more sofisticated business processes, please see
|
||||||
|
https://docs.pagure.org/copr.copr/maintenance_documentation.html
|
||||||
|
|
||||||
|
== TROUBLESHOOTING
|
||||||
|
|
||||||
|
Almost every problem with Copr is due problem with spawning builder VMs,
|
||||||
|
or with processing action queue on backend.
|
||||||
|
|
||||||
|
=== VM spawning/termination problems
|
||||||
|
|
||||||
|
Try to restart copr-backend service:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh root@copr-be.cloud.fedoraproject.org
|
||||||
|
$ systemctl restart copr-backend
|
||||||
|
....
|
||||||
|
|
||||||
|
If this doesn't solve the problem, try to follow logs for some clues:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ tail -f /var/log/copr-backend/{vmm,spawner,terminator}.log
|
||||||
|
....
|
||||||
|
|
||||||
|
As the last resort option, you can terminate all builders and let
|
||||||
|
copr-backend to throw all information about them. This action will
|
||||||
|
obviously interrupt all running builds and reschedule them:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh root@copr-be.cloud.fedoraproject.org
|
||||||
|
$ systemctl stop copr-backend
|
||||||
|
$ cleanup_vm_nova.py
|
||||||
|
$ redis-cli
|
||||||
|
> FLUSHALL
|
||||||
|
$ systemctl start copr-backend
|
||||||
|
....
|
||||||
|
|
||||||
|
Sometimes OpenStack can not handle spawning too much VMs at the same
|
||||||
|
time. So it is safer to edit on copr-be.cloud.fedoraproject.org:
|
||||||
|
|
||||||
|
....
|
||||||
|
vi /etc/copr/copr-be.conf
|
||||||
|
....
|
||||||
|
|
||||||
|
and change:
|
||||||
|
|
||||||
|
....
|
||||||
|
group0_max_workers=12
|
||||||
|
....
|
||||||
|
|
||||||
|
to "6". Start copr-backend service and some time later increase it to
|
||||||
|
original value. Copr automaticaly detect change in script and increase
|
||||||
|
number of workers.
|
||||||
|
|
||||||
|
The set of aarch64 VMs isn't maintained by OpenStack, but by Copr's
|
||||||
|
backend itself. Steps to diagnose:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh root@copr-be.cloud.fedoraproject.org
|
||||||
|
[root@copr-be ~][PROD]# systemctl status resalloc
|
||||||
|
● resalloc.service - Resource allocator server
|
||||||
|
...
|
||||||
|
|
||||||
|
[root@copr-be ~][PROD]# less /var/log/resallocserver/main.log
|
||||||
|
|
||||||
|
[root@copr-be ~][PROD]# su - resalloc
|
||||||
|
|
||||||
|
[resalloc@copr-be ~][PROD]$ resalloc-maint resource-list
|
||||||
|
13569 - aarch64_01_prod_00013569_20190613_151319 pool=aarch64_01_prod tags=aarch64 status=UP
|
||||||
|
13597 - aarch64_01_prod_00013597_20190614_083418 pool=aarch64_01_prod tags=aarch64 status=UP
|
||||||
|
13594 - aarch64_02_prod_00013594_20190614_082303 pool=aarch64_02_prod tags=aarch64 status=STARTING
|
||||||
|
...
|
||||||
|
|
||||||
|
[resalloc@copr-be ~][PROD]$ resalloc-maint ticket-list
|
||||||
|
879 - state=OPEN tags=aarch64 resource=aarch64_01_prod_00013569_20190613_151319
|
||||||
|
918 - state=OPEN tags=aarch64 resource=aarch64_01_prod_00013608_20190614_135536
|
||||||
|
904 - state=OPEN tags=aarch64 resource=aarch64_02_prod_00013594_20190614_082303
|
||||||
|
919 - state=OPEN tags=aarch64
|
||||||
|
...
|
||||||
|
....
|
||||||
|
|
||||||
|
Be careful when there's some resource in `STARTING` state. If that's so,
|
||||||
|
check
|
||||||
|
`/usr/bin/tail -F -n +0 /var/log/resallocserver/hooks/013594_alloc`.
|
||||||
|
Copr takes tickets from resalloc server; and if the resources fail to
|
||||||
|
spawn, the ticket numbers are not assigned with appropriately tagged
|
||||||
|
resource for a long time.
|
||||||
|
|
||||||
|
If that happens (it shouldn't) and there's some inconsistency between
|
||||||
|
resalloc's database and the actual status on aarch64 hypervisors
|
||||||
|
(`ssh copr@virthost-aarch64-os0{1,2}.fedorainfracloud.org`) - use
|
||||||
|
`virsh` there to introspect theirs statuses - use
|
||||||
|
`resalloc-maint resource-delete`, `resalloc ticket-close` or `psql`
|
||||||
|
commands to fix-up the resalloc's DB.
|
||||||
|
|
||||||
|
=== Backend Troubleshoting
|
||||||
|
|
||||||
|
Information about status of Copr backend services:
|
||||||
|
|
||||||
|
....
|
||||||
|
systemctl status copr-backend*.service
|
||||||
|
....
|
||||||
|
|
||||||
|
Utilization of workers:
|
||||||
|
|
||||||
|
....
|
||||||
|
ps axf
|
||||||
|
....
|
||||||
|
|
||||||
|
Worker process change $0 to list which task they are working on and on
|
||||||
|
which builder.
|
||||||
|
|
||||||
|
To list which VM builders are tracked by copr-vmm service:
|
||||||
|
|
||||||
|
....
|
||||||
|
/usr/bin/copr_get_vm_info.py
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Appstream builder troubleshoting
|
||||||
|
|
||||||
|
Appstream builder is painfully slow when running on a repository with a
|
||||||
|
huge amount of packages. See
|
||||||
|
https://github.com/hughsie/appstream-glib/issues/301 . You might need to
|
||||||
|
disable it for some projects:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh root@copr-be.cloud.fedoraproject.org
|
||||||
|
$ cd /var/lib/copr/public_html/results/<owner>/<project>/
|
||||||
|
$ touch .disable-appstream
|
||||||
|
# You should probably also delete existing appstream data because
|
||||||
|
# they might be obsolete
|
||||||
|
$ rm -rf ./appdata
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Backend action queue issues
|
||||||
|
|
||||||
|
First check the link:[number of not-yet-processed actions]. If that
|
||||||
|
number isn't equal to zero, and is not decrementing relatively fast (say
|
||||||
|
single action takes longer than 30s) -- there might be some problem.
|
||||||
|
Logs for the action dispatcher can be found in:
|
||||||
|
|
||||||
|
....
|
||||||
|
/var/log/copr-backend/action_dispatcher.log
|
||||||
|
....
|
||||||
|
|
||||||
|
Check if there's no stucked process under `Action dispatch` parent
|
||||||
|
process in `pstree -a copr` output.
|
||||||
|
|
||||||
|
== Deploy information
|
||||||
|
|
||||||
|
Using playbooks and rbac:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook groups/copr-backend.yml
|
||||||
|
$ sudo rbac-playbook groups/copr-frontend-cloud.yml
|
||||||
|
$ sudo rbac-playbook groups/copr-keygen.yml
|
||||||
|
$ sudo rbac-playbook groups/copr-dist-git.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
https://pagure.io/copr/copr/blob/master/f/copr-setup.txt The
|
||||||
|
[.title-ref]#copr-setup.txt# manual is severely outdated, but there is
|
||||||
|
no up-to-date alternative. We should extract useful information from it
|
||||||
|
and put it here in the SOP or into
|
||||||
|
https://docs.pagure.org/copr.copr/maintenance_documentation.html and
|
||||||
|
then throw the [.title-ref]#copr-setup.txt# away.
|
||||||
|
|
||||||
|
On backend should run copr-backend service (which spawns several
|
||||||
|
processes). Backend spawns VM from Fedora Cloud. You could not login to
|
||||||
|
those machines directly. You have to:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh root@copr-be.cloud.fedoraproject.org
|
||||||
|
$ su - copr
|
||||||
|
$ copr_get_vm_info.py
|
||||||
|
# find IP address of the VM that you want
|
||||||
|
$ ssh root@172.16.3.3
|
||||||
|
....
|
||||||
|
|
||||||
|
Instances can be easily terminated in
|
||||||
|
https://fedorainfracloud.org/dashboard
|
||||||
|
|
||||||
|
=== Order of start up
|
||||||
|
|
||||||
|
When reprovision you should start first: copr-keygen and copr-dist-git
|
||||||
|
machines (in any order). Then you can start copr-be. Well you can start
|
||||||
|
it sooner, but make sure that copr-* services are stopped.
|
||||||
|
|
||||||
|
Copr-fe machine is completly independent and can be start any time. If
|
||||||
|
backend is stopped it will just queue jobs.
|
||||||
|
|
||||||
|
== Logs
|
||||||
|
|
||||||
|
=== Backend
|
||||||
|
|
||||||
|
* /var/log/copr-backend/action_dispatcher.log
|
||||||
|
* /var/log/copr-backend/actions.log
|
||||||
|
* /var/log/copr-backend/backend.log
|
||||||
|
* /var/log/copr-backend/build_dispatcher.log
|
||||||
|
* /var/log/copr-backend/logger.log
|
||||||
|
* /var/log/copr-backend/spawner.log
|
||||||
|
* /var/log/copr-backend/terminator.log
|
||||||
|
* /var/log/copr-backend/vmm.log
|
||||||
|
* /var/log/copr-backend/worker.log
|
||||||
|
|
||||||
|
And several logs for non-essential features such as
|
||||||
|
copr_prune_results.log, hitcounter.log, cleanup_vms.log, that you
|
||||||
|
shouldn't be worried with.
|
||||||
|
|
||||||
|
=== Frontend
|
||||||
|
|
||||||
|
* /var/log/copr-frontend/frontend.log
|
||||||
|
* /var/log/httpd/access_log
|
||||||
|
* /var/log/httpd/error_log
|
||||||
|
|
||||||
|
=== Keygen
|
||||||
|
|
||||||
|
* /var/log/copr-keygen/main.log
|
||||||
|
|
||||||
|
=== Dist-git
|
||||||
|
|
||||||
|
* /var/log/copr-dist-git/main.log
|
||||||
|
* /var/log/httpd/access_log
|
||||||
|
* /var/log/httpd/error_log
|
||||||
|
|
||||||
|
== Services
|
||||||
|
|
||||||
|
=== Backend
|
||||||
|
|
||||||
|
* copr-backend
|
||||||
|
** copr-backend-action
|
||||||
|
** copr-backend-build
|
||||||
|
** copr-backend-log
|
||||||
|
** copr-backend-vmm
|
||||||
|
* redis
|
||||||
|
* lighttpd
|
||||||
|
|
||||||
|
All the [.title-ref]#copr-backend-*.service# are configured to be a part
|
||||||
|
of the [.title-ref]#copr-backend.service# so e.g. in case of restarting
|
||||||
|
all of them, just restart the [.title-ref]#copr-backend.service#.
|
||||||
|
|
||||||
|
=== Frontend
|
||||||
|
|
||||||
|
* httpd
|
||||||
|
* postgresql
|
||||||
|
|
||||||
|
=== Keygen
|
||||||
|
|
||||||
|
* signd
|
||||||
|
|
||||||
|
=== Dist-git
|
||||||
|
|
||||||
|
* httpd
|
||||||
|
* copr-dist-git
|
||||||
|
|
||||||
|
== PPC64LE Builders
|
||||||
|
|
||||||
|
Builders for PPC64 are located at rh-power2.fit.vutbr.cz and anyone with
|
||||||
|
access to buildsys ssh key can get there using keys as::
|
||||||
|
msuchy@rh-power2.fit.vutbr.cz
|
||||||
|
|
||||||
|
There are commands: $ ls bin/ destroy-all.sh reinit-vm26.sh
|
||||||
|
reinit-vm28.sh virsh-destroy-vm26.sh virsh-destroy-vm28.sh
|
||||||
|
virsh-start-vm26.sh virsh-start-vm28.sh get-one-vm.sh reinit-vm27.sh
|
||||||
|
reinit-vm29.sh virsh-destroy-vm27.sh virsh-destroy-vm29.sh
|
||||||
|
virsh-start-vm27.sh virsh-start-vm29.sh
|
||||||
|
|
||||||
|
bin/destroy-all.sh destroy all VM and reinit them reinit-vmXX.sh copy VM
|
||||||
|
image from template virsh-destroy-vmXX.sh destroys VM
|
||||||
|
virsh-start-vmXX.sh starts VM get-one-vm.sh start one VM and return its
|
||||||
|
IP - this is used in Copr playbooks.
|
||||||
|
|
||||||
|
In case of big queue of PPC64 tasks simply call bin/destroy-all.sh and
|
||||||
|
it will destroy stuck VM and copr backend will spawn new VM.
|
||||||
|
|
||||||
|
== Ports opened for public
|
||||||
|
|
||||||
|
Frontend:
|
||||||
|
|
||||||
|
[width="86%",cols="13%,17%,16%,54%",options="header",]
|
||||||
|
|===
|
||||||
|
|Port |Protocol |Service |Reason
|
||||||
|
|22 |TCP |ssh |Remote control
|
||||||
|
|80 |TCP |http |Serving Copr frontend website
|
||||||
|
|443 |TCP |https |^^
|
||||||
|
|===
|
||||||
|
|
||||||
|
Backend:
|
||||||
|
|
||||||
|
[width="86%",cols="13%,17%,16%,54%",options="header",]
|
||||||
|
|===
|
||||||
|
|Port |Protocol |Service |Reason
|
||||||
|
|22 |TCP |ssh |Remote control
|
||||||
|
|80 |TCP |http |Serving build results and repos
|
||||||
|
|443 |TCP |https |^^
|
||||||
|
|===
|
||||||
|
|
||||||
|
Distgit:
|
||||||
|
|
||||||
|
[width="86%",cols="13%,17%,16%,54%",options="header",]
|
||||||
|
|===
|
||||||
|
|Port |Protocol |Service |Reason
|
||||||
|
|22 |TCP |ssh |Remote control
|
||||||
|
|80 |TCP |http |Serving cgit interface
|
||||||
|
|443 |TCP |https |^^
|
||||||
|
|===
|
||||||
|
|
||||||
|
Keygen:
|
||||||
|
|
||||||
|
[width="86%",cols="13%,17%,16%,54%",options="header",]
|
||||||
|
|===
|
||||||
|
|Port |Protocol |Service |Reason
|
||||||
|
|22 |TCP |ssh |Remote control
|
||||||
|
|===
|
||||||
|
|
||||||
|
== Resources justification
|
||||||
|
|
||||||
|
Copr currently uses the following resources.
|
||||||
|
|
||||||
|
=== Frontend
|
||||||
|
|
||||||
|
* RAM: 2G (out of 4G) and some swap
|
||||||
|
* CPU: 2 cores (3400mhz) with load 0.92, 0.68, 0.65
|
||||||
|
|
||||||
|
Most of the memory is eaten by PostgreSQL, followed by Apache. The CPU
|
||||||
|
usage is also mainly used for those two services but in the reversed
|
||||||
|
order.
|
||||||
|
|
||||||
|
I don't think we can settle down with any instance that provides less
|
||||||
|
than (2G RAM, obviously), but ideally, we need 3G+. 2-core CPU is good
|
||||||
|
enough.
|
||||||
|
|
||||||
|
* Disk space: 17G for system and 8G for [.title-ref]#pgsqldb# directory
|
||||||
|
|
||||||
|
If needed, we are able to clean-up the database directory of old dumps
|
||||||
|
and backups and get down to around 4G disk space.
|
||||||
|
|
||||||
|
=== Backend
|
||||||
|
|
||||||
|
* RAM: 5G (out of 16G)
|
||||||
|
* CPU: 8 cores (3400MHz) with load 4.09, 4.55, 4.24
|
||||||
|
|
||||||
|
Backend takes care of spinning-up builders and running ansible playbooks
|
||||||
|
on them, running [.title-ref]#createrepo_c# (on big repositories) and so
|
||||||
|
on. Copr utilizes two queues, one for builds, which are delegated to
|
||||||
|
OpenStack builders, and action queue. Actions, however, are processed
|
||||||
|
directly by the backend, so it can spike our load up. We would ideally
|
||||||
|
like to have the same computing power that we have now. Maybe we can go
|
||||||
|
lower than 16G RAM, possibly down to 12G RAM.
|
||||||
|
|
||||||
|
* Disk space: 30G for the system, 5.6T (out of 6.8T) for build results
|
||||||
|
|
||||||
|
Currently, we have 1.3T of backup data, that is going to be deleted
|
||||||
|
soon, but nevertheless, we cannot go any lower on storage. Disk space is
|
||||||
|
a long-term issue for us and we need to do a lot of compromises and
|
||||||
|
settling down just to survive our daily increase (which is around 10G of
|
||||||
|
new data). Many features are blocked by not having enough storage. We
|
||||||
|
cannot go any lower and also we cannot go much longer with the current
|
||||||
|
storage.
|
||||||
|
|
||||||
|
=== Distgit
|
||||||
|
|
||||||
|
* RAM: ~270M (out of 4G), but climbs to ~1G when busy
|
||||||
|
* CPU: 2 cores (3400MHz) with load 1.35, 1.00, 0.53
|
||||||
|
|
||||||
|
Personally, I wouldn't downgrade the machine too much. Possibly we can
|
||||||
|
live with 3G ram, but I wouldn't go any lower.
|
||||||
|
|
||||||
|
* Disk space: 7G for system, 1.3T dist-git data
|
||||||
|
|
||||||
|
We currently employ a lot of aggressive cleaning strategies on our
|
||||||
|
distgit data, so we can't go any lower than what we have.
|
||||||
|
|
||||||
|
=== Keygen
|
||||||
|
|
||||||
|
* RAM: ~150M (out of 2G)
|
||||||
|
* CPU: 1 core (3400MHz) with load 0.10, 0.31, 0.25
|
||||||
|
|
||||||
|
We are basically running just [.title-ref]#signd# and
|
||||||
|
[.title-ref]#httpd# here, both with minimal resource requirements. The
|
||||||
|
memory usage is topped by [.title-ref]#systemd-journald#.
|
||||||
|
|
||||||
|
* Disk space: 7G for system and ~500M (out of ~700M) for GPG keys
|
||||||
|
|
||||||
|
We are slowly pushing the GPG keys storage to its limit, so in the case
|
||||||
|
of migrating copr-keygen somewhere, we would like to scale-up it to at
|
||||||
|
least 1G.
|
31
modules/sysadmin_guide/pages/cyclades.adoc
Normal file
31
modules/sysadmin_guide/pages/cyclades.adoc
Normal file
|
@ -0,0 +1,31 @@
|
||||||
|
= Cyclades
|
||||||
|
|
||||||
|
cyclades notes
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. login as root - default password is tslinux
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
change password for root and admin to our password from the::
|
||||||
|
phx2-access.txt file in the private repo
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
port forward to the web browser for the cyclades::
|
||||||
|
`ssh -L 8080:rack47-serial.phx2.fedoraproject.org:80`
|
||||||
|
. connect to localhost:8080 in your web browser
|
||||||
|
. login with root and the password you set above
|
||||||
|
. click on 'security'
|
||||||
|
. click on 'moderate'
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
logout, port forward port 443 as above:::
|
||||||
|
`ssh -L 8080:rack47-serial.phx2.fedoraproject.org:443`
|
||||||
|
. click on the 'wizard' button at lower left
|
||||||
|
. proceed through the wizard Info needed:
|
||||||
|
* serial ports are set to 115200 8N1 by default
|
||||||
|
* do not setup buffering
|
||||||
|
* give it the ip of our syslog server
|
||||||
|
. click 'apply changes'
|
||||||
|
. hope
|
||||||
|
. log back in
|
||||||
|
. name/setup the port aliases
|
108
modules/sysadmin_guide/pages/darkserver.adoc
Normal file
108
modules/sysadmin_guide/pages/darkserver.adoc
Normal file
|
@ -0,0 +1,108 @@
|
||||||
|
= Darkserver SOP
|
||||||
|
|
||||||
|
To setup a http://darkserver.fedoraproject.org based on Darkserver
|
||||||
|
project to provide GNU_BUILD_ID information for packages. A devel
|
||||||
|
instance can be seen at http://darkserver01.dev.fedoraproject.org and
|
||||||
|
staging instance is http://darkserver01.stg.phx2.fedoraproject.org/.
|
||||||
|
|
||||||
|
This page describes how to set up the server.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Installing the server
|
||||||
|
. Setting up the database
|
||||||
|
. SELinux Configuration
|
||||||
|
. Koji plugin setup
|
||||||
|
. Debugging
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin
|
||||||
|
Persons:::
|
||||||
|
kushal mether
|
||||||
|
Sponsor:::
|
||||||
|
nirik
|
||||||
|
Location:::
|
||||||
|
phx2
|
||||||
|
Servers:::
|
||||||
|
darkserver01 , darkserver01.stg, darkserver01.dev
|
||||||
|
Purpose:::
|
||||||
|
To host Darkserver
|
||||||
|
|
||||||
|
== Installing the Server
|
||||||
|
|
||||||
|
....
|
||||||
|
root@localhost# yum install darkserver
|
||||||
|
....
|
||||||
|
|
||||||
|
== Setting up the database
|
||||||
|
|
||||||
|
We are using MySQL as database. We will need two users, one for
|
||||||
|
koji-plugin and one for darkserver.:
|
||||||
|
|
||||||
|
....
|
||||||
|
root@localhost# mysql -u root
|
||||||
|
mysql> CREATE DATABASE darkserver;
|
||||||
|
mysql> GRANT INSERT ON darkserver.* TO kojiplugin@'koji-hub-ip' IDENTIFIED BY 'XXX';
|
||||||
|
mysql> GRANT SELECT ON darkserver.* TO dark@'darkserver-ip' IDENTIFIED BY 'XXX';
|
||||||
|
....
|
||||||
|
|
||||||
|
Setup this db configuration in the conf file under
|
||||||
|
`/etc/darkserver/darkserverweb.conf`:
|
||||||
|
|
||||||
|
....
|
||||||
|
[darkserverweb]
|
||||||
|
host=db host name
|
||||||
|
user=dark
|
||||||
|
password=XXX
|
||||||
|
database=darkserver
|
||||||
|
....
|
||||||
|
|
||||||
|
Now setup the db tables if it is a new install.
|
||||||
|
|
||||||
|
(For this you may need to `'GRANT * ON darkserver.*'` to the web user,
|
||||||
|
and then `'REVOKE * ON darkserver.*'` after running.)
|
||||||
|
|
||||||
|
....
|
||||||
|
root@localhost# python /usr/lib/python2.6/site-packages/darkserverweb/manage.py syncdb
|
||||||
|
....
|
||||||
|
|
||||||
|
== SELinux Configuration
|
||||||
|
|
||||||
|
Do the follow to allow the webserver to connect to the database.:
|
||||||
|
|
||||||
|
....
|
||||||
|
root@localhost# setsebool -P httpd_can_network_connect_db 1
|
||||||
|
....
|
||||||
|
|
||||||
|
== Setting up the Koji plugin
|
||||||
|
|
||||||
|
Install the package.:
|
||||||
|
|
||||||
|
....
|
||||||
|
root@localhost# yum install darkserver-kojiplugin
|
||||||
|
....
|
||||||
|
|
||||||
|
Then fill up the configuration file under
|
||||||
|
`/etc/koji-hub/plugins/darkserver.conf`:
|
||||||
|
|
||||||
|
....
|
||||||
|
[darkserver]
|
||||||
|
host=db host name
|
||||||
|
user=kojiplugin
|
||||||
|
password=XXX
|
||||||
|
database=darkserver
|
||||||
|
port=3306
|
||||||
|
....
|
||||||
|
|
||||||
|
Then enable the plugin in the koji hub configuration.
|
||||||
|
|
||||||
|
== Debugging
|
||||||
|
|
||||||
|
Set DEBUG to True in `/etc/darkserver/settings.py` file and restart
|
||||||
|
Apache.
|
235
modules/sysadmin_guide/pages/database.adoc
Normal file
235
modules/sysadmin_guide/pages/database.adoc
Normal file
|
@ -0,0 +1,235 @@
|
||||||
|
= Database Infrastructure SOP
|
||||||
|
|
||||||
|
Our database servers provide database storage for many of our apps.
|
||||||
|
|
||||||
|
Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. Creating a New Postgresql Database
|
||||||
|
. Troubleshooting and Resolution
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
.. Connection issues
|
||||||
|
.. Some useful queries
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
... What queries are running
|
||||||
|
... Seeing how "dirty" a table is
|
||||||
|
... XID Wraparound
|
||||||
|
____
|
||||||
|
.. Restart Procedure
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
... Koji
|
||||||
|
... Bodhi
|
||||||
|
____
|
||||||
|
____
|
||||||
|
. Note about TurboGears and MySQL
|
||||||
|
. Restoring from backups or specific dbs
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-dba group
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
sb01, db03, db-fas01, db-datanommer02, db-koji01, db-s390-koji01,
|
||||||
|
db-arm-koji01, db-ppc-koji01, db-qa01, dbqastg01
|
||||||
|
Purpose::
|
||||||
|
Provides database connection to many of our apps.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
db01, db03 and db-fas01 are our primmary servers. db01 and db-fas01 run
|
||||||
|
PostgreSQL. db03 contain mariadb. db-koji01, db-s390-koji01,
|
||||||
|
db-arm-koji01, db-ppc-koji01 contain secondary kojis. db-qa01 and
|
||||||
|
db-qastg01 contain resultsdb. db-datanommer02 contains all storage
|
||||||
|
messages from postgresql database.
|
||||||
|
|
||||||
|
== Creating a New Postgresql Database
|
||||||
|
|
||||||
|
Creating a new database on our postgresql server isn't hard but there's
|
||||||
|
several steps that should be taken to make the database server as secure
|
||||||
|
as possible.
|
||||||
|
|
||||||
|
We want to separate the database permissions so that we don't have the
|
||||||
|
user/password combination that can do anything it likes to the database
|
||||||
|
on every host (the webapp user can usually do a lot of things even
|
||||||
|
without those extra permissions but every little bit helps).
|
||||||
|
|
||||||
|
Say we have an app called "raffle". We'd have three users:
|
||||||
|
|
||||||
|
* raffleadmin: able to make any changes they want to this particular
|
||||||
|
database. It should not be used in day to day but only for things like
|
||||||
|
updating the database schema when an update occurs. We could very likely
|
||||||
|
disable this account in the db whenever we are not using it.
|
||||||
|
* raffleapp: the database user that the web application uses. This will
|
||||||
|
likely need to be able to insert and select from all tables. It will
|
||||||
|
probably need to update most tables as well. There may be some tables
|
||||||
|
that it does _not_ need delete on. It should almost certainly not need
|
||||||
|
schema modifying permissions. (With postgres, it likely also needs
|
||||||
|
permission to insert/select on sequences as well).
|
||||||
|
* rafflereadonly: Only able to read data from tables, not able to modify
|
||||||
|
anything. Sadly, we aren't using this often but it can be useful for
|
||||||
|
scripts that need to talk directly to the database without modifying it.
|
||||||
|
|
||||||
|
....
|
||||||
|
db2 $ sudo -u postgres createuser -P -E NEWDBadmin
|
||||||
|
Password: <randomly generated password>
|
||||||
|
db2 $ sudo -u postgres createuser -P -E NEWDBapp
|
||||||
|
Password: <randomly generated password>
|
||||||
|
db2 $ sudo -u postgres createuser -P -E NEWDBreadonly
|
||||||
|
Password: <randomly generated password>
|
||||||
|
db2 $ sudo -u postgres createdb -E utf8 NEWDB -O NEWDBadmin
|
||||||
|
db2 $ sudo -u postgres psql NEWDB
|
||||||
|
NEWDB=# revoke all on database NEWDB from public;
|
||||||
|
NEWDB=# revoke all on schema public from public;
|
||||||
|
NEWDB=# grant all on schema public to NEWDBadmin;
|
||||||
|
NEWDB=# [grant permissions to NEWDBapp as appropriate for your app]
|
||||||
|
NEWDB=# [grant permissions to NEWDBreadonly as appropriate for a user that
|
||||||
|
is only trusted enough to read information]
|
||||||
|
NEWDB=# grant connect on database NEWDB to nagiosuser;
|
||||||
|
....
|
||||||
|
|
||||||
|
If your application needs to have the NEWDBapp and password to connect
|
||||||
|
to the database, you probably want to add these to ansible as well. Put
|
||||||
|
the password in the private repo in batcave01. Then use a templatefile
|
||||||
|
to incorporate it into the config file. See fas.pp for an example.
|
||||||
|
|
||||||
|
== Troubleshooting and Resolution
|
||||||
|
|
||||||
|
=== Connection issues
|
||||||
|
|
||||||
|
There are no known outstanding issues with the database itself. Remember
|
||||||
|
that every time either database is restarted, services will have to be
|
||||||
|
restarted (see below).
|
||||||
|
|
||||||
|
=== Some useful queries
|
||||||
|
|
||||||
|
==== What queries are running
|
||||||
|
|
||||||
|
This can help you find out what queries are cuurently running on the
|
||||||
|
server:
|
||||||
|
|
||||||
|
....
|
||||||
|
select datname, pid, query_start, backend_start, query from
|
||||||
|
pg_stat_activity where state<>'idle' order by query_start;
|
||||||
|
....
|
||||||
|
|
||||||
|
This can help you find how many connections to the db server are for
|
||||||
|
each individual database:
|
||||||
|
|
||||||
|
....
|
||||||
|
select datname, count(datname) from pg_stat_activity group by datname
|
||||||
|
order by count desc;
|
||||||
|
....
|
||||||
|
|
||||||
|
==== Seeing how "dirty" a table is
|
||||||
|
|
||||||
|
We've added a function from postgres's contrib directory to tell how
|
||||||
|
dirty a table is. By dirty we mean, how many tuples are active, how many
|
||||||
|
have been marked as having old data (and therefore "dead") and how much
|
||||||
|
free space is allocated to the table but not used.:
|
||||||
|
|
||||||
|
....
|
||||||
|
\c fas2
|
||||||
|
\x
|
||||||
|
select * from pgstattuple('visit_identity');
|
||||||
|
table_len | 425984
|
||||||
|
tuple_count | 580
|
||||||
|
tuple_len | 46977
|
||||||
|
tuple_percent | 11.03
|
||||||
|
dead_tuple_count | 68
|
||||||
|
dead_tuple_len | 5508
|
||||||
|
dead_tuple_percent | 1.29
|
||||||
|
free_space | 352420
|
||||||
|
free_percent | 82.73
|
||||||
|
\x
|
||||||
|
....
|
||||||
|
|
||||||
|
Vacuum should clear out dead_tuples. Only a vacuum full, which will lock
|
||||||
|
the table and therefore should be avoided, will clear out free space.
|
||||||
|
|
||||||
|
==== XID Wraparound
|
||||||
|
|
||||||
|
Find out how close we are to having to perform a vacuum of a database
|
||||||
|
(as opposed to individual tables of the db). We should schedule a vacuum
|
||||||
|
when about 50% of the transaction ids have been used (approximately
|
||||||
|
530,000,000 xids):
|
||||||
|
|
||||||
|
....
|
||||||
|
select datname, age(datfrozenxid), pow(2, 31) - age(datfrozenxid) as xids_remaining
|
||||||
|
from pg_database order by xids_remaining;
|
||||||
|
....
|
||||||
|
|
||||||
|
Information on [61]wraparound
|
||||||
|
|
||||||
|
== Restart Procedure
|
||||||
|
|
||||||
|
If the database server needs to be restarted it should come back on it's
|
||||||
|
own. Otherwise each service on it can be restarted:
|
||||||
|
|
||||||
|
....
|
||||||
|
service mysqld restart
|
||||||
|
service postgresql restart
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Koji
|
||||||
|
|
||||||
|
Any time postgreql is restarted, koji needs to be restarted. Please also
|
||||||
|
see [62]Restarting Koji
|
||||||
|
|
||||||
|
=== Bodhi
|
||||||
|
|
||||||
|
Anytime postgresql is restarted Bodhi will need to be restarted no sop
|
||||||
|
currently exists for this.
|
||||||
|
|
||||||
|
== TurboGears and MySQL
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
about TurboGears and MySQL
|
||||||
|
|
||||||
|
There's a known bug in TurboGears that causes MySQL clients not to
|
||||||
|
automatically reconnect when lost. Typically a restart of the TurboGears
|
||||||
|
application will correct this issue.
|
||||||
|
====
|
||||||
|
|
||||||
|
== Restoring from backups or specific dbs.
|
||||||
|
|
||||||
|
Our backups store the latest copy in /backups/ on each db server. These
|
||||||
|
backups are created automatically by the db-backup script run fron cron.
|
||||||
|
Look in /usr/local/bin for the backup script.
|
||||||
|
|
||||||
|
To restore partially or completely you need to:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. setup postgres on a system
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
start postgres/run initdb::
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
if this new system running postgres has already run ansible then it
|
||||||
|
will;;
|
||||||
|
have wrong config files in /var/lib/pgsql/data - clear them out
|
||||||
|
before you start postgres so initdb can work.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
grab the backups you need from /backups - also grab global.sql::
|
||||||
|
edit up global.sql to only create/alter the dbs you care about
|
||||||
|
. as postgres run: `psql -U postgres -f global.sql`
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
when this completes you can restore each db with (as postgres user)::::
|
||||||
|
createdb $dbname pg_restore -d dbname dbname_backup_file.db
|
||||||
|
. restart postgres and check your data.
|
120
modules/sysadmin_guide/pages/datanommer.adoc
Normal file
120
modules/sysadmin_guide/pages/datanommer.adoc
Normal file
|
@ -0,0 +1,120 @@
|
||||||
|
= datanommer SOP
|
||||||
|
|
||||||
|
Consume fedmsg bus activity and stuff it in a postgresql db.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-fedmsg, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
busgateway01
|
||||||
|
Purpose::
|
||||||
|
Save fedmsg bus activity
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
datanommer is a set of three modules:
|
||||||
|
|
||||||
|
python-datanommer-models::
|
||||||
|
Schema definition and API for storing new items and querying existing
|
||||||
|
items
|
||||||
|
python-datanommer-consumer::
|
||||||
|
A plugin for the fedmsg-hub that actively listens to the bus and
|
||||||
|
stores events.
|
||||||
|
datanommer-commands::
|
||||||
|
A set of CLI tools for querying the DB.
|
||||||
|
|
||||||
|
datanommer will one day serve as a backend for future web services like
|
||||||
|
datagrepper and dataviewer.
|
||||||
|
|
||||||
|
Source: https://github.com/fedora-infra/datanommer/ Plan:
|
||||||
|
https://fedoraproject.org/wiki/User:Ianweller/statistics_plus_plus
|
||||||
|
|
||||||
|
== CLI tools
|
||||||
|
|
||||||
|
Dump the db into a file as json:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ datanommer-dump > datanommer-dump.json
|
||||||
|
....
|
||||||
|
|
||||||
|
When was the last bodhi message?:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ # It was 678 seconds ago
|
||||||
|
$ datanommer-latest --category bodhi --timesince
|
||||||
|
[678]
|
||||||
|
....
|
||||||
|
|
||||||
|
When was the last bodhi message in more readable terms?:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ # It was 12 minutes and 43 seconds ago
|
||||||
|
$ datanommer-latest --category bodhi --timesince --human
|
||||||
|
[0:12:43.087949]
|
||||||
|
....
|
||||||
|
|
||||||
|
What was that last bodhi message?:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ datanommer-latest --category bodhi
|
||||||
|
[{"bodhi": {
|
||||||
|
"topic": "org.fedoraproject.stg.bodhi.update.comment",
|
||||||
|
"msg": {
|
||||||
|
"comment": {
|
||||||
|
"group": null,
|
||||||
|
"author": "ralph",
|
||||||
|
"text": "Testing for latest datanommer.",
|
||||||
|
"karma": 0,
|
||||||
|
"anonymous": false,
|
||||||
|
"timestamp": 1360349639.0,
|
||||||
|
"update_title": "xmonad-0.10-10.fc17"
|
||||||
|
},
|
||||||
|
"agent": "ralph"
|
||||||
|
},
|
||||||
|
}}]
|
||||||
|
....
|
||||||
|
|
||||||
|
Show me stats on datanommer messages by topic:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ datanommer-stats --topic
|
||||||
|
org.fedoraproject.stg.fas.group.member.remove has 10 entries
|
||||||
|
org.fedoraproject.stg.logger.log has 76 entries
|
||||||
|
org.fedoraproject.stg.bodhi.update.comment has 5 entries
|
||||||
|
org.fedoraproject.stg.busmon.colorized-messages has 10 entries
|
||||||
|
org.fedoraproject.stg.fas.user.update has 10 entries
|
||||||
|
org.fedoraproject.stg.wiki.article.edit has 106 entries
|
||||||
|
org.fedoraproject.stg.fas.user.create has 3 entries
|
||||||
|
org.fedoraproject.stg.bodhitest.testing has 4 entries
|
||||||
|
org.fedoraproject.stg.fedoratagger.tag.create has 9 entries
|
||||||
|
org.fedoraproject.stg.fedoratagger.user.rank.update has 5 entries
|
||||||
|
org.fedoraproject.stg.wiki.upload.complete has 1 entries
|
||||||
|
org.fedoraproject.stg.fas.group.member.sponsor has 6 entries
|
||||||
|
org.fedoraproject.stg.fedoratagger.tag.update has 1 entries
|
||||||
|
org.fedoraproject.stg.fas.group.member.apply has 17 entries
|
||||||
|
org.fedoraproject.stg.__main__.testing has 1 entries
|
||||||
|
....
|
||||||
|
|
||||||
|
== Upgrading the DB Schema
|
||||||
|
|
||||||
|
datanommer uses "python-alembic" to manage its schema. When developers
|
||||||
|
want to add new columns or features, these should/must be tracked in
|
||||||
|
alembic and shipped with the RPM.
|
||||||
|
|
||||||
|
In order to run upgrades on our stg/prod dbs:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. ssh to busgateway01\{.stg}
|
||||||
|
. `cd /usr/share/datanommer.models/`
|
||||||
|
. Run:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ alembic upgrade +1
|
||||||
|
....
|
||||||
|
|
||||||
|
____
|
||||||
|
Over and over again until the db is fully upgraded.
|
||||||
|
____
|
133
modules/sysadmin_guide/pages/debuginfod.adoc
Normal file
133
modules/sysadmin_guide/pages/debuginfod.adoc
Normal file
|
@ -0,0 +1,133 @@
|
||||||
|
= Fedora Debuginfod Service - SOP
|
||||||
|
|
||||||
|
Debuginfod is the software that lies behind the service at
|
||||||
|
https://debuginfod.fedoraproject.org/ and
|
||||||
|
https://debuginfod.stg.fedoraproject.org/ . These services run on 1 VM
|
||||||
|
each in the stg and prod infrastructure at IAD2.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
RH perftools team + Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
@fche in #fedora-noc
|
||||||
|
Servers:::
|
||||||
|
VMs
|
||||||
|
Purpose:::
|
||||||
|
Serve elf/dwarf/source-code debuginfo for supported releases to
|
||||||
|
debugger-like tools in Fedora.
|
||||||
|
Repository:::
|
||||||
|
https://sourceware.org/elfutils/Debuginfod.html
|
||||||
|
https://fedoraproject.org/wiki/Debuginfod
|
||||||
|
|
||||||
|
== How it works
|
||||||
|
|
||||||
|
One virtual machine in prod NFS-mount the koji build system's RPM
|
||||||
|
repository, read-only. The production VM has a virtual twin in the
|
||||||
|
staging environment. They each run elfutils debuginfod to index
|
||||||
|
designated RPMs into a large local sqlite database. They answers HTTP
|
||||||
|
queries received from users on the Internet via reverse-proxies at the
|
||||||
|
https://debuginfod.fedoraproject.org/ URL. The reverse proxies apply
|
||||||
|
gzip compression on the data and provide redirection of the root `/`
|
||||||
|
location only into the fedora wiki.
|
||||||
|
|
||||||
|
Normally, it is autonomous and needs no maintenance. It should come back
|
||||||
|
nicely after many kinds of outage. The software is based on elfutils in
|
||||||
|
Fedora, but may occasionally track a custom COPR build with backported
|
||||||
|
patches from future elfutils versions.
|
||||||
|
|
||||||
|
== Configuration
|
||||||
|
|
||||||
|
The daemon uses systemd and `/etc/sysconfig/debuginfod` to set basic
|
||||||
|
parameters. These have been tuned from the distro defaults via
|
||||||
|
experimental hand-editing or ansible. Key parameters are:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. The -I/-X include/exclude regexes. These tell debuginfod what fedora
|
||||||
|
versions to include RPMs for. If index disk space starts to run low, one
|
||||||
|
can eliminate some older fedoras from the index to free up space (after
|
||||||
|
the next groom cycle).
|
||||||
|
. The --fdcache related parameters. These tell debuginfod how much data
|
||||||
|
to cache from RPMs. (Some debuginfo files - kernel, llvm, gtkweb, ...)
|
||||||
|
are huge and worth retaining instead of repeated extracting.) This is
|
||||||
|
straight disk space vs. time tradeoff.
|
||||||
|
. The -t (scan interval) parameter. Scanning lets an index get bigger,
|
||||||
|
as new RPMs in koji are examined and their contents indexed. Each pass
|
||||||
|
takes a bunch of hours to traverse the entire koji NFS directory
|
||||||
|
structure to fstat() everything for newness or change. A smaller scan
|
||||||
|
interval lets debuginfod react quicker to koji builds coming into
|
||||||
|
existence, but increases load on the NFS server. More -n (scan threads)
|
||||||
|
may help the indexing process go faster, if the networking fabric & NFS
|
||||||
|
server are underloaded.
|
||||||
|
. The -g (groom interval) parameter. Grooming lets an index get smaller,
|
||||||
|
as files removed from koji will be forgotten about. It can be run very
|
||||||
|
intermittently - weekly or less - since it takes many hours and cannot
|
||||||
|
run concurrently with scanning.
|
||||||
|
|
||||||
|
A quick:
|
||||||
|
|
||||||
|
....
|
||||||
|
systemd restart debuginfod
|
||||||
|
....
|
||||||
|
|
||||||
|
activates the new settings.
|
||||||
|
|
||||||
|
In case of some drastic failure like database corruption or signs of
|
||||||
|
penetration/abuse, one can shut down the server with systemd, and/or
|
||||||
|
stop traffic at the incoming proxy configuration level. The index sqlite
|
||||||
|
database under `/var/cache/debuginfod` may be deleted, if necessary, but
|
||||||
|
keep in mind that it takes days to reindex the relevant parts of koji.
|
||||||
|
Alternately, with the services stopped, the 150GB+ sqlite database files
|
||||||
|
may be freely copied between the staging and production servers, if that
|
||||||
|
helps during disaster recovery.
|
||||||
|
|
||||||
|
== Monitoring
|
||||||
|
|
||||||
|
=== Prometheus
|
||||||
|
|
||||||
|
The debuginfod daemons answer the standard /metrics URL endpoint to
|
||||||
|
serve a variety of operational metrics in prometheus. Important metrics
|
||||||
|
include:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. filesys_free_ratio - free space on the filesystems. (These are also
|
||||||
|
monitored via fedora-infra nagios.) If the free space on the database or
|
||||||
|
tmp partition falls low, further indexing or even service may be
|
||||||
|
impacted. Add more disk space if possible, or start eliding older fedora
|
||||||
|
versions from the database via the -I/-X daemon options.
|
||||||
|
. thread_busy - number of busy threads. During indexing, 1-6 threads may
|
||||||
|
be busy for minutes or even days, intermittently. User requests show up
|
||||||
|
as "buildid" (real request) or "buildid-after-you" (deferred duplicate
|
||||||
|
request) labels. If there are more than a handful of "buildid" ones,
|
||||||
|
there may be an overload/abuse underway, in which case it's time to
|
||||||
|
identify the excessive traffic via the logs and get a temporary iptables
|
||||||
|
block going. Or perhaps there is an outage or slowdown of the koji NFS
|
||||||
|
storage system, in which case there's not much to do.
|
||||||
|
. error_count. These should be zero or near zero all the time.
|
||||||
|
|
||||||
|
=== Logs
|
||||||
|
|
||||||
|
The debuginfod daemons produce voluminous logs into the local systemd
|
||||||
|
journal, whence the traffic moves to the usual fedora-infra log01
|
||||||
|
server, `/var/log/hosts/debuginfod*/YYYY/MM/DD/messages.log`. The lines
|
||||||
|
related to HTTP GET identify the main webapi traffic, with originating
|
||||||
|
IP addresses in the XFF: field, and response size and elapsed service
|
||||||
|
time in the last columns. These can be useful in tracking down possible
|
||||||
|
abuse. :
|
||||||
|
|
||||||
|
....
|
||||||
|
Jun 28 22:36:43 debuginfod01 debuginfod[381551]: [Mon 28 Jun 2021 10:36:43 PM GMT] (381551/2413727): 10.3.163.75:43776 UA:elfutils/0.185,Linux/x86_64,fedora/35 XFF:*elided* GET /buildid/90910c1963bbcf700c0c0c06ee3bf4c5cc831d3a/debuginfo 200 335440 0+0ms
|
||||||
|
....
|
||||||
|
|
||||||
|
The lines related to prometheus /metrics are usually no big deal.
|
||||||
|
|
||||||
|
The log also includes info about errors and indexing progress.
|
||||||
|
Interesting may be the lines like:
|
||||||
|
|
||||||
|
....
|
||||||
|
Jun 28 22:36:43 debuginfod01 debuginfod[381551]: [Mon 28 Jun 2021 10:36:43 PM GMT] (381551/2413727): serving fdcache archive /mnt/fedora_koji_prod/koji/packages/valgrind/3.17.0/3.fc35/x86_64/valgrind-3.17.0-3.fc35.x86_64.rpm file /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so
|
||||||
|
....
|
||||||
|
|
||||||
|
which identify the file names derived from requests (which RPMs the
|
||||||
|
buildids to). These can provide some indirect distro telemetry: what
|
||||||
|
packages and binaries are being debugged and for which architectures?
|
52
modules/sysadmin_guide/pages/denyhosts.adoc
Normal file
52
modules/sysadmin_guide/pages/denyhosts.adoc
Normal file
|
@ -0,0 +1,52 @@
|
||||||
|
= Denyhosts Infrastructure SOP
|
||||||
|
|
||||||
|
Denyhosts provides a protection against brute force attacks.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. Troubleshooting and Resolution
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Connection issues
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main group
|
||||||
|
Location::
|
||||||
|
Anywhere
|
||||||
|
Servers::
|
||||||
|
All
|
||||||
|
Purpose::
|
||||||
|
Denyhosts provides a protection against brute force attacks.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
All of our servers now implement denyhosts to protect against brute
|
||||||
|
force attacks. Very few boxes should be in the 'allowed' list.
|
||||||
|
Especially internally.
|
||||||
|
|
||||||
|
== Troubleshooting and Resolution
|
||||||
|
|
||||||
|
=== Connection issues
|
||||||
|
|
||||||
|
The most common issue will be legitimate logins failing. First, try to
|
||||||
|
figure out why a host ended up on the deny list (tcptraceroute, failed
|
||||||
|
login attempts, etc are all good candidates). Next do the following
|
||||||
|
directions. The below example is for a host (10.0.0.1) being banned.
|
||||||
|
Login to the box from a different host and as root do the following.:
|
||||||
|
|
||||||
|
....
|
||||||
|
cd /var/lib/denyhosts
|
||||||
|
sed -si '/10.0.0.1/d' * /etc/hosts.deny
|
||||||
|
/etc/init.d/denyhosts restart
|
||||||
|
....
|
||||||
|
|
||||||
|
That should correct the problem.
|
63
modules/sysadmin_guide/pages/departing-admin.adoc
Normal file
63
modules/sysadmin_guide/pages/departing-admin.adoc
Normal file
|
@ -0,0 +1,63 @@
|
||||||
|
= Departing admin SOP
|
||||||
|
|
||||||
|
From time to time admins depart the project, this SOP checks any access
|
||||||
|
they may no longer need.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
Everywhere
|
||||||
|
Servers::
|
||||||
|
all
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
From time to time people with admin access to various parts of the
|
||||||
|
project may leave the project or no longer wish to contribute. This SOP
|
||||||
|
attempts to list the process for removing access they no longer need.
|
||||||
|
|
||||||
|
[arabic, start=0]
|
||||||
|
. First, make sure that this SOP is needed. Verify the person has left
|
||||||
|
the project and what areas they might wish to still contibute to.
|
||||||
|
. Gather info: fas username, email address, knowledge of passwords.
|
||||||
|
. Check the following areas with the following commands:
|
||||||
|
+
|
||||||
|
____
|
||||||
|
email address in ansible::
|
||||||
|
* Check: `git grep email@address`
|
||||||
|
* Remove: `git commit`
|
||||||
|
koji admin::
|
||||||
|
* Check: `koji list-permissions --user=username`
|
||||||
|
* Remove: `koji revoke-permission permissionname username`
|
||||||
|
wiki pages::
|
||||||
|
* Check: look for https://fedoraproject.org/wiki/User:Username
|
||||||
|
* Remove: delete page, or modify with info they are no longer
|
||||||
|
contributing.
|
||||||
|
packages::
|
||||||
|
* Check: Download
|
||||||
|
https://admin.fedoraproject.org/pkgdb/lists/bugzilla?tg_format=plain
|
||||||
|
and grep
|
||||||
|
* Remove: remove from cc, orphan packages or reassign.
|
||||||
|
fas account::
|
||||||
|
* Check: check username in fas
|
||||||
|
* Remove: set user inactive
|
||||||
|
+
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
If there are scripts or files needed, save homedir of user.
|
||||||
|
====
|
||||||
|
passwords::
|
||||||
|
* Check: if departing admin knew sensitive passwords.
|
||||||
|
* Remove: Change passwords.
|
||||||
|
+
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
root pw, management interfaces, etc
|
||||||
|
====
|
||||||
|
____
|
358
modules/sysadmin_guide/pages/dns.adoc
Normal file
358
modules/sysadmin_guide/pages/dns.adoc
Normal file
|
@ -0,0 +1,358 @@
|
||||||
|
= DNS repository for fedoraproject
|
||||||
|
|
||||||
|
We've set this up so we can easily (and quickly) edit and deploy dns
|
||||||
|
changes with a record of who changed what and why. This system also lets
|
||||||
|
us edit out proxies from rotation for our many and varied websites
|
||||||
|
quickly and with a minimum of opportunity for error. Finally, it checks
|
||||||
|
to make sure that all of the zone changes will actually work before they
|
||||||
|
are allowed.
|
||||||
|
|
||||||
|
== DNS Infrastructure SOP
|
||||||
|
|
||||||
|
We have 5 DNS servers:
|
||||||
|
|
||||||
|
ns02.fedoraproject.org::
|
||||||
|
hosted at ibiblio (ipv6 enabled)
|
||||||
|
ns05.fedoraproject.org::
|
||||||
|
hosted at internetx (ipv6 enabled)
|
||||||
|
ns13.rdu2.fedoraproject.org::
|
||||||
|
in rdu2, internal to rdu2.
|
||||||
|
ns01.iad2.fedoraproject.org::
|
||||||
|
in iad2, internal to iad2.
|
||||||
|
ns02.iad2.fedoraproject.org::
|
||||||
|
in iad2, internal to iad2.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Troubleshooting, Resolution and Maintenance
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. DNS update
|
||||||
|
. Adding a new zone
|
||||||
|
____
|
||||||
|
|
||||||
|
[arabic, start=3]
|
||||||
|
. GeoDNS
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Non geodns fedoraproject.org IPs
|
||||||
|
. Adding and removing countries
|
||||||
|
. IP Country Mapping
|
||||||
|
____
|
||||||
|
|
||||||
|
[arabic, start=4]
|
||||||
|
. resolv.conf
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Phoenix
|
||||||
|
. Non-Phoenix
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-dns
|
||||||
|
Location:::
|
||||||
|
ServerBeach and ibiblio and internetx and phx2.
|
||||||
|
Servers:::
|
||||||
|
ns02, ns05, ns13.rdu2, ns01.iad2, ns02.iad2
|
||||||
|
Purpose:::
|
||||||
|
Provides DNS to our users
|
||||||
|
|
||||||
|
Troubleshooting, Resolution and Maintenance
|
||||||
|
|
||||||
|
== Check out the DNS repository
|
||||||
|
|
||||||
|
You can get the dns repository from `/srv/git/dns` on `batcave01`:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ git clone /srv/git/dns
|
||||||
|
....
|
||||||
|
|
||||||
|
== Adding a new Host
|
||||||
|
|
||||||
|
Adding a new host requires to add it to DNS and to ansible, see
|
||||||
|
new-hosts.rst for the details.
|
||||||
|
|
||||||
|
== Editing the domain(s)
|
||||||
|
|
||||||
|
We have three domains which needs to be able to change on demand for
|
||||||
|
proxy rotation/removal:
|
||||||
|
|
||||||
|
* fedoraproject.org.
|
||||||
|
* getfedora.org.
|
||||||
|
* cloud.fedoraproject.org.
|
||||||
|
|
||||||
|
The other domains are edited only when we add/subtract a host or move it
|
||||||
|
to a new ip. Not much else.
|
||||||
|
|
||||||
|
If you need to edit a domain that is NOT In the above list:
|
||||||
|
|
||||||
|
* change to the 'master' subdir, edit the domain as usual (remember to
|
||||||
|
update the serial), save it.
|
||||||
|
|
||||||
|
If you need to edit one of the domains in the above list: (replace
|
||||||
|
fedoraproject.org with the domain from above)
|
||||||
|
|
||||||
|
* if you need to add/change a host in fedoraproject.org that is not '@'
|
||||||
|
or 'wildcard' then:
|
||||||
|
** edit fedoraproject.org.template
|
||||||
|
** make your changes
|
||||||
|
** {blank}
|
||||||
|
+
|
||||||
|
do not edit the serial or anything surrounded by \{\{ }} unless you::
|
||||||
|
REALLY know what you are doing.
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
if you need to only add/remove a proxy during an outage or due to::
|
||||||
|
networking issue then run:
|
||||||
|
- `./zone-template fedoraproject.org.cfg disable ip [ip] [ip]`::
|
||||||
|
to disable the ip of the proxy you want removed.
|
||||||
|
- `./zone-template fedoraproject.org.cfg enable ip [ip] [ip]`::
|
||||||
|
reverses the disable
|
||||||
|
- `./zone-template fedoraproject.org.cfg reset`::
|
||||||
|
will reset to all ips enabled.
|
||||||
|
* if you want to add an all new proxy as '@' or 'wildcard' for
|
||||||
|
fedoraproject.org:
|
||||||
|
** edit fedoraproject.org.cfg
|
||||||
|
** add the ip to the correct section of the ipv4 or ipv6 in the config.
|
||||||
|
** save the file
|
||||||
|
** check the file for validity by running:
|
||||||
|
`python fedoraproject.org.cfg` looking for errors or tracebacks.
|
||||||
|
|
||||||
|
When complete run:
|
||||||
|
|
||||||
|
____
|
||||||
|
git add . git commit -a -m 'description of your change here'
|
||||||
|
____
|
||||||
|
|
||||||
|
It is important to commit this before running the do-domains script as
|
||||||
|
it makes it easier to track the changes.
|
||||||
|
|
||||||
|
In all cases then run:
|
||||||
|
|
||||||
|
* `./do-domains`
|
||||||
|
* if that completes successfully then run:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
git add .
|
||||||
|
git commit -a -m 'description of your change here'
|
||||||
|
git push
|
||||||
|
....
|
||||||
|
* nameservers update from dns via cron every 10minutes.
|
||||||
|
|
||||||
|
The above git process can be achieved with the below bash function where
|
||||||
|
the commit message is passed as an arg when running.:
|
||||||
|
|
||||||
|
....
|
||||||
|
dnscommit()
|
||||||
|
{
|
||||||
|
local args=$1
|
||||||
|
cd ~/dns;
|
||||||
|
git commit -a -m "${args}"
|
||||||
|
git pull --rebase && ./do-domains && git add built && git commit -a -m "Signed DNS" && git push
|
||||||
|
}
|
||||||
|
....
|
||||||
|
|
||||||
|
If you need an update to be live more quickly:
|
||||||
|
|
||||||
|
and then run this on all of the nameservers (as root):
|
||||||
|
|
||||||
|
....
|
||||||
|
/usr/local/bin/update-dns
|
||||||
|
....
|
||||||
|
|
||||||
|
To run this via ansible from batcave do:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook update_dns.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
this will pull from the git tree, update all of the zones and reload the
|
||||||
|
name server.
|
||||||
|
|
||||||
|
== DNS update
|
||||||
|
|
||||||
|
DNS config files are ansible managed on batcave01.
|
||||||
|
|
||||||
|
From your local machine run:
|
||||||
|
|
||||||
|
....
|
||||||
|
git clone ssh://git@pagure.io/fedora-infra/ansible.git
|
||||||
|
cd ansible/roles/dns/files/
|
||||||
|
...make changes needed...
|
||||||
|
git commit -m "What you did"
|
||||||
|
git push
|
||||||
|
....
|
||||||
|
|
||||||
|
It should update within a half hour. You can test the new configs with
|
||||||
|
dig:
|
||||||
|
|
||||||
|
....
|
||||||
|
dig @ns01.fedoraproject.org fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
== Adding a new zone
|
||||||
|
|
||||||
|
First name the zone and generate new set of keys for it. Run this on
|
||||||
|
ns01. Note it could take SEVERAL minutes to run:
|
||||||
|
|
||||||
|
....
|
||||||
|
/usr/sbin/dnssec-keygen -a RSASHA1 -b 1024 -n ZONE c.fedoraproject.org
|
||||||
|
/usr/sbin/dnssec-keygen -a RSASHA1 -b 2048 -n ZONE -f KSK c.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
Then copy the created .key and .private files to the private git repo
|
||||||
|
(You need to be sysadmin-main to do this). The directory is
|
||||||
|
`private/private/dnssec`.
|
||||||
|
|
||||||
|
* add the zone in zones.conf in `ansible/roles/dns/files/zones.conf`
|
||||||
|
* save and commit - but do not push
|
||||||
|
* Add zone file to the master subdir in this repo
|
||||||
|
* git add and commit the file
|
||||||
|
* check the zone by running check-domains
|
||||||
|
* if you intend to have this be a dnssec signed zone then you must
|
||||||
|
** create a new key:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
/usr/sbin/dnssec-keygen -a RSASHA1 -b 1024 -n ZONE $domain.org
|
||||||
|
/usr/sbin/dnssec-keygen -a RSASHA1 -b 2048 -n ZONE -f KSK $domain.org
|
||||||
|
....
|
||||||
|
*** {blank}
|
||||||
|
+
|
||||||
|
put the files this generates into /srv/privatekeys/dnssec on batcave01::
|
||||||
|
**** edit the do-domains file in this dir and your domain to the
|
||||||
|
signed_domains entry at the top
|
||||||
|
**** edit the zone you just created and add the contents of the .key
|
||||||
|
files to the bottom of the zone
|
||||||
|
|
||||||
|
If this is a subdomain of fedoraproject.org:
|
||||||
|
|
||||||
|
* run dnssec-dsfromkey on each of the .key files generated
|
||||||
|
* paste that output into the bottom of fedoraproject.org.template
|
||||||
|
* commit everything to the dns tree
|
||||||
|
* push your changes
|
||||||
|
* push your changes to the ansible repo
|
||||||
|
* test
|
||||||
|
|
||||||
|
If you add a new child zone, such as c.fedoraproject.org or
|
||||||
|
vpn.fedoraproject.org you will also need to add the contents of
|
||||||
|
dsset-childzone.fedoraproject.org (for example), to the main
|
||||||
|
fedoraproject.org zonefile, so that DNSSEC has a valid trust path to
|
||||||
|
that zone.
|
||||||
|
|
||||||
|
You also must set the NS delegation entries near the top of
|
||||||
|
fedoraproject.org zone file these are necessary to keep dnssec-signzone
|
||||||
|
from whining with this error msg:
|
||||||
|
|
||||||
|
....
|
||||||
|
dnssec-signzone: fatal: 'xxxxx.example.com': found DS RRset without NS RRset
|
||||||
|
....
|
||||||
|
|
||||||
|
Look for the: "vpn IN NS" records at the top of fedoraproject.org and
|
||||||
|
copy them for the new child zone.
|
||||||
|
|
||||||
|
== GeoDNS
|
||||||
|
|
||||||
|
As part of our Content Distribution Network we use geodns for certain
|
||||||
|
zones. At the moment just `fedoraproject.org` and `*.fedoraproject.org`
|
||||||
|
zones. We've got proxy servers all over the US and in Europe. We are now
|
||||||
|
sending users to proxy servers that are near them. The current list of
|
||||||
|
available 'zone areas' are:
|
||||||
|
|
||||||
|
* DEFAULT
|
||||||
|
* EU
|
||||||
|
* NA
|
||||||
|
|
||||||
|
DEFAULT contains all the zones. So someone who does not seem to be in or
|
||||||
|
near the EU, or NA would get directed to any random set. (South Africa
|
||||||
|
for example doesn't get directed to any particular server).
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
Don't forget to increase the serial number in the fedoraproject.org zone
|
||||||
|
file. Even if you're making a change to one of the geodns IPs. There is
|
||||||
|
only one serial number for all setups and that serial number is in the
|
||||||
|
fedoraproject.org zone.
|
||||||
|
====
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Non geodns fedoraproject.org IPs If you're adding as server that is just
|
||||||
|
in one location, and isn't going to get geodns balanced. Just add that
|
||||||
|
host to the fedoraproject.org zone.
|
||||||
|
====
|
||||||
|
=== Adding and removing countries
|
||||||
|
|
||||||
|
Our setup actually requires us to specify which countries go to which
|
||||||
|
servers. To do this, simply edit the named.conf file in ansible. Below
|
||||||
|
is an example of what counts as "NA" (North America).:
|
||||||
|
|
||||||
|
....
|
||||||
|
view "NA" {
|
||||||
|
match-clients { US; CA; MX; };
|
||||||
|
recursion no;
|
||||||
|
zone "fedoraproject.org" {
|
||||||
|
type master;
|
||||||
|
file "master/NA/fedoraproject.org.signed";
|
||||||
|
};
|
||||||
|
include "etc/zones.conf";
|
||||||
|
};
|
||||||
|
....
|
||||||
|
|
||||||
|
=== IP Country Mapping
|
||||||
|
|
||||||
|
The IP -> Location mapping is done via a config file that exists on the
|
||||||
|
dns servers themselves (it's not ansible controlled). The file, located
|
||||||
|
at `/var/named/chroot/etc/GeoIP.acl` is generated by the `GeoIP.sh`
|
||||||
|
script (that script is in ansible).
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
This is known to be a less efficient means of doing geodns than the
|
||||||
|
patched version from kernel.org. We're using this version at the moment
|
||||||
|
because it's in Fedora and works. The level of DNS traffic we see is
|
||||||
|
generally low enough that the inefficiencies aren't that noticed. For
|
||||||
|
example, average load on the servers before this geodns was .2, now it's
|
||||||
|
around .4
|
||||||
|
====
|
||||||
|
== resolv.conf
|
||||||
|
|
||||||
|
In order to make the network more transparent to the admins, we do a lot
|
||||||
|
of search based relative names. Below is a list of what a resolv.conf
|
||||||
|
should look like.
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
Any machine that is not on our vpn or has not yet joined the vpn should
|
||||||
|
_link:[NOT] have the vpn.fedoraproject.org search until after it has
|
||||||
|
been added to the vpn (if it ever does)
|
||||||
|
====
|
||||||
|
Phoenix::
|
||||||
|
....
|
||||||
|
search phx2.fedoraproject.org vpn.fedoraproject.org fedoraproject.org
|
||||||
|
....
|
||||||
|
Phoenix in the QA network:::
|
||||||
|
....
|
||||||
|
search qa.fedoraproject.org vpn.fedoraproject.org phx2.fedoraproject.org fedoraproject.org
|
||||||
|
....
|
||||||
|
Non-Phoenix::
|
||||||
|
....
|
||||||
|
search vpn.fedoraproject.org fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
The idea here is that we can, when need be, setup local domains to
|
||||||
|
contact instead of having to go over the VPN directly but still have
|
||||||
|
sane configs. For example if we tell the proxy server to hit "app1" and
|
||||||
|
that box is in PHX, it will go directly to app1, if its not, it will go
|
||||||
|
over the vpn to app1.
|
62
modules/sysadmin_guide/pages/docs.fedoraproject.org.adoc
Normal file
62
modules/sysadmin_guide/pages/docs.fedoraproject.org.adoc
Normal file
|
@ -0,0 +1,62 @@
|
||||||
|
= docs SOP
|
||||||
|
|
||||||
|
____
|
||||||
|
Fedora Documentation - Documentation for installing and using Fedora
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
docs, Fedora Infrastrcture Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-docs
|
||||||
|
Servers:::
|
||||||
|
proxy*
|
||||||
|
Purpose:::
|
||||||
|
Provide documentation for users and contributors.
|
||||||
|
|
||||||
|
== Description:
|
||||||
|
|
||||||
|
The Fedora Documentation Project was created to provide documentation
|
||||||
|
for fedora users and contributors. It's like "The Bible" for using
|
||||||
|
Fedora and other software used by the Fedora Project. It uses Publican,
|
||||||
|
a free and open-source publishing tool. Publican generates html pages
|
||||||
|
from content in DocBook XML format. The source files are in a git repo
|
||||||
|
and publican builds html files from these source files whenever changes
|
||||||
|
are made. As these are static pages these are available on all the proxy
|
||||||
|
servers which serve our requests for docs.fedoraproject.org.
|
||||||
|
|
||||||
|
== Updates process:
|
||||||
|
|
||||||
|
The fedora docs writers update and build their docs and then push the
|
||||||
|
completed output into a git repo. This git repo is then pulled by each
|
||||||
|
of the Fedora proxies and served as static content.
|
||||||
|
|
||||||
|
Note that docs is talking about setting up a new process, this SOP needs
|
||||||
|
updating when that happens.
|
||||||
|
|
||||||
|
== Reporting bugs:
|
||||||
|
|
||||||
|
Bugs can be reported at the Fedora Documentation's Bugzilla. Here's the
|
||||||
|
link:
|
||||||
|
https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora%20Documentation
|
||||||
|
|
||||||
|
Errors or problems in the wiki can be modified by anyone with a FAS
|
||||||
|
account.
|
||||||
|
|
||||||
|
== Contributing to the Fedora Documentation Project:
|
||||||
|
|
||||||
|
If you find the existing documentation insufficient or outdated or any
|
||||||
|
particular page is not available in your language feel free to improve
|
||||||
|
the documentation by contributing to Fedora Documentation Project. You
|
||||||
|
can find more details here:
|
||||||
|
https://fedoraproject.org/wiki/Join_the_Docs_Project
|
||||||
|
|
||||||
|
Translation of documentation is taken care by the Fedora Localization
|
||||||
|
Project aka L10N. More details can be found at:
|
||||||
|
https://fedoraproject.org/wiki/L10N
|
||||||
|
|
||||||
|
== Publican wiki:
|
||||||
|
|
||||||
|
More details about Publican can be found at the publican wiki here:
|
||||||
|
https://sourceware.org/publican/en-US/index.html
|
157
modules/sysadmin_guide/pages/fas-notes.adoc
Normal file
157
modules/sysadmin_guide/pages/fas-notes.adoc
Normal file
|
@ -0,0 +1,157 @@
|
||||||
|
= Fedora Account System
|
||||||
|
|
||||||
|
Notes about FAS and how to do things in it:
|
||||||
|
|
||||||
|
* where are certs for fas accounts for koji, etc? on fas01
|
||||||
|
/var/lib/fedora-ca - makefile targets allow you to do things with them.
|
||||||
|
|
||||||
|
look in index.txt for certs. One's marked with an 'R' in the left-most
|
||||||
|
column are 'REVOKED'
|
||||||
|
|
||||||
|
to revoke a cert:
|
||||||
|
|
||||||
|
....
|
||||||
|
cd /var/lib/fedora-ca
|
||||||
|
....
|
||||||
|
|
||||||
|
find the cert number in index.txt - the number is the 3rd column in the
|
||||||
|
file - you can match it to the user by searching for their username. You
|
||||||
|
want the highest number cert for their account.
|
||||||
|
|
||||||
|
once you have the number you would run (as root or fas):
|
||||||
|
|
||||||
|
....
|
||||||
|
make revoke cert=newcerts/$that_number.pem
|
||||||
|
....
|
||||||
|
|
||||||
|
== How to gather information about a user
|
||||||
|
|
||||||
|
You'll want to have direct access to query the database for this. The
|
||||||
|
common way is to have someone in sysadmin-db ssh to the postgres db
|
||||||
|
hosting FAS (currently db01). Then access it via ident auth on the box:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -u postgres psql fas2
|
||||||
|
....
|
||||||
|
|
||||||
|
There are several tables that will have information about a user. Some
|
||||||
|
of it is redundant but it's good to check all the sources there
|
||||||
|
shouldn't be inconsistencies:
|
||||||
|
|
||||||
|
....
|
||||||
|
select * from people where username = 'USERNAME';
|
||||||
|
....
|
||||||
|
|
||||||
|
Of interest here are:
|
||||||
|
|
||||||
|
id::
|
||||||
|
for later queries
|
||||||
|
password_changed::
|
||||||
|
tells when the password was last changed
|
||||||
|
last_seen::
|
||||||
|
last login to fas (including through jsonfas from other TG1/2 apps.
|
||||||
|
Maybe wiki and insight as well. Not fedorahosted trac, shell login,
|
||||||
|
etc)
|
||||||
|
status_change::
|
||||||
|
last time that the user's status was updated via the website. Usually
|
||||||
|
triggered when the user was marked inactive for a mass password change
|
||||||
|
and then they reset their password.
|
||||||
|
|
||||||
|
Next table is the log table:
|
||||||
|
|
||||||
|
....
|
||||||
|
select * from log where author_id = ID_FROM_PREV_QUERY or description ~ '.*USERNAME.*';
|
||||||
|
....
|
||||||
|
|
||||||
|
The FAS writes certain events to the log table. This will get those
|
||||||
|
events. We use both the author_id field (who made the change) and the
|
||||||
|
username in a description regex search because a few changes are made to
|
||||||
|
users by admins. Fields of interest are pretty self explanatory here:
|
||||||
|
|
||||||
|
changetime::
|
||||||
|
when the log was made
|
||||||
|
description::
|
||||||
|
description of the event that's being logged
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
FAS does not log every event that happens to a user. Only "important"
|
||||||
|
ones. FAS also cannot record direct changes to the database here (for
|
||||||
|
instance, when we mark accounts inactive administratively via the db).
|
||||||
|
====
|
||||||
|
|
||||||
|
Lastly, there's the groups and person_roles table. When a user joins
|
||||||
|
a group, the person_roles table is updated to reflect the user's status
|
||||||
|
in the group, when they applied, and when they were approved:
|
||||||
|
|
||||||
|
....
|
||||||
|
select groups.name, person_roles.* from person_roles, groups where person_id = ID_FROM_INITIAL_QUERY and groups.id = person_roles.group_id;
|
||||||
|
....
|
||||||
|
|
||||||
|
This will give you the following fields to pay attention to:
|
||||||
|
|
||||||
|
name::
|
||||||
|
Name of the group
|
||||||
|
role_status::
|
||||||
|
If this is unapproved, it just means the user applied for it. If it is
|
||||||
|
approved, it means they are actually in the group.
|
||||||
|
creation::
|
||||||
|
When the user applied to the group
|
||||||
|
approval::
|
||||||
|
When the user was approved to be in the group
|
||||||
|
role_type::
|
||||||
|
What role the person has or wants to have in the group
|
||||||
|
sponsor_id::
|
||||||
|
If you suspect something is suspicious with one of the roles, you may
|
||||||
|
want to ask the sponsor if they remember sponsoring this person
|
||||||
|
|
||||||
|
== Account Deletion and renaming
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
see also accountdeletion.rst For information on how to disable, rename,
|
||||||
|
and remove accounts.
|
||||||
|
====
|
||||||
|
|
||||||
|
== Pseudo Users
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
see also nonhumanaccounts.rst For information on creating pseudo user
|
||||||
|
accounts for use in pkgdb/bugzilla
|
||||||
|
====
|
||||||
|
|
||||||
|
== fas staging
|
||||||
|
|
||||||
|
we have a staging fas db setup on db-fas01.stg.phx2.fedoraproject.org -
|
||||||
|
it accessed by fas01.stg.phx2.fedoraproject.org
|
||||||
|
|
||||||
|
This system is not autopopulated by production fas - it must be done
|
||||||
|
manually. To do this you must:
|
||||||
|
|
||||||
|
* dump the fas2 db on db-fas01.phx2.fedoraproject.org:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo -u postgres pg_dump -C fas2 > fas2.dump
|
||||||
|
scp fas2.dump db-fas01.stg.phx2.fedoraproject.org:/tmp
|
||||||
|
....
|
||||||
|
* then on fas01.stg.phx2.fedoraproject.org:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
/etc/init.d/httpd stop
|
||||||
|
....
|
||||||
|
* then on db02.stg.phx2.fedoraproject.org:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
echo "drop database fas2\;" | sudo -u postgres psql ; cat fas2.dump | sudo -u postgres psql
|
||||||
|
....
|
||||||
|
* then on fas01.stg.phx2.fedoraproject.org:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
/etc/init.d/httpd start
|
||||||
|
....
|
||||||
|
|
||||||
|
that should do it.
|
42
modules/sysadmin_guide/pages/fas-openid.adoc
Normal file
42
modules/sysadmin_guide/pages/fas-openid.adoc
Normal file
|
@ -0,0 +1,42 @@
|
||||||
|
= FAS-OpenID
|
||||||
|
|
||||||
|
FAS-OpenID is the OpenID server of Fedora infrastructure.
|
||||||
|
|
||||||
|
Live instance is at https://id.fedoraproject.org/ Staging instance is at
|
||||||
|
https://id.dev.fedoraproject.org/
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Patrick Uiterwijk (puiterwijk)
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-apps, #fedora-noc
|
||||||
|
Location::
|
||||||
|
openid0\{1,2}.phx2.fedoraproject.org openid01.stg.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Authentication & Authorization
|
||||||
|
|
||||||
|
== Trusted roots
|
||||||
|
|
||||||
|
FAS-OpenID has a set of "trusted roots", which contains websites which
|
||||||
|
are always trusted, and thus FAS-OpenID will not show the Approve/Reject
|
||||||
|
form to the user when they login to any such site.
|
||||||
|
|
||||||
|
As a policy, we will only add websites to this list which Fedora
|
||||||
|
Infrastructure controls. If anyone ever ask to add a website to this
|
||||||
|
list, just answer with this default message:
|
||||||
|
|
||||||
|
....
|
||||||
|
We only add websites we (Fedora Infrastructure) maintain to this list.
|
||||||
|
|
||||||
|
This feature was put in because it wouldn't make sense to ask for permission
|
||||||
|
to send data to the same set of servers that it already came from.
|
||||||
|
|
||||||
|
Also, if we were to add external websites, we would need to judge their
|
||||||
|
privacy policy etc.
|
||||||
|
|
||||||
|
Also, people might start complaining that we added site X but not their site,
|
||||||
|
maybe causing us "political" issues later down the road.
|
||||||
|
|
||||||
|
As a result, we do NOT add external websites.
|
||||||
|
....
|
181
modules/sysadmin_guide/pages/fedmsg-certs.adoc
Normal file
181
modules/sysadmin_guide/pages/fedmsg-certs.adoc
Normal file
|
@ -0,0 +1,181 @@
|
||||||
|
= fedmsg (Fedora Messaging) Certs, Keys, and CA - SOP
|
||||||
|
|
||||||
|
X509 certs, private RSA keys, Certificate Authority, and Certificate
|
||||||
|
Revocation List.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-apps, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
* app0[1-7]
|
||||||
|
* packages0[1-2]
|
||||||
|
* fas0[1-3]
|
||||||
|
* pkgs01
|
||||||
|
* busgateway01,
|
||||||
|
* value0\{1,3}
|
||||||
|
* releng0\{1,4}
|
||||||
|
* relepel03
|
||||||
|
Purpose::
|
||||||
|
Certify fedmsg messages come from authentic sources.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
fedmsg sends JSON-encoded messages from many services to a zeromq
|
||||||
|
messaging bus. We're not concerned with encrypting the messages, only
|
||||||
|
with signing them so an attacker cannot spoof.
|
||||||
|
|
||||||
|
Every instance of each service on each host has its own cert and private
|
||||||
|
key, signed by the CA. By convention, we name the certs
|
||||||
|
<service>-<fqdn>.\{crt,key} For instance, bodhi has the following certs:
|
||||||
|
|
||||||
|
* bodhi-app01.phx2.fedoraproject.org
|
||||||
|
* bodhi-app02.phx2.fedoraproject.org
|
||||||
|
* bodhi-app03.phx2.fedoraproject.org
|
||||||
|
* bodhi-app01.stg.phx2.fedoraproject.org
|
||||||
|
* bodhi-app02.stg.phx2.fedoraproject.org
|
||||||
|
* more
|
||||||
|
|
||||||
|
Scripts to generate new keys, sign them, and revoke them live in the
|
||||||
|
ansible repo in `ansible/roles/fedmsg/files/cert-tools/`. The keys and
|
||||||
|
certs themselves (including ca.crt and the CRL) live in the private repo
|
||||||
|
in `private/fedmsg-certs/keys/`
|
||||||
|
|
||||||
|
fedmsg is locally configured to find the key it needs by looking in
|
||||||
|
`/etc/fedmsg.d/ssl.py` which is kept in ansible in
|
||||||
|
`ansible/roles/fedmsg/templates/fedmsg.d/ssl.py.erb`.
|
||||||
|
|
||||||
|
Each service-host has its own key. This means:
|
||||||
|
|
||||||
|
* A key is not shared across multiple instances of a service on
|
||||||
|
different machines. i.e., bodhi on app01 and bodhi on app02 should have
|
||||||
|
different key/cert pairs.
|
||||||
|
* A key is not shared across multiple services on a host. i.e.,
|
||||||
|
mediawiki on app01 and bodhi on app01 should have different key/cert
|
||||||
|
pairs.
|
||||||
|
|
||||||
|
The attempt here is to minimize the number of potential attack vectors.
|
||||||
|
Each private key should be readable only by the service that needs it.
|
||||||
|
bodhi runs under mod_wsgi in apache and should run as its own unique
|
||||||
|
bodhi user (not as apache). The permissions for
|
||||||
|
its.phx2.fedoraproject.org private_key, when deployed by ansible, should
|
||||||
|
be read-only for that local bodhi user.
|
||||||
|
|
||||||
|
For more information on how fedmsg uses these certs see
|
||||||
|
http://fedmsg.readthedocs.org/en/latest/crypto.html
|
||||||
|
|
||||||
|
== Configuring the Scripts
|
||||||
|
|
||||||
|
Usage of the main scripts is described in more detail below. They are
|
||||||
|
located in `ansible/rolesfedmsg/files/cert-tools`.
|
||||||
|
|
||||||
|
Before you use them, you'll need to point them at the right directory to
|
||||||
|
modify. By default, this is `~/private/fedmsg-certs/keys/`. You can
|
||||||
|
change that by editing `ansible/roles/fedmsg/files/cert-tools/vars` in
|
||||||
|
the event that you have the private repo checked out to an alternate
|
||||||
|
location.
|
||||||
|
|
||||||
|
There are other configuration values defined in that script. Most will
|
||||||
|
not need to be changed.
|
||||||
|
|
||||||
|
== Wiping and Rebuilding Everything
|
||||||
|
|
||||||
|
There is a script in `ansible/roles/fedmsg/files/cert-tools/` named
|
||||||
|
`rebuild-all-fedmsg-certs`. You can run it with no arguments to wipe out
|
||||||
|
the old and generate a new CA root certificate, a signing cert and key,
|
||||||
|
and all key/cert pairs for all service-hosts.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Warning -- Obviously, this will wipe everything. Do you want that?
|
||||||
|
====
|
||||||
|
== Adding a new key for a new service-host
|
||||||
|
|
||||||
|
First, checkout the ansible private repo as that's where the keys are
|
||||||
|
going to be stored. The scripts will assume this is checked out to
|
||||||
|
~/private.
|
||||||
|
|
||||||
|
In `ansible/roles/fedmsg/files/cert-tools` run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ source ./vars
|
||||||
|
$ ./build-and-sign-key <service>-<fqdn>
|
||||||
|
....
|
||||||
|
|
||||||
|
For instance, if we bring up a new app host,
|
||||||
|
app10.phx2.fedoraproject.org, we'll need to generate a new cert/key pair
|
||||||
|
for each fedmsg-enabled service that will be running on it, so you'd
|
||||||
|
run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ source ./vars
|
||||||
|
$ ./build-and-sign-key shell-app10.phx2.fedoraproject.org
|
||||||
|
$ ./build-and-sign-key bodhi-app10.phx2.fedoraproject.org
|
||||||
|
$ ./build-and-sign-key mediawiki-app10.phx2.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
Just creating the keys isn't quite enough, there are four more things
|
||||||
|
you'll need to do.
|
||||||
|
|
||||||
|
The private keys are created in your checkout of the private repo under
|
||||||
|
~/private/private/fedmsg-certs/keys . There will be four files for each
|
||||||
|
cert you created: <hexdigits>.pem (ex: 5B.pem) and
|
||||||
|
<service>-<fqdn>.\{crt,csr,key} git add, commit, and push all of those.
|
||||||
|
|
||||||
|
Second, You need to edit
|
||||||
|
`ansible/roles/fedmsg/files/cert-tools/rebuild-all-fedmsg-certs` and add
|
||||||
|
the argument of the commands you just ran, so that next time certs need
|
||||||
|
to be blown away and recreated, the new service-hosts will be included.
|
||||||
|
For the examples above, you would need to add to the list:
|
||||||
|
|
||||||
|
....
|
||||||
|
shell-app10.phx2.fedoraproject.org
|
||||||
|
bodhi-app10.phx2.fedoraproject.org
|
||||||
|
mediawiki-app10.phx2.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
You need to ensure that the keys are distributed to the host with the
|
||||||
|
proper permissions. Only the bodhi user should be able to access bodhi's
|
||||||
|
private key. This can be accomplished by using the `fedmsg::certificate`
|
||||||
|
in ansible. It should distribute your new keys to the correct hosts and
|
||||||
|
correctly permission them.
|
||||||
|
|
||||||
|
Lastly, if you haven't already updated the global fedmsg config, you'll
|
||||||
|
need to. You need to add your new service-node to `fedmsg.d/endpoint.py`
|
||||||
|
and to `fedmsg.d/ssl.py`. Those can be found in
|
||||||
|
`ansible/roles/fedmsg/templates/fedmsg.d`. See
|
||||||
|
http://fedmsg.readthedocs.org/en/latest/config.html for more information
|
||||||
|
on the layout and meaning of those files.
|
||||||
|
|
||||||
|
== Revoking a key
|
||||||
|
|
||||||
|
In `ansible/roles/fedmsg/files/cert-tools` run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ source ./vars
|
||||||
|
$ ./revoke-full <service>-<fqdn>
|
||||||
|
....
|
||||||
|
|
||||||
|
This will alter `private/fedmsg-certs/keys/crl.pem` which should be
|
||||||
|
picked up and served publicly, and then consumed by all fedmsg consumers
|
||||||
|
globally.
|
||||||
|
|
||||||
|
`crl.pem` is publicly available at
|
||||||
|
http://fedoraproject.org/fedmsg/crl.pem
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Even though crl.pem lives in the private repo, we're just keeping it
|
||||||
|
there for convenience. It really _should_ be served publicly, so don't
|
||||||
|
panic. :)
|
||||||
|
====
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
At the time of this writing, the CRL is not actually used. I need one
|
||||||
|
publicly available first so we can test it out.
|
||||||
|
====
|
106
modules/sysadmin_guide/pages/fedmsg-gateway.adoc
Normal file
106
modules/sysadmin_guide/pages/fedmsg-gateway.adoc
Normal file
|
@ -0,0 +1,106 @@
|
||||||
|
= fedmsg-gateway SOP
|
||||||
|
|
||||||
|
Outgoing raw ZeroMQ message stream.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
see also: fedmsg-websocket
|
||||||
|
====
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-apps, #fedora-admin, #fedora-noc
|
||||||
|
Servers:::
|
||||||
|
busgateway01, proxy0*
|
||||||
|
Purpose:::
|
||||||
|
Expose raw ZeroMQ messages outside the FI environment.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Users outside of Fedora Infrastructure can listen to the production
|
||||||
|
message bus by connecting to specific addresses. This is required for
|
||||||
|
local users to run their own hubs and message processors ("Consumers").
|
||||||
|
It is also required for user-facing tools like fedmsg-notify to work.
|
||||||
|
|
||||||
|
The specific public endpoints are:
|
||||||
|
|
||||||
|
production::
|
||||||
|
tcp://hub.fedoraproject.org:9940
|
||||||
|
staging::
|
||||||
|
tcp://stg.fedoraproject.org:9940
|
||||||
|
|
||||||
|
fedmsg-gateway, the daemon running on busgateway01, is listening to the
|
||||||
|
FI production fedmsg bus and will relay every message that it receives
|
||||||
|
out to a special ZMQ pub endpoint bound to port 9940. haproxy mediates
|
||||||
|
connections to the fedmsg-gateway daemon.
|
||||||
|
|
||||||
|
== Connection Flow
|
||||||
|
|
||||||
|
Clients connect through haproxy on proxy0*:9940 are redirected to
|
||||||
|
busgateway0*:9940. This can be found in the haproxy.cfg entry for
|
||||||
|
`listen fedmsg-raw-zmq 0.0.0.0:9940`.
|
||||||
|
|
||||||
|
This is different than the apache reverse proxy pass setup we have for
|
||||||
|
the app0* and packages0* machines. _That_ flow looks something like
|
||||||
|
this:
|
||||||
|
|
||||||
|
....
|
||||||
|
Client -> apache(proxy01) -> haproxy(proxy01) -> apache(app01)
|
||||||
|
....
|
||||||
|
|
||||||
|
The flow for the raw zmq stream provided by fedmsg-gateway looks
|
||||||
|
something like this:
|
||||||
|
|
||||||
|
....
|
||||||
|
Client -> haproxy(proxy01) -> fedmsg-gateway(busgateway01)
|
||||||
|
....
|
||||||
|
|
||||||
|
haproxy is listening on a public port.
|
||||||
|
|
||||||
|
At the time of this writing, haproxy does not actually load balance
|
||||||
|
zeromq session requests across multiple busgateway0* machines, but there
|
||||||
|
is nothing stopping us from adding them. New hosts can be added in
|
||||||
|
ansible and pressed from busgateway01's template. Add them to the
|
||||||
|
fedmsg-raw-zmq listen in haproxy's config and it should Just Work.
|
||||||
|
|
||||||
|
== Increasing the Maximum Number of Concurrent Connections
|
||||||
|
|
||||||
|
HTTP requests are typically very short (a few seconds at most). This
|
||||||
|
means that the number of concurrent tcp connections we require for most
|
||||||
|
of our services is quite low (1024 is overkill). ZeroMQ tcp connections,
|
||||||
|
on the other hand, are expected to live for quite a long time.
|
||||||
|
Consequently we needed to scale up the number of possible concurrent tcp
|
||||||
|
connections.
|
||||||
|
|
||||||
|
All of this is in ansible and should be handled for us automatically if
|
||||||
|
we bring up new nodes.
|
||||||
|
|
||||||
|
* The pam_limits user limit for the fedmsg user was increased from 1024
|
||||||
|
to 160000 on busgateway01.
|
||||||
|
* The pam_limits user limit for the haproxy user was increased from 1024
|
||||||
|
to 160000 on the proxy0* machines.
|
||||||
|
* The zeromq High Water Mark (HWM) was increased to 160000 on
|
||||||
|
busgateway01.
|
||||||
|
* The maximum number of connections allowed was increased in
|
||||||
|
haproxy.cfg.
|
||||||
|
|
||||||
|
== Nagios
|
||||||
|
|
||||||
|
New nagios checks were added for this that check to see if the number of
|
||||||
|
concurrent connections through haproxy is approaching the maximum number
|
||||||
|
allowed.
|
||||||
|
|
||||||
|
You can check these numbers by hand by inspecting the haproxy web
|
||||||
|
interface: https://admin.fedoraproject.org/haproxy/proxy1#fedmsg-raw-zmq
|
||||||
|
|
||||||
|
Look at the "Sessions" section. "Cur" is the current number of sessions
|
||||||
|
versus "Max", the maximum number seen at the same time and "Limit", the
|
||||||
|
maximum number of concurrent connections allowed.
|
||||||
|
|
||||||
|
== RHIT
|
||||||
|
|
||||||
|
We had RHIT open up port 9940 special to proxy01.phx2 for this.
|
57
modules/sysadmin_guide/pages/fedmsg-introduction.adoc
Normal file
57
modules/sysadmin_guide/pages/fedmsg-introduction.adoc
Normal file
|
@ -0,0 +1,57 @@
|
||||||
|
= fedmsg introduction and basics, SOP
|
||||||
|
|
||||||
|
General information about fedmsg
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
Almost all of them.
|
||||||
|
Purpose::
|
||||||
|
Introduce sysadmins to fedmsg tools and config
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
fedmsg is a system that links together most of our webapps and services
|
||||||
|
into a message mesh or net (often called a "bus"). It is built on top of
|
||||||
|
the zeromq messaging library.
|
||||||
|
|
||||||
|
fedmsg has its own developer documentation that is a good place to check
|
||||||
|
if this or other SOPs don't provide enough information -
|
||||||
|
http://fedmsg.rtfd.org
|
||||||
|
|
||||||
|
== Tools
|
||||||
|
|
||||||
|
Generally, fedmsg-tail and fedmsg-logger are the two most commonly used
|
||||||
|
tools for debugging and testing. To see if bus-connectivity exists
|
||||||
|
between two machines, log onto each of them and run the following on the
|
||||||
|
first:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ echo testing from $(hostname) | fedmsg-logger
|
||||||
|
....
|
||||||
|
|
||||||
|
And run the following on the second:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ fedmsg-tail --really-pretty
|
||||||
|
....
|
||||||
|
|
||||||
|
== Configuration
|
||||||
|
|
||||||
|
fedmsg configuration lives in /etc/fedmsg.d/
|
||||||
|
|
||||||
|
`/etc/fedmsg.d/endpoints.py` keeps the list of every possible fedmsg
|
||||||
|
endpoint. It acts as a global index that defines the bus.
|
||||||
|
|
||||||
|
See fedmsg.readthedocs.org/en/latest/config/ for a full glossary of
|
||||||
|
configuration values.
|
||||||
|
|
||||||
|
== Logs
|
||||||
|
|
||||||
|
fedmsg daemons keep their logs in /var/log/fedmsg. fedmsg message hooks
|
||||||
|
in existing apps (like bodhi) will log any errors to the logs of the app
|
||||||
|
they've been added to (like /var/log/httpd/error_log).
|
29
modules/sysadmin_guide/pages/fedmsg-irc.adoc
Normal file
29
modules/sysadmin_guide/pages/fedmsg-irc.adoc
Normal file
|
@ -0,0 +1,29 @@
|
||||||
|
= fedmsg-irc SOP
|
||||||
|
|
||||||
|
____
|
||||||
|
Echo fedmsg bus activity to IRC.
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-fedmsg, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
value03
|
||||||
|
Purpose::
|
||||||
|
Echo fedmsg bus activity to IRC
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
fedmsg-irc is a daemon running on value03 and value01.stg. It is
|
||||||
|
listening to the fedmsg bus and echoing that activity to the
|
||||||
|
#fedora-fedmsg channel in IRC.
|
||||||
|
|
||||||
|
It can be configured to ignore certain messages, join certain rooms, and
|
||||||
|
take on a different nick by editing the values in `/etc/fedmsg.d/irc.py`
|
||||||
|
and restarting it with `sudo service fedmsg-irc restart`
|
||||||
|
|
||||||
|
See http://fedmsg.readthedocs.org/en/latest/config/#term-irc for more
|
||||||
|
information on configuration.
|
73
modules/sysadmin_guide/pages/fedmsg-new-message-type.adoc
Normal file
73
modules/sysadmin_guide/pages/fedmsg-new-message-type.adoc
Normal file
|
@ -0,0 +1,73 @@
|
||||||
|
= Adding a new fedmsg message type
|
||||||
|
|
||||||
|
== Instrumenting the program
|
||||||
|
|
||||||
|
First, figure out how you're going to publish the message? Is it from a
|
||||||
|
shell script or from a long running process?
|
||||||
|
|
||||||
|
If its from shell script, you need to just add a
|
||||||
|
[.title-ref]#fedmsg-logger# statement to the script. Remember to set the
|
||||||
|
[.title-ref]#--modname# and [.title-ref]#--topic# for your new message's
|
||||||
|
fully-qualified topic.
|
||||||
|
|
||||||
|
If its from a python process, you need to just add a
|
||||||
|
`fedmsg.publish(..)` call. The same concerns about modname and topic
|
||||||
|
apply here.
|
||||||
|
|
||||||
|
If this is a short-lived python process, you'll want to add
|
||||||
|
[.title-ref]#active=True# to the call to `fedmsg.publish(..)`. This will
|
||||||
|
make the fedmsg lib "actively" reach out to our fedmsg-relay running on
|
||||||
|
busgateway01.
|
||||||
|
|
||||||
|
If it is a long-running python process (like a WSGI thread), then you
|
||||||
|
don't need to pass any extra arguments. You don't want it to reach out
|
||||||
|
to the fedmsg-relay if possible. Your process will require that some
|
||||||
|
"endpoints" are created for it in `/etc/fedmsg.d/`. More on that below.
|
||||||
|
|
||||||
|
== Supporting infrastructure
|
||||||
|
|
||||||
|
You need to make sure that the machine this is running on has a cert and
|
||||||
|
key that can be read by the program to sign its message. If you don't
|
||||||
|
have a cert already, then you need to create it in the private repo. Ask
|
||||||
|
a sysadmin-main member.
|
||||||
|
|
||||||
|
Then you need to declare those certs in the [.title-ref]#fedmsg_certs#
|
||||||
|
data structure stored typically in our ansible `group_vars/` for this
|
||||||
|
service. Declare both the name of the cert, what group and user it
|
||||||
|
should be owned by, and in the `can_send:` section, declare the list of
|
||||||
|
topics that this cert should be allowed to publish.
|
||||||
|
|
||||||
|
If this is a long-running python process that is _not_ passing
|
||||||
|
[.title-ref]#active=True# to the call to
|
||||||
|
[.title-ref]#fedmsg.publish(..)#, then you have to also declare
|
||||||
|
endpoints for it. You do that by specifying the `fedmsg_wsgi_procs` and
|
||||||
|
`fedmsg_wsgi_vars` in the `group_vars` for your service. The iptables
|
||||||
|
rules and fedmsg endpoints should be automatically created for you on
|
||||||
|
the next playbook run.
|
||||||
|
|
||||||
|
== Supporting code
|
||||||
|
|
||||||
|
At this point, you can push the change out to production and be
|
||||||
|
publishing messages "okay". Everything should be fine.
|
||||||
|
|
||||||
|
However, your message will show up blank in datagrepper, in IRC, and in
|
||||||
|
FMN, and everywhere else we try to render it. You _must_ then follow up
|
||||||
|
and write a new [.title-ref]#Processor# for it in the fedmsg_meta
|
||||||
|
library we maintain:
|
||||||
|
https://github.com/fedora-infra/fedmsg_meta_fedora_infrastructure
|
||||||
|
|
||||||
|
You also _must_ write a test case for it there. The docs listing all
|
||||||
|
topics we publish at http://fedora-fedmsg.rtfd.org/ is automatically
|
||||||
|
generated from the test suite. Please don't forget this.
|
||||||
|
|
||||||
|
Lastly, you should cut a release of fedmsg_meta and deploy it using the
|
||||||
|
[.title-ref]#playbooks/manual/upgrade/fedmsg.yml# playbook, which should
|
||||||
|
update all the relevant hosts.
|
||||||
|
|
||||||
|
== Corner cases
|
||||||
|
|
||||||
|
If the process publishing the new message lives _outside_ our main
|
||||||
|
network, you have to jump through more hoops. Look at abrt, koschei, and
|
||||||
|
copr for examples of how to configure this (you need a special firewall
|
||||||
|
rule, and they need to be configured to talk to our "inbound gateway"
|
||||||
|
running on the proxies.
|
58
modules/sysadmin_guide/pages/fedmsg-relay.adoc
Normal file
58
modules/sysadmin_guide/pages/fedmsg-relay.adoc
Normal file
|
@ -0,0 +1,58 @@
|
||||||
|
= fedmsg-relay SOP
|
||||||
|
|
||||||
|
Bridge ephemeral scripts into the fedmsg bus.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
app01
|
||||||
|
Purpose::
|
||||||
|
Bridge ephemeral bash and python scripts into the fedmsg bus.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
fedmsg-relay is running on app01, which is a bad choice. We should look
|
||||||
|
to move it to a more isolated place in the future. busgateway01 would be
|
||||||
|
a better choice.
|
||||||
|
|
||||||
|
"Ephemeral" scripts like `pkgdb2branch.py`, the post-receive git hook on
|
||||||
|
pkgs01, and anywhere fedmsg-logger is used all depend on fedmsg-relay.
|
||||||
|
Instead of emitting messages "directly" to the rest of the bus, they use
|
||||||
|
fedmsg-relay as an intermediary.
|
||||||
|
|
||||||
|
Check that fedmsg-relay is running by looking for it in the process
|
||||||
|
list. You can restart it in the standard way with
|
||||||
|
`sudo service fedmsg-relay restart`. Check for its logs in
|
||||||
|
`/var/log/fedmsg/fedmsg-relay.log`
|
||||||
|
|
||||||
|
Ephemeral scripts know where the fedmsg-relay is by looking for the
|
||||||
|
relay_inbound and relay_outbound values in the global fedmsg config.
|
||||||
|
|
||||||
|
== But What is it Doing? And Why?
|
||||||
|
|
||||||
|
The fedmsg bus is designed to be "passive" in its normal operation. A
|
||||||
|
mod_wsgi process under httpd sets up its fedmsg publisher socket to
|
||||||
|
passively emit messages on a certain port. When some other service wants
|
||||||
|
to receive these messages, it is up to that service to know where
|
||||||
|
mod_wsgi is emitting and to actively connect there. In this way,
|
||||||
|
emitting is passive and listening is active.
|
||||||
|
|
||||||
|
We get a problem when we have a one-off or "ephemeral" script that is
|
||||||
|
not a long-running process -- a script like pkgdb2branch which is run
|
||||||
|
when a user runs it and which ends shortly after. Listeners who want
|
||||||
|
these scripts messages will find that they are usually not available
|
||||||
|
when they try to connect.
|
||||||
|
|
||||||
|
To solve this problem, we introduced the "fedmsg-relay" daemon which is
|
||||||
|
a kind of "passive"-to-"passive" adaptor. It binds to an outbound port
|
||||||
|
on one end where it will publish messages (like normal) but it also
|
||||||
|
binds to an another port where it listens passively for inbound
|
||||||
|
messages. Ephemeral scripts then actively connect to the passive inbound
|
||||||
|
port of the fedmsg-relay to have their payloads echoed on the
|
||||||
|
bus-proper.
|
||||||
|
|
||||||
|
See http://fedmsg.readthedocs.org/en/latest/topology/ for a diagram.
|
70
modules/sysadmin_guide/pages/fedmsg-websocket.adoc
Normal file
70
modules/sysadmin_guide/pages/fedmsg-websocket.adoc
Normal file
|
@ -0,0 +1,70 @@
|
||||||
|
= websocket SOP
|
||||||
|
|
||||||
|
websocket communication with Fedora apps.
|
||||||
|
|
||||||
|
see-also: `fedmsg-gateway.txt`
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Messaging SIG, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
busgateway01, proxy0*, app0*
|
||||||
|
Purpose::
|
||||||
|
Expose a websocket server for FI apps to use
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
WebSocket is a protocol (an extension of HTTP/1.1) by which client web
|
||||||
|
browsers can establish full-duplex socket communications with a server
|
||||||
|
--the "real-time web".
|
||||||
|
|
||||||
|
In our case, webapps served from app0* and packages0* will include
|
||||||
|
javascript code instructing client browsers to establish a second
|
||||||
|
connection to our WebSocket server. They point browsers to the following
|
||||||
|
addresses:
|
||||||
|
|
||||||
|
production::
|
||||||
|
wss://hub.fedoraproject.org:9939
|
||||||
|
staging::
|
||||||
|
wss://stg.fedoraproject.org:9939
|
||||||
|
|
||||||
|
The websocket server itself is a fedmsg-hub daemon running on
|
||||||
|
busgateway01. It is configured to enable its websocket server component
|
||||||
|
in the presence of certain configuration values.
|
||||||
|
|
||||||
|
haproxy mediates connections to the fedmsg-hub websocket server daemon.
|
||||||
|
An stunnel daemon provides SSL support.
|
||||||
|
|
||||||
|
== Connection Flow
|
||||||
|
|
||||||
|
The connection flow is much the same as in the fedmsg-gateway.txt SOP,
|
||||||
|
but is somewhat more complicated.
|
||||||
|
|
||||||
|
"Normal" HTTP requests to our app servers traverse the following chain:
|
||||||
|
|
||||||
|
....
|
||||||
|
Client -> apache(proxy01) -> haproxy(proxy01) -> apache(app01)
|
||||||
|
....
|
||||||
|
|
||||||
|
The flow for a websocket requests looks something like this:
|
||||||
|
|
||||||
|
....
|
||||||
|
Client -> stunnel(proxy01) -> haproxy(proxy01) -> fedmsg-hub(busgateway01)
|
||||||
|
....
|
||||||
|
|
||||||
|
stunnel is listening on a public port, negotiates the SSL connection,
|
||||||
|
and redirects the connection to haproxy who in turn hands it off to the
|
||||||
|
fedmsg-hub websocket server listening on busgateway01.
|
||||||
|
|
||||||
|
At the time of this writing, haproxy does not actually load balance
|
||||||
|
zeromq session requests across multiple busgateway0* machines, but there
|
||||||
|
is nothing stopping us from adding them. New hosts can be added in
|
||||||
|
ansible and pressed from busgateway01's template. Add them to the
|
||||||
|
fedmsg-websockets listen in haproxy's config and it should Just Work.
|
||||||
|
|
||||||
|
== RHIT
|
||||||
|
|
||||||
|
We had RHIT open up port 9939 special to proxy01.phx2 for this.
|
34
modules/sysadmin_guide/pages/fedocal.adoc
Normal file
34
modules/sysadmin_guide/pages/fedocal.adoc
Normal file
|
@ -0,0 +1,34 @@
|
||||||
|
= Fedocal SOP
|
||||||
|
|
||||||
|
Fedocal is a web-based group calender application that is made available
|
||||||
|
to the various groups with in the Fedora project.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Documentation Links
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Location::
|
||||||
|
https://apps.fedoraproject.org/calendar
|
||||||
|
|
||||||
|
Servers
|
||||||
|
|
||||||
|
Purpose::
|
||||||
|
To provide links to the documentation for fedocal, as it exists
|
||||||
|
elsewhere on the internet and it was decided that a link document
|
||||||
|
would be a better use of resources than to rewrite the book.
|
||||||
|
|
||||||
|
== Documentation Links
|
||||||
|
|
||||||
|
For information on the latest and greatest in fedocal please review:
|
||||||
|
http://fedocal.readthedocs.org/en/latest/
|
||||||
|
|
||||||
|
For documentation on the usage of fedocal please consult:
|
||||||
|
http://fedocal.readthedocs.org/en/latest/usage.html
|
360
modules/sysadmin_guide/pages/fedora-releases.adoc
Normal file
360
modules/sysadmin_guide/pages/fedora-releases.adoc
Normal file
|
@ -0,0 +1,360 @@
|
||||||
|
= Fedora Release Infrastructure SOP
|
||||||
|
|
||||||
|
This SOP contains all of the steps required by the Fedora Infrastructure
|
||||||
|
team in order to get a release out. Much of this work overlaps with the
|
||||||
|
Release Engineering team (and at present share many of the same
|
||||||
|
members). Some work may get done by releng, some may get done by
|
||||||
|
Infrastructure, as long as it gets done, it doesn't matter.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team, Fedora Release Engineering Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, #fedora-releng, sysadmin-main, sysadmin-releng
|
||||||
|
Location:::
|
||||||
|
N/A
|
||||||
|
Servers:::
|
||||||
|
All
|
||||||
|
Purpose:::
|
||||||
|
Releasing a new version of Fedora
|
||||||
|
|
||||||
|
== Preparations
|
||||||
|
|
||||||
|
Before a release ships, the following items need to be completed.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. New website from the websites team (typically hosted at
|
||||||
|
http://getfedora.org/_/)
|
||||||
|
. Verify mirror space (for all test releases as well)
|
||||||
|
. Verify with rel-eng permissions on content are right on the mirrors.
|
||||||
|
Don't leak.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Communication with Red Hat IS (Give at least 2 months notice, then::
|
||||||
|
reminders as the time comes near) (final release only)
|
||||||
|
. Infrastructure change freeze
|
||||||
|
. Modify Template:FedoraVersion to reference new version. (Final release
|
||||||
|
only)
|
||||||
|
. Move old releases to archive (post final release only)
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Switch release from development/N to normal releases/N/ tree in mirror::
|
||||||
|
manager (post final release only)
|
||||||
|
|
||||||
|
== Change Freeze
|
||||||
|
|
||||||
|
The rules are simple:
|
||||||
|
|
||||||
|
* Hosts with the ansible variable "freezes" "True" are frozen.
|
||||||
|
* You may make changes as normal on hosts that are not frozen. (For
|
||||||
|
example, staging is never frozen)
|
||||||
|
* Changes to frozen hosts requires a freeze break request sent to the
|
||||||
|
fedora infrastructure list, containing a description of the problem or
|
||||||
|
issue, actions to be taken and (if possible) patches to ansible that
|
||||||
|
will be applied. These freeze breaks must then get two approvals from
|
||||||
|
sysadmin-main or sysadmin-releng group members before being applied.
|
||||||
|
* Changes to recover from outages are acceptable to frozen hosts if
|
||||||
|
needed.
|
||||||
|
|
||||||
|
Change freezes will be sent to the fedora-infrastructure-list and begin
|
||||||
|
3 weeks before each release and the final release. The freeze will end
|
||||||
|
one day after the release. Note, if the release slips during a change
|
||||||
|
freeze, the freeze just extends until the day after a release ships.
|
||||||
|
|
||||||
|
You can get a list of frozen/non-frozen hosts by:
|
||||||
|
|
||||||
|
....
|
||||||
|
git clone https://pagure.io/fedora-infra/ansible.git
|
||||||
|
scripts/freezelist -i inventory
|
||||||
|
....
|
||||||
|
|
||||||
|
== Notes about release day
|
||||||
|
|
||||||
|
Release day is always an interesting and unique event. After the final
|
||||||
|
sprint from test to the final release a lot of the developers will be
|
||||||
|
looking forward to a bit of time away, as well as some sleep. Once
|
||||||
|
Release Engineering has built the final tree, and synced it to the
|
||||||
|
mirrors it is our job to make sure everything else (except the bit flip)
|
||||||
|
gets done as painlessly and easily as possible.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
All communication is typically done in #fedora-admin. Typically these
|
||||||
|
channels are laid back and staying on topic isn't strictly enforced. On
|
||||||
|
release day this is not true. We encourage people to come, stay in the
|
||||||
|
room and be quiet unless they have a specific task or question releated
|
||||||
|
to release day. Its nothing personal, but release day can get out of
|
||||||
|
hand quick.
|
||||||
|
====
|
||||||
|
|
||||||
|
During normal load, our websites function as normal. This is
|
||||||
|
especially true since we've moved the wiki to mod_fcgi. On release day
|
||||||
|
our load spikes a great deal. During the Fedora 6 launch many services
|
||||||
|
were offline for hours. Some (like the docs) were off for days. A large
|
||||||
|
part of this outage was due to the wiki not being able to handle the
|
||||||
|
load, part was a lack of planning by the Infrastructure team, and part
|
||||||
|
is still a mystery. There are questions as to whether or not all of the
|
||||||
|
traffic was legit or a ddos.
|
||||||
|
|
||||||
|
The Fedora 7 release went much better. Some services were offline for
|
||||||
|
minutes at a time but very little of it was out longer then that. The
|
||||||
|
wiki crashed, as it always does. We had made sure to make the
|
||||||
|
fedoraproject.org landing page static though. This helped a great deal
|
||||||
|
though we did see load on the proxy boxes as spiky.
|
||||||
|
|
||||||
|
Recent releases have been quite smooth due to a number of changes: we
|
||||||
|
have a good deal more bandwith on master mirrors, more cpus and memory,
|
||||||
|
as well as prerelease versions are much easier to come by for those
|
||||||
|
interested before release day.
|
||||||
|
|
||||||
|
== Day Prior to Release Day
|
||||||
|
|
||||||
|
=== Step 1 (Torrent)
|
||||||
|
|
||||||
|
Setup the torrent. All files can be synced with the torrent box but just
|
||||||
|
not published to the world. Verify with sha1sum. Follow the instructions
|
||||||
|
on the torrentrelease.txt sop up to and including step 4.
|
||||||
|
|
||||||
|
=== Step 2 (Website)
|
||||||
|
|
||||||
|
Verify the website design / content has been finalized with the websites
|
||||||
|
team. Update the Fedora version number wiki template if this is a final
|
||||||
|
release. It will need to be changed in
|
||||||
|
https://fedoraproject.org/wiki/Template:CurrentFedoraVersion
|
||||||
|
|
||||||
|
Additionally, there are redirects in the ansible
|
||||||
|
playbooks/include/proxies-redirects.yml file for Cloud Images. These
|
||||||
|
should be pushed as soon as the content is available. See:
|
||||||
|
https://pagure.io/fedora-infrastructure/issue/3866 for example
|
||||||
|
|
||||||
|
=== Step 3 (Mirrors)
|
||||||
|
|
||||||
|
Verify enough mirrors are setup and have Fedora ready for release. If
|
||||||
|
for some reason something is broken it needs to be fixed. Many of the
|
||||||
|
mirrors are running a check-in script. This lets us know who has Fedora
|
||||||
|
without having to scan everyone. Hide the Alpha, Beta, and Preview
|
||||||
|
releases from the publiclist page.
|
||||||
|
|
||||||
|
You can check this by looking at:
|
||||||
|
|
||||||
|
....
|
||||||
|
wget "http://mirrors.fedoraproject.org/mirrorlist?path=pub/fedora/linux/releases/test/28-Beta&country=global"
|
||||||
|
|
||||||
|
(replace 28 and Beta with the version and release.)
|
||||||
|
....
|
||||||
|
|
||||||
|
== Release day
|
||||||
|
|
||||||
|
=== Step 1 (Prep and wait)
|
||||||
|
|
||||||
|
Verify the mirrors are ready and that the torrent has valid copies of
|
||||||
|
its files (use sha1sum)
|
||||||
|
|
||||||
|
Do not move on to step two until the Release Engineering team has given
|
||||||
|
the ok for the release. It is the releng team's decision as to whether
|
||||||
|
or not we release and they may pull the plug at any moment.
|
||||||
|
|
||||||
|
=== Step 2 (Torrent)
|
||||||
|
|
||||||
|
Once given the ok to release, the Infrastructure team should publish the
|
||||||
|
torrent and encourage people to seed. Complete the steps on the
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/torrentrelease.html
|
||||||
|
after step 4.
|
||||||
|
|
||||||
|
=== Step 3 (Bit flip)
|
||||||
|
|
||||||
|
The mirrors sit and wait for a single permissions bit to be altered so
|
||||||
|
that they show up to their services. The bit flip (done by the releng
|
||||||
|
team) will replicate out to the mirrors. Verify that the mirrors have
|
||||||
|
received the change by seeing if it is actually available, just use a
|
||||||
|
spot check. Once that is complete move on.
|
||||||
|
|
||||||
|
=== Step 4 (Website)
|
||||||
|
|
||||||
|
Once all of the distribution pieces are verified (mirrors and torrent),
|
||||||
|
all that is left is to publish the website. At present this is done by
|
||||||
|
making sure the master branch of fedora-web is pulled by the
|
||||||
|
syncStatic.sh script in ansible. It will sync in an hour normally but on
|
||||||
|
release day people don't like to wait that long so do the following on
|
||||||
|
sundries01
|
||||||
|
|
||||||
|
____
|
||||||
|
sudo -u apache /usr/local/bin/lock-wrapper syncStatic 'sh -x
|
||||||
|
/usr/local/bin/syncStatic'
|
||||||
|
____
|
||||||
|
|
||||||
|
Once that completes, on batcave01:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -i ansible proxy\* "/usr/bin/rsync --delete -a --no-owner --no-group bapp02::getfedora.org/ /srv/web/getfedora.org/"
|
||||||
|
....
|
||||||
|
|
||||||
|
Verify http://getfedora.org/ is working.
|
||||||
|
|
||||||
|
=== Step 5 (Docs)
|
||||||
|
|
||||||
|
Just as with the website, the docs site needs to be published. Just as
|
||||||
|
above follow the following steps:
|
||||||
|
|
||||||
|
....
|
||||||
|
/root/bin/docs-sync
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Step 6 (Monitor)
|
||||||
|
|
||||||
|
Once the website is live, keep an eye on various news sites for the
|
||||||
|
release announcement. Closely watch the load on all of the boxes, proxy,
|
||||||
|
application and otherwise. If something is getting overloaded, see
|
||||||
|
suggestions on this page in the "Juggling Resources" section.
|
||||||
|
|
||||||
|
=== Step 7 (Badges) (final release only)
|
||||||
|
|
||||||
|
We have some badge rules that are dependent on which release of Fedora
|
||||||
|
we're on. As you have time, please performs the following on your local
|
||||||
|
box:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ git clone ssh://git@pagure.io/fedora-badges.git
|
||||||
|
$ cd badges
|
||||||
|
....
|
||||||
|
|
||||||
|
Edit `rules/tester-it-still-works.yml` and update the release tag to
|
||||||
|
match the now old but stable release. For instance, if we just released
|
||||||
|
fc21, then the tag in that badge rule should be fc20.
|
||||||
|
|
||||||
|
Edit `rules/tester-you-can-pry-it-from-my-cold-dead-hands.yml` and
|
||||||
|
update the release tag to match the release that is about to reach EOL.
|
||||||
|
For instance, if we just released f28, then the tag in that badge rule
|
||||||
|
should be f26. Commit the changes:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ git commit -a -m 'Updated tester badge rule for f28 release.'
|
||||||
|
$ git push origin master
|
||||||
|
....
|
||||||
|
|
||||||
|
Then, on batcave, perform the following:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -i ansible-playbook $(pwd)/playbooks/manual/push-badges.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Step 8 (Done)
|
||||||
|
|
||||||
|
Just chill, keep an eye on everything and make changes as needed. If you
|
||||||
|
can't keep a service up, try to redirect randomly to some of the
|
||||||
|
mirrors.
|
||||||
|
|
||||||
|
== Priorities
|
||||||
|
|
||||||
|
Priorities of during release day (In order):
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Website::
|
||||||
|
Anything related to a user landing at fedoraproject.org, and clicking
|
||||||
|
through to a mirror or torrent to download something must be kept up.
|
||||||
|
This is distribution, and without it we can potentially lose many
|
||||||
|
users.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Linked addresses::
|
||||||
|
We do not have direct control over what Hacker News, Phoronix or
|
||||||
|
anyone else links to. If they link to something on the wiki and it is
|
||||||
|
going down or link to any other site we control a rewrite should be
|
||||||
|
put in place to direct them to http://fedoraproject.org/get-fedora.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Torrent::
|
||||||
|
The torrent server has never had problems during a release. Make sure
|
||||||
|
it is up.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Release Notes::
|
||||||
|
Typically grouped with the docs site, the release notes are often
|
||||||
|
linked to (this is fine, no need to redirect) but keep an eye on the
|
||||||
|
logs and ensure that where we've said the release notes are, that they
|
||||||
|
can be found there. In previous releases we sometimes had to make this
|
||||||
|
available in more than one spot.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
docs.fedoraproject.org::
|
||||||
|
People will want to see whats new in Fedora and get further
|
||||||
|
documentation about it. Much of this is in the release notes.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
wiki::
|
||||||
|
Because it is so resource heavy, and because it is so developer
|
||||||
|
oriented we have no choice but to give the wiki a lower priority.
|
||||||
|
. Everything else.
|
||||||
|
|
||||||
|
== Juggling Resources
|
||||||
|
|
||||||
|
In our environment we're running different things on many different
|
||||||
|
servers. Using Xen we can easily give machines more or less ram,
|
||||||
|
processors. We can take down builders and bring up application servers.
|
||||||
|
The trick is to be smart and make sure you understand what is causing
|
||||||
|
the problem. These are some tips to keep in mind:
|
||||||
|
|
||||||
|
* IPTables based bandwidth and connection limiting (successful in the
|
||||||
|
past)
|
||||||
|
* Altering the weight on the proxy balancers
|
||||||
|
* Create static pages out of otherwise dynamic content
|
||||||
|
* Redirect pages to a mirror
|
||||||
|
* Add a server / remove un-needed servers
|
||||||
|
|
||||||
|
== CHECKLISTS:
|
||||||
|
|
||||||
|
=== Beta:
|
||||||
|
|
||||||
|
* Announce infrastructure freeze 3 weeks before Beta
|
||||||
|
* Change /topic in #fedora-admin
|
||||||
|
* mail infrastucture list a reminder.
|
||||||
|
* File all tickets
|
||||||
|
* new website
|
||||||
|
* check mirror permissions, mirrormanager, check mirror sizes, release
|
||||||
|
day ticket.
|
||||||
|
|
||||||
|
After release is a "go":
|
||||||
|
|
||||||
|
* Make sure torrents are setup and ready to go.
|
||||||
|
* fedora-web needs a branch for fN-beta. In it:
|
||||||
|
* Beta used on get-prerelease
|
||||||
|
* get-prerelease doesn't direct to release
|
||||||
|
* verify is updated with Beta info
|
||||||
|
* releases.txt gets a branched entry for preupgrade
|
||||||
|
* bfo gets updated to have a Beta entry.
|
||||||
|
|
||||||
|
After release:
|
||||||
|
|
||||||
|
* Update /topic in #fedora-admin
|
||||||
|
* post to infrastructure list that freeze is over.
|
||||||
|
|
||||||
|
=== Final:
|
||||||
|
|
||||||
|
* Announce infrastructure freeze 2 weeks before Final
|
||||||
|
* Change /topic in #fedora-admin
|
||||||
|
* mail infrastucture list a reminder.
|
||||||
|
* File all tickets
|
||||||
|
* new website, check mirror permissions, mirrormanager, check
|
||||||
|
* mirror sizes, release day ticket.
|
||||||
|
|
||||||
|
After release is a "go":
|
||||||
|
|
||||||
|
* Make sure torrents are setup and ready to go.
|
||||||
|
* fedora-web needs a branch for fN-alpha. In it:
|
||||||
|
* get-prerelease does direct to release
|
||||||
|
* verify is updated with Final info
|
||||||
|
* bfo gets updated to have a Final entry.
|
||||||
|
* update wiki version numbers and names.
|
||||||
|
|
||||||
|
After release:
|
||||||
|
|
||||||
|
* Update /topic in #fedora-admin
|
||||||
|
* post to infrastructure list that freeze is over.
|
||||||
|
* Move MirrorManager repository tags from the development/$version/
|
||||||
|
Directory objects, to the releases/$version/ Directory objects. This is
|
||||||
|
done using the `move-devel-to-release --version=$version` command on
|
||||||
|
bapp02. This is usually done now a week or two after release.
|
108
modules/sysadmin_guide/pages/fedorapackages.adoc
Normal file
108
modules/sysadmin_guide/pages/fedorapackages.adoc
Normal file
|
@ -0,0 +1,108 @@
|
||||||
|
= Fedora Packages SOP
|
||||||
|
|
||||||
|
This SOP is for the Fedora Packages web application.
|
||||||
|
https://apps.fedoraproject.org/packages
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Deploying to the servers
|
||||||
|
. Maintenance
|
||||||
|
. Checking for AGPL violations
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-apps
|
||||||
|
Persons::
|
||||||
|
cverna
|
||||||
|
Location::
|
||||||
|
PHX2
|
||||||
|
Servers::
|
||||||
|
packages03.phx2.fedoraproject.org packages04.phx2.fedoraproject.org
|
||||||
|
packages03.stg.phx2.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Web interface for searching packages information
|
||||||
|
|
||||||
|
== Deploying to the servers
|
||||||
|
|
||||||
|
=== Deploying
|
||||||
|
|
||||||
|
Once the new version is built, it needs to be deployed. To deploy the
|
||||||
|
new version, you need
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/sshaccess.html[ssh
|
||||||
|
access] to batcave01.phx2.fedoraproject.org and
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ansible.html[permissions
|
||||||
|
to run the Ansible playbook].
|
||||||
|
|
||||||
|
All the following commands should be run from batcave01.
|
||||||
|
|
||||||
|
You can check the upstream documentation, on how to build a new release.
|
||||||
|
This process results in a fedora-packages rpm available in the infra-tag
|
||||||
|
rpm repo.
|
||||||
|
|
||||||
|
You should make use of the staging instance in order to test the new
|
||||||
|
version of the application.
|
||||||
|
|
||||||
|
=== Upgrading
|
||||||
|
|
||||||
|
To upgrade, run the upgrade playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook manual/upgrade/packages.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
This will upgrade the fedora-pacages package and restart the Apache web
|
||||||
|
server and fedmsg-hub service.
|
||||||
|
|
||||||
|
=== Rebuild the xapian Database
|
||||||
|
|
||||||
|
If you need to rebuild the xapian database then you can run the
|
||||||
|
following playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook manual/rebuild/fedora-packages.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Maintenance
|
||||||
|
|
||||||
|
The web application is served by httpd and managed by the httpd
|
||||||
|
service.:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo systemctl restart httpd
|
||||||
|
....
|
||||||
|
|
||||||
|
can be used to restart the service if needed. The application log files
|
||||||
|
are available under [.title-ref]#/var/log/httpd/# directory.
|
||||||
|
|
||||||
|
The xapian database is updated by a fedmsg consumer. You can restart the
|
||||||
|
fedmsg-hub serivce if needed by using:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo systemctl restart fedmsg-hub
|
||||||
|
....
|
||||||
|
|
||||||
|
To check the consumer logs you can use:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo journalctl -u fedmsg-hub
|
||||||
|
....
|
||||||
|
|
||||||
|
== Checking for AGPL violations
|
||||||
|
|
||||||
|
To remain AGPL compliant, we must ensure that all modifications to the
|
||||||
|
code are made available in the SRPM that we link to in the footer of the
|
||||||
|
application. You can easily query our app servers to determine if any
|
||||||
|
AGPL violating code modifications have been made to the package.:
|
||||||
|
|
||||||
|
....
|
||||||
|
func-command --host="*app*" --host="community*" "rpm -V fedoracommunity"
|
||||||
|
....
|
||||||
|
|
||||||
|
You can safely ignore any changes to non-code files in the output. If
|
||||||
|
any violations are found, the Infrastructure Team should be notified
|
||||||
|
immediately.
|
82
modules/sysadmin_guide/pages/fedorapastebin.adoc
Normal file
82
modules/sysadmin_guide/pages/fedorapastebin.adoc
Normal file
|
@ -0,0 +1,82 @@
|
||||||
|
= Fedora Pastebin SOP
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Introduction
|
||||||
|
. Installation
|
||||||
|
. Dashboard
|
||||||
|
. Add a word to censored list
|
||||||
|
|
||||||
|
== 1. Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
athmane herlo
|
||||||
|
Sponsor::
|
||||||
|
nirik
|
||||||
|
Location::
|
||||||
|
phx2
|
||||||
|
Servers::
|
||||||
|
paste01.stg, paste01.dev
|
||||||
|
Purpose::
|
||||||
|
To host Fedora Pastebin
|
||||||
|
|
||||||
|
== 2. Introduction
|
||||||
|
|
||||||
|
Fedora pastebin is powered by sticky-notes which is included in EPEL.
|
||||||
|
|
||||||
|
Fedora theming (skin) is included in ansible role.
|
||||||
|
|
||||||
|
== 3. Installation
|
||||||
|
|
||||||
|
Sticky-notes needs a MySQL db and a user with 'select, update, delete,
|
||||||
|
insert' privileges.
|
||||||
|
|
||||||
|
It's recommended to dump and import db from a working installation to
|
||||||
|
save time (skipping the installation and tweaking).
|
||||||
|
|
||||||
|
By default the installation is locked ie: you can't relaunch it.
|
||||||
|
|
||||||
|
However, you can unlock the installation by commenting the line
|
||||||
|
containing `$gsod->trigger` in `/etc/sticky-notes/install.php` then
|
||||||
|
pointing the web browser to '/install'
|
||||||
|
|
||||||
|
The configuration file containing general settings and DB credentials is
|
||||||
|
located in `/etc/sticky-notes/config.php`
|
||||||
|
|
||||||
|
== 4. Dashboard
|
||||||
|
|
||||||
|
Sticky-notes has a dashboard (URL: /admin/) that can be used to :
|
||||||
|
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Manage pastes:::
|
||||||
|
** deleting paste
|
||||||
|
** getting information about the paste author (IP/Date/time etc...)
|
||||||
|
* Manage users (aka admins) which can log into the dashboard
|
||||||
|
* Manage IP Bans (add / delete banned IPs).
|
||||||
|
* Authentication (not needed)
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Site configuration:::
|
||||||
|
** General configuration (included in config.php).
|
||||||
|
** Project Honey Pot configuration (not a FOSS service)
|
||||||
|
** Word censor configuration: a list of words to be censored in
|
||||||
|
pastes.
|
||||||
|
|
||||||
|
== 5. Add a word to censored list
|
||||||
|
|
||||||
|
If a word is in censored list, any paste containing that word will be
|
||||||
|
rejected, to add one, edit the variable '$sg_censor' in sticky-notes
|
||||||
|
configuration file.:
|
||||||
|
|
||||||
|
....
|
||||||
|
$sg_censor = "WORD1
|
||||||
|
WORD2
|
||||||
|
...
|
||||||
|
...
|
||||||
|
WORDn";
|
||||||
|
....
|
303
modules/sysadmin_guide/pages/fedorawebsites.adoc
Normal file
303
modules/sysadmin_guide/pages/fedorawebsites.adoc
Normal file
|
@ -0,0 +1,303 @@
|
||||||
|
= Websites Release SOP
|
||||||
|
|
||||||
|
____
|
||||||
|
* {blank}
|
||||||
|
[arabic]
|
||||||
|
. Preparing the website for a release
|
||||||
|
** 1.1 Obsolete GPG key of the EOL Fedora release
|
||||||
|
** 1.2 Update GPG key
|
||||||
|
*** 1.2.1 Steps
|
||||||
|
* {blank}
|
||||||
|
[arabic, start=2]
|
||||||
|
. Update website
|
||||||
|
** 2.1 For Alpha
|
||||||
|
** 2.2 For Beta
|
||||||
|
** 2.3 For GA
|
||||||
|
* {blank}
|
||||||
|
[arabic, start=3]
|
||||||
|
. Fire in the hole
|
||||||
|
* {blank}
|
||||||
|
[arabic, start=4]
|
||||||
|
. Tips
|
||||||
|
** 4.1 Merging branches
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Preparing the website for a new release cycle
|
||||||
|
____
|
||||||
|
|
||||||
|
1.1 Obsolete GPG key
|
||||||
|
|
||||||
|
One month after a Fedora release the release number 'FXX-2' (i.e. 1
|
||||||
|
month after F21 release, F19 will be EOL) will be EOL (End of Life). At
|
||||||
|
this point we should drop the GPG key from the list in verify/ and move
|
||||||
|
the keys to the obsolete keys page in keys/obsolete.html.
|
||||||
|
|
||||||
|
1.2 Update GPG key
|
||||||
|
|
||||||
|
After another couple of weeks and as the next release approaches, watch
|
||||||
|
the fedora-release package for a new key to be added. Use the
|
||||||
|
update-gpg-keys script in the fedora-web git repository to add it to
|
||||||
|
static/. Manually add it to /keys and /verify in all websites where we
|
||||||
|
use these keys:
|
||||||
|
|
||||||
|
____
|
||||||
|
* arm.fpo
|
||||||
|
* getfedora.org
|
||||||
|
* labs.fpo
|
||||||
|
* spins.fpo
|
||||||
|
____
|
||||||
|
|
||||||
|
1.2.1 Steps
|
||||||
|
|
||||||
|
[loweralpha]
|
||||||
|
. Get a copy of the new key(s) from the fedora-release repo, you will
|
||||||
|
find FXX-primary and FXX-secondary keys. Save them in ./tools to make
|
||||||
|
the update easier.
|
||||||
|
+
|
||||||
|
https://pagure.io/fedora-repos
|
||||||
|
. Start by editing ./tools/update-gpg-keys and adding the key-ids of any
|
||||||
|
obsolete keys to the obsolete_keys list.
|
||||||
|
. Then run that script to add the new key(s) to the fedora.gpg block:
|
||||||
|
+
|
||||||
|
fedora-web git:(master) cd tools/ tools git:(master) ./update-gpg-keys
|
||||||
|
RPM-GPG-KEY-fedora-23-primary tools git:(master) ./update-gpg-keys
|
||||||
|
RPM-GPG-KEY-fedora-23-secondary
|
||||||
|
+
|
||||||
|
This will add the key(s) to the keyblock in static/fedora.gpg and create
|
||||||
|
a text file for the key in static/$KEYID.txt as well. Verify that these
|
||||||
|
files have been created properly and contain all the keys that they
|
||||||
|
should.
|
||||||
|
* Handy checks: gpg static/fedora.gpg or gpg static/$KEYID.txt
|
||||||
|
* Adding "--with-fingerprint" option will add the fingerprint to the
|
||||||
|
output
|
||||||
|
+
|
||||||
|
The output of fedora.gpg should contain only the actual keys, not the
|
||||||
|
obsolete keys. The single text files should contain the correct
|
||||||
|
information for the uploaded key.
|
||||||
|
. Next, add new key(s) to the list in data/verify.html and move the new
|
||||||
|
key informations in the keys page in data/content/keys/index.html. A
|
||||||
|
script to aid in generating the HTML code for new keys is in
|
||||||
|
./tools/make-gpg-key-html. It will print HTML to stdout for each
|
||||||
|
RPM-GPG-KEY-* file given as arguments. This is suitable for copy/paste
|
||||||
|
(or directly importing if your editor supports this). Check the copied
|
||||||
|
HTML code and select if the key info is for a primary or secondary key
|
||||||
|
(output says 'Primary or Secondary').
|
||||||
|
+
|
||||||
|
tools git:(master) ./make-gpg-key-html RPM-GPG-KEY-fedora-23-primary
|
||||||
|
+
|
||||||
|
Build the website with 'make en test' and carefully verify that the data
|
||||||
|
is correct. Please double check all keys in
|
||||||
|
http://localhost:5000/en/keys and http://localhost:5000/en/verify.
|
||||||
|
+
|
||||||
|
NOTE: the tool will give you an outdated output, adapt it to the new
|
||||||
|
websites and bootstrap layout!
|
||||||
|
____
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic, start=2]
|
||||||
|
. Update website
|
||||||
|
____
|
||||||
|
|
||||||
|
|
||||||
|
2.1 For Alpha
|
||||||
|
|
||||||
|
____
|
||||||
|
[loweralpha]
|
||||||
|
. Create the fXX-alpha branch from master fedora-web git:(master) git
|
||||||
|
push origin master:refs/heads/f22-alpha
|
||||||
|
+
|
||||||
|
and checkout to the new branch: fedora-web git:(master) git checkout -t
|
||||||
|
-b f13-alpha origin/f13-alpha
|
||||||
|
. Update the global variables Change curr_state to Alpha for all arches
|
||||||
|
. Add Alpha banner Upload the FXX-Alpha banner to
|
||||||
|
static/images/banners/f22alpha.png which should appear in every
|
||||||
|
$\{PRODUCT}/download/index.html page. Make sure the banner is shown in
|
||||||
|
all sidebars, also in labs, spins, and arm.
|
||||||
|
. Check all Download links and paths in
|
||||||
|
$\{PRODUCT}/prerelease/index.html You can find all paths in bapp01 (sudo
|
||||||
|
su - mirrormanager first) or you can look at the downlaod page
|
||||||
|
http://dl.fedoraproject.org/pub/alt/stage
|
||||||
|
. Add CHECKSUM files to static/checksums and verify that the paths are
|
||||||
|
correct. The files should be in sundries01 and you can query them with:
|
||||||
|
$ find /pub/fedora/linux/releases/test/17-Alpha/ -type f -name
|
||||||
|
_CHECKSUM_ -exec cp '\{}' . ; Remember to add the right checksums to the
|
||||||
|
right websites (same path).
|
||||||
|
. Add EC2 AMI IDs for Alpha. All IDs now are in the globalvar.py file.
|
||||||
|
We get all data from there, even the redirect path to trac the AMI IDs.
|
||||||
|
We now also have a script which is useful to get all the AMI IDs
|
||||||
|
uploaded with fedimg. Execute it to get the latest uploads, but don't
|
||||||
|
run the script too early, as new builds are added constantly. fedora-web
|
||||||
|
git:(fXX-alpha) python ~/fedora-web/tools/get_ami.py
|
||||||
|
. Add CHECKSUM files also to http://spins.fedoraproject.org in
|
||||||
|
static/checksums. Verify the paths are correct in
|
||||||
|
data/content/verify.html. (see point e) to query them on sundries01).
|
||||||
|
Same for labs.fpo and arm.fpo.
|
||||||
|
. Verify all paths and links on http://spins.fpo, labs.fpo and arm.fpo.
|
||||||
|
. Update Alpha Image sizes and pre_cloud_composedate in
|
||||||
|
./build.d/globalvar.py. Verify they are right in Cloud images and Docker
|
||||||
|
image.
|
||||||
|
. Update the new POT files and push them to Zanata (ask a maintainer to
|
||||||
|
do so) every time you change text strings.
|
||||||
|
. Add this build to stg.fedoraproject.org (ansible syncStatic.sh.stg) to
|
||||||
|
test the pages online.
|
||||||
|
____
|
||||||
|
|
||||||
|
. Release Date:
|
||||||
|
|
||||||
|
____
|
||||||
|
* Merge the fXX-alpha branch to master and correct conflicts manually
|
||||||
|
* Remove the redirect of prerelease pages in ansible, edit:
|
||||||
|
* ansible/playbooks/include/proxies-redirects.yml
|
||||||
|
* ask a sysadmin-main to run playbook
|
||||||
|
* When ready and about 90 minutes before Release Time push to master
|
||||||
|
* Tag the commit as new release and push it too: $ git tag -a FXX-Alpha
|
||||||
|
-m 'Releasing Fedora XX Alpha' $ git push --tags
|
||||||
|
* If needed follow "Fire in the hole" below.
|
||||||
|
____
|
||||||
|
|
||||||
|
2.2 For Beta
|
||||||
|
|
||||||
|
____
|
||||||
|
[loweralpha]
|
||||||
|
. Create the fXX-beta branch from master fedora-web git:(master) git
|
||||||
|
push origin master:refs/heads/f22-beta
|
||||||
|
+
|
||||||
|
and checkout to the new branch: fedora-web git:(master) git checkout -t
|
||||||
|
-b f22-beta origin/f22-beta
|
||||||
|
. Update the global variables Change curr_state to Beta for all arches
|
||||||
|
. Add Alpha banner Upload the FXX-Beta banner to
|
||||||
|
static/images/banners/f22beta.png which should appear in every
|
||||||
|
$\{PRODUCT}/download/index.html page. Make sure the banner is shown in
|
||||||
|
all sidebars, also in labs, spins, and arm.
|
||||||
|
. Check all Download links and paths in
|
||||||
|
$\{PRODUCT}/prerelease/index.html You can find all paths in bapp01 (sudo
|
||||||
|
su - mirrormanager first) or you can look at the downlaod page
|
||||||
|
http://dl.fedoraproject.org/pub/alt/stage
|
||||||
|
. Add CHECKSUM files to static/checksums and verify that the paths are
|
||||||
|
correct. The files should be in sundries and you can query them with: $
|
||||||
|
find /pub/fedora/linux/releases/test/17-Beta/ -type f -name _CHECKSUM_
|
||||||
|
-exec cp '\{}' . ; Remember to add the right checksums to the right
|
||||||
|
websites (same path).
|
||||||
|
. Add EC2 AMI IDs for Beta. All IDs now are in the globalvar.py file. We
|
||||||
|
get all data from there, even the redirect path to trac the AMI IDs. We
|
||||||
|
now also have a script which is useful to get all the AMI IDs uploaded
|
||||||
|
with fedimg. Execute it to get the latest uploads, but don't run the
|
||||||
|
script too early, as new builds are added constantly. fedora-web
|
||||||
|
git:(fXX-beta) python ~/fedora-web/tools/get_ami.py
|
||||||
|
. Add CHECKSUM files also to http://spins.fedoraproject.org in
|
||||||
|
static/checksums. Verify the paths are correct in
|
||||||
|
data/content/verify.html. (see point e) to query them on sundries01).
|
||||||
|
Same for labs.fpo and arm.fpo.
|
||||||
|
. Remove static/checksums/Fedora-XX-Alpha* in all websites.
|
||||||
|
. Verify all paths and links on http://spins.fpo, labs.fpo and arm.fpo.
|
||||||
|
. Update Beta Image sizes and pre_cloud_composedate in
|
||||||
|
./build.d/globalvar.py. Verify they are right in Cloud images and Docker
|
||||||
|
image.
|
||||||
|
. Update the new POT files and push them to Zanata (ask a maintainer to
|
||||||
|
do so) every time you change text strings.
|
||||||
|
. Add this build to stg.fedoraproject.org (ansible syncStatic.sh.stg) to
|
||||||
|
test the pages online.
|
||||||
|
____
|
||||||
|
|
||||||
|
. Release Date:
|
||||||
|
|
||||||
|
____
|
||||||
|
* Merge the fXX-beta branch to master and correct conflicts manually
|
||||||
|
* When ready and about 90 minutes before Release Time push to master
|
||||||
|
* Tag the commit as new release and push it too: $ git tag -a FXX-Beta
|
||||||
|
-m 'Releasing Fedora XX Beta' $ git push --tags
|
||||||
|
* If needed follow "Fire in the hole" below.
|
||||||
|
____
|
||||||
|
|
||||||
|
2.3 For GA
|
||||||
|
|
||||||
|
____
|
||||||
|
[loweralpha]
|
||||||
|
. Create the fXX branch from master fedora-web git:(master) git push
|
||||||
|
origin master:refs/heads/f22
|
||||||
|
+
|
||||||
|
and checkout to the new branch: fedora-web git:(master) git checkout -t
|
||||||
|
-b f22 origin/f22
|
||||||
|
. Update the global variables Change curr_state for all arches
|
||||||
|
. Check all Download links and paths in $\{PRODUCT}/download/index.html
|
||||||
|
You can find all paths in bapp01 (sudo su - mirrormanager first) or you
|
||||||
|
can look at the downlaod page http://dl.fedoraproject.org/pub/alt/stage
|
||||||
|
. Add CHECKSUM files to static/checksums and verify that the paths are
|
||||||
|
correct. The files should be in sundries01 and you can query them with:
|
||||||
|
$ find /pub/fedora/linux/releases/17/ -type f -name _CHECKSUM_ -exec cp
|
||||||
|
'\{}' . ; Remember to add the right checksums to the right websites
|
||||||
|
(same path).
|
||||||
|
. At some point freeze translations. Add an empty PO_FREEZE file to
|
||||||
|
every website's directory you want to freeze.
|
||||||
|
. Add EC2 AMI IDs for GA. All IDs now are in the globalvar.py file. We
|
||||||
|
get all data from there, even the redirect path to trac the AMI IDs. We
|
||||||
|
now also have a script which is useful to get all the AMI IDs uploaded
|
||||||
|
with fedimg. Execute it to get the latest uploads, but don't run the
|
||||||
|
script too early, as new builds are added constantly. fedora-web
|
||||||
|
git:(fXX) python ~/fedora-web/tools/get_ami.py
|
||||||
|
. Add CHECKSUM files also to http://spins.fedoraproject.org in
|
||||||
|
static/checksums. Verify the paths are correct in
|
||||||
|
data/content/verify.html. (see point e) to query them on sundries01).
|
||||||
|
Same for labs.fpo and arm.fpo.
|
||||||
|
. Remove static/checksums/Fedora-XX-Beta* in all websites.
|
||||||
|
. Verify all paths and links on http://spins.fpo, labs.fpo and arm.fpo.
|
||||||
|
. Update GA Image sizes and cloud_composedate in ./build.d/globalvar.py.
|
||||||
|
Verify they are right in Cloud images and Docker image.
|
||||||
|
. Update static/js/checksum.js and check if the paths and checksum still
|
||||||
|
match.
|
||||||
|
. Update the new POT files and push them to Zanata (ask a maintainer to
|
||||||
|
do so) every time you change text strings.
|
||||||
|
. Add this build to stg.fedoraproject.org (ansible syncStatic.sh.stg) to
|
||||||
|
test the pages online.
|
||||||
|
____
|
||||||
|
|
||||||
|
. Release Date:
|
||||||
|
|
||||||
|
____
|
||||||
|
* Merge the fXX-beta branch to master and correct conflicts manually
|
||||||
|
* Add the redirect of prerelease pages in ansible, edit:
|
||||||
|
* ansible/playbooks/include/proxies-redirects.yml
|
||||||
|
* ask a sysadmin-main to run playbook
|
||||||
|
* Unfreeze translations by deleting the PO_FREEZE files
|
||||||
|
* When ready and about 90 minutes before Release Time push to master
|
||||||
|
* Update the short links for the Cloud Images for 'Fedora XX', 'Fedora
|
||||||
|
XX-1' and 'Latest'
|
||||||
|
* Tag the commit as new release and push it too: $ git tag -a FXX -m
|
||||||
|
'Releasing Fedora XX' $ git push --tags
|
||||||
|
* If needed follow "Fire in the hole" below.
|
||||||
|
____
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic, start=3]
|
||||||
|
. Fire in the hole
|
||||||
|
____
|
||||||
|
|
||||||
|
We now use ansible for everything, and normally use a regular build to
|
||||||
|
make the websites live. If something is not happening as expected, you
|
||||||
|
should get in contact with a sysadmin-main to run the ansible playbook
|
||||||
|
again.
|
||||||
|
|
||||||
|
All our stuff, such as SyncStatic.sh and SyncTranslation.sh scripts are
|
||||||
|
now also in ansible!
|
||||||
|
|
||||||
|
Staging server app02 and production server bapp01 do not exist anymore,
|
||||||
|
now our staging websites are on sundries01.stg and the production on
|
||||||
|
sundries01. Change your scripts accordingly and as sysadmin-web you
|
||||||
|
should have access to those servers as before.
|
||||||
|
____
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic, start=4]
|
||||||
|
. Tips
|
||||||
|
____
|
||||||
|
|
||||||
|
4.1 Merging branches
|
||||||
|
|
||||||
|
Suggested by Ricky This can be useful if you're _sure_ all new changes
|
||||||
|
on devel branch should go into the master branch. Conflicts will be
|
||||||
|
solved directly accepting only the changes in the devel branch. If
|
||||||
|
you're not 100% sure do a normal merge and fix conflicts manually!
|
||||||
|
|
||||||
|
$ git merge f22-beta $ git checkout --theirs f22-beta [list of
|
||||||
|
conflicting po files] $ git commit
|
||||||
|
|
204
modules/sysadmin_guide/pages/fmn.adoc
Normal file
204
modules/sysadmin_guide/pages/fmn.adoc
Normal file
|
@ -0,0 +1,204 @@
|
||||||
|
= FedMsg Notifications (FMN) SOP
|
||||||
|
|
||||||
|
Route individualized notifications to fedora contributors over email,
|
||||||
|
irc.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
=== Owner
|
||||||
|
|
||||||
|
* Messaging SIG
|
||||||
|
* Fedora Infrastructure Team
|
||||||
|
|
||||||
|
=== Contact
|
||||||
|
|
||||||
|
____
|
||||||
|
* #fedora-apps for FMN development
|
||||||
|
* #fedora-fedmsg for an IRC feed of all fedmsgs
|
||||||
|
* #fedora-admin for problems with the deployment of FMN
|
||||||
|
* #fedora-noc for outage/crisis alerts
|
||||||
|
____
|
||||||
|
|
||||||
|
=== Servers
|
||||||
|
|
||||||
|
Production servers:
|
||||||
|
|
||||||
|
____
|
||||||
|
* notifs-backend01.phx2.fedoraproject.org (RHEL 7)
|
||||||
|
* notifs-web01.phx2.fedoraproject.org (RHEL 7)
|
||||||
|
* notifs-web02.phx2.fedoraproject.org (RHEL 7)
|
||||||
|
____
|
||||||
|
|
||||||
|
Staging servers:
|
||||||
|
|
||||||
|
____
|
||||||
|
* notifs-backend01.stg.phx2.fedoraproject.org (RHEL 7)
|
||||||
|
* notifs-web01.stg.phx2.fedoraproject.org (RHEL 7)
|
||||||
|
* notifs-web02.stg.phx2.fedoraproject.org (RHEL 7)
|
||||||
|
____
|
||||||
|
|
||||||
|
=== Purpose
|
||||||
|
|
||||||
|
Route notifications to users
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
fmn is a pair of systems intended to route fedmsg notifications to
|
||||||
|
Fedora contributors and users.
|
||||||
|
|
||||||
|
There is a web interface running on notifs-web01 and notifs-web02 that
|
||||||
|
allows users to login and configure their preferences to select this or
|
||||||
|
that type of message.
|
||||||
|
|
||||||
|
There is a backend running on notifs-backend01 where most of the work is
|
||||||
|
done.
|
||||||
|
|
||||||
|
The backend process is a 'fedmsg-hub' daemon, controlled by systemd.
|
||||||
|
|
||||||
|
== Hosts
|
||||||
|
|
||||||
|
=== notifs-backend
|
||||||
|
|
||||||
|
This host runs:
|
||||||
|
|
||||||
|
* `fedmsg-hub.service`
|
||||||
|
* One or more `fmn-worker@.service`. Currently notifs-backend01 runs
|
||||||
|
`fmn-worker@\{1-4}.service`
|
||||||
|
* `fmn-backend@1.service`
|
||||||
|
* `fmn-digests@1.service`
|
||||||
|
* `rabbitmq-server.service`, an AMQP broker used to communicate between
|
||||||
|
the services.
|
||||||
|
* `redis.service`, used for caching.
|
||||||
|
|
||||||
|
This host relies on a PostgreSQL database running on
|
||||||
|
db01.phx2.fedoraproject.org.
|
||||||
|
|
||||||
|
=== notifs-web
|
||||||
|
|
||||||
|
This host runs:
|
||||||
|
|
||||||
|
* A Python WSGI application via Apache httpd that serves the
|
||||||
|
https://apps.fedoraproject.org/notifications%3E[FMN web user interface].
|
||||||
|
|
||||||
|
This host relies on a PostgreSQL database running on
|
||||||
|
db01.phx2.fedoraproject.org.
|
||||||
|
|
||||||
|
== Deployment
|
||||||
|
|
||||||
|
Once upstream releases a new version of
|
||||||
|
https://github.com/fedora-infra/fmn[fmn],
|
||||||
|
https://github.com/fedora-infra/fmn.web[fmn-web], or
|
||||||
|
https://github.com/fedora-infra/fmn.sse[fmn-sse] creating a Git tag, a
|
||||||
|
new version can be built an deployed into Fedora infrastructure.
|
||||||
|
|
||||||
|
=== Building
|
||||||
|
|
||||||
|
FMN is packaged in Fedora and EPEL as
|
||||||
|
https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn/[python-fmn]
|
||||||
|
(the backend),
|
||||||
|
https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-web/[python-fmn-web]
|
||||||
|
(the frontend), and the optional
|
||||||
|
https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-sse/[python-fmn-sse].
|
||||||
|
|
||||||
|
Since all the hosts run RHEL 7, you need to build all these packages for
|
||||||
|
EPEL 7.
|
||||||
|
|
||||||
|
=== Configuration
|
||||||
|
|
||||||
|
If there are any configuration updates required by the new version of
|
||||||
|
FMN, update the `notifs` Ansible roles on
|
||||||
|
batcave01.phx2.fedoraproject.org. Remember to use:
|
||||||
|
|
||||||
|
....
|
||||||
|
{% if env == 'staging' %}
|
||||||
|
<new config here>
|
||||||
|
{% else %}
|
||||||
|
<retain old config>
|
||||||
|
{% endif %}
|
||||||
|
....
|
||||||
|
|
||||||
|
When deploying the update to staging. You can apply configuration
|
||||||
|
updates to staging by running:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook -l staging groups/notifs-backend.yml
|
||||||
|
$ sudo rbac-playbook -l staging groups/notifs-web.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
Simply drop the `-l staging` to update the production configuration.
|
||||||
|
|
||||||
|
=== Upgrading
|
||||||
|
|
||||||
|
To upgrade the
|
||||||
|
https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn/[python-fmn],
|
||||||
|
https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-web/[python-fmn-web],
|
||||||
|
and
|
||||||
|
https://admin.fedoraproject.org/pkgdb/package/rpms/python-fmn-sse/[python-fmn-sse]
|
||||||
|
packages, apply configuration changes, and restart the services, you
|
||||||
|
should use the manual upgrade playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook -l staging manual/upgrade/fmn.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
Again, drop the `-l staging` flag to upgrade production.
|
||||||
|
|
||||||
|
Be aware that the FMN services take a significant amount of time to
|
||||||
|
start up as they pre-heat their caches before starting work.
|
||||||
|
|
||||||
|
== Service Administration
|
||||||
|
|
||||||
|
Disable an account (on notifs-backend01):
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -u fedmsg /usr/local/bin/fmn-disable-account USERNAME
|
||||||
|
....
|
||||||
|
|
||||||
|
Restart:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo systemctl restart fedmsg-hub
|
||||||
|
....
|
||||||
|
|
||||||
|
Watch logs:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo journalctl -u fedmsg-hub -f
|
||||||
|
....
|
||||||
|
|
||||||
|
Configuration:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ls /etc/fedmsg.d/
|
||||||
|
$ sudo fedmsg-config | less
|
||||||
|
....
|
||||||
|
|
||||||
|
Monitor performance:
|
||||||
|
|
||||||
|
....
|
||||||
|
http://threebean.org/fedmsg-health-day.html#FMN
|
||||||
|
....
|
||||||
|
|
||||||
|
Upgrade (from batcave):
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo -i ansible-playbook /srv/web/infra/ansible/playbooks/manual/upgrade/fmn.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Mailing Lists
|
||||||
|
|
||||||
|
We use FMN as a way to forward certain kinds of messages to mailing
|
||||||
|
lists so people can read them the good old fashioned way that they like
|
||||||
|
to. To accomplish this, we create 'bot' FAS accounts with their own FMN
|
||||||
|
profiles and we set their email addresses to the lists in question.
|
||||||
|
|
||||||
|
If you need to change the way some set of messages are forwarded, you
|
||||||
|
can do it from the FMN web interface (if you are an FMN admin as defined
|
||||||
|
in the config file in roles/notifs/frontend/). You can navigate to
|
||||||
|
https://apps.fedoraproject.org/notifications/USERNAME.id.fedoraproject.org
|
||||||
|
to do this.
|
||||||
|
|
||||||
|
If the account exists as a FAS user already (for instance, the
|
||||||
|
`virtmaint` user) but it does not yet exist in FMN, you can add it to
|
||||||
|
the FMN database by logging in to notifs-backend01 and running
|
||||||
|
`fmn-create-user --email DESTINATION@EMAIL.COM --create-defaults FAS_USERNAME`.
|
100
modules/sysadmin_guide/pages/fpdc.adoc
Normal file
100
modules/sysadmin_guide/pages/fpdc.adoc
Normal file
|
@ -0,0 +1,100 @@
|
||||||
|
= FPDC SOP
|
||||||
|
|
||||||
|
Fedora Product Definition Center is a service that aims to replace
|
||||||
|
https://pdc.fedoraproject.org/[PDC] in Fedora. It is meant to be a
|
||||||
|
database with REST API access used to store data needed by other
|
||||||
|
services.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-admin
|
||||||
|
Persons::
|
||||||
|
cverna, abompard
|
||||||
|
Location::
|
||||||
|
Phoenix (Openshift)
|
||||||
|
Public addresses::
|
||||||
|
* fpdc.fedoraproject.org
|
||||||
|
* fpdc.stg.fedoraproject.org
|
||||||
|
Servers::
|
||||||
|
* os.fedoraproject.org
|
||||||
|
* os.stg.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Centralize metadata and facilitate access.
|
||||||
|
|
||||||
|
== Systems
|
||||||
|
|
||||||
|
FPDC is built using the DJANGO REST FRAMEWORK and uses a POSTGRESQL
|
||||||
|
database to store the metadata. The application is run on Openshift and
|
||||||
|
uses the Source-to-image technology to build the container directly from
|
||||||
|
the https://github.com/fedora-infra/fpdc[git repository].
|
||||||
|
|
||||||
|
In the staging and production environments, the application is
|
||||||
|
automatically rebuilt for every new commit in the [.title-ref]#staging#
|
||||||
|
or [.title-ref]#production# branch, this is achieved by configuring a
|
||||||
|
github webhook's to trigger an openshift deployment.
|
||||||
|
|
||||||
|
For example a new deployment to staging would look like that:
|
||||||
|
|
||||||
|
____
|
||||||
|
git clone git@github.com:fedora-infra/fpdc.git cd fpdc git checkout
|
||||||
|
staging git rebase master git push origin staging
|
||||||
|
____
|
||||||
|
|
||||||
|
The initial Openshift project deployment is manual and is done using the
|
||||||
|
following ansible playbook :
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook openshift-apps/fpdc.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
This will create a new fpdc project in Openshift with all the needed
|
||||||
|
configuration.
|
||||||
|
|
||||||
|
== Logs
|
||||||
|
|
||||||
|
Logs can be retrive using the openshift command line:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc login os-master01.phx2.fedoraproject.org
|
||||||
|
You must obtain an API token by visiting https://os.fedoraproject.org/oauth/token/request
|
||||||
|
|
||||||
|
$ oc login os-master01.phx2.fedoraproject.org --token=<Your token here>
|
||||||
|
$ oc -n fpdc get pods
|
||||||
|
fpdc-28-bfj52 1/1 Running 522 28d
|
||||||
|
$ oc logs fpdc-28-bfj52
|
||||||
|
....
|
||||||
|
|
||||||
|
== Database migrations
|
||||||
|
|
||||||
|
FPDC uses the [.title-ref]#recreate# deployment configuration of
|
||||||
|
openshift, which means that openshift will bring down the pods currently
|
||||||
|
running and recreate new ones with the new version of the application.
|
||||||
|
In the phase between the pods being down and the new pods being up, the
|
||||||
|
database migrations are run in an independent pod.
|
||||||
|
|
||||||
|
== Things that could go wrong
|
||||||
|
|
||||||
|
Hopefully not much. If something goes wrong is it currently advised to
|
||||||
|
kill the pods to trigger a fresh deployment. :
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc login os-master01.phx2.fedoraproject.org
|
||||||
|
You must obtain an API token by visiting https://os.fedoraproject.org/oauth/token/request
|
||||||
|
|
||||||
|
$ oc login os-master01.phx2.fedoraproject.org --token=<Your token here>
|
||||||
|
$ oc -n fpdc get pods
|
||||||
|
fpdc-28-bfj52 1/1 Running 522 28d
|
||||||
|
$ oc delete pod fpdc-28-bfj52
|
||||||
|
....
|
||||||
|
|
||||||
|
It is also possible to rollback to a previous version :
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc -n fpdc get dc
|
||||||
|
NAME REVISION DESIRED CURRENT TRIGGERED BY
|
||||||
|
fpdc 39 1 1 config,image(fpdc:latest)
|
||||||
|
$ oc -n fpdc rollback fpdc
|
||||||
|
....
|
261
modules/sysadmin_guide/pages/freemedia.adoc
Normal file
261
modules/sysadmin_guide/pages/freemedia.adoc
Normal file
|
@ -0,0 +1,261 @@
|
||||||
|
= FreeMedia Infrastructure SOP
|
||||||
|
|
||||||
|
This page is for defining the SOP for Fedora FreeMedia Program. This
|
||||||
|
will cover the infrastructural things as well as procedural things.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Location of Resources
|
||||||
|
. Location on Ansible
|
||||||
|
. Opening of the form
|
||||||
|
. Closing of the Form
|
||||||
|
. Tentative timeline
|
||||||
|
. How to
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Open
|
||||||
|
. Close
|
||||||
|
____
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic, start=7]
|
||||||
|
. Handling of tickets
|
||||||
|
____
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Login
|
||||||
|
. Rejecting Invalid Tickets
|
||||||
|
. Accepting Valid Tickets
|
||||||
|
____
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic, start=8]
|
||||||
|
. Handling of non fulfilled requests
|
||||||
|
. How to handle membership applications
|
||||||
|
____
|
||||||
|
|
||||||
|
== Location of Resources
|
||||||
|
|
||||||
|
* The web form is at
|
||||||
|
https://fedoraproject.org/freemedia/FreeMedia-form.html
|
||||||
|
* The TRAC is at [63]https://fedorahosted.org/freemedia/report
|
||||||
|
|
||||||
|
== Location on ansible
|
||||||
|
|
||||||
|
$PWD = `roles/freemedia/files`
|
||||||
|
|
||||||
|
Freemedia form::
|
||||||
|
FreeMedia-form.html
|
||||||
|
Backup form::
|
||||||
|
FreeMedia-form.html.orig
|
||||||
|
Closed form::
|
||||||
|
FreeMedia-close.html
|
||||||
|
Backend processing script::
|
||||||
|
process.php
|
||||||
|
Error Document::
|
||||||
|
FreeMedia-error.html
|
||||||
|
|
||||||
|
== Opening of the form
|
||||||
|
|
||||||
|
The form will be opened on the First day of each month.
|
||||||
|
|
||||||
|
== Closing of the Form
|
||||||
|
|
||||||
|
=== Tentative timeline
|
||||||
|
|
||||||
|
The form will be closed after a couple of days. This may vary according
|
||||||
|
to the capacity.
|
||||||
|
|
||||||
|
== How to
|
||||||
|
|
||||||
|
* The form is available at `roles/freemedia/files/FreeMedia-form.html`
|
||||||
|
and `roles/freemedia/files//FreeMedia-form.html.orig`
|
||||||
|
* The closed form is at `roles/freemedia/files/FreeMedia-close.html`
|
||||||
|
|
||||||
|
=== Open
|
||||||
|
|
||||||
|
* Goto roles/freemedia/tasks
|
||||||
|
* Open `main.yml`
|
||||||
|
* Goto line 32.
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
To Open: Change the line to read::::
|
||||||
|
src="FreeMedia-form.html"
|
||||||
|
* After opening the form, go to trac and grant "Ticket Create and Ticket
|
||||||
|
View" privilege to "Anonymous".
|
||||||
|
|
||||||
|
=== Close
|
||||||
|
|
||||||
|
* Goto roles/freemedia/tasks
|
||||||
|
* Open main.yml
|
||||||
|
* Goto line 32.
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
To Close: Change the line to read::::
|
||||||
|
src="FreeMedia-close.html",
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
After closing the form, go to trac and remove "Ticket Create and::
|
||||||
|
Ticket View" privilege from "Anonymous".
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
* Have to check about monthly cron. * Have to write about changing
|
||||||
|
init.pp for closing and opening
|
||||||
|
====
|
||||||
|
|
||||||
|
== Handling of tickets
|
||||||
|
|
||||||
|
=== Login
|
||||||
|
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
Contributors are requested to visit::
|
||||||
|
https://fedorahosted.org/freemedia/report
|
||||||
|
* Please login with your FAS account.
|
||||||
|
|
||||||
|
=== Rejecting Invalid Tickets
|
||||||
|
|
||||||
|
* If a ticket is invalid, don't accept the request. Go to "resolve as:"
|
||||||
|
and select "invalid" and then press "Submit Changes".
|
||||||
|
* A ticket is Invalid if
|
||||||
|
+
|
||||||
|
|
||||||
|
____
|
||||||
|
** No Valid email-id is provided.
|
||||||
|
** The region does not match the country.
|
||||||
|
** No Proper Address is given.
|
||||||
|
____
|
||||||
|
|
||||||
|
* If a ticket is duplicate, accept one copy, close the others as
|
||||||
|
duplicate Go to "resolve as:" and select "duplicate" and then press
|
||||||
|
"Submit Changes".
|
||||||
|
|
||||||
|
=== Accepting Valid Tickets
|
||||||
|
|
||||||
|
* If you wish to fulfill a request, please ensure it from the above
|
||||||
|
section, it is not liable to be discarded.
|
||||||
|
* Now "Accept" the ticket from the "Action" field at the bottom, and
|
||||||
|
press the "Submit Changes" button.
|
||||||
|
* These accepted tickets will be available from
|
||||||
|
https://fedorahosted.org/freemedia/report user both "My Tickets" and
|
||||||
|
"Accepted Tickets for XX" (XX= your region e.g APAC)
|
||||||
|
* When You ship the request, please go to the ticket again, go to
|
||||||
|
"resolve as:" from the "Action" field and select "Fixed" and then press
|
||||||
|
"Submit Changes".
|
||||||
|
* If an accepted ticket is not finalised by the end of the month, is
|
||||||
|
should be closed with "shipping status unknown" in a comment
|
||||||
|
|
||||||
|
=== Handling of non fulfilled requests
|
||||||
|
|
||||||
|
We shall close all the pending requests by the end of the Month.
|
||||||
|
|
||||||
|
* Please Check your region
|
||||||
|
|
||||||
|
=== How to handle membership applications
|
||||||
|
|
||||||
|
Steps to become member of Free-media Group.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Create an account in Fedora Account System (FAS)
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Create an user page in Fedora Wiki with contact data. Like::
|
||||||
|
User:<nick-name>. There are templates.
|
||||||
|
. Apply to Free-Media Group in FAS
|
||||||
|
. Apply to Free-Media mailing list subscription
|
||||||
|
|
||||||
|
==== Rules for deciding over membership applications
|
||||||
|
|
||||||
|
[cols=",,,,",]
|
||||||
|
|===
|
||||||
|
|Case |Applied to Free-Media Group |User Page Created |Applied to
|
||||||
|
Free-Media List a|
|
||||||
|
____
|
||||||
|
Action
|
||||||
|
____
|
||||||
|
|
||||||
|
|======= |================ |========== |===============
|
||||||
|
|=========================
|
||||||
|
|
||||||
|
|1 |Yes a|
|
||||||
|
____
|
||||||
|
Yes
|
||||||
|
____
|
||||||
|
|
||||||
|
|Yes |Approve Group and mailing list applications
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
|-------------------------Put on hold + Write to
|
||||||
|
|
||||||
|
|2 |Yes a|
|
||||||
|
____
|
||||||
|
Yes
|
||||||
|
____
|
||||||
|
|
||||||
|
|No |subscribe to list Within a Week
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
a|
|
||||||
|
|
||||||
|
'''''
|
||||||
|
|
||||||
|
|-------------------------Put on hold + Write to
|
||||||
|
|
||||||
|
|3 |Yes a|
|
||||||
|
____
|
||||||
|
No
|
||||||
|
____
|
||||||
|
|
||||||
|
|whatever |make User Page Within a Week
|
||||||
|
|
||||||
|
|------- |---------------- |---------- |---------------
|
||||||
|
|-------------------------
|
||||||
|
|
||||||
|
|4 |No a|
|
||||||
|
____
|
||||||
|
No
|
||||||
|
____
|
||||||
|
|
||||||
|
|Yes |Reject
|
||||||
|
|===
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
{empty}1. As you need to have an FAS account for steps 2 and 3, this is
|
||||||
|
not included in the decision rules above 2. The time to be on hold is
|
||||||
|
one week. If not action is taken after one week, the application has to
|
||||||
|
be rejected. 3. When writing asking to fulfil steps, send CC to other
|
||||||
|
Free-media sponsors to let them know the application has been reviewed.
|
||||||
|
====
|
72
modules/sysadmin_guide/pages/freenode-irc-channel.adoc
Normal file
72
modules/sysadmin_guide/pages/freenode-irc-channel.adoc
Normal file
|
@ -0,0 +1,72 @@
|
||||||
|
= Freenode IRC Channel Infrastructure SOP
|
||||||
|
|
||||||
|
Fedora uses the freenode IRC network for it's IRC communications. If you
|
||||||
|
want to make a new Fedora Related IRC Channel, please follow the
|
||||||
|
following guidelines.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Is a new channel needed?
|
||||||
|
. Adding new channel
|
||||||
|
. Recovering/fixing an existing channel
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin
|
||||||
|
Location:::
|
||||||
|
freenode
|
||||||
|
Servers:::
|
||||||
|
none
|
||||||
|
Purpose:::
|
||||||
|
Provides a channel for Fedora contributors to use.
|
||||||
|
|
||||||
|
== Is a new channel needed?
|
||||||
|
|
||||||
|
First you should see if one of the existing Fedora channels will meet
|
||||||
|
your needs. Adding a new channel can give you a less noisy place to
|
||||||
|
focus on something, but at the cost of less people being involved. If
|
||||||
|
you topic/area is development related, perhaps the main #fedora-devel
|
||||||
|
channel will meet your needs?
|
||||||
|
|
||||||
|
== Adding new channel
|
||||||
|
|
||||||
|
* Make sure the channel is in the #fedora-* namespace. This allows the
|
||||||
|
Fedora Group Coordinator to make changes to it if needed.
|
||||||
|
* Found the channel. You do this by /join #channelname, then /msg
|
||||||
|
chanserv register #channelname
|
||||||
|
* Setup GUARD mode. This allows ChanServ to be in the channel for easier
|
||||||
|
management: `/msg chanserv set #channel GUARD on`
|
||||||
|
* Add Some other Operators/Managers to the access list. This would allow
|
||||||
|
them to manage the channel if you are asleep or absent.:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
/msg chanserv access #channel add NICK +ARfiorstv
|
||||||
|
....
|
||||||
|
|
||||||
|
You can see what the various flags mean at
|
||||||
|
http://toxin.jottit.com/freenode_chanserv_commands#cs03
|
||||||
|
|
||||||
|
You may want to consider adding some or all of the folks in #fedora-ops
|
||||||
|
who manage other channels to help you with yours. You can see this list
|
||||||
|
with [.title-ref]##/msg chanserv access #fedora-ops list##`
|
||||||
|
|
||||||
|
* Set default modes. `/msg chanserv set mlock #channel +Ccnt` (The t for
|
||||||
|
topic lock is optional, if your channel would like to have people change
|
||||||
|
the topic often).
|
||||||
|
* If your channel is of general interest, add it to the main communicate
|
||||||
|
page of IRC Channels, and possibly announce it to your target audience.
|
||||||
|
* You may want to request zodbot join your channel if you need it's
|
||||||
|
functions. You can request that in #fedora-admin.
|
||||||
|
|
||||||
|
== Recovering/fixing an existing channel
|
||||||
|
|
||||||
|
If there is an existing channel in the #fedora-* namespace that has a
|
||||||
|
missing founder/operator, please contact the Fedora Group Coordinator:
|
||||||
|
[49]User:Spot and request it be reassigned. Follow the above procedure
|
||||||
|
on the channel once done so it's setup and has enough operators/managers
|
||||||
|
to not need reassiging again.
|
149
modules/sysadmin_guide/pages/freshmaker.adoc
Normal file
149
modules/sysadmin_guide/pages/freshmaker.adoc
Normal file
|
@ -0,0 +1,149 @@
|
||||||
|
= Freshmaker SOP
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Freshmaker is very new and changing rapidly. We'll try to keep this up
|
||||||
|
to date as best we can.
|
||||||
|
====
|
||||||
|
|
||||||
|
Freshmaker is a service that watches message bus activity and tries
|
||||||
|
to rebuild _compound_ artifacts when their constituent pieces change.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Factory2 Team, Release Engineering Team, Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-modularity, #fedora-admin, #fedora-releng
|
||||||
|
Persons::
|
||||||
|
jkaluza, cqi, qwan, sochotni, threebean
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Public addresses::
|
||||||
|
* freshmaker.fedoraproject.org
|
||||||
|
Servers::
|
||||||
|
* freshmaker-frontend0[1-2].phx2.fedoraproject.org
|
||||||
|
* freshmaker-backend01.phx2.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Rebuild compound artifacts. See description for more detail.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
See also
|
||||||
|
http://fedoraproject.org/wiki/Infrastructure/Factory2/Focus/Freshmaker
|
||||||
|
for some of the original (old) thinking on Freshmaker.
|
||||||
|
|
||||||
|
As per the summary above, Freshmaker is a bus-oriented system that
|
||||||
|
watches for changes to smaller pieces of content, and triggers rebuilds
|
||||||
|
of larger pieces of content.
|
||||||
|
|
||||||
|
It doesn't do the actual _builds_ itself, but instead requests rebuilds
|
||||||
|
in our existing build systems.
|
||||||
|
|
||||||
|
It handles a number of different content types. In Fedora, we would like
|
||||||
|
to roll out rebuilds in the following order:
|
||||||
|
|
||||||
|
=== Module Builds
|
||||||
|
|
||||||
|
When a spec file changes on a particular dist-git branch, trigger
|
||||||
|
rebuilds of all modules that declare dependencies on that rpm branch.
|
||||||
|
|
||||||
|
Consider the _traditional workflow_ today. You make a patch to the
|
||||||
|
[.title-ref]#f27# of your package, and you know you need to build that
|
||||||
|
patch for f27, and then later submit an update for this single build.
|
||||||
|
Packagers know what to do.
|
||||||
|
|
||||||
|
Consider the _modular workflow_. You make a patch to the
|
||||||
|
[.title-ref]#2.2# branch of your package, but now, which modules do you
|
||||||
|
rebuild? Maybe you had one in mind that you wanted to fix, but are there
|
||||||
|
others that you forgot about -- that you don't even know about? Kevin
|
||||||
|
could maintain a module that pulls in my rpm branch and he never told
|
||||||
|
me. Even if he did, I have to now maintain a list of modules that depend
|
||||||
|
on my rpm, and request rebuilds of them everytime I patch my .spec file.
|
||||||
|
This is unmanageable.
|
||||||
|
|
||||||
|
Freshmaker deals with this by watching the bus for dist-git fedmsg
|
||||||
|
messages. When it sees a change on a branch, it looks up the list of
|
||||||
|
modules that depend on that branch, and requests rebuilds of them in the
|
||||||
|
MBS.
|
||||||
|
|
||||||
|
=== Container Slow Flow
|
||||||
|
|
||||||
|
When a traditional rpm or modular rpm is _shipped stable_, this trigger
|
||||||
|
rebuilds of all containers that ever included previous versions of this
|
||||||
|
rpm.
|
||||||
|
|
||||||
|
This applies to both modular and non-modular contexts. Today, you build
|
||||||
|
an rpm that fixes a CVE, but _some other person_ maintains a container
|
||||||
|
that includes your RPM. Maybe they never told you about this. Maybe they
|
||||||
|
didn't notice your CVE fix. Their container will remain outdated and
|
||||||
|
vulnerable.. forever?
|
||||||
|
|
||||||
|
Freshmaker deals with this by watching the bus for dist-git messages
|
||||||
|
about rpms being shipped to the stable updates repo. When they're
|
||||||
|
shipped, it looks up all containers that ever included pervious versions
|
||||||
|
of the rpm in question, and it triggers rebuilds of them.
|
||||||
|
|
||||||
|
_Waiting_ until the rpm ships to stable is _necessary_ because the
|
||||||
|
container build process doesn't know about unshipped content. This is
|
||||||
|
how containers are built manually today, and it is annoying. Which
|
||||||
|
brings us to the more complicated...
|
||||||
|
|
||||||
|
=== Container Fast Flow
|
||||||
|
|
||||||
|
When a traditional rpm or modular rpm is _signed_, generate a repo
|
||||||
|
containing it and rebuild all containers that ever included that rpm
|
||||||
|
before. This is the better version of the slow flow, but is more
|
||||||
|
complicated so we're deferring it until after we've proved the first two
|
||||||
|
cases out.
|
||||||
|
|
||||||
|
Freshmaker will do this by requesting an interim build repo from ODCS
|
||||||
|
(the On Demand Compose Service). ODCS can be given the appropriate koji
|
||||||
|
tag and will produce a repo of (pre-signed) rpms. Freshmaker will
|
||||||
|
request a rebuild of the container and will pass the ODCS repo url in.
|
||||||
|
This gives us an auditable trail of disposable repos.
|
||||||
|
|
||||||
|
== Systems
|
||||||
|
|
||||||
|
There is a frontend and a backend.
|
||||||
|
|
||||||
|
Everything in the previous section describes the backend behavior.
|
||||||
|
|
||||||
|
The frontend exists to provide an HTTP API that can be queried to find
|
||||||
|
out the status of the backend: What is it doing? What is it planning to
|
||||||
|
do? What has it done already?
|
||||||
|
|
||||||
|
== Observing Freshmaker Behavior
|
||||||
|
|
||||||
|
There is currently no command line tool to query Freshmaker, but
|
||||||
|
Freshmaker provides REST API which can be used to observe Freshmaker
|
||||||
|
behavior. This is available at the following URLs:
|
||||||
|
|
||||||
|
* https://freshmaker.fedoraproject.org/api/1/events
|
||||||
|
* https://freshmaker.fedoraproject.org/api/1/builds
|
||||||
|
|
||||||
|
The first [.title-ref]#/events# URL should return a list of events that
|
||||||
|
Freshmaker has noticed, recorded, and is handling. Handled events should
|
||||||
|
produce associated builds.
|
||||||
|
|
||||||
|
The second [.title-ref]#/builds# URL should return a list of builds that
|
||||||
|
Freshmaker has submitted and is monitoring. Each build should be
|
||||||
|
traceable back to the event that triggered it.
|
||||||
|
|
||||||
|
== Logs
|
||||||
|
|
||||||
|
The frontend logs are on freshmaker-frontend0[1-2] in
|
||||||
|
`/var/log/httpd/error_log`.
|
||||||
|
|
||||||
|
The backend logs are on freshmaker-backend01. Look in the journal for
|
||||||
|
the [.title-ref]#fedmsg-hub# service.
|
||||||
|
|
||||||
|
== Upgrading
|
||||||
|
|
||||||
|
The package in question is [.title-ref]#freshmaker#. Please use the
|
||||||
|
[.title-ref]#playbooks/manual/upgrade/freshmaker.yml# playbook.
|
||||||
|
|
||||||
|
== Things that could go wrong
|
||||||
|
|
||||||
|
TODO. We don't know yet. Probably lots of things.
|
39
modules/sysadmin_guide/pages/gather-easyfix.adoc
Normal file
39
modules/sysadmin_guide/pages/gather-easyfix.adoc
Normal file
|
@ -0,0 +1,39 @@
|
||||||
|
= Fedora gather easyfix SOP
|
||||||
|
|
||||||
|
Fedora-gather-easyfix as the name says gather tickets marked as easyfix
|
||||||
|
from multiple sources (pagure, github and fedorahosted currently).
|
||||||
|
Providing a single place for new-comers to find small tasks to work on.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Documentation Links
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Location::
|
||||||
|
http://fedoraproject.org/easyfix/
|
||||||
|
Servers::
|
||||||
|
sundries01, sundries02, sundries01.stg
|
||||||
|
Purpose::
|
||||||
|
Gather easyfix tickets from multiple sources.
|
||||||
|
|
||||||
|
Upstream sources are hosted on github at:
|
||||||
|
https://github.com/fedora-infra/fedora-gather-easyfix/
|
||||||
|
|
||||||
|
The files are then mirrored to our ansible repo, under the
|
||||||
|
[.title-ref]#easyfix/gather# role.
|
||||||
|
|
||||||
|
The project is a simple script `gather_easyfix.py` gathering information
|
||||||
|
from the projects sets on the
|
||||||
|
https://fedoraproject.org/wiki/Easyfix[Fedora wiki] and outputing a
|
||||||
|
single html file. This html file is then improved via the css and
|
||||||
|
javascript files present in the sources.
|
||||||
|
|
||||||
|
The generated html file together with the css and js files are then
|
||||||
|
synced to the proxies for public consumption :)
|
121
modules/sysadmin_guide/pages/gdpr_delete.adoc
Normal file
121
modules/sysadmin_guide/pages/gdpr_delete.adoc
Normal file
|
@ -0,0 +1,121 @@
|
||||||
|
= GDPR Delete SOP
|
||||||
|
|
||||||
|
This SOP covers how Fedora Infrastructure handles General Data
|
||||||
|
Protection Regulation (GDPR) Delete Requests. It contains information
|
||||||
|
about how system administrators will use tooling to respond to Delete
|
||||||
|
requests, as well as how application developers can integrate their
|
||||||
|
applications with that tooling.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
nirik
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
batcave01.phx2.fedoraproject.org Various application servers, which
|
||||||
|
will run scripts to delete data.
|
||||||
|
Purpose::
|
||||||
|
Respond to Delete requests.
|
||||||
|
|
||||||
|
== Responding to a Deletion Request
|
||||||
|
|
||||||
|
This section covers how a system administrator will use our
|
||||||
|
`gdpr-delete.yml` playbook to respond to a Delete request.
|
||||||
|
|
||||||
|
When processing a Delete request, perform the following steps:
|
||||||
|
|
||||||
|
[arabic, start=0]
|
||||||
|
. Verify that the requester is who they say they are. If the request
|
||||||
|
came in email ask them to file an issue at
|
||||||
|
https://pagure.io/fedora-pdr/new_issue Use the following in email reply
|
||||||
|
to them:
|
||||||
|
+
|
||||||
|
`In order to verify your identity, please file a new issue at https://pagure.io/fedora-pdr/new_issue using the appropriate issue type. Please note this form requires you to sign in to your account to verify your identity.`
|
||||||
|
+
|
||||||
|
If the request has come via Red Hat internal channels as an explicit
|
||||||
|
request to delete, mark the ticket with the tag `rh`. This tag will help
|
||||||
|
delineate requests for any future reporting needs.
|
||||||
|
+
|
||||||
|
If they do not have a FAS account, indicate to them that there is no
|
||||||
|
data to be deleted. Use this response:
|
||||||
|
+
|
||||||
|
`Your request for deletion has been reviewed. Since there is no related account in the Fedora Account System, the Fedora infrastructure does not store data relevant for this deletion request. Note that some public content related to Fedora you may have previously submitted without an account, such as to public mailing lists, is not deleted since accurate maintenance of this data serves Fedora's legitimate business interests, the public interest, and the interest of the open source community.`
|
||||||
|
. Identify the users FAS account name. The Delete playbook will use this
|
||||||
|
FAS account to delete the required data. Update the fedora-pdr issue
|
||||||
|
saying the request has been received. There is a 'quick response' in the
|
||||||
|
pagure issue tracker to note this.
|
||||||
|
. Login to FAS and clear the `Telephone number` entry, set Country to
|
||||||
|
`Other`, clear `Lattitude` and `Longitude` and `IRC Nick` and
|
||||||
|
`GPG Key ID` and set `Time Zone` to UTC and `Locale` to `en` and set the
|
||||||
|
user status to `disabled`. If the user is not in cla_done plus one
|
||||||
|
group, you are done. Update the ticket and close it. This step will be
|
||||||
|
folded into the following one once we implement it.
|
||||||
|
. If the user is in cla_done + one group, they may have additional data:
|
||||||
|
Run the gdpr delete playbook on `batcave01`. You will need to define one
|
||||||
|
Ansible variable for the playbook. `sar_fas_user` will be the FAS
|
||||||
|
username of the user.
|
||||||
|
+
|
||||||
|
____
|
||||||
|
$ sudo ansible-playbook playbooks/manual/gdpr/delete.yml -e
|
||||||
|
gdpr_delete_fas_user=bowlofeggs
|
||||||
|
____
|
||||||
|
+
|
||||||
|
After the script completes, update the ticket that the request is
|
||||||
|
completed and close it. There is a 'quick response' in the pagure issue
|
||||||
|
tracker to note this.
|
||||||
|
|
||||||
|
== Integrating an application with our delete playbook
|
||||||
|
|
||||||
|
This section covers how an infrastructure application can be configured
|
||||||
|
to integrate with our `delete.yml` playbook. To integrate, you must
|
||||||
|
create a script and Ansible variables so that your application is
|
||||||
|
compatible with this playbook.
|
||||||
|
|
||||||
|
=== Script
|
||||||
|
|
||||||
|
You need to create a script and have your project's Ansible role install
|
||||||
|
that script somewhere (most likely on a host from your project - for
|
||||||
|
example fedocal's is going on `fedocal01`.) It's not a bad idea to put
|
||||||
|
your script into your upstream project. This script should accept one
|
||||||
|
environment variable as input: `GDPR_DELETE_USERNAME`. This will be a
|
||||||
|
FAS username.
|
||||||
|
|
||||||
|
Some scripts may need secrets embedded in them - if you must do this be
|
||||||
|
careful to install the script with `0700` permissions, ensuring that
|
||||||
|
only `gdpr_delete_script_user` (defined below) can run them. Bodhi
|
||||||
|
worked around this concern by having the script run as `apache` so it
|
||||||
|
could read Bodhi's server config file to get the secrets, so it does not
|
||||||
|
have secrets in its script.
|
||||||
|
|
||||||
|
=== Variables
|
||||||
|
|
||||||
|
In addition to writing a script, you need to define some Ansible
|
||||||
|
variables for the host that will run your script:
|
||||||
|
|
||||||
|
[cols=",,",options="header",]
|
||||||
|
|===
|
||||||
|
|Variable |Description |Example
|
||||||
|
|``gdpr_delete_script |`` The full path to the script. a|
|
||||||
|
____
|
||||||
|
`/usr/bin/fedocal-delete`
|
||||||
|
____
|
||||||
|
|
||||||
|
|``gdpr_delete_script |_user`` The user the script should be run as a|
|
||||||
|
____
|
||||||
|
`apache`
|
||||||
|
____
|
||||||
|
|
||||||
|
|===
|
||||||
|
|
||||||
|
You also need to add the host that the script should run on to the
|
||||||
|
`[gdpr_delete]` group in `inventory/inventory`:
|
||||||
|
|
||||||
|
....
|
||||||
|
[gdpr_delete]
|
||||||
|
fedocal01.phx2.fedoraproject.org
|
||||||
|
....
|
153
modules/sysadmin_guide/pages/gdpr_sar.adoc
Normal file
153
modules/sysadmin_guide/pages/gdpr_sar.adoc
Normal file
|
@ -0,0 +1,153 @@
|
||||||
|
= GDPR SAR SOP
|
||||||
|
|
||||||
|
This SOP covers how Fedora Infrastructure handles General Data
|
||||||
|
Protection Regulation (GDPR) Subject Access Requests (SAR). It contains
|
||||||
|
information about how system administrators will use tooling to respond
|
||||||
|
to SARs, as well as how application developers can integrate their
|
||||||
|
applications with that tooling.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Persons::
|
||||||
|
bowlofeggs
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
batcave01.phx2.fedoraproject.org Various application servers, which
|
||||||
|
will run scripts to collect SAR data.
|
||||||
|
Purpose::
|
||||||
|
Respond to SARs.
|
||||||
|
|
||||||
|
== Responding to a SAR
|
||||||
|
|
||||||
|
This section covers how a system administrator will use our `sar.yml`
|
||||||
|
playbook to respond to a SAR.
|
||||||
|
|
||||||
|
When processing a SAR, perform the following steps:
|
||||||
|
|
||||||
|
[arabic, start=0]
|
||||||
|
. Verify that the requester is who they say they are. If the request
|
||||||
|
came in email and the user has a FAS account, ask them to file an issue
|
||||||
|
at https://pagure.io/fedora-pdr/new_issue Use the following in email
|
||||||
|
reply to them:
|
||||||
|
+
|
||||||
|
`In order to verify your identity, please file a new issue at https://pagure.io/fedora-pdr/new_issue using the appropriate issue type. Please note this form requires you to sign in to your account to verify your identity.`
|
||||||
|
+
|
||||||
|
If the request has come via Red Hat internal channels as an explicit
|
||||||
|
request to delete, mark the ticket with the tag `rh`. This tag will help
|
||||||
|
delineate requests for any future reporting needs.
|
||||||
|
. Identify an e-mail address for the requester, and if applicable, their
|
||||||
|
FAS account name. The SAR playbook will use both of these since some
|
||||||
|
applications have data associated with FAS accounts and others have data
|
||||||
|
associated with e-mail addresses. Update the fedora-pdr issue saying the
|
||||||
|
request has been received. There is a 'quick response' in the pagure
|
||||||
|
issue tracker to note this.
|
||||||
|
. Run the SAR playbook on `batcave01`. You will need to define three
|
||||||
|
Ansible variables for the playbook. `sar_fas_user` will be the FAS
|
||||||
|
username, if applicable; this may be omitted if the requester does not
|
||||||
|
have a FAS account. `sar_email` will be the e-mail address associated
|
||||||
|
with the user. `sar_tar_output_path` will be the path you want the
|
||||||
|
playbook to write the resulting tarball to, and should have a `.tar.gz`
|
||||||
|
extension. For example, if `bowlofeggs` submitted a SAR and his e-mail
|
||||||
|
address is `bowlof@eggs.biz`, you might run the playbook like this:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ sudo ansible-playbook playbooks/manual/gdpr/sar.yml -e sar_fas_user=bowlofeggs \
|
||||||
|
-e sar_email=bowlof@eggs.biz -e sar_tar_output_path=/home/bowlofeggs/bowlofeggs.tar.gz
|
||||||
|
....
|
||||||
|
. Generate a random sha512 with something like:
|
||||||
|
`openssl rand 512 | sha512sum` and then move the output file to
|
||||||
|
/srv/web/infra/pdr/the-sha512.tar.gz
|
||||||
|
. Update the ticket to fixed / processed on pdr requests to have a link
|
||||||
|
to https://infrastructure.fedoraproject.org/infra/pdr/the-sha512.tar.gz
|
||||||
|
and tell them it will be available for one week.
|
||||||
|
|
||||||
|
== Integrating an application with our SAR playbook
|
||||||
|
|
||||||
|
This section covers how an infrastructure application can be configured
|
||||||
|
to integrate with our `sar.yml` playbook. To integrate, you must create
|
||||||
|
a script and Ansible variables so that your application is compatible
|
||||||
|
with this playbook.
|
||||||
|
|
||||||
|
=== Script
|
||||||
|
|
||||||
|
You need to create a script and have your project's Ansible role install
|
||||||
|
that script somewhere (most likely on a host from your project - for
|
||||||
|
example Bodhi's is going on `bodhi-backend02`.) It's not a bad idea to
|
||||||
|
put your script into your upstream project - there are plans for
|
||||||
|
upstream Bodhi to ship `bodhi-sar`, for example. This script should
|
||||||
|
accept two environment variables as input: `SAR_USERNAME` and
|
||||||
|
`SAR_EMAIL`. Not all applications will use both, so do what makes sense
|
||||||
|
for your application. The first will be a FAS username and the second
|
||||||
|
will be an e-mail address. Your script should gather the required
|
||||||
|
information related to those identifiers and print it in a machine
|
||||||
|
readable format to stdout. Bodhi, for example, prints information to
|
||||||
|
stdout in `JSON`.
|
||||||
|
|
||||||
|
Some scripts may need secrets embedded in them - if you must do this be
|
||||||
|
careful to install the script with `0700` permissions, ensuring that
|
||||||
|
only `sar_script_user` (defined below) can run them. Bodhi worked around
|
||||||
|
this concern by having the script run as `apache` so it could read
|
||||||
|
Bodhi's server config file to get the secrets, so it does not have
|
||||||
|
secrets in its script.
|
||||||
|
|
||||||
|
=== Variables
|
||||||
|
|
||||||
|
In addition to writing a script, you need to define some Ansible
|
||||||
|
variables for the host that will run your script:
|
||||||
|
|
||||||
|
[cols=",,",options="header",]
|
||||||
|
|===
|
||||||
|
|Variable |Description |Example
|
||||||
|
|`sar_script` |The full path to the script. |`/usr/bin/bodhi-sar`
|
||||||
|
|
||||||
|
|`sar_script_user` |The user the script should be run as |`apache`
|
||||||
|
|
||||||
|
|`sar_output_file` |The name of the file to write into the output
|
||||||
|
tarball |`bodhi.json`
|
||||||
|
|===
|
||||||
|
|
||||||
|
You also need to add the host that the script should run on to the
|
||||||
|
`[sar]` group in `inventory/inventory`:
|
||||||
|
|
||||||
|
....
|
||||||
|
[sar]
|
||||||
|
bodhi-backend02.phx2.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Variables for OpenShift apps
|
||||||
|
|
||||||
|
When you need to add OpenShift app to SAR playbook, you need to add
|
||||||
|
following variables to existing `sar_openshift` dictionary:
|
||||||
|
|
||||||
|
[cols=",,",options="header",]
|
||||||
|
|===
|
||||||
|
|Variable |Description |Example
|
||||||
|
|`sar_script` |The full path to the script. |`/usr/local/bin/sar.py`
|
||||||
|
|
||||||
|
|`sar_output_file` |The name of the file to write into the output
|
||||||
|
tarball |`anitya.json`
|
||||||
|
|
||||||
|
|`openshift_namespace` |The namespace in which the application is
|
||||||
|
running |`release-monitoring`
|
||||||
|
|
||||||
|
|`openshift_pod` |The pod name in which the script will be run
|
||||||
|
|`release-monitoring-web`
|
||||||
|
|===
|
||||||
|
|
||||||
|
The `sar_openshift` dictionary is located in
|
||||||
|
`inventory/group_vars/os_masters`:
|
||||||
|
|
||||||
|
....
|
||||||
|
sar_openshift:
|
||||||
|
# Name of the app
|
||||||
|
release-monitoring:
|
||||||
|
sar_script: /usr/local/bin/sar.py
|
||||||
|
sar_output_file: anitya.json
|
||||||
|
openshift_namespace: release-monitoring
|
||||||
|
openshift_pod: release-monitoring-web
|
||||||
|
....
|
62
modules/sysadmin_guide/pages/geoip-city-wsgi.adoc
Normal file
62
modules/sysadmin_guide/pages/geoip-city-wsgi.adoc
Normal file
|
@ -0,0 +1,62 @@
|
||||||
|
= geoip-city-wsgi SOP
|
||||||
|
|
||||||
|
A simple web service that return geoip information as JSON-formatted
|
||||||
|
dictionary in utf-8. Particularly, it's used by anaconda[1] to get the
|
||||||
|
most probable territory code, based on the public IP of the caller.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Basic Function
|
||||||
|
. Ansible Roles
|
||||||
|
. Apps depending of geoip-city-wsgi
|
||||||
|
. Documentation Links
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-admin, #fedora-noc
|
||||||
|
Location::
|
||||||
|
https://geoip.fedoraproject.org
|
||||||
|
Servers::
|
||||||
|
sundries*, sundries*-stg
|
||||||
|
Purpose::
|
||||||
|
A simple web service that return geoip information as JSON-formatted
|
||||||
|
dictionary in utf-8. Particularly, it's used by anaconda[1] to get the
|
||||||
|
most probable territory code, based on the public IP of the caller.
|
||||||
|
|
||||||
|
== Basic Function
|
||||||
|
|
||||||
|
* Users go to https://geoip.fedoraproject.org/city
|
||||||
|
* The website is exposed via
|
||||||
|
`/etc/httpd/conf.d/geoip-city-wsgi-proxy.conf`.
|
||||||
|
* Return a string with geoip information with syntax as JSON-formatted
|
||||||
|
dict in utf8
|
||||||
|
* It also currently accepts one override: ?ip=xxx.xxx.xxx.xxx, e.g.
|
||||||
|
https://geoip.fedoraproject.org/city?ip=18.0.0.1 which then uses the
|
||||||
|
passed IP address instead of the determined IP address of the client.
|
||||||
|
|
||||||
|
== Ansible Roles
|
||||||
|
|
||||||
|
The geoip-city-wsgi role
|
||||||
|
https://pagure.io/fedora-infra/ansible/blob/main/f/roles/geoip-city-wsgi
|
||||||
|
is present in sundries playbook
|
||||||
|
https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/groups/sundries.yml
|
||||||
|
|
||||||
|
the proxy task are present in
|
||||||
|
https://pagure.io/fedora-infra/ansible/blob/main/f/playbooks/include/proxies-reverseproxy.yml
|
||||||
|
|
||||||
|
== Apps depending of geoip-city-wsgi
|
||||||
|
|
||||||
|
unknown
|
||||||
|
|
||||||
|
== Documentation Links
|
||||||
|
|
||||||
|
app: https://geoip.fedoraproject.org source:
|
||||||
|
https://github.com/fedora-infra/geoip-city-wsgi bugs:
|
||||||
|
https://github.com/fedora-infra/geoip-city-wsgi/issues Role:
|
||||||
|
https://pagure.io/fedora-infra/ansible/blob/main/f/tree/roles/geoip-city-wsgi
|
||||||
|
[1] https://fedoraproject.org/wiki/Anaconda
|
67
modules/sysadmin_guide/pages/github.adoc
Normal file
67
modules/sysadmin_guide/pages/github.adoc
Normal file
|
@ -0,0 +1,67 @@
|
||||||
|
= Using github for Infra Projects
|
||||||
|
|
||||||
|
We're presently using github to host git repositories and issue tracking
|
||||||
|
for some infrastructure projects. Anything we need to know should be
|
||||||
|
recorded here.
|
||||||
|
|
||||||
|
== Setting up a new repo
|
||||||
|
|
||||||
|
Create projects inside of the fedora-infra group:
|
||||||
|
|
||||||
|
https://github.com/fedora-infra
|
||||||
|
|
||||||
|
That will allow us to more easily track what projects we have.
|
||||||
|
|
||||||
|
[TODO] How do we create a new project and import it?
|
||||||
|
|
||||||
|
* After creating a new repo, click on the Settings tab to set up some
|
||||||
|
fancy things.
|
||||||
|
+
|
||||||
|
If using git-flow for your project:
|
||||||
|
** Set the default branch from 'master' to 'develop'. Having the default
|
||||||
|
branch be develop is nice: new contributors will automatically start
|
||||||
|
committing there if they're not paying attention to what branch they're
|
||||||
|
on. You almost never want to commit directly to the master branch.
|
||||||
|
+
|
||||||
|
If there does not exist a develop branch, you should create one by
|
||||||
|
branching off of master.:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ git clone GIT_URL
|
||||||
|
$ git checkout -b develop
|
||||||
|
$ git push --all
|
||||||
|
....
|
||||||
|
** Set up an IRC hook for notifications. From the "settings" tab click
|
||||||
|
on "Webhooks & Services." Under the "Add Service" dropdown, find "IRC"
|
||||||
|
and click it. You might need to enter your password. In the form, you
|
||||||
|
probably want the following values:
|
||||||
|
*** Server, irc.freenode.net
|
||||||
|
*** Port, 6697
|
||||||
|
*** Room, #fedora-apps
|
||||||
|
*** Nick, <nothing>
|
||||||
|
*** Branch Regexes, <nothing>
|
||||||
|
*** Password, <nothing>
|
||||||
|
*** Ssl, <on>
|
||||||
|
*** Message Without Join, <on>
|
||||||
|
*** No Colors, <off>
|
||||||
|
*** Long Url, <off>
|
||||||
|
*** Notice, <on>
|
||||||
|
*** Active, <on>
|
||||||
|
|
||||||
|
== Add an EasyFix label
|
||||||
|
|
||||||
|
The EasyFix label is used to mark bugs that are potentially fixable by
|
||||||
|
new contributors getting used to our source code or relatively new to
|
||||||
|
python programming. GitHub doesn't provide this label automatically so
|
||||||
|
we have to add it. You can add the label from the issues page of the
|
||||||
|
repository or use this curl command to add it:
|
||||||
|
|
||||||
|
....
|
||||||
|
curl -k -u '$GITHUB_USERNAME:$GITHUB_PASSWORD' https://api.github.com/repos/fedora-infra/python-fedora/labels -H "Content-Type: application/json" -d '{"name":"EasyFix","color":"3b6eb4"}'
|
||||||
|
....
|
||||||
|
|
||||||
|
Please try to use the same color for consistency between Fedora
|
||||||
|
Infrastructure Projects. You can then add the github repo to the list
|
||||||
|
that easyfix.fedoraproject.org scans for easyfix tickets here:
|
||||||
|
|
||||||
|
https://fedoraproject.org/wiki/Easyfix
|
50
modules/sysadmin_guide/pages/github2fedmsg.adoc
Normal file
50
modules/sysadmin_guide/pages/github2fedmsg.adoc
Normal file
|
@ -0,0 +1,50 @@
|
||||||
|
= github2fedmsg SOP
|
||||||
|
|
||||||
|
Bridge github events onto our fedmsg bus.
|
||||||
|
|
||||||
|
App: https://apps.fedoraproject.org/github2fedmsg/ Source:
|
||||||
|
https://github.com/fedora-infra/github2fedmsg/
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps, #fedora-admin, #fedora-noc
|
||||||
|
Servers::
|
||||||
|
github2fedmsg01
|
||||||
|
Purpose::
|
||||||
|
Bridge github events onto our fedmsg bus.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
github2fedmsg is a small Python Pyramid app that bridges github events
|
||||||
|
onto our fedmsg bus by way of github's "webhooks" feature. It is what
|
||||||
|
allows us to have IRC notifications of github activity via fedmsg. It
|
||||||
|
has two phases of operation:
|
||||||
|
|
||||||
|
* Infrequently, a user will log in to github2fedmsg via Fedora OpenID.
|
||||||
|
They then push a button to also log in to github.com. They are then
|
||||||
|
logged in to github2fedmsg with _both_ their FAS account and their
|
||||||
|
github account.
|
||||||
|
+
|
||||||
|
They are then presented with a list of their github repositories. They
|
||||||
|
can toggle each one: "on" or "off". When they turn a repo on, our webapp
|
||||||
|
makes a request to github.com to install a "webhook" for that repo with
|
||||||
|
a callback URL to our app.
|
||||||
|
* When events happen to that repo on github.com, github looks up our
|
||||||
|
callback URL and makes an http POST request to us, informing us of the
|
||||||
|
event. Our github2fedmsg app receives that, validates it, and then
|
||||||
|
republishes the content to our fedmsg bus.
|
||||||
|
|
||||||
|
== What could go wrong?
|
||||||
|
|
||||||
|
* Restarting the app or rebooting the host shouldn't cause a problem. It
|
||||||
|
should come right back up.
|
||||||
|
* Our database could die. We have a db with a list of all the repos we
|
||||||
|
have turned on and off. We would want to restore that from backup.
|
||||||
|
* If github gets compromised, they might have to revoke all of their
|
||||||
|
application credentials. In that case, our app would fail to work. There
|
||||||
|
are _lots_ of private secrets set in our private repo that allow our app
|
||||||
|
to talk to github.com. There are inline comments there with instructions
|
||||||
|
about how to generate new keys and secrets.
|
26
modules/sysadmin_guide/pages/gitweb.adoc
Normal file
26
modules/sysadmin_guide/pages/gitweb.adoc
Normal file
|
@ -0,0 +1,26 @@
|
||||||
|
= Gitweb Infrastructure SOP
|
||||||
|
|
||||||
|
Gitweb-caching is the web interface we use to expose git to the web at
|
||||||
|
http://git.fedorahosted.org/git/
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-hosted
|
||||||
|
Location::
|
||||||
|
Serverbeach
|
||||||
|
Servers::
|
||||||
|
hosted[1-2]
|
||||||
|
Purpose::
|
||||||
|
Http access to git sources.
|
||||||
|
|
||||||
|
== Basic Function
|
||||||
|
|
||||||
|
* Users go to [46]http://git.fedorahosted.org/git/
|
||||||
|
* Pages are generated from cache stored in `/var/cache/gitweb-caching/`.
|
||||||
|
* The website is exposed via
|
||||||
|
`/etc/httpd/conf.d/git.fedoraproject.org.conf`.
|
||||||
|
* Main config file is `/var/www/gitweb-caching/gitweb_config.pl`. This
|
||||||
|
pulls git repos from /git/.
|
112
modules/sysadmin_guide/pages/greenwave.adoc
Normal file
112
modules/sysadmin_guide/pages/greenwave.adoc
Normal file
|
@ -0,0 +1,112 @@
|
||||||
|
= Greenwave SOP
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Factory2 Team, Fedora QA Team, Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-qa, #fedora-admin
|
||||||
|
Persons::
|
||||||
|
gnaponie (giulia), mprahl, lucarval, ralph (threebean)
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Public addresses::
|
||||||
|
* https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/version
|
||||||
|
* https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/policies
|
||||||
|
* https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/decision
|
||||||
|
Servers::
|
||||||
|
* In OpenShift.
|
||||||
|
Purpose::
|
||||||
|
Provide gating decisions.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
* See
|
||||||
|
http://fedoraproject.org/wiki/Infrastructure/Factory2/Focus/Greenwave[the
|
||||||
|
focus document] for background.
|
||||||
|
* See https://pagure.io/docs/greenwave/[the upstream docs] for more
|
||||||
|
detailed info.
|
||||||
|
|
||||||
|
Greenwave's job is:
|
||||||
|
|
||||||
|
* answering yes/no questions (or making decisions)
|
||||||
|
* about artifacts (RPM packages, source tarballs, …)
|
||||||
|
* at certain gating points in our pipeline
|
||||||
|
* based on test results
|
||||||
|
* according to some policy
|
||||||
|
|
||||||
|
In particular, we'll be using Greenwave to provide yes/no gating
|
||||||
|
decisions _to Bodhi_ about rpms in each update. Greenwave will do this
|
||||||
|
by consulting resultsdb and waiverdb for individual test results and
|
||||||
|
then combining those results into an aggregate decision.
|
||||||
|
|
||||||
|
The _policies_ for how those results should be combined or ignored, are
|
||||||
|
defined in ansible in
|
||||||
|
`roles/openshift-apps/greenwave/templates/configmap.yml`. We expect to
|
||||||
|
grow these over time to new use cases (rawhide compose gating, etc..)
|
||||||
|
|
||||||
|
== Observing Greenwave Behavior
|
||||||
|
|
||||||
|
Login to `os-master01.phx2.fedoraproject.org` as `root` (or,
|
||||||
|
authenticate remotely with openshift using
|
||||||
|
`oc login https://os.fedoraproject.org`), and run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc project greenwave
|
||||||
|
$ oc status -v
|
||||||
|
$ oc logs -f dc/greenwave-web
|
||||||
|
....
|
||||||
|
|
||||||
|
== Database
|
||||||
|
|
||||||
|
Greenwave currently has no database (and we'd like to keep it that way).
|
||||||
|
It relies on `resultsdb` and `waiverdb` for information.
|
||||||
|
|
||||||
|
== Upgrading
|
||||||
|
|
||||||
|
You can roll out configuration changes by changing the files in
|
||||||
|
`roles/openshift-apps/greenwave/` and running the
|
||||||
|
`playbooks/openshift-apps/greenwave.yml` playbook.
|
||||||
|
|
||||||
|
To understand how the software is deployed, take a look at these two
|
||||||
|
files:
|
||||||
|
|
||||||
|
* `roles/openshift-apps/greenwave/templates/imagestream.yml`
|
||||||
|
* `roles/openshift-apps/greenwave/templates/buildconfig.yml`
|
||||||
|
|
||||||
|
See that we build a fedora-infra specific image on top of an app image
|
||||||
|
published by upstream. The `latest` tag is automatically deployed to
|
||||||
|
staging. This should represent the latest commit to the `master` branch
|
||||||
|
of the upstream git repo that passed its unit and functional tests.
|
||||||
|
|
||||||
|
The `prod-fedora` tag is manually controlled. To upgrade prod to match
|
||||||
|
what is in stage, move the `prod-fedora` tag to point to the same image
|
||||||
|
as the `latest` tag. Our buildconfig is configured to poll that tag, so
|
||||||
|
a new os.fp.o build and deployment should be automatically created.
|
||||||
|
|
||||||
|
You can watch the build and deployment with `oc` commands.
|
||||||
|
|
||||||
|
You can poll this URL to see what version is live at the moment:
|
||||||
|
https://greenwave-web-greenwave.app.os.fedoraproject.org/api/v1.0/version
|
||||||
|
|
||||||
|
== Troubleshooting
|
||||||
|
|
||||||
|
In case of problems with greenwave messaging, check the logs of the
|
||||||
|
container dc/greenwave-fedmsg-consumers to see if the is something
|
||||||
|
wrong:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc logs -f dc/greenwave-fedmsg-consumers
|
||||||
|
....
|
||||||
|
|
||||||
|
It is also possible to check if greenwave is actually publishing
|
||||||
|
messages looking at
|
||||||
|
https://apps.fedoraproject.org/datagrepper/raw?category=greenwave&delta=127800&rows_per_page=1[this
|
||||||
|
link] and checking the time of the last message.
|
||||||
|
|
||||||
|
In case of problems with greenwave webapp, check the logs of the
|
||||||
|
container dc/greenwave-web:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oc logs -f dc/greenwave-web
|
||||||
|
....
|
134
modules/sysadmin_guide/pages/guestdisk.adoc
Normal file
134
modules/sysadmin_guide/pages/guestdisk.adoc
Normal file
|
@ -0,0 +1,134 @@
|
||||||
|
= Guest Disk Resize SOP
|
||||||
|
|
||||||
|
Resize disks in our kvm guests
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. How to do it
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. KVM/libvirt Guests
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location:::
|
||||||
|
PHX, Tummy, ibiblio, Telia, OSUOSL
|
||||||
|
Servers:::
|
||||||
|
All xen servers, kvm/libvirt servers.
|
||||||
|
Purpose:::
|
||||||
|
Resize guest disks
|
||||||
|
|
||||||
|
== How to do it
|
||||||
|
|
||||||
|
=== KVM/libvirt Guests
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
SSH to the kvm server and resize the guest's logical volume. If you::
|
||||||
|
want to be extra careful, make a snapshot of the LV first:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
lvcreate -n [guest name]-snap -L 10G -s /dev/VolGroup00/[guest name]
|
||||||
|
....
|
||||||
|
+
|
||||||
|
Optional, but always good to be careful
|
||||||
|
. Shutdown the guest:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo virsh shutdown [guest name]
|
||||||
|
....
|
||||||
|
. Disable the guests lv:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
lvchange -an /dev/VolGroup00/[guest name]
|
||||||
|
....
|
||||||
|
. Resize the lv:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
lvresize -L [NEW TOTAL SIZE]G /dev/VolGroup00/[guest name]
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
lvresize -L +XG /dev/VolGroup00/[guest name]
|
||||||
|
(to add X GB to the disk)
|
||||||
|
....
|
||||||
|
. Enable the lv:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
lvchange -ay /dev/VolGroup00/[guest name]
|
||||||
|
....
|
||||||
|
. Bring the guest back up:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo virsh start [guest name]
|
||||||
|
....
|
||||||
|
. Login into the guest:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo virsh console [guest name]
|
||||||
|
You may wish to boot single user mode to avoid services coming up and going down again
|
||||||
|
....
|
||||||
|
. On the guest, run:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
fdisk /dev/vda
|
||||||
|
....
|
||||||
|
. Delete the the LVM partition on the guest you want to add space to and
|
||||||
|
recreate it with the maximum size. Make sure to set its type to LV (8e):
|
||||||
|
+
|
||||||
|
....
|
||||||
|
p to list partitions
|
||||||
|
d to delete selected partition
|
||||||
|
n to create new partition (default values should be ok)
|
||||||
|
t to change partition type (set to 8e)
|
||||||
|
w to write changes
|
||||||
|
....
|
||||||
|
. Run partprobe:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
partprobe
|
||||||
|
....
|
||||||
|
. Check the size of the partition:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
fdisk -l /dev/vdaN
|
||||||
|
....
|
||||||
|
+
|
||||||
|
If this still reflects the old size, then reboot the guest and verify
|
||||||
|
that its size changed correctly when it comes up again.
|
||||||
|
. Login to the guest again, and run:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
pvresize /dev/vdaN
|
||||||
|
....
|
||||||
|
. A vgs should now show the new size. Use lvresize to resize the root
|
||||||
|
lv:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
lvresize -L [new root partition size]G /dev/GuestVolGroup00/root
|
||||||
|
|
||||||
|
(pvs will tell you how much space is available)
|
||||||
|
....
|
||||||
|
. Finally, resize the root partition:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
resize2fs /dev/GuestVolGroup00/root
|
||||||
|
(If the root fs is ext4)
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
xfs_growfs /dev/GuestVolGroup00/root
|
||||||
|
(if the root fs is xfs)
|
||||||
|
....
|
||||||
|
+
|
||||||
|
verify that everything worked out, and delete the snapshot you made if
|
||||||
|
you made one.
|
80
modules/sysadmin_guide/pages/guestedit.adoc
Normal file
80
modules/sysadmin_guide/pages/guestedit.adoc
Normal file
|
@ -0,0 +1,80 @@
|
||||||
|
= Guest Editing SOP
|
||||||
|
|
||||||
|
Various virsh commands
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. How to do it
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
.. add/remove cpus
|
||||||
|
.. resize memory
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location:::
|
||||||
|
PHX, Tummy, ibiblio, Telia, OSUOSL
|
||||||
|
Servers:::
|
||||||
|
All xen servers, kvm/libvirt servers.
|
||||||
|
Purpose:::
|
||||||
|
Resize guest disks
|
||||||
|
|
||||||
|
== How to do it
|
||||||
|
|
||||||
|
=== Add cpu
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. SSH to the virthost server
|
||||||
|
. Calculate the number of CPUs the system needs
|
||||||
|
. `sudo virsh setvcpus <guest> <num_of_cpus> --config` - ie:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo virsh setvcpus bapp01 16 --config
|
||||||
|
....
|
||||||
|
. Shutdown the virtual system
|
||||||
|
. Start the virtual system
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Note that using [.title-ref]#virsh reboot# is insufficient. You have to
|
||||||
|
actually stop the domain and start it with `virsh destroy <guest>` and
|
||||||
|
`virsh start <guest>` for the change to take effect.
|
||||||
|
====
|
||||||
|
[arabic, start=6]
|
||||||
|
. Login and check that cpu count matches
|
||||||
|
. *Remember to update the group_vars in ansible* to match the new value
|
||||||
|
you set, if appropriate.
|
||||||
|
|
||||||
|
=== Resize memory
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. SSH to the virthost server
|
||||||
|
. Calculate the amount of memory the system needs in kb
|
||||||
|
. `sudo virsh setmem <guest> <num_in_kilobytes> --config` - ie:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
sudo virsh setmem bapp01 16777216 --config
|
||||||
|
....
|
||||||
|
. Shutdown the virtual system
|
||||||
|
. Start the virtual system
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Note that using [.title-ref]#virsh reboot# is insufficient. You have to
|
||||||
|
actually stop the domain and start it with `virsh destroy <guest>` and
|
||||||
|
`virsh start <guest>` for the change to take effect.
|
||||||
|
====
|
||||||
|
[arabic, start=6]
|
||||||
|
. Login and check that memory matches
|
||||||
|
. *Remember to update the group_vars in ansible* to match the new value
|
||||||
|
you set, if appropriate.
|
143
modules/sysadmin_guide/pages/haproxy.adoc
Normal file
143
modules/sysadmin_guide/pages/haproxy.adoc
Normal file
|
@ -0,0 +1,143 @@
|
||||||
|
= Haproxy Infrastructure SOP
|
||||||
|
|
||||||
|
haproxy is an application that does load balancing at the tcp layer or
|
||||||
|
at the http layer. It can do generic tcp balancing but it does
|
||||||
|
specialize in http balancing. Our proxy servers are still running apache
|
||||||
|
and that is what our users connect to. But instead of using
|
||||||
|
mod_proxy_balancer and ProxyPass balancer://, we do a ProxyPass to
|
||||||
|
[45]http://localhost:10001/ or [46]http://localhost:10002/. haproxy must
|
||||||
|
be told to listen to an individual port for each farm. All haproxy farms
|
||||||
|
are listed in /etc/haproxy/haproxy.cfg.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. How it works
|
||||||
|
. Configuration example
|
||||||
|
. Stats
|
||||||
|
. Advanced Usage
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-web group
|
||||||
|
Location:::
|
||||||
|
Phoenix, Tummy, Telia
|
||||||
|
Servers:::
|
||||||
|
proxy1, proxy2, proxy3, proxy4, proxy5
|
||||||
|
Purpose:::
|
||||||
|
Provides load balancing from the proxy layer to our application layer.
|
||||||
|
|
||||||
|
== How it works
|
||||||
|
|
||||||
|
haproxy is a load balancer. If you're familiar, this section won't be
|
||||||
|
that interesting. haproxy in its normal usage acts just like a web
|
||||||
|
server. It listens on a port for requests. Unlike most webservers though
|
||||||
|
it then sends that request to one of our back end application servers
|
||||||
|
and sends the response back. This is referred to as reverse proxying. We
|
||||||
|
typically configure haproxy to send check to a specific url and look for
|
||||||
|
the response code. If this url isn't sent, it just does basic checks to
|
||||||
|
/. In most of our configurations we're using round robin balancing. IE,
|
||||||
|
request 1 goes to app1, request2 goes to app2, request 3 goes to app3
|
||||||
|
request 4 goes to app1, and the whole process repeats.
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
These checks do add load to the app servers. As well as additional
|
||||||
|
connections. Be smart about which url you're checking as it gets checked
|
||||||
|
often. Also be sure to verify the application servers can handle your
|
||||||
|
new settings, monitor them closely for the hour or two after you make
|
||||||
|
changes.
|
||||||
|
====
|
||||||
|
|
||||||
|
== Configuration example
|
||||||
|
|
||||||
|
The below example is how our fedoraproject wiki could be configured.
|
||||||
|
Each application should have its own farm. Even though it may have an
|
||||||
|
identical configuration to another farm, this allows easy addition and
|
||||||
|
subtraction of specific nodes when we need them.:
|
||||||
|
|
||||||
|
....
|
||||||
|
listen fpo-wiki 0.0.0.0:10001
|
||||||
|
balance roundrobin
|
||||||
|
server app1 app1.fedora.phx.redhat.com:80 check inter 2s rise 2 fall 5
|
||||||
|
server app2 app2.fedora.phx.redhat.com:80 check inter 2s rise 2 fall 5
|
||||||
|
server app4 app4.fedora.phx.redhat.com:80 backup check inter 2s rise 2 fall 5
|
||||||
|
option httpchk GET /wiki/Infrastructure
|
||||||
|
....
|
||||||
|
|
||||||
|
* The first line "listen ...." Says to create a farm called 'fpo-wiki'.
|
||||||
|
Listening on all IP's on port 10001. fpo-wiki can be arbitrary but make
|
||||||
|
it something obvious. Aside from that the important bit is :10001.
|
||||||
|
Always make sure that when creating a new farm, its listening on a
|
||||||
|
unique port. In Fedora's case we're starting at 10001, and moving up by
|
||||||
|
one. Just check the config file for the lowest open port above 10001.
|
||||||
|
* The next line "balance roundrobin" says to use round robin balancing.
|
||||||
|
* The server lines each add a new node to the balancer farm. In this
|
||||||
|
case the wiki is being served from app1, app2 and app4. If the wiki is
|
||||||
|
available at [53]http://app1.fedora.phx.redhat.com/wiki/ Then this
|
||||||
|
config would be used in conjunction with "RewriteRule ^/wiki/(.*)
|
||||||
|
[54]http://localhost:10001/wiki/$1 [P,L]".
|
||||||
|
* 'server' means we're adding a new node to the farm
|
||||||
|
* {blank}
|
||||||
|
+
|
||||||
|
'app1' is the worker name, it is analagous to fpo-wiki but should::
|
||||||
|
match shorthostname of the node to make it easy to follow.
|
||||||
|
* 'app1.fedora.phx.redhat.com:80' is the hostname and port to be
|
||||||
|
contacted.
|
||||||
|
* 'check' means to check via bottom line "option httpchk GET
|
||||||
|
/wiki/Infrastructure" which will use /wiki/Infrastructure to verify the
|
||||||
|
wiki is working. If that URL fails, that entire node will be taken out
|
||||||
|
of the farm mix.
|
||||||
|
* 'inter 2s' means to check every 2 seconds. 2s is the same as 2000 in
|
||||||
|
this case.
|
||||||
|
* 'rise 2' means to not put this node back in the mix until it has had
|
||||||
|
two successful connections in a row. haproxy will continue to check
|
||||||
|
every 2 seconds whether a node is up or down
|
||||||
|
* 'fall 5' means to take a node out of the farm after 5 failures.
|
||||||
|
* 'backup' You'll notice that app4 has a 'backup' option. We don't
|
||||||
|
actually use this for the wiki but do for other farms. It basically
|
||||||
|
means to continue checking and treat this node like any other node but
|
||||||
|
don't send it any production traffic unless the other two nodes are
|
||||||
|
down.
|
||||||
|
|
||||||
|
All of these options can be tweaked so keep that in mind when changing
|
||||||
|
or building a new farm. There are other configuration options in this
|
||||||
|
file that are global. Please see the haproxy documentation for more
|
||||||
|
info:
|
||||||
|
|
||||||
|
....
|
||||||
|
/usr/share/doc/haproxy-1.3.14.6/haproxy-en.txt
|
||||||
|
....
|
||||||
|
|
||||||
|
== Stats
|
||||||
|
|
||||||
|
In order to view the stats for a farm please see the stats page. Each
|
||||||
|
proxy server has its own stats page since each one is running its own
|
||||||
|
haproxy server. To view the stats point your browser to
|
||||||
|
https://admin.fedoraproject.org/haproxy/shorthostname/ so proxy1 is at
|
||||||
|
https://admin.fedoraproject.org/haproxy/proxy1/ The trailing / is
|
||||||
|
important.
|
||||||
|
|
||||||
|
* https://admin.fedoraproject.org/haproxy/proxy1/
|
||||||
|
* https://admin.fedoraproject.org/haproxy/proxy2/
|
||||||
|
* https://admin.fedoraproject.org/haproxy/proxy3/
|
||||||
|
* https://admin.fedoraproject.org/haproxy/proxy4/
|
||||||
|
* https://admin.fedoraproject.org/haproxy/proxy5/
|
||||||
|
|
||||||
|
== Advanced Usage
|
||||||
|
|
||||||
|
haproxy has some more advanced usage that we've not needed to worry
|
||||||
|
about yet but is worth mentioning. For example, one could send users to
|
||||||
|
just one app server based on session id. If user A happened to hit app1
|
||||||
|
first and user B happened to hit app4 first. All subsequent requests for
|
||||||
|
user A would go to app1 and user B would go to app4. This is handy for
|
||||||
|
applications that cannot normally be balanced because of shared storage
|
||||||
|
needs or other locking issues. This won't solve all problems though and
|
||||||
|
can have negative affects for example when app1 goes down user A would
|
||||||
|
either lose their session, or be unable to work until app1 comes back
|
||||||
|
up. Please do some great testing before looking in to this option.
|
191
modules/sysadmin_guide/pages/hosted_git_to_svn.adoc
Normal file
191
modules/sysadmin_guide/pages/hosted_git_to_svn.adoc
Normal file
|
@ -0,0 +1,191 @@
|
||||||
|
= Fedorahosted migrations
|
||||||
|
|
||||||
|
Migrating hosted repositories to that of another type.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. SVN to GIT migration
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Questions left to be answered with this SOP
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-hosted
|
||||||
|
Location::
|
||||||
|
Serverbeach
|
||||||
|
Servers::
|
||||||
|
hosted1, hosted2
|
||||||
|
Purpose::
|
||||||
|
Migrate hosted SCM repositories to that of another SCM.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
fedorahosted.org can be used to host open source projects. Occasionally
|
||||||
|
those projects want to change the SCM they utilize. This document
|
||||||
|
provides documentation for doing so.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. An scm for maintaining the code. The currently supported scm's include
|
||||||
|
Mercurial, Git, Bazaar, or SVN. Note: There is no cvs
|
||||||
|
. A trac instance, which provides a mini-wiki for hosting information
|
||||||
|
and also provides a ticketing system.
|
||||||
|
. A mailing list
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
This page is for administrators only. People wishing to request a hosted
|
||||||
|
project should use the [50]Ticketing System ; see the new project
|
||||||
|
request template. (Requires Fedora Account)
|
||||||
|
====
|
||||||
|
== SVN to GIT migration
|
||||||
|
|
||||||
|
=== FAS User Prep
|
||||||
|
|
||||||
|
Currently you must manually generate $PROJECTNAME-users.txt by grabbing
|
||||||
|
a list of people in the FAS group - and recording them in th following
|
||||||
|
format:
|
||||||
|
|
||||||
|
....
|
||||||
|
$fasusername = FirstName LastName <$emailaddress>
|
||||||
|
....
|
||||||
|
|
||||||
|
This is error prone, and will stop the git-svn fetch below if an author
|
||||||
|
appears that doesn't exist in the list of users.:
|
||||||
|
|
||||||
|
....
|
||||||
|
svn log --quiet | awk '/^r/ {print $3}' | sort -u
|
||||||
|
....
|
||||||
|
|
||||||
|
The above will generate a list of users in the svn repo.
|
||||||
|
|
||||||
|
If all users are FAS users you can use the following script to create a
|
||||||
|
users file (written by tmz (Todd Zullinger):
|
||||||
|
|
||||||
|
....
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
if [ -z "$1" ]; then
|
||||||
|
echo "usage: $0 <svn repo>" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
svnurl=file:///svn/$1
|
||||||
|
|
||||||
|
if ! svn info $svnurl &>/dev/null; then
|
||||||
|
echo "$1 is not a valid svn repo." >&2
|
||||||
|
fi
|
||||||
|
|
||||||
|
svn log -q $svnurl | awk '/^r[0-9]+/ {print $3}' | sort -u | while read user; do
|
||||||
|
name=$( (getent passwd $user 2>/dev/null | awk -F: '{print $5}') || '' )
|
||||||
|
[ -z "$name" ] && name=$user
|
||||||
|
email="$user@fedoraproject.org"
|
||||||
|
echo "$user=$name <$email>"
|
||||||
|
done
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Doing the conversion
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Log into hosted1
|
||||||
|
. Make a temporary directory to convert the repos in:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ sudo mkdir /tmp/tmp-$PROJECTNAME.git
|
||||||
|
|
||||||
|
$ cd /tmp/tmp-$PROJECTNAME.git
|
||||||
|
....
|
||||||
|
. Create an git repo ready to receive migrated SVN data:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ sudo git-svn init http://svn.fedorahosted.org/svn/$PROJECTNAME --no-metadata
|
||||||
|
....
|
||||||
|
. Tell git to fetch and convert the repository:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ git svn fetch
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
This creation of a temporary repository is necessary because SVN leaves a
|
||||||
|
number of items floating around that git can ignore, and we want those
|
||||||
|
essentially ignored.
|
||||||
|
....
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
From here, you'll wanted to follow [53]Creating a new git repo as if::
|
||||||
|
cloning an existing git repository to Fedorahosted.
|
||||||
|
. After that process is done - kindly remove the temporary repo that was
|
||||||
|
created:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
$ sudo rm -rf /tmp/tmp-$PROJECTNAME.git
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Doing the converstion (alternate)
|
||||||
|
|
||||||
|
Alternately, here's another way to do this (tmz):
|
||||||
|
|
||||||
|
Setup a working dir:
|
||||||
|
|
||||||
|
....
|
||||||
|
[tmz@hosted1 tmp (master)]$ mkdir im-chooser-conversion && cd im-chooser-conversion
|
||||||
|
....
|
||||||
|
|
||||||
|
Create authors file mapping svn usernames to Name <email> form git
|
||||||
|
uses.:
|
||||||
|
|
||||||
|
....
|
||||||
|
[tmz@hosted1 im-chooser-conversion (master)]$ ~tmz/svn-to-git-authors im-chooser > authors
|
||||||
|
....
|
||||||
|
|
||||||
|
Convert svn to git:
|
||||||
|
|
||||||
|
....
|
||||||
|
[tmz@hosted1 im-chooser-conversion (master)]$ git svn clone -s -A authors --no-metadata file:///svn/im-chooser
|
||||||
|
....
|
||||||
|
|
||||||
|
Move svn branches and tags into proper locations for the new git repo.
|
||||||
|
(git-svn leaves them as 'remote' branches/tags.):
|
||||||
|
|
||||||
|
....
|
||||||
|
[tmz@hosted1 im-chooser-conversion (master)]$ cd im-chooser
|
||||||
|
[tmz@hosted1 im-chooser (master)]$ mv .git/refs/remotes/tags/* .git/refs/tags/ && rmdir .git/refs/remotes/tags
|
||||||
|
[tmz@hosted1 im-chooser (master)]$ mv .git/refs/remotes/* .git/refs/heads/
|
||||||
|
....
|
||||||
|
|
||||||
|
Now 'git branch' and 'git tag' should display the branches/tags.
|
||||||
|
|
||||||
|
Create a bare repo from the converted git repo. Using `file://$(pwd)`
|
||||||
|
here ensures that git copies all objects to the new bare repo.:
|
||||||
|
|
||||||
|
....
|
||||||
|
[tmz@hosted1 im-chooser-conversion (master)]$ git clone --bare --shared file://$(pwd)/im-chooser im-chooser.git
|
||||||
|
....
|
||||||
|
|
||||||
|
Follow the steps in
|
||||||
|
https://fedoraproject.org/wiki/Hosted_repository_setup to finish setting
|
||||||
|
proper modes and permissions for the repo. Don't forget to update the
|
||||||
|
description file.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
This still leaves moving the converted bare repo (im-chooser.git) to
|
||||||
|
/git and fixing up the user/group.
|
||||||
|
====
|
||||||
|
== Questions left to be answered with this SOP
|
||||||
|
|
||||||
|
* Obviously we need to have requestor review the migration and confirm
|
||||||
|
it's ok.
|
||||||
|
* Do we then delete the old SCM contents?
|
||||||
|
* Do we need to change the FAS-group type to grant them access to
|
||||||
|
pull/push from it?
|
51
modules/sysadmin_guide/pages/hotfix.adoc
Normal file
51
modules/sysadmin_guide/pages/hotfix.adoc
Normal file
|
@ -0,0 +1,51 @@
|
||||||
|
= HOTFIXES SOP
|
||||||
|
|
||||||
|
From time to time we have to quickly patch a problem or issue in
|
||||||
|
applications in our infrastructure. This process allows us to do that
|
||||||
|
and track what changed and be ready to remove it when the issue is fixed
|
||||||
|
upstream.
|
||||||
|
|
||||||
|
== Ansible based items:
|
||||||
|
|
||||||
|
For ansible, they should be placed after the task that installs the
|
||||||
|
package to be changed or modified. Either in roles or tasks.
|
||||||
|
|
||||||
|
hotfix tasks should be called "HOTFIX description" They should also link
|
||||||
|
in comments to any upstream bug or ticket. They should also have tags of
|
||||||
|
'hotfix'
|
||||||
|
|
||||||
|
The process is:
|
||||||
|
|
||||||
|
* Create a diff of any files changed in the fix.
|
||||||
|
* Check in the _link:[original] files and change to role/task
|
||||||
|
* Check in now your diffs of those same files.
|
||||||
|
* ansible will replace the files on the affected machines completely
|
||||||
|
with the fixed versions.
|
||||||
|
* If you need to back it out, you can revert the diff step, wait and
|
||||||
|
then remove the first checkin
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
....
|
||||||
|
<task that installs the httpd package>
|
||||||
|
|
||||||
|
#
|
||||||
|
# install hash randomization hotfix
|
||||||
|
# See bug https://bugzilla.redhat.com/show_bug.cgi?id=812398
|
||||||
|
#
|
||||||
|
- name: hotfix - copy over new httpd init script
|
||||||
|
copy: src="{{ files }}/hotfix/httpd/httpd.init" dest=/etc/init.d/httpd
|
||||||
|
owner=root group=root mode=0755
|
||||||
|
notify:
|
||||||
|
- restart apache
|
||||||
|
tags:
|
||||||
|
- config
|
||||||
|
- hotfix
|
||||||
|
- apache
|
||||||
|
....
|
||||||
|
|
||||||
|
== Upstream changes
|
||||||
|
|
||||||
|
Also, if at all possible a bug should be filed with the upstream
|
||||||
|
application to get the fix in the next version. Hotfixes are something
|
||||||
|
we should strive to only carry a short time.
|
147
modules/sysadmin_guide/pages/hotness.adoc
Normal file
147
modules/sysadmin_guide/pages/hotness.adoc
Normal file
|
@ -0,0 +1,147 @@
|
||||||
|
= The New Hotness
|
||||||
|
|
||||||
|
https://github.com/fedora-infra/the-new-hotness/[the-new-hotness] is a
|
||||||
|
https://fedora-messaging.readthedocs.io/en/stable/[fedora messaging
|
||||||
|
consumer] that subscribes to
|
||||||
|
https://release-monitoring.org/[release-monitoring.org] fedora messaging
|
||||||
|
notifications to determine when a package in Fedora should be updated.
|
||||||
|
For more details on the-new-hotness, consult the
|
||||||
|
http://the-new-hotness.readthedocs.io/[project documentation].
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin #fedora-apps
|
||||||
|
Persons::
|
||||||
|
zlopez
|
||||||
|
Location::
|
||||||
|
iad2.fedoraproject.org
|
||||||
|
Servers::
|
||||||
|
Production
|
||||||
|
+
|
||||||
|
* hotness01.iad2.fedoraproject.org
|
||||||
|
+
|
||||||
|
Staging
|
||||||
|
+
|
||||||
|
* hotness01.stg.iad2.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
File issues when upstream projects release new versions of a package
|
||||||
|
|
||||||
|
== Hosts
|
||||||
|
|
||||||
|
The current deployment is made up of the-new-hotness OpenShift
|
||||||
|
namespace.
|
||||||
|
|
||||||
|
[[the-new-hotness-1]]
|
||||||
|
=== the-new-hotness
|
||||||
|
|
||||||
|
This OpenShift namespace runs following pods:
|
||||||
|
|
||||||
|
* A fedora messaging consumer
|
||||||
|
|
||||||
|
This OpenShift project relies on:
|
||||||
|
|
||||||
|
* `anitya-sop` as message publisher
|
||||||
|
* Fedora messaging RabbitMQ hub for consuming messages
|
||||||
|
* Koji for scratch builds
|
||||||
|
* Bugzilla for issue reporting
|
||||||
|
|
||||||
|
== Releasing
|
||||||
|
|
||||||
|
The release process is described in
|
||||||
|
https://the-new-hotness.readthedocs.io/en/stable/dev-guide.html#release-guide[the-new-hotness
|
||||||
|
documentation].
|
||||||
|
|
||||||
|
=== Deploying
|
||||||
|
|
||||||
|
Staging deployment of the-new-hotness is deployed in OpenShift on
|
||||||
|
os-master01.stg.iad2.fedoraproject.org.
|
||||||
|
|
||||||
|
To deploy staging instance of the-new-hotness you need to push changes
|
||||||
|
to staging branch on
|
||||||
|
https://github.com/fedora-infra/the-new-hotness[the-new-hotness GitHub].
|
||||||
|
GitHub webhook will then automatically deploy a new version of
|
||||||
|
the-new-hotness on staging.
|
||||||
|
|
||||||
|
Production deployment of the-new-hotness is deployed in OpenShift on
|
||||||
|
os-master01.iad2.fedoraproject.org.
|
||||||
|
|
||||||
|
To deploy production instance of the-new-hotness you need to push
|
||||||
|
changes to production branch on
|
||||||
|
https://github.com/fedora-infra/the-new-hotness[the-new-hotness GitHub].
|
||||||
|
GitHub webhook will then automatically deploy a new version of
|
||||||
|
the-new-hotness on production.
|
||||||
|
|
||||||
|
==== Configuration
|
||||||
|
|
||||||
|
To deploy the new configuration, you need
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/sshaccess.html[ssh
|
||||||
|
access] to batcave01.iad2.fedoraproject.org and
|
||||||
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ansible.html[permissions
|
||||||
|
to run the Ansible playbook].
|
||||||
|
|
||||||
|
All the following commands should be run from batcave01.
|
||||||
|
|
||||||
|
First, ensure there are no configuration changes required for the new
|
||||||
|
update. If there are, update the Ansible anitya role(s) and optionally
|
||||||
|
run the playbook:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook openshift-apps/the-new-hotness.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
The configuration changes could be limited to staging only using:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook openshift-apps/the-new-hotness.yml -l staging
|
||||||
|
....
|
||||||
|
|
||||||
|
This is recommended for testing new configuration changes.
|
||||||
|
|
||||||
|
==== Upgrading
|
||||||
|
|
||||||
|
===== Staging
|
||||||
|
|
||||||
|
To deploy new version of the-new-hotness you need to push changes to
|
||||||
|
staging branch on
|
||||||
|
https://github.com/fedora-infra/the-new-hotness[the-new-hotness GitHub].
|
||||||
|
GitHub webhook will then automatically deploy a new version of
|
||||||
|
the-new-hotness on staging.
|
||||||
|
|
||||||
|
===== Production
|
||||||
|
|
||||||
|
To deploy new version of the-new-hotness you need to push changes to
|
||||||
|
production branch on
|
||||||
|
https://github.com/the-new-hotness/anitya[the-new-hotness GitHub].
|
||||||
|
GitHub webhook will then automatically deploy a new version of
|
||||||
|
the-new-hotness on production.
|
||||||
|
|
||||||
|
Congratulations! The new version should now be deployed.
|
||||||
|
|
||||||
|
== Monitoring Activity
|
||||||
|
|
||||||
|
It can be nice to check up on the-new-hotness to make sure its behaving
|
||||||
|
correctly. You can see all the Bugzilla activity using the
|
||||||
|
https://bugzilla.redhat.com/page.cgi?id=user_activity.html[user activity
|
||||||
|
query] (staging uses
|
||||||
|
https://partner-bugzilla.redhat.com/page.cgi?id=user_activity.html[partner-bugzilla.redhat.com])
|
||||||
|
and querying for the `upstream-release-monitoring@fedoraproject.org`
|
||||||
|
user.
|
||||||
|
|
||||||
|
You can also view all the Koji tasks dispatched by the-new-hotness. For
|
||||||
|
example, you can see the
|
||||||
|
https://koji.fedoraproject.org/koji/tasks?state=failed&owner=hotness[failed
|
||||||
|
tasks] it has created.
|
||||||
|
|
||||||
|
To monitor the pods of the-new-hotness you can connect to Fedora infra
|
||||||
|
OpenShift and look at the state of pods.
|
||||||
|
|
||||||
|
For staging look at the [.title-ref]#the-new-hotness# namespace in
|
||||||
|
https://os.stg.fedoraproject.org/console/project/release-monitoring/overview[staging
|
||||||
|
OpenShift instance].
|
||||||
|
|
||||||
|
For production look at the [.title-ref]#the-new-hotness# namespace in
|
||||||
|
https://os.fedoraproject.org/console/project/release-monitoring/overview[production
|
||||||
|
OpenShift instance].
|
144
modules/sysadmin_guide/pages/hubs.adoc
Normal file
144
modules/sysadmin_guide/pages/hubs.adoc
Normal file
|
@ -0,0 +1,144 @@
|
||||||
|
= Fedora Hubs SOP
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-tools, sysadmin-hosted
|
||||||
|
Location::
|
||||||
|
?
|
||||||
|
Servers::
|
||||||
|
<prod-srv-hostname>, <stg-srv-hostname>, hubs-dev.fedorainfracloud.org
|
||||||
|
Purpose::
|
||||||
|
Contributor and team portal.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Fedora Hubs aggregates user and team activity throughout the Fedora
|
||||||
|
infrastructure (and elsewhere) to show what a user or a team is doing.
|
||||||
|
It helps new people find a place to contribute.
|
||||||
|
|
||||||
|
=== Components
|
||||||
|
|
||||||
|
Fedora Hubs has the following components:
|
||||||
|
|
||||||
|
* a SQL database like PostgreSQL (in the Fedora infra we're using the
|
||||||
|
shared database).
|
||||||
|
* a Redis server that is used as a message bus (it is not critical if
|
||||||
|
the content is lost). System service: `redis`.
|
||||||
|
* a MongoDB server used to store the contents of the activity feeds.
|
||||||
|
It's JSON data, limited to 100 entries per user or group. Service:
|
||||||
|
`mongod`.
|
||||||
|
* a Flask-based WSGI app served by Apache + mod_wsgi, that will also
|
||||||
|
serve the JS front end as static files. System service: `httpd`.
|
||||||
|
* a Fedmsg listener that receives messages from the fedmsg bus and puts
|
||||||
|
them in Redis. System service: `fedmsg-hub`.
|
||||||
|
* a set of "triage" workers that pull the raw messages from Redis,
|
||||||
|
process them using SQL queries and puts work items in another Redis
|
||||||
|
queue. System service: `fedora-hubs-triage@`.
|
||||||
|
* a set of "worker" daemons that pull from this other Redis queue, work
|
||||||
|
on the items by making SQL queries and external HTTP requests (to Github
|
||||||
|
for example), and put reload notifications in the SSE Redis queue. They
|
||||||
|
also access the caching system, which can be local files or memcached.
|
||||||
|
System service: `fedora-hubs-worker@`.
|
||||||
|
* The SSE server (Twisted-based) that pulls from that Redis queue and
|
||||||
|
sends reload notifications to the connected browsers. It handles
|
||||||
|
long-lived HTTP connection but there is little activity: only the
|
||||||
|
notifications and a "keepalive ping" message every 30 seconds to every
|
||||||
|
connected browser. System service: `fedora-hubs-sse`. Apache is
|
||||||
|
configured to proxy the `/sse` path to this server.
|
||||||
|
|
||||||
|
== Managing the services
|
||||||
|
|
||||||
|
Restarting all the services:
|
||||||
|
|
||||||
|
....
|
||||||
|
systemctl restart fedmsg-hub fedora-hubs-\*
|
||||||
|
....
|
||||||
|
|
||||||
|
By default, 4 `triage` daemons and 4 `worker` daemons are enabled. To
|
||||||
|
add another `triage` daemon and another `worker` daemon, you can run:
|
||||||
|
|
||||||
|
....
|
||||||
|
systemctl enable --now fedora-hubs-triage@5.service
|
||||||
|
systemctl enable --now fedora-hubs-worker@5.service
|
||||||
|
....
|
||||||
|
|
||||||
|
It is not necessary to have the same number of `triage` and `worker`
|
||||||
|
daemons, in fact it is expected that more `worker` than `triage` daemons
|
||||||
|
will be necessary, as they do more time-consuming work.
|
||||||
|
|
||||||
|
== Hubs-specific operations
|
||||||
|
|
||||||
|
Other Hubs-specific operations are done using the
|
||||||
|
[.title-ref]#fedora-hubs# command:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ fedora-hubs
|
||||||
|
Usage: fedora-hubs [OPTIONS] COMMAND [ARGS]...
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--help Show this message and exit.
|
||||||
|
|
||||||
|
Commands:
|
||||||
|
cache Cache-related operations.
|
||||||
|
db Database-related operations.
|
||||||
|
fas FAS-related operations.
|
||||||
|
run Run daemon processes.
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Manipulating the cache
|
||||||
|
|
||||||
|
The `cache` subcommand is used to do cache-related operations:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ fedora-hubs cache
|
||||||
|
Usage: fedora-hubs cache [OPTIONS] COMMAND [ARGS]...
|
||||||
|
|
||||||
|
Cache-related operations.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--help Show this message and exit.
|
||||||
|
|
||||||
|
Commands:
|
||||||
|
clean Clean the specified WIDGETs (id or name).
|
||||||
|
coverage Check the cache coverage.
|
||||||
|
list List widgets for which there is cached data.
|
||||||
|
....
|
||||||
|
|
||||||
|
For example, to check the cache coverage:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ fedora-hubs cache coverage
|
||||||
|
107 cached values found, 95 are missing.
|
||||||
|
52.97 percent cache coverage.
|
||||||
|
....
|
||||||
|
|
||||||
|
The cache coverage value is an interesting metric that could be used in
|
||||||
|
a Nagios check. A value below 50% could be considered as significant of
|
||||||
|
application slowdowns and could thus generate a warning.
|
||||||
|
|
||||||
|
=== Interacting with FAS
|
||||||
|
|
||||||
|
The `fas` subcommand is used to get information from FAS:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ fedora-hubs fas
|
||||||
|
Usage: fedora-hubs fas [OPTIONS] COMMAND [ARGS]...
|
||||||
|
|
||||||
|
FAS-related operations.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--help Show this message and exit.
|
||||||
|
|
||||||
|
Commands:
|
||||||
|
create-team Create the team hub NAME from FAS.
|
||||||
|
sync-teams Sync all the team hubs NAMEs from FAS.
|
||||||
|
....
|
||||||
|
|
||||||
|
To add a new team hub for a FAS group, run:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ fedora-hubs fas create-team <fas-group-name>
|
||||||
|
....
|
60
modules/sysadmin_guide/pages/ibm_rsa_ii.adoc
Normal file
60
modules/sysadmin_guide/pages/ibm_rsa_ii.adoc
Normal file
|
@ -0,0 +1,60 @@
|
||||||
|
= IBM RSA II Infrastructure SOP
|
||||||
|
|
||||||
|
Many of our physical machines use RSA II cards for remote management.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
PHX, ibiblio
|
||||||
|
Servers::
|
||||||
|
All physical IBM machines
|
||||||
|
Purpose::
|
||||||
|
Provide remote management for our physical IBM machines
|
||||||
|
|
||||||
|
== Restarting the RSA II card
|
||||||
|
|
||||||
|
Normally, the RSA II can be restarted from the web/ssh interface. If you
|
||||||
|
are locked out of any outside access to the RSA II, follow these
|
||||||
|
instructions on the physical machine.
|
||||||
|
|
||||||
|
If the machine can be rebooted without issue, cut off all power to the
|
||||||
|
machine, wait a few seconds, and restart everything.
|
||||||
|
|
||||||
|
Otherwise, to restart the card without rebooting the machine:
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Download and install the IBM Remote Supervisor Adapter II Daemon
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
.. `yum install usbutils libusb-devel` # (needed by the RSA II daemon)
|
||||||
|
.. {blank}
|
||||||
|
+
|
||||||
|
Download the correct tarball from::
|
||||||
|
http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5071676&brandind=5000008
|
||||||
|
(TODO: check if this can be packaged in Fedora)
|
||||||
|
.. Extract the tarball and run `sudo ./install.sh --update`
|
||||||
|
____
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Download and extract the IBM Advanced Settings Utility::
|
||||||
|
http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=TOOL-ASU&brandind=5000016
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
this tarball dumps files in the current working directory
|
||||||
|
====
|
||||||
|
____
|
||||||
|
. Issue a `sudo ./asu64 rebootrsa` to reboot the RSA II.
|
||||||
|
. Clean up: `yum remove ibmusbasm64`
|
||||||
|
|
||||||
|
== Other Resources
|
||||||
|
|
||||||
|
http://www.redbooks.ibm.com/abstracts/sg246495.html may be a useful
|
||||||
|
resource to refer to when working with this.
|
73
modules/sysadmin_guide/pages/index.adoc
Normal file
73
modules/sysadmin_guide/pages/index.adoc
Normal file
|
@ -0,0 +1,73 @@
|
||||||
|
= System Administrator Guide
|
||||||
|
|
||||||
|
Welcome to The Fedora Infrastructure system administration guide.
|
||||||
|
|
||||||
|
[[sysadmin-getting-started]]
|
||||||
|
== Getting Started
|
||||||
|
|
||||||
|
If you haven't already, you should complete the general
|
||||||
|
`getting-started` guide. Once you've completed that, you're ready to get
|
||||||
|
involved in the
|
||||||
|
https://admin.fedoraproject.org/accounts/group/view/fi-apprentice[Fedora
|
||||||
|
Infrastructure Apprentice] group.
|
||||||
|
|
||||||
|
=== Fedora Infrastructure Apprentice
|
||||||
|
|
||||||
|
The
|
||||||
|
https://admin.fedoraproject.org/accounts/group/view/fi-apprentice[Fedora
|
||||||
|
Infrastructure Apprentice] group in the Fedora Account System grants
|
||||||
|
read-only access to many Fedora infrastructure machines. This group is
|
||||||
|
used for new folks to look around at the infrastructure setup, check
|
||||||
|
machines and processes and see where they might like to contribute
|
||||||
|
moving forward. This also allows apprentices to examine and gather info
|
||||||
|
on problems, then propose solutions.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
This group will be pruned often of inactive folks who miss the monthly
|
||||||
|
email check-in on the
|
||||||
|
https://lists.fedoraproject.org/admin/lists/infrastructure.lists.fedoraproject.org/[infrastructure
|
||||||
|
mailing list]. There's nothing personal in this and you're welcome to
|
||||||
|
re-join later when you have more time, we just want to make sure the
|
||||||
|
group only has active members.
|
||||||
|
====
|
||||||
|
|
||||||
|
Members of the https://admin.fedoraproject.org/accounts/group/view/fi-apprentice[Fedora
|
||||||
|
Infrastructure Apprentice] group have ssh/shell access to many machines,
|
||||||
|
but no sudo rights or ability to commit to the
|
||||||
|
https://pagure.io/fedora-infra/ansible/[Ansible repository] (but they do
|
||||||
|
have read-only access). Apprentice can, however, contribute to the
|
||||||
|
infrastructure documentation by making a pull request to the
|
||||||
|
https://pagure.io/infra-docs/[infra-docs] repository. Access is via the
|
||||||
|
bastion.fedoraproject.org machine and from there to each machine. See
|
||||||
|
the `ssh-sop` for instructions on how to set up SSH. You can see a list
|
||||||
|
of hosts that allow apprentice access by using:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ./scripts/hosts_with_var_set -i inventory/ -o ipa_client_shell_groups=fi-apprentice
|
||||||
|
....
|
||||||
|
|
||||||
|
from a checkout of the https://pagure.io/fedora-infra/ansible/[Ansible
|
||||||
|
repository]. The Ansible repository is hosted on pagure.io at
|
||||||
|
`https://pagure.io/fedora-infra/ansible.git`.
|
||||||
|
|
||||||
|
=== Selecting a Ticket
|
||||||
|
|
||||||
|
Start by checking out the
|
||||||
|
https://pagure.io/fedora-infrastructure/issues?status=Open&tags=easyfix[easyfix
|
||||||
|
tickets]. Tickets marked with this tag are a good place for apprentices
|
||||||
|
to learn how things are setup, and also contribute a fix.
|
||||||
|
|
||||||
|
Since apprentices do not have commit access to the
|
||||||
|
https://pagure.io/fedora-infra/ansible/[Ansible repository], you should
|
||||||
|
make your change, produce a patch with `git diff`, and attach it to the
|
||||||
|
infrastructure ticket you are working on. It will then be reviewed.
|
||||||
|
|
||||||
|
[[sops]]
|
||||||
|
== Standard Operating Procedures
|
||||||
|
|
||||||
|
Below is a table of contents containing all the standard operating
|
||||||
|
procedures for Fedora Infrastructure applications. For information on
|
||||||
|
how to write a new standard operating procedure, consult the guide on
|
||||||
|
`develop-sops`.
|
55
modules/sysadmin_guide/pages/infra-git-repo.adoc
Normal file
55
modules/sysadmin_guide/pages/infra-git-repo.adoc
Normal file
|
@ -0,0 +1,55 @@
|
||||||
|
= Infrastructure Git Repos
|
||||||
|
|
||||||
|
Setting up an infrastructure git repo - and the push mechanisms for the
|
||||||
|
magicks
|
||||||
|
|
||||||
|
We have a number of git repos (in /git on batcave) that manage files for
|
||||||
|
ansible, our docs, our common host info database and our kickstarts This
|
||||||
|
is a doc on how to setup a new one of these, if it is needed.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
batcave01.phx2.fedoraproject.org, batcave-comm01.qa.fedoraproject.org
|
||||||
|
|
||||||
|
== Steps
|
||||||
|
|
||||||
|
Create the bare repo:
|
||||||
|
|
||||||
|
....
|
||||||
|
make $git_dir
|
||||||
|
setfacl -m d:g:$yourgroup:rwx -m d:g:$othergroup:rwx \
|
||||||
|
-m g:$yourgroup:rwx -m g:$othergroup:rwx $git_dir
|
||||||
|
|
||||||
|
cd $git_dir
|
||||||
|
git init --bare
|
||||||
|
....
|
||||||
|
|
||||||
|
edit up config - add these lines to the bottom:
|
||||||
|
|
||||||
|
....
|
||||||
|
[hooks]
|
||||||
|
# (normallysysadmin-members@fedoraproject.org)
|
||||||
|
mailinglist = emailaddress@yourdomain.org
|
||||||
|
emailprefix =
|
||||||
|
maildomain = fedoraproject.org
|
||||||
|
reposource = /path/to/this/dir
|
||||||
|
repodest = /path/to/where/you/want/the/files/dumped
|
||||||
|
....
|
||||||
|
|
||||||
|
edit up description - make it something useful:
|
||||||
|
|
||||||
|
....
|
||||||
|
cd hooks
|
||||||
|
rm -f *.sample
|
||||||
|
cp hooks from /git/infra-docs/hooks/ on batcave01 to this path
|
||||||
|
....
|
||||||
|
|
||||||
|
modify sudoers to allow users in whatever groups can commit to this repo
|
||||||
|
can run /usr/local/bin/syncgittree.sh w/o inputting a password
|
115
modules/sysadmin_guide/pages/infra-hostrename.adoc
Normal file
115
modules/sysadmin_guide/pages/infra-hostrename.adoc
Normal file
|
@ -0,0 +1,115 @@
|
||||||
|
= Infrastructure Host Rename SOP
|
||||||
|
|
||||||
|
This page is intended to guide you through the process of renaming a
|
||||||
|
virtual node.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Introduction
|
||||||
|
. Finding out where the host is
|
||||||
|
. Preparation
|
||||||
|
. Renaming the Logical Volume
|
||||||
|
. Doing the actual rename
|
||||||
|
. Telling ansible about the new host
|
||||||
|
. VPN Stuff
|
||||||
|
|
||||||
|
== Introduction
|
||||||
|
|
||||||
|
Throughout this SOP, we will refer to the old hostname as $oldhostname
|
||||||
|
and the new hostname as $newhostname. We will refer to the Dom0 host
|
||||||
|
that the vm resides on as $vmhost.
|
||||||
|
|
||||||
|
If this process is being followed so that a temporary-named host can
|
||||||
|
replace a production host, please be sure to follow the
|
||||||
|
[51]Infrastructure retire machine SOP to properly decommission the old
|
||||||
|
host before continuing.
|
||||||
|
|
||||||
|
== Finding out where the host is
|
||||||
|
|
||||||
|
In order to rename the host, you must have access to the Dom0 (host) on
|
||||||
|
which the virtual server resides. To find out which host that is, log in
|
||||||
|
to batcave01, and run:
|
||||||
|
|
||||||
|
....
|
||||||
|
grep $oldhostname /var/log/virthost-lists.out
|
||||||
|
....
|
||||||
|
|
||||||
|
The first column of the output will be the Dom0 of the virtual node.
|
||||||
|
|
||||||
|
== Preparation
|
||||||
|
|
||||||
|
SSH to $oldhostname. If the new name is replacing a production box,
|
||||||
|
change the IP Address that it binds to, in
|
||||||
|
`/etc/sysconfig/network-scripts/ifcfg-eth0`.
|
||||||
|
|
||||||
|
Also change the hostname in `/etc/sysconfig/network`.
|
||||||
|
|
||||||
|
At this point, you can `sudo poweroff` $oldhostname.
|
||||||
|
|
||||||
|
Open an ssh session to $vmhost, and make sure that the node is listed as
|
||||||
|
`shut off`. If it is not, you can force it off with:
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh destroy $oldhostname
|
||||||
|
....
|
||||||
|
|
||||||
|
== Renaming the Logical Volume
|
||||||
|
|
||||||
|
Find out the name of the logical volume (on $vmhost):
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh dumpxml $oldhostname | grep 'source dev'
|
||||||
|
....
|
||||||
|
|
||||||
|
This will give you a line that looks like
|
||||||
|
`<source dev='/dev/VolGroup00/$oldhostname'/>` which tells you that
|
||||||
|
`/dev/VolGroup00/$oldhostname` is the path to the logical volume.
|
||||||
|
|
||||||
|
Run `/usr/sbin/lvrename` (the path that you found above) (the path that
|
||||||
|
you found above, with $newhostname at the end instead of $oldhostname)`
|
||||||
|
|
||||||
|
For example::::
|
||||||
|
/usr/sbin/lvrename /dev/VolGroup00/noc03-tmp /dev/VolGroup00/noc01
|
||||||
|
|
||||||
|
== Doing the actual rename
|
||||||
|
|
||||||
|
Now that the logical volume has been renamed, we can rename the host in
|
||||||
|
libvirt.
|
||||||
|
|
||||||
|
Dump the configuration of $oldhostname into an xml file, by running:
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh dumpxml $oldhostname > $newhostname.xml
|
||||||
|
....
|
||||||
|
|
||||||
|
Open up $newhostname.xml, and change all instances of $oldhostname to
|
||||||
|
$newhostname.
|
||||||
|
|
||||||
|
Save the file and run:
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh define $newhostname.xml
|
||||||
|
....
|
||||||
|
|
||||||
|
If there are no errors above, you can undefine $oldhostname:
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh undefine $oldhostname
|
||||||
|
....
|
||||||
|
|
||||||
|
Power on $newhostname, with:
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh start $newhostname
|
||||||
|
....
|
||||||
|
|
||||||
|
And remember to set it to autostart:
|
||||||
|
|
||||||
|
....
|
||||||
|
virsh autostart $newhostname
|
||||||
|
....
|
||||||
|
|
||||||
|
== VPN Stuff
|
||||||
|
|
||||||
|
TODO
|
75
modules/sysadmin_guide/pages/infra-raidmismatch.adoc
Normal file
75
modules/sysadmin_guide/pages/infra-raidmismatch.adoc
Normal file
|
@ -0,0 +1,75 @@
|
||||||
|
= Infrastructure/SOP/Raid Mismatch Count
|
||||||
|
|
||||||
|
What to do when a raid device has a mismatch count
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. Correction
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Step 1
|
||||||
|
. Step 2
|
||||||
|
____
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
All
|
||||||
|
Servers::
|
||||||
|
Physical hosts
|
||||||
|
Purpose::
|
||||||
|
Provides database connection to many of our apps.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
In some situations a raid device may indicate there is a count mismatch
|
||||||
|
as listed in:
|
||||||
|
|
||||||
|
....
|
||||||
|
/sys/block/mdX/md/mismatch_cnt
|
||||||
|
....
|
||||||
|
|
||||||
|
Anything other than 0 is considered not good. Though if the number is
|
||||||
|
low it's probably nothing to worry about. To correct this situation try
|
||||||
|
the directions below.
|
||||||
|
|
||||||
|
== Correction
|
||||||
|
|
||||||
|
More than anything these steps are to A) Verify there is no problem and
|
||||||
|
B) make the error go away. If step 1 and step 2 don't correct the
|
||||||
|
problems, PROCEED WITH CAUTION. The steps below, however, should be
|
||||||
|
relatively safe.
|
||||||
|
|
||||||
|
Issue a repair (replace mdX with the questionable raid device):
|
||||||
|
|
||||||
|
....
|
||||||
|
echo repair > /sys/block/mdX/md/sync_action
|
||||||
|
....
|
||||||
|
|
||||||
|
Depending on the size of the array and disk speed this can take a while.
|
||||||
|
Watch the progress with:
|
||||||
|
|
||||||
|
....
|
||||||
|
cat /proc/mdstat
|
||||||
|
....
|
||||||
|
|
||||||
|
Issue a check. It's this check that will reset the mismatch count if
|
||||||
|
there are no problems. Again replace mdX with your actual raid device.:
|
||||||
|
|
||||||
|
....
|
||||||
|
echo check > /sys/block/mdX/md/sync_action
|
||||||
|
....
|
||||||
|
|
||||||
|
Just as before, you can watch the progress with:
|
||||||
|
|
||||||
|
....
|
||||||
|
cat /proc/mdstat
|
||||||
|
....
|
113
modules/sysadmin_guide/pages/infra-repo.adoc
Normal file
113
modules/sysadmin_guide/pages/infra-repo.adoc
Normal file
|
@ -0,0 +1,113 @@
|
||||||
|
= Infrastructure Yum Repo SOP
|
||||||
|
|
||||||
|
In some cases RPM's in Fedora need to be rebuilt for the Infrastructure
|
||||||
|
team to suit our needs. This repo is provided to the public (except for
|
||||||
|
the RHEL RPMs). Rebuilds go into this repo which are stored on the
|
||||||
|
netapp and shared via the proxy servers after being built on koji.
|
||||||
|
|
||||||
|
For basic instructions, read the standard documentation on Fedora wiki:
|
||||||
|
- https://fedoraproject.org/wiki/Using_the_Koji_build_system
|
||||||
|
|
||||||
|
This document will only outline the differences between the "normal"
|
||||||
|
repos and the infra repos.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Building an RPM
|
||||||
|
. Tagging an existing build
|
||||||
|
. Promoting a staging build
|
||||||
|
. Koji package list
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Location::
|
||||||
|
PHX [53] https://kojipkgs.fedoraproject.org/repos-dist/
|
||||||
|
Servers::
|
||||||
|
koji batcave01 / Proxy Servers
|
||||||
|
Purpose::
|
||||||
|
Provides infrastructure repo for custom Fedora Infrastructure rebuilds
|
||||||
|
|
||||||
|
== Building an RPM
|
||||||
|
|
||||||
|
Building an RPM for Infrastructure is significantly easier then building
|
||||||
|
an RPM for Fedora. Basically get your SRPM ready, then submit it to koji
|
||||||
|
for building to the $repo-infra target. (e.g. epel7-infra).
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
....
|
||||||
|
rpmbuild --define "dist .el7.infra" -bs test.spec
|
||||||
|
koji build epel7-infra test-1.0-1.el7.infra.src.rpm
|
||||||
|
....
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Remember to build it for every dist / arch you need to deploy it on.
|
||||||
|
====
|
||||||
|
|
||||||
|
After it has been built, you will see it's tagged as
|
||||||
|
$repo-infra-candidate, this means that it is a candidate for being
|
||||||
|
signed. The automatic signing system will pick it up and sign the
|
||||||
|
package for you without any further intervention. You can track when
|
||||||
|
this is done by checking the build info: when it is moved from
|
||||||
|
$repo-infra-candidate to $repo-infra-stg, it has been signed. You can
|
||||||
|
check this on the web interface (look under "Tags"), or via:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji buildinfo test-1.0-1.el7.infra
|
||||||
|
....
|
||||||
|
|
||||||
|
After the build has been tagged into the $repo-infra-stg tag,
|
||||||
|
tag2distrepo will automatically create a distrepo task, which will
|
||||||
|
update the repository so that the package is available on staging hosts.
|
||||||
|
After this time, you can yum clean all and then install the packages via
|
||||||
|
yum install or yum update.
|
||||||
|
|
||||||
|
== Tagging existing builds
|
||||||
|
|
||||||
|
If you already have a real build and want to use it in the
|
||||||
|
infrastructure before it has landed in stable, you can tag it into the
|
||||||
|
respective infra-candidate tag. For example, if you have an epel7 build
|
||||||
|
of test2-1.0-1.el7.infra, run:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji tag epel7-infra-candidate test2-1.0-1.el7.infra
|
||||||
|
....
|
||||||
|
|
||||||
|
And then the same autosigning and repogen from the previous section
|
||||||
|
applies.
|
||||||
|
|
||||||
|
== Promoting a staging build
|
||||||
|
|
||||||
|
After getting autosigned, builds will land in the respective infra-stg
|
||||||
|
tag, for example epel7-infra-stg. These tags go into repos that are
|
||||||
|
enabled on staging machines, but not on production. If you decide, after
|
||||||
|
testing, that the build is good enough for production, you can promote
|
||||||
|
it by running:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji move epel7-infra-stg epel7-infra test2-1.0-1.el7.infra
|
||||||
|
....
|
||||||
|
|
||||||
|
== Koji package list
|
||||||
|
|
||||||
|
If you try to build a package into the infra tags, and koji says
|
||||||
|
something like: BuildError: package test not in list for tag
|
||||||
|
epel7-infra-candidate That means that the package has not been added to
|
||||||
|
the list for building in that particular tag. Either add the package to
|
||||||
|
the respective Fedora/EPEL branches (this is the preferred method, since
|
||||||
|
we should always aim to get everything packaged for Fedora/EPEL), or add
|
||||||
|
the package to the listing for the respective tag.
|
||||||
|
|
||||||
|
To add package to infra tag, run:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji add-pkg $tag $package --owner=$user
|
||||||
|
....
|
44
modules/sysadmin_guide/pages/infra-retiremachine.adoc
Normal file
44
modules/sysadmin_guide/pages/infra-retiremachine.adoc
Normal file
|
@ -0,0 +1,44 @@
|
||||||
|
= Infrastructure retire machine SOP
|
||||||
|
|
||||||
|
== Introduction
|
||||||
|
|
||||||
|
When a machine (be it virtual instance or real physical hardware is
|
||||||
|
decommisioned, a set of steps must be followed to ensure that the
|
||||||
|
machine is properly removed from the set of machines we manage and
|
||||||
|
doesn't cause problems down the road.
|
||||||
|
|
||||||
|
== Retire process
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Ensure that the machine is no longer used for anything. Use git-grep,::
|
||||||
|
stop services, etc.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Remove the machine from ansible. Make sure you not only remove the
|
||||||
|
main::
|
||||||
|
machine name, but also any aliases it might have (or move them to an
|
||||||
|
active server if they are active services. Make sure to search for the
|
||||||
|
IP address(s) of the machine as well. Ensure dns is updated to remove
|
||||||
|
the machine.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Remove the machine from any labels in hardware devices like consoles
|
||||||
|
or::
|
||||||
|
the like.
|
||||||
|
. Revoke the ansible cert for the machine.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Move the machine xml defintion to ensure it does NOT start on boot.
|
||||||
|
You::
|
||||||
|
can move it to 'name-retired-YYYY-MM-DD'.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Ensure any backend storage the machine was using is freed or renamed
|
||||||
|
to::
|
||||||
|
name-retired-YYYY-MM-DD
|
||||||
|
|
||||||
|
== TODO
|
||||||
|
|
||||||
|
fill in commands
|
140
modules/sysadmin_guide/pages/infra-yubikey.adoc
Normal file
140
modules/sysadmin_guide/pages/infra-yubikey.adoc
Normal file
|
@ -0,0 +1,140 @@
|
||||||
|
= Infrastructure/SOP/Yubikey
|
||||||
|
|
||||||
|
This document describes how yubikey authentication works
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. User Information
|
||||||
|
. Host Admins
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
.. pam_yubico
|
||||||
|
____
|
||||||
|
. Server Admins
|
||||||
|
+
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
.. Basic architecture
|
||||||
|
.. ykval
|
||||||
|
.. ykksm
|
||||||
|
.. Physical Yubikey info
|
||||||
|
____
|
||||||
|
. fas integration
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
fas*, db02
|
||||||
|
Purpose::
|
||||||
|
Provides yubikey authentication in Fedora
|
||||||
|
|
||||||
|
== Config Files
|
||||||
|
|
||||||
|
* `/etc/httpd/conf.d/yk-ksm.conf`
|
||||||
|
* `/etc/httpd/conf.d/yk-val.conf`
|
||||||
|
* `/etc/ykval/ykval-config.php`
|
||||||
|
* `/etc/ykksm/ykksm-config.php`
|
||||||
|
* `/etc/fas.cfg`
|
||||||
|
|
||||||
|
== User Information
|
||||||
|
|
||||||
|
See [57]Infrastruture/Yubikey
|
||||||
|
|
||||||
|
== Host Admins
|
||||||
|
|
||||||
|
pam_yubico
|
||||||
|
|
||||||
|
Generated from fas, the /etc/yubikeyid works like a authroized_keys file
|
||||||
|
and maps valid keys to users. It is downloaded from FAS:
|
||||||
|
|
||||||
|
[58]https://admin.fedoraproject.org/accounts/yubikey/dump
|
||||||
|
|
||||||
|
== Server Admins
|
||||||
|
|
||||||
|
=== Basic architecture
|
||||||
|
|
||||||
|
Yubikey authentication takes place in 3 basic phases.
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. User presses yubikey which generates a one time password
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
The one time password makes its way to the yk-val application which::
|
||||||
|
verifies it is not a replay
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
yk-val passes that otp on to the yk-ksm application which verifies the::
|
||||||
|
key itself is a valid key
|
||||||
|
|
||||||
|
If all of those steps succeed, the ykval application sends back an OK
|
||||||
|
and authentication is considered successful. The two applications are
|
||||||
|
defined below, if either of them is unavailable, yubikey authentication
|
||||||
|
will fail.
|
||||||
|
|
||||||
|
==== ykval
|
||||||
|
|
||||||
|
Database: db02:ykval
|
||||||
|
|
||||||
|
The database contains 3 tables. clients: just a valid client. These are
|
||||||
|
not users, these are systems able to authenticate against ykval. In our
|
||||||
|
case Fedora is the only client so there's just one entry here queue:
|
||||||
|
Used for distributed setups (we don't do this) yubikeys: maps which
|
||||||
|
yubikey belongs to which user
|
||||||
|
|
||||||
|
ykval is installed on fas* and is located at:
|
||||||
|
[59]http://localhost/yk-val/verify
|
||||||
|
|
||||||
|
Purpose: Is to map keys to users and protect against replay attacks
|
||||||
|
|
||||||
|
==== ykksm
|
||||||
|
|
||||||
|
Database: db02:ykksm
|
||||||
|
|
||||||
|
The database contains one table: yubikeys: maps who created keys, what
|
||||||
|
key was created, when, and the public name and serial number, whether
|
||||||
|
its active, etc.
|
||||||
|
|
||||||
|
ykksm is installed on fas* at [60]http://localhost/yk-ksm
|
||||||
|
|
||||||
|
Purpose: verify if a key is a valid known key or not. Nothing contacts
|
||||||
|
this service directly except for ykval. This should be considered the
|
||||||
|
“high security” portion of the system as access to this table would
|
||||||
|
allow users to make their own yubikeys.
|
||||||
|
|
||||||
|
==== Physical Yubikey info
|
||||||
|
|
||||||
|
The actual yubikey contains information to generate a one time password.
|
||||||
|
The important bits to know are the begining of the otp contains the
|
||||||
|
identifier of the key (used similar to how ssh uses authorized_keys) and
|
||||||
|
note the rest of it contains lots of bits of information, including a
|
||||||
|
serial incremental.
|
||||||
|
|
||||||
|
Sample key: `ccccfcdaivjrvdhvzfljbbievftnvncljhibkulrftt`
|
||||||
|
|
||||||
|
Breaking this up, the first 12 characters are the identifier. This can
|
||||||
|
be considered 'public'
|
||||||
|
|
||||||
|
ccccfcdaivj rvdhvzfljbbievftnvncljhibkulrftt
|
||||||
|
|
||||||
|
The second half is the otp part.
|
||||||
|
|
||||||
|
== fas integration
|
||||||
|
|
||||||
|
Fas integration has two main parts. First is key generation, the next is
|
||||||
|
activation. The fas-plugin-yubikey contains the bits for both, and
|
||||||
|
verification. Users call on this page to generate the key info:
|
||||||
|
|
||||||
|
[61]https://admin.fedoraproject.org/accounts/yubikey/genkey
|
||||||
|
|
||||||
|
The fas password field automatically detects whether someone is using a
|
||||||
|
otp or a regular password. It then sends otp requests to yk-val for
|
||||||
|
verification.
|
225
modules/sysadmin_guide/pages/ipsilon.adoc
Normal file
225
modules/sysadmin_guide/pages/ipsilon.adoc
Normal file
|
@ -0,0 +1,225 @@
|
||||||
|
= Ipsilon Infrastructure SOP
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. Known Issues
|
||||||
|
. ReStarting
|
||||||
|
. Configuration
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Common actions::
|
||||||
|
6.1. Registering OpenID Connect Scopes 6.2. Generate an OpenID Connect
|
||||||
|
token 6.3. Create OpenID Connect secrets for apps
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Primary upstream contact::
|
||||||
|
Patrick Uiterwijk - FAS: puiterwijk
|
||||||
|
Backup upstream contact::
|
||||||
|
Simo Sorce - FAS: simo (irc: simo) Howard Johnson - FAS: merlinthp
|
||||||
|
(irc: MerlinTHP) Rob Crittenden - FAS: rcritten (irc: rcrit)
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
ipsilon01.phx2.fedoraproject.org ipsilon02.phx2.fedoraproject.org
|
||||||
|
ipsilion01.stg.phx2.fedoraproject.org.
|
||||||
|
Purpose::
|
||||||
|
Ipsilon is our central authentication service that is used to
|
||||||
|
authenticate users agains FAS. It is seperate from FAS.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Ipsilon is our central authentication agent that is used to authenticate
|
||||||
|
users agains FAS. It is seperate from FAS. The only service that is not
|
||||||
|
using this currently is the wiki. It is a web service that is presented
|
||||||
|
via httpd and is load balanced by our standard haproxy setup.
|
||||||
|
|
||||||
|
== Known issues
|
||||||
|
|
||||||
|
No known issues at this time. There is not currently a logout option for
|
||||||
|
ipsilon, but it is not considered an issue. If group memberships are
|
||||||
|
updated in ipsilon the user will need to wait a few minutes for them to
|
||||||
|
replicate to the all the systems.
|
||||||
|
|
||||||
|
== Restarting
|
||||||
|
|
||||||
|
To restart the application you simply need to ssh to the servers for the
|
||||||
|
problematic region and issue an 'service httpd restart'. This should
|
||||||
|
rarely be required.
|
||||||
|
|
||||||
|
== Configuration
|
||||||
|
|
||||||
|
Configuration is handled by the ipsilon.yaml playbook in Ansible. This
|
||||||
|
can also be used to reconfigure application, if that becomes nessecary.
|
||||||
|
|
||||||
|
== Common actions
|
||||||
|
|
||||||
|
This section describes some common configuration actions.
|
||||||
|
|
||||||
|
=== OpenID Connect Scope Registration
|
||||||
|
|
||||||
|
As documented on
|
||||||
|
https://fedoraproject.org/wiki/Infrastructure/Authentication,
|
||||||
|
application developers can request their own scopes. When a request for
|
||||||
|
this comes in, look in ansible/roles/ipsilon/files/oidc_scopes/ and copy
|
||||||
|
an example module. Copy this to a new file, so we have a file per scope
|
||||||
|
set. Fill in the information:
|
||||||
|
|
||||||
|
____
|
||||||
|
* name is an Ipsilon-internal name. This should not include any spaces
|
||||||
|
* display_name is the name that is displayed to the category of scopes
|
||||||
|
to the user
|
||||||
|
* scopes is a dictionary with the full scope identifier (with namespace)
|
||||||
|
as keys. The values are dicts with the following keys:
|
||||||
|
+
|
||||||
|
____
|
||||||
|
** display_name: The complete display name for this scope. This is what
|
||||||
|
the user gets shown to accept/reject
|
||||||
|
** claims: A list of additional "claims" (pieces of user information) an
|
||||||
|
application will get when the user
|
||||||
|
** consents to this scope. For most scopes, this will be the empty list.
|
||||||
|
____
|
||||||
|
____
|
||||||
|
|
||||||
|
In ansible/roles/ipsilon/tasks/main.yml, add the name of the new file
|
||||||
|
(without .py) to the with_items of "Copy OpenID Connect scope
|
||||||
|
registrations"). To enable, open
|
||||||
|
ansible/roles/ipsilon/templates/configuration.conf, and look for the
|
||||||
|
lines starting with "openidc enabled extensions". Add the name of the
|
||||||
|
plugin (in the "name" field of the file) to the environment this
|
||||||
|
scopeset has been requested for. Run the ansible ipsilon.yml playbook.
|
||||||
|
|
||||||
|
=== Generate an OpenID Connect token
|
||||||
|
|
||||||
|
There is a handy script in the Ansible project under
|
||||||
|
`scripts/generate-oidc-token` that can help you generate an OIDC token.
|
||||||
|
It has a self-explanatory `--help` argument, and it will print out some
|
||||||
|
SQL that you can run against Ipsilon's database, as well as the token
|
||||||
|
that you seek.
|
||||||
|
|
||||||
|
The `SERVICE_NAME` (the required positional argument) is the name of the
|
||||||
|
application that wants to use the token to perform actions against
|
||||||
|
another service.
|
||||||
|
|
||||||
|
To generate the scopes, you can visit our link:[authentication] docs and
|
||||||
|
find the service you want the token to be used for. Each service has a
|
||||||
|
base namespace (a URL) and one or more scopes for that namespace. To
|
||||||
|
form a scope for this script, you concatenate the namespace of the
|
||||||
|
service with the scope you want to grant the service. You can provide
|
||||||
|
the script the -s flag multiple times if you want to grant more than one
|
||||||
|
scope to the same token.
|
||||||
|
|
||||||
|
As an example, to give Bodhi access to create waivers in WaiverDB, you
|
||||||
|
can see that the base namespace is
|
||||||
|
`https://waiverdb.fedoraproject.org/oidc/` and that there is a
|
||||||
|
`create-waiver` scope. You can run this to generate Ipsilon SQL and a
|
||||||
|
token with that scope:
|
||||||
|
|
||||||
|
....
|
||||||
|
[bowlofeggs@batcave01 ansible][PROD]$ ./scripts/generate-oidc-token bodhi -e 365 -s https://waiverdb.fedoraproject.org/oidc/create-waiver
|
||||||
|
|
||||||
|
Run this SQL against Ipsilon's database:
|
||||||
|
|
||||||
|
--------START CUTTING HERE--------
|
||||||
|
BEGIN;
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','username','bodhi@service');
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','security_check','-ptBqVLId-kUJquqkVyhvR0DbDULIiKp1eqbXqG_dfVK9qACU6WwRBN3-7TRfoOn');
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','client_id','bodhi');
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','expires_at','1557259744');
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','type','Bearer');
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','issued_at','1525723744');
|
||||||
|
insert into token values ('2a5f2dff-4e93-4a8d-8482-e62f40dce046','scope','["openid", "https://someapp.fedoraproject.org/"]');
|
||||||
|
COMMIT;
|
||||||
|
-------- END CUTTING HERE --------
|
||||||
|
|
||||||
|
|
||||||
|
Token: 2a5f2dff-4e93-4a8d-8482-e62f40dce046_-ptBqVLId-kUJquqkVyhvR0DbDULIiKp1eqbXqG_dfVK9qACU6WwRBN3-7TRfoOn
|
||||||
|
....
|
||||||
|
|
||||||
|
Once you have the SQL, you can run it against Ipsilon's database, and
|
||||||
|
you can provide the token to the application through some secure means
|
||||||
|
(such as putting into Ansible's secrets and telling the requestor the
|
||||||
|
Ansible variable they can use to access it.)
|
||||||
|
|
||||||
|
=== Create OpenID Connect secrets for apps
|
||||||
|
|
||||||
|
Application wanting to use OpenID Connect need to register against our
|
||||||
|
OpenID Connect server (Ipsilon). Since we do not allow self-registration
|
||||||
|
(except on iddev.fedorainfracloud.org) for obvious reasons, the secrets
|
||||||
|
need to be created and configured per application and environment
|
||||||
|
(production vs staging).
|
||||||
|
|
||||||
|
To do so: - Go to the private ansible repository. - Edit the file:
|
||||||
|
`files/ipsilon/openidc.{{env}}.static` - At the bottom of this file, add
|
||||||
|
the information concerning the application you are adding. This will
|
||||||
|
look something like:
|
||||||
|
|
||||||
|
____
|
||||||
|
....
|
||||||
|
fedocal client_name="fedocal"
|
||||||
|
fedocal client_secret="<long random string>"
|
||||||
|
fedocal redirect_uris=["https://calendar.stg.fedoraproject.org/oidc_callback"]
|
||||||
|
fedocal client_uri="https://calendar.stg.fedoraproject.org/"
|
||||||
|
fedocal ipsilon_internal={"type":"static","client_id":"fedocal","trusted":true}
|
||||||
|
fedocal contacts=["admin@fedoraproject.org"]
|
||||||
|
fedocal client_id=null
|
||||||
|
fedocal policy_uri="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
|
||||||
|
fedocal grant_types="authorization_code"
|
||||||
|
fedocal response_types="code"
|
||||||
|
fedocal application_type="web"
|
||||||
|
fedocal subject_type="pairwise"
|
||||||
|
fedocal logo_uri=null
|
||||||
|
fedocal tos_uri=null
|
||||||
|
fedocal jwks_uri=null
|
||||||
|
fedocal jwks=null
|
||||||
|
fedocal sector_identifier_uri=null
|
||||||
|
fedocal request_uris=[]
|
||||||
|
fedocal require_auth_time=null
|
||||||
|
fedocal token_endpoint_auth_method="client_secret_post"
|
||||||
|
fedocal id_token_signed_response_alg="RS256"
|
||||||
|
fedocal request_object_signing_alg="none"
|
||||||
|
fedocal initiate_login_uri=null
|
||||||
|
fedocal default_max_age=null
|
||||||
|
fedocal default_acr_values=null
|
||||||
|
fedocal client_secret_expires_at=0
|
||||||
|
....
|
||||||
|
____
|
||||||
|
|
||||||
|
In most of situation, only the first 5 lines (up to `ipsilon_internal`)
|
||||||
|
will change. If the application is not using flask-oidc or is not
|
||||||
|
maintained by the Fedora Infrastructure the first 11 lines (up to
|
||||||
|
`application_type`) may change. The remaining lines require a deeper
|
||||||
|
understanding of OpenID Connect and Ipsilon.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
`client_id` in `ipsilon_internal` must match the begining of the line,
|
||||||
|
and the `client_id` field must either match the begining of the line or
|
||||||
|
be `null` as in the example here.
|
||||||
|
====
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
In our OpenID connect server, OIDC.user_getfield('nickname') will return
|
||||||
|
the FAS username, which we know from FAS is unique. However, not all
|
||||||
|
OpenID Connect servers enforce this constraint, so the application code
|
||||||
|
may rely on the `sub` which is the only key that is sure to be unique.
|
||||||
|
If the application relies on `sub` and wants `sub` to return the FAS
|
||||||
|
username, then the configuration should be adjusted with:
|
||||||
|
`subject_type="public"`.
|
||||||
|
====
|
||||||
|
After adjusting this file, you will need to make the `client_secret`
|
||||||
|
available to the application via ansible, for this simply add it to
|
||||||
|
`vars.yml` as we do for the other private variables and provide the
|
||||||
|
variable name to the person who requested it.
|
||||||
|
|
||||||
|
Finally, commit and push the changes to both files and run the
|
||||||
|
`ipsilon.yml` playbook.
|
149
modules/sysadmin_guide/pages/iscsi.adoc
Normal file
149
modules/sysadmin_guide/pages/iscsi.adoc
Normal file
|
@ -0,0 +1,149 @@
|
||||||
|
= iSCSI
|
||||||
|
|
||||||
|
iscsi allows one to share and mount block devices using the scsi
|
||||||
|
protocol over a network. Fedora currently connects to a netapp that has
|
||||||
|
an iscsi export.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Typical uses
|
||||||
|
. iscsi basics
|
||||||
|
|
||||||
|
____
|
||||||
|
[arabic]
|
||||||
|
. Terms
|
||||||
|
. iscsi's basic login / logout procedure is
|
||||||
|
____
|
||||||
|
|
||||||
|
[arabic, start=4]
|
||||||
|
. Loggin in
|
||||||
|
. Logging out
|
||||||
|
. Important note about creating new logical volumes
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
xen[1-15]
|
||||||
|
Purpose::
|
||||||
|
Provides iscsi connectivity to our netapp.
|
||||||
|
|
||||||
|
== Typical uses
|
||||||
|
|
||||||
|
The best uses for Fedora are for servers that are not part of a farm or
|
||||||
|
live replicated. For example, we wouldn't put app1 on the iscsi share
|
||||||
|
because we don't gain anything from it. Shutting down app1 to move it
|
||||||
|
isn't an issue because app1 is part of our application server farm.
|
||||||
|
|
||||||
|
noc1, however, is not replicated. It's a stand alone box that, at best,
|
||||||
|
would have a non-live failover. By placing this host on an iscsi share,
|
||||||
|
we can make it more highly available as it allows us to move that box
|
||||||
|
around our virtualization infrastructure without rebooting it or even
|
||||||
|
taking it down.
|
||||||
|
|
||||||
|
== iscsi basics
|
||||||
|
|
||||||
|
=== Terms
|
||||||
|
|
||||||
|
* initiator means client
|
||||||
|
* target means server
|
||||||
|
* swab means mop
|
||||||
|
* deck means floor
|
||||||
|
|
||||||
|
=== iscsi's basic login / logout procedure is
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Notify your client that a new target is available (similar to editing::
|
||||||
|
/etc/fstab for a new nfs mount)
|
||||||
|
. Login to the iscsi target (similar to running "mount /my/nfs"
|
||||||
|
. Logout from the iscsi target (similar to running "umount /my/nfs"
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Delete the target from the client (similar to removing the nfs mount::
|
||||||
|
from /etc/fstab)
|
||||||
|
|
||||||
|
==== Logging in
|
||||||
|
|
||||||
|
Most mounts are covered by ansible so this should be automatic. In the
|
||||||
|
event that something goes wrong though, the best way to fix this is:
|
||||||
|
|
||||||
|
* Notify the client of the target:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
iscsiadm --mode node --targetname iqn.1992-08.com.netapp:sn.118047036 --portal 10.5.88.21:3260 -o new
|
||||||
|
....
|
||||||
|
* Log in to the new target:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
iscsiadm --mode node --targetname iqn.1992-08.com.netapp:sn.118047036 --portal 10.5.88.21:3260 --login
|
||||||
|
....
|
||||||
|
* Scan and activate lvm:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
pvscan
|
||||||
|
vgscan
|
||||||
|
vgchange -ay xenGuests
|
||||||
|
....
|
||||||
|
|
||||||
|
Once this is done, one should be able to run "lvs" to see the logical
|
||||||
|
volumes
|
||||||
|
|
||||||
|
==== Logging out
|
||||||
|
|
||||||
|
Logging out isn't normally needed, for example rebooting a machine
|
||||||
|
automatically logs the initiator out. Should a problem arise though here
|
||||||
|
are the steps:
|
||||||
|
|
||||||
|
* Disable the logical volume:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
vgchange -an xenGuests
|
||||||
|
....
|
||||||
|
* log out:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
iscsiadm --mode node --targetname iqn.1992-08.com.netapp:sn.118047036 --portal 10.5.88.21:3260 --logout
|
||||||
|
....
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
`Cannot deactivate volume group`
|
||||||
|
|
||||||
|
If the vgchange command fails with an error about not being able to
|
||||||
|
deactivate the volume group, this means that one of the logical volumes
|
||||||
|
is still in use. By running "lvs" you can get a list of volume groups.
|
||||||
|
Look in the Attr column. There are 6 attrs listed. The 5th column
|
||||||
|
usually has a '-' or an 'a'. 'a' means its active, - means it is not. To
|
||||||
|
the right of that (the last column) you will see an '-' or an 'o'. If
|
||||||
|
you see an 'o' that means that logical volume is still mounted and in
|
||||||
|
use.
|
||||||
|
====
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
|
||||||
|
Note about creating new logical volumes
|
||||||
|
|
||||||
|
At present we do not have logical volume locking on the xen servers.
|
||||||
|
This is dangerous and being worked on. Basically when you create a new
|
||||||
|
volume on a host, you need to run:
|
||||||
|
|
||||||
|
....
|
||||||
|
pvscan
|
||||||
|
vgscan
|
||||||
|
lvscan
|
||||||
|
....
|
||||||
|
|
||||||
|
on the other virtualization servers.
|
||||||
|
====
|
40
modules/sysadmin_guide/pages/jenkins-fedmsg.adoc
Normal file
40
modules/sysadmin_guide/pages/jenkins-fedmsg.adoc
Normal file
|
@ -0,0 +1,40 @@
|
||||||
|
= Jenkins Fedmsg SOP
|
||||||
|
|
||||||
|
Send information about Jenkins builds to fedmsg.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Ricky Elrod, Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-apps
|
||||||
|
|
||||||
|
== Reinstalling when it disappears
|
||||||
|
|
||||||
|
For an as-of-yet unknown reason, the plugin sometimes seems to
|
||||||
|
disappear, though it still shows as "installed" on Jenkins.
|
||||||
|
|
||||||
|
To re-install it, grab [.title-ref]#fedmsg.hpi# from
|
||||||
|
[.title-ref]#/srv/web/infra/bigfiles/jenkins#. Go to the Jenkins web
|
||||||
|
interface and log in. Click [.title-ref]#Manage Jenkins# ->
|
||||||
|
[.title-ref]#Manage Plugins# -> [.title-ref]#Advanced#. Upload the
|
||||||
|
plugin and on the page that comes up, check the box to have Jenkins
|
||||||
|
restart when running jobs are finished.
|
||||||
|
|
||||||
|
== Configuration Values
|
||||||
|
|
||||||
|
These are written here in case the Jenkins configuration ever gets lost.
|
||||||
|
This is how to configure the jenkins-fedmsg-emit plugin.
|
||||||
|
|
||||||
|
Assume the plugin is already installed.
|
||||||
|
|
||||||
|
Go to "Configure Jenkins" -> "System Configuration"
|
||||||
|
|
||||||
|
Towards the bottom, look for "Fedmsg Emitter"
|
||||||
|
|
||||||
|
Values:
|
||||||
|
|
||||||
|
Signing: Checked Fedmsg Endpoint: tcp://209.132.181.16:9941 Environment
|
||||||
|
Shortname: prod Certificate File:
|
||||||
|
/etc/pki/fedmsg/jenkins-jenkins.fedorainfracloud.org.crt Keystore File:
|
||||||
|
/etc/pki/fedmsg/jenkins-jenkins.fedorainfracloud.org.key
|
60
modules/sysadmin_guide/pages/kerneltest-harness.adoc
Normal file
60
modules/sysadmin_guide/pages/kerneltest-harness.adoc
Normal file
|
@ -0,0 +1,60 @@
|
||||||
|
= Kerneltest-harness SOP
|
||||||
|
|
||||||
|
The kerneltest-harness is the web application used to gather and present
|
||||||
|
statistics about kernel test results.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Documentation Links
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Location::
|
||||||
|
https://apps.fedoraproject.org/kerneltest/
|
||||||
|
Servers::
|
||||||
|
kerneltest01, kerneltest01.stg
|
||||||
|
Purpose::
|
||||||
|
Provide a system to gather and present kernel tests results
|
||||||
|
|
||||||
|
== Add a new Fedora release
|
||||||
|
|
||||||
|
* Login
|
||||||
|
* On the front page, in the menu on the left side, if there is a
|
||||||
|
[.title-ref]#Fedora Rawhide# release, click on [.title-ref]#(edit)#.
|
||||||
|
* Bump the [.title-ref]#Release number# on [.title-ref]#Fedora Rawhide#
|
||||||
|
to avoid conflicts with the new release you're creating
|
||||||
|
* Back on the index page, click on [.title-ref]#New release#
|
||||||
|
* Complete the form:
|
||||||
|
+
|
||||||
|
Release number::
|
||||||
|
This would be the integer version of the Fedora release, for example
|
||||||
|
24 for Fedora 24.
|
||||||
|
Support::
|
||||||
|
The current status of the Fedora release
|
||||||
|
+
|
||||||
|
** Rawhide for Fedora Rawhide
|
||||||
|
** Test for branched release
|
||||||
|
** Release for released Fedora
|
||||||
|
** Retired for retired release of Fedora
|
||||||
|
|
||||||
|
== Upload new test results
|
||||||
|
|
||||||
|
The kernel tests are available on the
|
||||||
|
https://git.fedorahosted.org/cgit/kernel-tests.git/[kernel-test] git
|
||||||
|
repository.
|
||||||
|
|
||||||
|
Once ran with [.title-ref]#runtests.sh#, you can upload the resulting
|
||||||
|
file either using [.title-ref]#fedora_submit.py# or the UI.
|
||||||
|
|
||||||
|
If you choose the UI the steps are simply:
|
||||||
|
|
||||||
|
* Login
|
||||||
|
* Click on [.title-ref]#Upload# in the main menu on the top
|
||||||
|
* Select the result file generated by running the tests
|
||||||
|
* Submit
|
171
modules/sysadmin_guide/pages/kickstarts.adoc
Normal file
171
modules/sysadmin_guide/pages/kickstarts.adoc
Normal file
|
@ -0,0 +1,171 @@
|
||||||
|
= Kickstart Infrastructure SOP
|
||||||
|
|
||||||
|
Kickstart scripts provide our install infrastructure. We have a plethora
|
||||||
|
of different kickstarts to best match the system you are trying to
|
||||||
|
install.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main
|
||||||
|
Location::
|
||||||
|
Everywhere we have machines.
|
||||||
|
Servers::
|
||||||
|
batcave01 (stores kickstarts and install media)
|
||||||
|
Purpose::
|
||||||
|
Provides our install infrastructure
|
||||||
|
|
||||||
|
== Introduction
|
||||||
|
|
||||||
|
Our kickstart infrastructure lives on batcave01. All install media and
|
||||||
|
kickstart scripts are located on batcave01. Because the RHEL binaries
|
||||||
|
are not public we have these bits blocked. You can add needed IPs to
|
||||||
|
(from batcave01):
|
||||||
|
|
||||||
|
....
|
||||||
|
ansible/roles/batcave/files/allows
|
||||||
|
....
|
||||||
|
|
||||||
|
== Physical Machine (kvm virthost)
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
PXE Booting: If PXE booting just follow the prompt after doing the pxe boot (most
|
||||||
|
hosts will pxeboot via console hitting f12).
|
||||||
|
====
|
||||||
|
|
||||||
|
=== Prep
|
||||||
|
|
||||||
|
This only works on an already booted box, many boxes at our colocations
|
||||||
|
may have to be rebuilt by the people in those locations first. Also make
|
||||||
|
sure the IP you are about to boot to install from is allowed to our IP
|
||||||
|
restricted infrastructure.fedoraproject.org as noted above (in
|
||||||
|
Introduction).
|
||||||
|
|
||||||
|
Download the vmlinuz and initrd images.
|
||||||
|
|
||||||
|
for a rhel6 install:
|
||||||
|
|
||||||
|
....
|
||||||
|
wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL6-x86_64/images/pxeboot/vmlinuz \
|
||||||
|
-O /boot/vmlinuz-install
|
||||||
|
wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL6-x86_64/images/pxeboot/initrd.img \
|
||||||
|
-O /boot/initrd-install.img
|
||||||
|
|
||||||
|
grubby --add-kernel=/boot/vmlinuz-install \
|
||||||
|
--args="ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-6-nohd \
|
||||||
|
repo=https://infrastructure.fedoraproject.org/repo/rhel/RHEL6-x86_64/ \
|
||||||
|
ksdevice=link ip=$IP gateway=$GATEWAY netmask=$NETMASK dns=$DNS" \
|
||||||
|
--title="install el6" --initrd=/boot/initrd-install.img
|
||||||
|
....
|
||||||
|
|
||||||
|
for a rhel7 install:
|
||||||
|
|
||||||
|
....
|
||||||
|
wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL7-x86_64/images/pxeboot/vmlinuz -O /boot/vmlinuz-install
|
||||||
|
wget https://infrastructure.fedoraproject.org/repo/rhel/RHEL7-x86_64/images/pxeboot/initrd.img -O /boot/initrd-install.img
|
||||||
|
....
|
||||||
|
|
||||||
|
For phx2 hosts:
|
||||||
|
|
||||||
|
....
|
||||||
|
grubby --add-kernel=/boot/vmlinuz-install \
|
||||||
|
--args="ks=http://10.5.126.23/repo/rhel/ks/hardware-rhel-7-nohd \
|
||||||
|
repo=http://10.5.126.23/repo/rhel/RHEL7-x86_64/ \
|
||||||
|
net.ifnames=0 biosdevname=0 bridge=br0:eth0 ksdevice=br0 \
|
||||||
|
ip={{ br0_ip }}::{{ gw }}:{{ nm }}:{{ hostname }}:br0:none" \
|
||||||
|
--title="install el7" --initrd=/boot/initrd-install.img
|
||||||
|
....
|
||||||
|
|
||||||
|
(You will need to setup the br1 device if any after install)
|
||||||
|
|
||||||
|
For non phx2 hosts:
|
||||||
|
|
||||||
|
....
|
||||||
|
grubby --add-kernel=/boot/vmlinuz-install \
|
||||||
|
--args="ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-ext \
|
||||||
|
repo=https://infrastructure.fedoraproject.org/repo/rhel/RHEL7-x86_64/ \
|
||||||
|
net.ifnames=0 biosdevname=0 bridge=br0:eth0 ksdevice=br0 \
|
||||||
|
ip={{ br0_ip }}::{{ gw }}:{{ nm }}:{{ hostname }}:br0:none" \
|
||||||
|
--title="install el7" --initrd=/boot/initrd-install.img
|
||||||
|
....
|
||||||
|
|
||||||
|
Fill in the br0 ip, gateway, etc
|
||||||
|
|
||||||
|
The default here is to use the hardware-rhel-7-nohd config which
|
||||||
|
requires you to connect via VNC to the box and configure its drives. If
|
||||||
|
this is a new machine or you are fine with blowing everything away, you
|
||||||
|
can instead use
|
||||||
|
https://infrastructure.fedoraproject.org/rhel/ks/hardware-rhel-6-minimal
|
||||||
|
as your kickstart
|
||||||
|
|
||||||
|
If you know the number of hard drives the system has there are other
|
||||||
|
kickstarts which can be used.
|
||||||
|
|
||||||
|
2 disk system::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-02disk
|
||||||
|
or external::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-02disk-ext
|
||||||
|
4 disk system::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-04disk
|
||||||
|
or external::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-04disk-ext
|
||||||
|
6 disk system::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-06disk
|
||||||
|
or external::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-06disk-ext
|
||||||
|
8 disk system::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-08disk
|
||||||
|
or external::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-08disk-ext
|
||||||
|
10 disk system::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-10disk
|
||||||
|
or external::::
|
||||||
|
ks=https://infrastructure.fedoraproject.org/repo/rhel/ks/hardware-rhel-7-10disk-ext
|
||||||
|
|
||||||
|
Double and triple check your configuration settings (On RHEL-6
|
||||||
|
`cat /boot/grub/menu.lst` and on RHEL-7 `cat /boot/grub2/grub.cfg`),
|
||||||
|
especially your IP information. In places like ServerBeach not all hosts
|
||||||
|
have the same netmask or gateway. Once everything you are ready to run
|
||||||
|
the commands to get it set up to boot next boot.
|
||||||
|
|
||||||
|
RHEL-6:
|
||||||
|
|
||||||
|
....
|
||||||
|
echo "savedefault --default=0 --once" | grub --batch
|
||||||
|
shutdown -r now
|
||||||
|
....
|
||||||
|
|
||||||
|
RHEL-7:
|
||||||
|
|
||||||
|
....
|
||||||
|
grub2-reboot 0
|
||||||
|
shutdown -r now
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Installation
|
||||||
|
|
||||||
|
Once the box logs you out, start pinging the IP address. It will
|
||||||
|
disappear and come back. Once you can ping it again, try to open up a
|
||||||
|
VNC session. It can take a couple of minutes after the box is back up
|
||||||
|
for it to actually allow vnc sessions. The VNC password is in the
|
||||||
|
kickstart script on batcave01:
|
||||||
|
|
||||||
|
....
|
||||||
|
grep vnc /mnt/fedora/app/fi-repo/rhel/ks/hardware-rhel-7-nohd
|
||||||
|
|
||||||
|
vncviewer $IP:1
|
||||||
|
....
|
||||||
|
|
||||||
|
If using the standard kickstart script, one can watch as the install
|
||||||
|
completes itself, there should be no need to do anything. If using the
|
||||||
|
hardware-rhel-6-nohd script, one will need to configure the drives. The
|
||||||
|
password is in the kickstart file in the kickstart repo.
|
||||||
|
|
||||||
|
=== Post Install
|
||||||
|
|
||||||
|
Run ansible on the box asap to set root passwords and other security
|
||||||
|
features. Don't leave a newly installed box sitting around.
|
33
modules/sysadmin_guide/pages/koji-archive.adoc
Normal file
33
modules/sysadmin_guide/pages/koji-archive.adoc
Normal file
|
@ -0,0 +1,33 @@
|
||||||
|
This SOP documents how to archive Fedora EOL'd builds from the DEFAULT
|
||||||
|
volume to archived volume.
|
||||||
|
|
||||||
|
Before archiving the builds, identify if any of the EOL'd release builds
|
||||||
|
are still being used in the current releases. For example. to test if
|
||||||
|
f28 builds are still being using in f32, use:
|
||||||
|
|
||||||
|
$ koji list-tagged f32 | grep fc28
|
||||||
|
|
||||||
|
Tag all these builds to koji's do-not-archive-yet tag, so that they wont
|
||||||
|
be archived. To do that, first add the packages to the
|
||||||
|
do-not-archive-tag
|
||||||
|
|
||||||
|
$ koji add-pkg do-not-archive-yet --owner <username> pkg1 pkg2 ...
|
||||||
|
|
||||||
|
Then tags the builds to do-not-archive-yet tag
|
||||||
|
|
||||||
|
$ koji tag-build do-not-archive-yet build1 build2 ...
|
||||||
|
|
||||||
|
Then update the archive policy which is available in releng repo
|
||||||
|
(https://pagure.io/releng/blob/master/f/koji-archive-policy)
|
||||||
|
|
||||||
|
Run the following from compose-x86-01.phx2.fedoraproject.org
|
||||||
|
|
||||||
|
$ cd $ wget https://pagure.io/releng/raw/master/f/koji-archive-policy $
|
||||||
|
git clone https://pagure.io/koji-tools/ $ cd koji-tools $
|
||||||
|
./koji-change-volumes -p compose_koji -v ~/archive-policy
|
||||||
|
|
||||||
|
In any case, if you need to move a build back to DEFAULT volume
|
||||||
|
|
||||||
|
$ koji add-pkg do-not-archive-yet --owner <username> pkg1 $ koji
|
||||||
|
tag-build do-not-archive-yet build1 $ koji set-build-volume DEFAULT
|
||||||
|
<n-v-r>
|
118
modules/sysadmin_guide/pages/koji-builder-setup.adoc
Normal file
118
modules/sysadmin_guide/pages/koji-builder-setup.adoc
Normal file
|
@ -0,0 +1,118 @@
|
||||||
|
= Setup Koji Builder SOP
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
* Setting up a new koji builder
|
||||||
|
* Resetting/installing an old koji builder
|
||||||
|
|
||||||
|
== Builder Setup
|
||||||
|
|
||||||
|
Setting up a new koji builder involves a goodly number of steps:
|
||||||
|
|
||||||
|
=== Network Overview
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. First get an instance spun up following the kickstart sop.
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Define a hostname for it on the 125 network and a $hostname-nfs name::
|
||||||
|
for it on the .127 network.
|
||||||
|
. make sure the instance has 2 network connections:
|
||||||
|
* eth0 should be on the .125 network
|
||||||
|
* eth1 should be on the .127 network
|
||||||
|
+
|
||||||
|
____
|
||||||
|
For VM eth0 should be on br0, eth1 on br1 on the vmhost.
|
||||||
|
____
|
||||||
|
|
||||||
|
=== Setup Overview
|
||||||
|
|
||||||
|
* install the system as normal:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
virt-install -n $builder_fqdn -r $memsize \
|
||||||
|
-f $path_to_lvm --vcpus=$numprocs \
|
||||||
|
-l http://10.5.126.23/repo/rhel/RHEL6-x86_64/ \
|
||||||
|
-x "ksdevice=eth0 ks=http://10.5.126.23/repo/rhel/ks/kvm-rhel-6 \
|
||||||
|
ip=$ip netmask=$netmask gateway=$gw dns=$dns \
|
||||||
|
console=tty0 console=ttyS0" \
|
||||||
|
--network=bridge=br0 --network=bridge=br1 \
|
||||||
|
--vnc --noautoconsole
|
||||||
|
....
|
||||||
|
* run python `/root/tmp/setup-nfs-network.py` this should print out the
|
||||||
|
-nfs hostname that you made above
|
||||||
|
* change root pw
|
||||||
|
* disable selinux on the machine in /etc/sysconfig/selinux
|
||||||
|
* reboot
|
||||||
|
* setup ssl cert into private/builders - use fqdn of host as DN
|
||||||
|
** login to fas01 as root
|
||||||
|
** `cd /var/lib/fedora-ca`
|
||||||
|
** `./kojicerthelper.py normal --outdir=/tmp/ \ --name=$fqdn_of_the_new_builder --cadir=. --caname=Fedora`
|
||||||
|
** info for the cert should be like this:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
Country Name (2 letter code) [US]:
|
||||||
|
State or Province Name (full name) [North Carolina]:
|
||||||
|
Locality Name (eg, city) [Raleigh]:
|
||||||
|
Organization Name (eg, company) [Fedora Project]:
|
||||||
|
Organizational Unit Name (eg, section) []:Fedora Builders
|
||||||
|
Common Name (eg, your name or your servers hostname) []:$fqdn_of_new_builder
|
||||||
|
Email Address []:buildsys@fedoraproject.org
|
||||||
|
....
|
||||||
|
** scp the file in `/tmp/$\{fqdn}_key_and_cert.pem` over to batcave01
|
||||||
|
** put file in the private repo under `private/builders/$dn}.pem`
|
||||||
|
** `git add` + `git commit`
|
||||||
|
** `git push`
|
||||||
|
* run `./sync-hosts` in infra-hosts repo; `git commit; git push`
|
||||||
|
* as a koji admin run:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
koji add-host $fqdnr i386 x86_64
|
||||||
|
|
||||||
|
(note: those are yum basearchs on the end - season to taste)
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Resetting/installing an old koji builder
|
||||||
|
|
||||||
|
* disable the builder in koji (ask a koji admin)
|
||||||
|
* halt the old system (halt -p)
|
||||||
|
* undefine the vm instance on the buildvmhost:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
virsh undefine $builder_fqdn
|
||||||
|
....
|
||||||
|
* reinstall it - from the buildvmhost run:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
virt-install -n $builder_fqdn -r $memsize \
|
||||||
|
-f $path_to_lvm --vcpus=$numprocs \
|
||||||
|
-l http://10.5.126.23/repo/rhel/RHEL6-x86_64/ \
|
||||||
|
-x "ksdevice=eth0 ks=http://10.5.126.23/repo/rhel/ks/kvm-rhel-6 \
|
||||||
|
ip=$ip netmask=$netmask gateway=$gw dns=$dns \
|
||||||
|
console=tty0 console=ttyS0" \
|
||||||
|
--network=bridge=br0 --network=bridge=br1 \
|
||||||
|
--vnc --noautoconsole
|
||||||
|
....
|
||||||
|
* watch install via vnc:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
vncviewer -via bastion.fedoraproject.org $builder_fqdn:1
|
||||||
|
....
|
||||||
|
* when the install finishes:
|
||||||
|
** start the instance on the buildvmhost:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
virsh start $builder_fqdn
|
||||||
|
....
|
||||||
|
** set it to autostart on the buildvmhost:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
virsh autostart $builder_fqdn
|
||||||
|
....
|
||||||
|
* when the guest comes up
|
||||||
|
** login via ssh using the temp root password
|
||||||
|
** python /root/tmp/setup-nfs-network.py
|
||||||
|
** change root password
|
||||||
|
** disable selinux in /etc/sysconfig/selinux
|
||||||
|
** reboot
|
||||||
|
** ask a koji admin to re-enable the host
|
224
modules/sysadmin_guide/pages/koji.adoc
Normal file
224
modules/sysadmin_guide/pages/koji.adoc
Normal file
|
@ -0,0 +1,224 @@
|
||||||
|
= Koji Infrastructure SOP
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
We are transitioning from two buildsystems, koji for Fedora and plague
|
||||||
|
for EPEL, to just using koji. This page documents both.
|
||||||
|
====
|
||||||
|
|
||||||
|
Koji and plague are our buildsystems. They share some of the same
|
||||||
|
machines to do their work.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Description
|
||||||
|
. Add packages into Buildroot
|
||||||
|
. Troubleshooting and Resolution
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Restarting Koji
|
||||||
|
. kojid won't start or some builders won't connect
|
||||||
|
. OOM (Out of Memory) Issues
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Increase Memory
|
||||||
|
. Decrease weight
|
||||||
|
|
||||||
|
[arabic, start=4]
|
||||||
|
. Disk Space Issues
|
||||||
|
|
||||||
|
|
||||||
|
{empty}5. Should there be mention of being sure filesystems in chroots
|
||||||
|
are unmounted before you delete the chroots?
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-build group
|
||||||
|
Persons::
|
||||||
|
mbonnet, dgilmore, f13, notting, mmcgrath, SmootherFrOgZ
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
* koji.fedoraproject.org
|
||||||
|
* buildsys.fedoraproject.org
|
||||||
|
* xenbuilder[1-4]
|
||||||
|
* hammer1, ppc[1-4]
|
||||||
|
Purpose::
|
||||||
|
Build packages for Fedora.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Users submit builds to koji.fedoraproject.org or
|
||||||
|
buildsys.fedoraproject.org. From there it gets passed on to the
|
||||||
|
builders.
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
At present plague and koji are unaware of each other. A result of this
|
||||||
|
may be an overloaded builder. A easy fix for this is not clear at this
|
||||||
|
time
|
||||||
|
====
|
||||||
|
|
||||||
|
== Add packages into Buildroot
|
||||||
|
|
||||||
|
Some contributors may have the need to build packages against fresh
|
||||||
|
built packages which are not into buildroot yet. Koji has override tags
|
||||||
|
as a Inheritance to the build tag in order to include them into
|
||||||
|
buildroot which can be set by:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji tag-pkg dist-$release-override <package_nvr>
|
||||||
|
....
|
||||||
|
|
||||||
|
== Troubleshooting and Resolution
|
||||||
|
|
||||||
|
=== Restarting Koji
|
||||||
|
|
||||||
|
If for some reason koji needs to be restarted, make sure to restart the
|
||||||
|
koji master first, then the builders. If the koji master has been down
|
||||||
|
for a short enough time the builders do not need to be restarted.:
|
||||||
|
|
||||||
|
....
|
||||||
|
service httpd restart
|
||||||
|
service kojira restart
|
||||||
|
service kojid restart
|
||||||
|
....
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
If postgres becomes interrupted in some way, koji will need to be
|
||||||
|
restarted. As long as the koji master daemon gets restarted the builders
|
||||||
|
should reconnect automatically. If the db server has been restarted and
|
||||||
|
the builders don't seem to be building, restart their daemons as well.
|
||||||
|
====
|
||||||
|
|
||||||
|
=== kojid won't start or some builders won't connect
|
||||||
|
|
||||||
|
In the event that some items are able to connect to koji while some are
|
||||||
|
not, please make sure that the database is not filled up on connections.
|
||||||
|
This is common if koji crashes and the db connections aren't properly
|
||||||
|
cleared. Upon restart many of the connections are full so koji cannot
|
||||||
|
reconnect. Clearing old connections is easy, guess about how long it the
|
||||||
|
new koji has been up and pick a number of minutes larger then that and
|
||||||
|
kill those queries. From db3 as postgres run:
|
||||||
|
|
||||||
|
....
|
||||||
|
echo "select procpid from pg_stat_activity where usename='koji' and now() - query_start \
|
||||||
|
>= '00:40:00' order by query_start;" | psql koji | grep "^ " | xargs kill
|
||||||
|
....
|
||||||
|
|
||||||
|
=== OOM (Out of Memory) Issues
|
||||||
|
|
||||||
|
Out of memory issues occur from time to time on the build machines.
|
||||||
|
There are a couple of options for correction. The first fix is to just
|
||||||
|
restart the machine and hope it was a one time thing. If the problem
|
||||||
|
continues please choose from one of the following options.
|
||||||
|
|
||||||
|
==== Increase Memory
|
||||||
|
|
||||||
|
The xen machines can have memory increased on their corresponding xen
|
||||||
|
hosts. At present this is the table:
|
||||||
|
|
||||||
|
[width="34%",cols="44%,56%",]
|
||||||
|
|===
|
||||||
|
|xen3 |xenbuilder1
|
||||||
|
|xen4 |xenbuilder2
|
||||||
|
|disabled |xenbuilder3
|
||||||
|
|xen8 |xenbuilder4
|
||||||
|
|===
|
||||||
|
|
||||||
|
Edit `/etc/xen/xenbuilder[1-4]` and add more memory.
|
||||||
|
|
||||||
|
==== Decrease weight
|
||||||
|
|
||||||
|
Each builder has a weight as to how much work can be given to it.
|
||||||
|
Presently the only way to alter weight is actually changing the database
|
||||||
|
on db3:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo su - postgres
|
||||||
|
-bash-2.05b$ psql koji
|
||||||
|
koji=# select * from host limit 1;
|
||||||
|
id | user_id | name | arches | task_load | capacity | ready | enabled
|
||||||
|
---+---------+------------------------+-----------+-----------+----------+-------+---------
|
||||||
|
6 | 130 | ppc3.fedora.redhat.com | ppc ppc64 | 1.5 | 4 | t | t
|
||||||
|
(1 row)
|
||||||
|
koji=# update host set capacity=2 where name='ppc3.fedora.redhat.com';
|
||||||
|
....
|
||||||
|
|
||||||
|
Simply update capacity to a lower number.
|
||||||
|
|
||||||
|
=== Disk Space Issues
|
||||||
|
|
||||||
|
The builders use a lot of temporary storage. Failed builds also get left
|
||||||
|
on the builders, most should get cleaned but plague does not. The
|
||||||
|
easiest thing to do is remove some older cache dirs.
|
||||||
|
|
||||||
|
Step one is to turn off both koji and plague:
|
||||||
|
|
||||||
|
....
|
||||||
|
/etc/init.d/plague-builder stop
|
||||||
|
/etc/init.d/kojid stop
|
||||||
|
....
|
||||||
|
|
||||||
|
Next check to see what file system is full:
|
||||||
|
|
||||||
|
....
|
||||||
|
df -h
|
||||||
|
....
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
If any one of the following directories is full, send an outage
|
||||||
|
notification as outlined in: [62]Infrastructure/OutageTemplate to the
|
||||||
|
fedora-infrastructure-list and fedora-devel-list, then contact Mike
|
||||||
|
McGrath
|
||||||
|
|
||||||
|
* /mnt/koji
|
||||||
|
* /mnt/ntap-fedora1/scratch
|
||||||
|
* /pub/epel
|
||||||
|
* /pub/fedora
|
||||||
|
====
|
||||||
|
|
||||||
|
Typically just / will be full. The next thing to do is determine if
|
||||||
|
we have any extremely large builds left on the builder. Typical
|
||||||
|
locations include /var/lib/mock and /mnt/build (/mnt/build actually is
|
||||||
|
on the local filesystem):
|
||||||
|
|
||||||
|
....
|
||||||
|
du -sh /var/lib/mock/* /mnt/build/*
|
||||||
|
....
|
||||||
|
|
||||||
|
`/var/lib/mock/dist-f8-build-10443-1503`::
|
||||||
|
classic koji build
|
||||||
|
`/var/lib/mock/fedora-6-ppc-core-57cd31505683ef1afa533197e91608c5a2c52864`::
|
||||||
|
classic plague build
|
||||||
|
|
||||||
|
If nothing jumps out immediately, just start deleting files older than
|
||||||
|
one week. Once enough space has been freed start koji and plague back
|
||||||
|
up:
|
||||||
|
|
||||||
|
....
|
||||||
|
/etc/init.d/plague-builder start
|
||||||
|
/etc/init.d/kojid start
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Unmounting
|
||||||
|
|
||||||
|
[WARNING]
|
||||||
|
.Warning
|
||||||
|
====
|
||||||
|
Should there be mention of being sure filesystems in chroots are
|
||||||
|
unmounted before you delete the chroots?
|
||||||
|
|
||||||
|
Res ipsa loquitur.
|
||||||
|
====
|
208
modules/sysadmin_guide/pages/koschei.adoc
Normal file
208
modules/sysadmin_guide/pages/koschei.adoc
Normal file
|
@ -0,0 +1,208 @@
|
||||||
|
= Koschei SOP
|
||||||
|
|
||||||
|
Koschei is a continuous integration system for RPM packages. Koschei
|
||||||
|
runs package scratch builds after dependency change or after time elapse
|
||||||
|
and reports package buildability status to interested parties.
|
||||||
|
|
||||||
|
Production instance: https://apps.fedoraproject.org/koschei Staging
|
||||||
|
instance: https://apps.stg.fedoraproject.org/koschei
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
mizdebsk, msimacek
|
||||||
|
Contact::
|
||||||
|
#fedora-admin
|
||||||
|
Location::
|
||||||
|
Fedora Cloud
|
||||||
|
Purpose::
|
||||||
|
continuous integration system
|
||||||
|
|
||||||
|
== Deployment
|
||||||
|
|
||||||
|
Koschei deployment is managed by two Ansible playbooks:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo rbac-playbook groups/koschei-backend.yml
|
||||||
|
sudo rbac-playbook groups/koschei-web.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Koschei is deployed on two separate machines - `koschei-backend` and
|
||||||
|
`koschei-web`
|
||||||
|
|
||||||
|
Frontend (`koschei-web`) is a Flask WSGi application running with httpd.
|
||||||
|
It displays information to users and allows editing package groups and
|
||||||
|
changing priorities.
|
||||||
|
|
||||||
|
Backend (`koschei-backend`) consists of multiple services:
|
||||||
|
|
||||||
|
* `koschei-watcher` - listens to fedmsg events for complete builds and
|
||||||
|
changes build states in the database
|
||||||
|
* `koschei-repo-resolver` - resolves package dependencies in given repo
|
||||||
|
using hawkey and compares them with previous iteration to get a
|
||||||
|
dependency diff. It resolves all packages in the newest repo available
|
||||||
|
in Koji. The output is a base for scheduling new builds
|
||||||
|
* `koschei-build-resolver` - resolves complete builds in the repo in
|
||||||
|
which they were done in Koji. Produces the dependency differences
|
||||||
|
visible in the frontend
|
||||||
|
* `koschei-scheduler` - schedules new builds based on multiple criteria:
|
||||||
|
** dependency priority - dependency changes since last build valued by
|
||||||
|
their distance in the dependency graph
|
||||||
|
** manual and static priorities - set manually in the frontend. Manual
|
||||||
|
priority is reset after each build, static priority persists
|
||||||
|
** time priority - time elapsed since the last build
|
||||||
|
* `koschei-polling` - polls the same types of events as koschei-watcher
|
||||||
|
without reliance on fedmsg. Additionaly takes care of package list
|
||||||
|
synchronization and other regularly executed tasks
|
||||||
|
|
||||||
|
== Configuration
|
||||||
|
|
||||||
|
Koschei configuration is in `/etc/koschei/config-backend.cfg` and
|
||||||
|
`/etc/koschei/config-frontend.cfg`, and is merged with the default
|
||||||
|
configuration in `/usr/share/koschei/config.cfg` (the ones in `/etc`
|
||||||
|
overrides the defaults in `/usr`). Note the merge is recursive. The
|
||||||
|
configuration contains all configurable items for all Koschei services
|
||||||
|
and the frontend. The alterations to configuration that aren't temporary
|
||||||
|
should be done through ansible playbook. Configuration changes have no
|
||||||
|
effect on already running services -- they need to be restarted, which
|
||||||
|
happens automatically when using the playbook.
|
||||||
|
|
||||||
|
== Disk usage
|
||||||
|
|
||||||
|
Koschei doesn't keep on disk anything that couldn't be recreated easily
|
||||||
|
- all important data is stored in PostgreSQL database, configuration is
|
||||||
|
managed by Ansible, code installed by RPM and so on.
|
||||||
|
|
||||||
|
To speed up operation and reduce load on external servers, Koschei
|
||||||
|
caches some data obtained from services it integrates with. Most
|
||||||
|
notably, YUM repositories downloaded from Koji are kept in
|
||||||
|
`/var/cache/koschei/repodata`. Each repository takes about 100 MB of
|
||||||
|
disk space. Maximal number of repositories kept at time is controlled by
|
||||||
|
`cache_l2_capacity` parameter in `config-backend.cfg`
|
||||||
|
(`config-backend.cfg.j2` in Ansible). If repodata cache starts to
|
||||||
|
consume too much disk space, that value can be decreased - after
|
||||||
|
restart, `koschei-*-resolver` will remove least recently used cache
|
||||||
|
entries to respect configured cache capacity.
|
||||||
|
|
||||||
|
== Database
|
||||||
|
|
||||||
|
Koschei needs to connect to a PostgreSQL database, other database
|
||||||
|
systems are not supported. Database connection is specified in the
|
||||||
|
configuration under the `database_config` key that can contain the
|
||||||
|
following keys: `username, password, host, port, database`.
|
||||||
|
|
||||||
|
After an update of koschei, the database needs to be migrated to new
|
||||||
|
schema. This happens automatically when using the upgrade playbook.
|
||||||
|
Alternatively, it can be executed manulally using:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin alembic upgrade head
|
||||||
|
....
|
||||||
|
|
||||||
|
The backend services need to be stopped during the migration.
|
||||||
|
|
||||||
|
== Managing koschei services
|
||||||
|
|
||||||
|
Koschei services are systemd units managed through `systemctl`. They can
|
||||||
|
be started and stopped independently in any order. The frontend is run
|
||||||
|
using httpd.
|
||||||
|
|
||||||
|
== Suspespending koschei operation
|
||||||
|
|
||||||
|
For stopping builds from being scheduled, stopping the
|
||||||
|
`koschei-scheduler` service is enough. For planned Koji outages, it's
|
||||||
|
recommended to stop `koschei-scheduler`. It is not necessary, as koschei
|
||||||
|
can recover from Koji errors and network errors automatically, but when
|
||||||
|
Koji builders are stopped, it may cause unexpected build failures that
|
||||||
|
would be reported to users. Other services can be left running as they
|
||||||
|
automatically restart themselves on Koji and network errors.
|
||||||
|
|
||||||
|
== Limiting Koji usage
|
||||||
|
|
||||||
|
Koschei is by default limited to 30 concurrently running builds. This
|
||||||
|
limit can be changed in the configuration under `koji_config.max_builds`
|
||||||
|
key. There's also Koji load monitoring, that prevents builds from being
|
||||||
|
scheduled when Koji load is higher that certain threshold. That should
|
||||||
|
prevent scheduling builds during mass rebuilds, so it's not necessary to
|
||||||
|
stop scheduling during those.
|
||||||
|
|
||||||
|
== Fedmsg notifications
|
||||||
|
|
||||||
|
Koschei optionally supports sending fedmsg notifications for package
|
||||||
|
state changes. The fedmsg dispatch can be turned on and off in the
|
||||||
|
configuration (key `fedmsg-publisher.enabled`). Koschei doesn't supply
|
||||||
|
configuration for fedmsg, it lets the library to load it's own (in
|
||||||
|
`/etc/fedmsg.d/`).
|
||||||
|
|
||||||
|
== Setting admin announcement
|
||||||
|
|
||||||
|
Koschei can display announcement in web UI. This is mostly useful to
|
||||||
|
inform users about outages or other problems.
|
||||||
|
|
||||||
|
To set announcement, run as koschei user:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin set-notice "Koschei operation is currently suspended due to scheduled Koji outage"
|
||||||
|
....
|
||||||
|
|
||||||
|
or:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin set-notice "Sumbitting scratch builds by Koschei is currently disabled due to Fedora 23 mass rebuild"
|
||||||
|
....
|
||||||
|
|
||||||
|
To clear announcement, run as koschei user:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin clear-notice
|
||||||
|
....
|
||||||
|
|
||||||
|
== Adding package groups
|
||||||
|
|
||||||
|
Packages can be added to one or more group.
|
||||||
|
|
||||||
|
To add new group named "mynewgroup", run as koschei user:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin add-group mynewgroup
|
||||||
|
....
|
||||||
|
|
||||||
|
To add new group named "mynewgroup" and populate it with some packages,
|
||||||
|
run as koschei user:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin add-group mynewgroup pkg1 pkg2 pkg3
|
||||||
|
....
|
||||||
|
|
||||||
|
== Set package static priority
|
||||||
|
|
||||||
|
Some packages are more or less important and can have higher or lower
|
||||||
|
priority. Any user can change manual priority, which is reset after
|
||||||
|
package is rebuilt. Admins can additionally set static priority, which
|
||||||
|
is not affected by package rebuilds.
|
||||||
|
|
||||||
|
To set static priority of package "foo" to value "100", run as koschei
|
||||||
|
user:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin --collection f27 set-priority --static foo 100
|
||||||
|
....
|
||||||
|
|
||||||
|
== Branching a new Fedora release
|
||||||
|
|
||||||
|
After branching occurs and Koji build targets have been created, Koschei
|
||||||
|
should be updated to reflect the new state. There is a special admin
|
||||||
|
command for this purpose, which takes care of copying the configuration
|
||||||
|
and also last builds from the history.
|
||||||
|
|
||||||
|
To branch the collection from Fedora 27 to Fedora 28, use the following:
|
||||||
|
|
||||||
|
....
|
||||||
|
koschei-admin branch-collection f27 f28 -d 'Fedora 27' -t f28 --bugzilla-version 27
|
||||||
|
....
|
||||||
|
|
||||||
|
Then you can optionally verify that the collection configuration is
|
||||||
|
correct by visiting https://apps.fedoraproject.org/koschei/collections
|
||||||
|
and examining the configuration of the newly branched collection.
|
281
modules/sysadmin_guide/pages/layered-image-buildsys.adoc
Normal file
281
modules/sysadmin_guide/pages/layered-image-buildsys.adoc
Normal file
|
@ -0,0 +1,281 @@
|
||||||
|
= Layered Image Build System
|
||||||
|
|
||||||
|
The
|
||||||
|
https://docs.pagure.org/releng/layered_image_build_service.html[Fedora
|
||||||
|
Layered Image Build System], often referred to as
|
||||||
|
https://github.com/projectatomic/osbs-client[OSBS] (OpenShift Build
|
||||||
|
Service) as that is the upstream project that this is based on, is used
|
||||||
|
to build Layered Container Images in the Fedora Infrastructure via Koji.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Overview
|
||||||
|
. Setup
|
||||||
|
. Outage
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Clement Verna (cverna)
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-releng, #fedora-noc, sysadmin-main,
|
||||||
|
sysadmin-releng
|
||||||
|
Location::
|
||||||
|
osbs-control01, osbs-master01, osbs-node01, osbs-node02
|
||||||
|
registry.fedoraproject.org, candidate-registry.fedoraproject.org
|
||||||
|
+
|
||||||
|
osbs-control01.stg, osbs-master01.stg, osbs-node01.stg,
|
||||||
|
osbs-node02.stg registry.stg.fedoraproject.org,
|
||||||
|
candidate-registry.stg.fedoraproject.org
|
||||||
|
+
|
||||||
|
x86_64 koji buildvms
|
||||||
|
Purpose::
|
||||||
|
Layered Container Image Builds
|
||||||
|
|
||||||
|
== Overview
|
||||||
|
|
||||||
|
The build system is setup such that Fedora Layered Image maintainers
|
||||||
|
will submit a build to Koji via the `fedpkg container-build` command a
|
||||||
|
`container` namespace within
|
||||||
|
https://src.fedoraproject.org/projects/container/*[DistGit]. This will
|
||||||
|
trigger the build to be scheduled in
|
||||||
|
https://www.openshift.org/[OpenShift] via
|
||||||
|
https://github.com/projectatomic/osbs-client[osbs-client] tooling, this
|
||||||
|
will create a custom
|
||||||
|
https://docs.openshift.org/latest/dev_guide/builds.html[OpenShift Build]
|
||||||
|
which will use the pre-made buildroot container image that we have
|
||||||
|
created. The https://github.com/projectatomic/atomic-reactor[Atomic
|
||||||
|
Reactor] (`atomic-reactor`) utility will run within the buildroot and
|
||||||
|
prep the build container where the actual build action will execute, it
|
||||||
|
will also maintain uploading the
|
||||||
|
https://fedoraproject.org/wiki/Koji/ContentGenerators[Content Generator]
|
||||||
|
metadata back to https://fedoraproject.org/wiki/Koji[Koji] and upload
|
||||||
|
the built image to the candidate docker registry. This will run on a
|
||||||
|
host with iptables rules restricting access to the docker bridge, this
|
||||||
|
is how we will further limit the access of the buildroot to the outside
|
||||||
|
world verifying that all sources of information come from Fedora.
|
||||||
|
|
||||||
|
Completed layered image builds are hosted in a candidate docker registry
|
||||||
|
which is then used to pull the image and perform tests.
|
||||||
|
|
||||||
|
== Setup
|
||||||
|
|
||||||
|
The Layered Image Build System setup is currently as follows (more
|
||||||
|
detailed view available in the
|
||||||
|
https://docs.pagure.org/releng/layered_image_build_service.html[RelEng
|
||||||
|
Architecture Document]):
|
||||||
|
|
||||||
|
....
|
||||||
|
=== Layered Image Build System Overview ===
|
||||||
|
|
||||||
|
+--------------+ +-----------+
|
||||||
|
| | | |
|
||||||
|
| koji hub +----+ | batcave |
|
||||||
|
| | | | |
|
||||||
|
+--------------+ | +----+------+
|
||||||
|
| |
|
||||||
|
V |
|
||||||
|
+----------------+ V
|
||||||
|
| | +----------------+
|
||||||
|
| koji builder | | +-----------+
|
||||||
|
| | | osbs-control01 +--------+ |
|
||||||
|
+-+--------------+ | +-----+ | |
|
||||||
|
| +----------------+ | | |
|
||||||
|
| | | |
|
||||||
|
| | | |
|
||||||
|
| | | |
|
||||||
|
V | | |
|
||||||
|
+----------------+ | | |
|
||||||
|
| | | | |
|
||||||
|
| osbs-master01 +------------------------------+ [ansible]
|
||||||
|
| +-------+ | | | |
|
||||||
|
+----------------+ | | | | |
|
||||||
|
^ | | | | |
|
||||||
|
| | | | | |
|
||||||
|
| V V | | |
|
||||||
|
| +-----------------+ +----------------+ | | |
|
||||||
|
| | | | | | | |
|
||||||
|
| | osbs-node01 | | osbs-node02 | | | |
|
||||||
|
| | | | | | | |
|
||||||
|
| +-----------------+ +----------------+ | | |
|
||||||
|
| ^ ^ | | |
|
||||||
|
| | | | | |
|
||||||
|
| | +-----------+ | |
|
||||||
|
| | | |
|
||||||
|
| +------------------------------------------+ |
|
||||||
|
| |
|
||||||
|
+-------------------------------------------------------------+
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Deployment
|
||||||
|
|
||||||
|
From batcave you can run the following
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook groups/osbs/deploy-cluster.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
This is going to deploy the OpenShift cluster used by OSBS. Currently
|
||||||
|
the playbook deploys 2 clusters (x86_64 and aarch64). Ansible tags can
|
||||||
|
be used to deploy only one of these if needed for example
|
||||||
|
[.title-ref]#osbs-x86-deploy-openshift#.
|
||||||
|
|
||||||
|
If the openshift-ansible playbook fails it can be easier to run it
|
||||||
|
directly from osbs-control01 and use the verbose mode.
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh osbs-control01.iad2.fedoraproject.org
|
||||||
|
$ sudo -i
|
||||||
|
# cd /root/openshift-ansible
|
||||||
|
# ansible-playbook -i cluster-inventory playbooks/prerequisites.yml
|
||||||
|
# ansible-playbook -i cluster-inventory playbooks/deploy_cluster.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
Once these playbook have been successfull, you can configure OSBS on the
|
||||||
|
cluster. For that use the following playbook
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook groups/osbs/configure-osbs.yml
|
||||||
|
....
|
||||||
|
|
||||||
|
When this is done we need to get the new koji service token and update
|
||||||
|
its value in the private repository
|
||||||
|
|
||||||
|
....
|
||||||
|
$ ssh osbs-master01.iad2.fedoraproject.org
|
||||||
|
$ sudo -i
|
||||||
|
# oc -n osbs-fedora sa get-token koji
|
||||||
|
dsjflksfkgjgkjfdl ....
|
||||||
|
....
|
||||||
|
|
||||||
|
The token needs to be saved in the private ansible repo in
|
||||||
|
[.title-ref]#files/osbs/production/x86-64-osbs-koji#. Once this is done
|
||||||
|
you can run the builder playbook to update that token.
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook groups/buildvm.yml -t osbs
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Operation
|
||||||
|
|
||||||
|
Koji Hub will schedule the containerBuild on a koji builder via the
|
||||||
|
koji-containerbuild-hub plugin, the builder will then submit the build
|
||||||
|
in OpenShift via the koji-containerbuild-builder plugin which uses the
|
||||||
|
osbs-client python API that wraps the OpenShift API along with a custom
|
||||||
|
OpenShift Build JSON payload.
|
||||||
|
|
||||||
|
The Build is then scheduled in OpenShift and it's logs are captured by
|
||||||
|
the koji plugins. Inside the buildroot, atomic-reactor will upload the
|
||||||
|
built container image as well as provide the metadata to koji's content
|
||||||
|
generator.
|
||||||
|
|
||||||
|
== Outage
|
||||||
|
|
||||||
|
If Koji is down, then builds can't be scheduled but repairing Koji is
|
||||||
|
outside the scope of this document.
|
||||||
|
|
||||||
|
If either the candidate-registry.fedoraproject.org or
|
||||||
|
registry.fedoraproject.org Container Registries are unavailable, but
|
||||||
|
repairing those is also outside the scope of this document.
|
||||||
|
|
||||||
|
=== OSBS Failures
|
||||||
|
|
||||||
|
OpenShift Build System itself can have various types of failures that
|
||||||
|
are known about and the recovery procedures are listed below.
|
||||||
|
|
||||||
|
==== Ran out of disk space
|
||||||
|
|
||||||
|
Docker uses a lot of disk space, and while the osbs-nodes have been
|
||||||
|
alloted what is considered to be ample disk space for builds (since they
|
||||||
|
are automatically cleaned up periodically) it is possible this will run
|
||||||
|
out.
|
||||||
|
|
||||||
|
To resolve this, run the following commands:
|
||||||
|
|
||||||
|
....
|
||||||
|
# These command will clean up old/dead docker containers from old OpenShift
|
||||||
|
# Pods
|
||||||
|
|
||||||
|
$ for i in $(sudo docker ps -a | awk '/Exited/ { print $1 }'); do sudo docker rm $i; done
|
||||||
|
|
||||||
|
$ for i in $(sudo docker images -q -f 'dangling=true'); do sudo docker rmi $i; done
|
||||||
|
|
||||||
|
|
||||||
|
# This command should only be run on osbs-master01 (it won't work on the
|
||||||
|
# nodes)
|
||||||
|
#
|
||||||
|
# This command will clean up old builds and related artifacts in OpenShift
|
||||||
|
# that are older than 30 days (We can get more aggressive about this if
|
||||||
|
# necessary, the main reason these still exist is in the event we need to
|
||||||
|
# debug something. All build info we care about is stored in Koji.)
|
||||||
|
|
||||||
|
$ oadm prune builds --orphans --keep-younger-than=720h0m0s --confirm
|
||||||
|
....
|
||||||
|
|
||||||
|
==== A node is broken, how to remove it from the cluster?
|
||||||
|
|
||||||
|
If a node is having an issue, the following command will effectively
|
||||||
|
remove it from the cluster temporarily.
|
||||||
|
|
||||||
|
In this example, we are removing osbs-node01
|
||||||
|
|
||||||
|
....
|
||||||
|
$ oadm manage-node osbs-node01.phx2.fedoraproject.org --schedulable=true
|
||||||
|
....
|
||||||
|
|
||||||
|
==== Container Builds are unable to access resources on the network
|
||||||
|
|
||||||
|
Sometimes the Container Builds will fail and the logs will show that the
|
||||||
|
buildroot is unable to access networked resources (docker registry, dnf
|
||||||
|
repos, etc).
|
||||||
|
|
||||||
|
This is because of a bug in OpenShift v1.3.1 (current upstream release
|
||||||
|
at the time of this writing) where an OpenVSwitch flow is left behind
|
||||||
|
when a Pod is destroyed instead of the flow being deleted along with the
|
||||||
|
Pod.
|
||||||
|
|
||||||
|
Method to confirm the issue is unfortunately multi-step since it's not a
|
||||||
|
cluster-wide issue but isolated to the node experiencing the problem.
|
||||||
|
|
||||||
|
First in the koji createContainer task there is a log file called
|
||||||
|
openshift-incremental.log and in there you will find a key:value in some
|
||||||
|
JSON output similar to the following:
|
||||||
|
|
||||||
|
....
|
||||||
|
'openshift_build_selflink': u'/oapi/v1/namespaces/default/builds/cockpit-f24-6``
|
||||||
|
....
|
||||||
|
|
||||||
|
The last field of the value, in this example `cockpit-f24-6` is the
|
||||||
|
OpenShift build identifier. We need to ssh into `osbs-master01` and get
|
||||||
|
information about which node that ran on.
|
||||||
|
|
||||||
|
....
|
||||||
|
# On osbs-master01
|
||||||
|
# Note: the output won't be pretty, but it gives you the info you need
|
||||||
|
|
||||||
|
$ sudo oc get build cockpit-f25-3 -o yaml | grep osbs-node
|
||||||
|
....
|
||||||
|
|
||||||
|
Once you know what machine you need, ssh into it and run the following:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo docker run --rm -ti buildroot /bin/bash'
|
||||||
|
|
||||||
|
# now attempt to run a curl command
|
||||||
|
|
||||||
|
$ curl https://google.com
|
||||||
|
# This should get refused, but if this node is experiencing the networking
|
||||||
|
# issue then this command will hang and eventually time out
|
||||||
|
....
|
||||||
|
|
||||||
|
How to fix:
|
||||||
|
|
||||||
|
Reboot the affected node that's experiencing the issue, when the node
|
||||||
|
comes back up OpenShift will rebuild the flow tables on OpenVSwitch and
|
||||||
|
things will be back to normal.
|
||||||
|
|
||||||
|
....
|
||||||
|
systemctl reboot
|
||||||
|
....
|
27
modules/sysadmin_guide/pages/librariesio2fedmsg.adoc
Normal file
27
modules/sysadmin_guide/pages/librariesio2fedmsg.adoc
Normal file
|
@ -0,0 +1,27 @@
|
||||||
|
= librariesio2fedmsg SOP
|
||||||
|
|
||||||
|
librariesio2fedmsg is a small service that converts Server-Sent Events
|
||||||
|
from https://libraries.io/[libraries.io] to fedmsgs.
|
||||||
|
|
||||||
|
librariesio2fedmsg is an instance of
|
||||||
|
https://github.com/fedora-infra/sse2fedmsg[sse2fedmsg] using the
|
||||||
|
http://firehose.libraries.io/events[libraries.io firehose] running on
|
||||||
|
https://os.fedoraproject.org/[OpenShift] and publishes its fedmsgs
|
||||||
|
through the busgateway01.phx2.fedoraproject.org relay using the
|
||||||
|
`org.fedoraproject.prod.sse2fedmsg.librariesio` topic.
|
||||||
|
|
||||||
|
== Updating
|
||||||
|
|
||||||
|
sse2fedmsg is installed directly from its git repository, so once a new
|
||||||
|
release is tagged in sse2fedmsg, just update the tag in the git URL
|
||||||
|
provided to pip in the
|
||||||
|
https://infrastructure.fedoraproject.org/infra/ansible/roles/openshift-apps/librariesio2fedmsg/files/[build
|
||||||
|
config].
|
||||||
|
|
||||||
|
== Deploying
|
||||||
|
|
||||||
|
Run the playbook to apply the new OpenShift configuration:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ sudo rbac-playbook openshift-apps/librariesio2fedmsg.yml
|
||||||
|
....
|
77
modules/sysadmin_guide/pages/linktracking.adoc
Normal file
77
modules/sysadmin_guide/pages/linktracking.adoc
Normal file
|
@ -0,0 +1,77 @@
|
||||||
|
= Link tracking
|
||||||
|
|
||||||
|
Using link tracking is [43]an easy way for us to find out how people are
|
||||||
|
getting to our download page. People might click over to our download
|
||||||
|
page from any of a number of areas, and knowing the relative usage of
|
||||||
|
those links can help us understand what materials we're producing are
|
||||||
|
more effective than others.
|
||||||
|
|
||||||
|
== Adding links
|
||||||
|
|
||||||
|
Each link should be constructed by adding ? to the URL, followed by a
|
||||||
|
short code that includes:
|
||||||
|
|
||||||
|
* an indicator for the link source (such as the wiki release notes)
|
||||||
|
* an indicator for the Fedora release in specific (such as F15 for the
|
||||||
|
final, or F15a for the Alpha test release)
|
||||||
|
|
||||||
|
So a link to get.fp.o from the one-page release notes would become
|
||||||
|
http://get.fedoraproject.org/?opF15.
|
||||||
|
|
||||||
|
== FAQ
|
||||||
|
|
||||||
|
I want to copy a link to my status update for social networking, or my
|
||||||
|
blog.::
|
||||||
|
If you're posting a status update to identi.ca, for example, use the
|
||||||
|
link tracking code for status updates. Don't copy a link straight from
|
||||||
|
an announcement that includes link tracking from the announcement. You
|
||||||
|
can copy the link itself but remember to change the portion after the
|
||||||
|
? to instead use the st code for status updates and blogs, followed by
|
||||||
|
the Fedora release version (such as F16a, F16b, or F16), like this:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
http://fedoraproject.org/get-prerelease?stF16a
|
||||||
|
....
|
||||||
|
I want to point people to the announcement from my blog. Should I use
|
||||||
|
the announcement link tracking code?::
|
||||||
|
The actual URL link itself is the announcement URL. Add the link
|
||||||
|
tracking code for blogs, which would start with ?st and end with the
|
||||||
|
Fedora release version, like this:
|
||||||
|
+
|
||||||
|
....
|
||||||
|
http://fedoraproject.org/wiki/F16_release_announcement?stF16a
|
||||||
|
....
|
||||||
|
|
||||||
|
== The codes
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
Additions to this table are welcome.
|
||||||
|
====
|
||||||
|
|
||||||
|
[cols=",",options="header",]
|
||||||
|
|===
|
||||||
|
|Link source |Code
|
||||||
|
|Email announcements |an
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Wiki announcements |wkan
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Front page |fp
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Front page of wiki |wkfp
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|The press release Red Hat makes |rhpr
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|http://redhat.com/fedora |rhf
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Test phase release notes on |wkrn
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Official release notes |rn
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Official installation guide |ig
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|One-page release notes |op
|
||||||
|
|----------------------------------------------- |----------
|
||||||
|
|Status links (blogs, social media) |st
|
||||||
|
|===
|
148
modules/sysadmin_guide/pages/loopabull.adoc
Normal file
148
modules/sysadmin_guide/pages/loopabull.adoc
Normal file
|
@ -0,0 +1,148 @@
|
||||||
|
= Loopabull
|
||||||
|
|
||||||
|
https://github.com/maxamillion/loopabull[Loopabull] is an event-driven
|
||||||
|
https://www.ansible.com/[Ansible]-based automation engine. This is used
|
||||||
|
for various tasks, originally slated for
|
||||||
|
https://pagure.io/releng-automation[Release Engineering Automation].
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Overview
|
||||||
|
. Setup
|
||||||
|
. Outage
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Adam Miller (maxamillion) Pierre-Yves Chibon (pingou)
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, #fedora-releng, #fedora-noc, sysadmin-main,
|
||||||
|
sysadmin-releng
|
||||||
|
Location::
|
||||||
|
loopabull01.phx2.fedoraproject.org
|
||||||
|
loopabull01.stg.phx2.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Event Driven Automation of tasks within the Fedora Infrastructure and
|
||||||
|
Fedora Release Engineering
|
||||||
|
|
||||||
|
== Overview
|
||||||
|
|
||||||
|
The https://github.com/maxamillion/loopabull[loopabull] system is setup
|
||||||
|
such that an event will take place within the infrastructure and a
|
||||||
|
http://www.fedmsg.com/en/latest/[fedmsg] is sent, then loopabull will
|
||||||
|
consume that message, trigger an https://www.ansible.com/[Ansible]
|
||||||
|
http://docs.ansible.com/ansible/playbooks.html[playbook] that shares a
|
||||||
|
name with the fedmsg topic, and provide the payload of the fedmsg to the
|
||||||
|
playbook as
|
||||||
|
https://github.com/ansible/ansible/blob/devel/docs/man/man1/ansible-playbook.1.asciidoc.in[extra
|
||||||
|
variables].
|
||||||
|
|
||||||
|
== Setup
|
||||||
|
|
||||||
|
The setup is relatively simple, the Overview above describes it and a
|
||||||
|
more detailed version can be found in the [.title-ref]#releng docs#.
|
||||||
|
|
||||||
|
....
|
||||||
|
+-----------------+ +-------------------------------+
|
||||||
|
| | | |
|
||||||
|
| fedmsg +------------>| Looper |
|
||||||
|
| | | (fedmsg handler plugin) |
|
||||||
|
| | | |
|
||||||
|
+-----------------+ +-------------------------------+
|
||||||
|
|
|
||||||
|
|
|
||||||
|
+-------------------+ |
|
||||||
|
| | |
|
||||||
|
| | |
|
||||||
|
| Loopabull +<-------------+
|
||||||
|
| (Event Loop) |
|
||||||
|
| |
|
||||||
|
+---------+---------+
|
||||||
|
|
|
||||||
|
|
|
||||||
|
|
|
||||||
|
|
|
||||||
|
V
|
||||||
|
+----------+-----------+
|
||||||
|
| |
|
||||||
|
| ansible-playbook |
|
||||||
|
| |
|
||||||
|
+----------------------+
|
||||||
|
....
|
||||||
|
|
||||||
|
=== Deployment
|
||||||
|
|
||||||
|
Loopabull is deployed on two hosts, one for the production instance:
|
||||||
|
`loopabull01.prod.phx2.fedoraproject.org` and one for the staging
|
||||||
|
instance: `loopabull01.stg.phx2.fedoraproject.org`.
|
||||||
|
|
||||||
|
Each host is running loopabull with 5 workers reacting to fedmsg
|
||||||
|
notifications.
|
||||||
|
|
||||||
|
== Expanding loopabull
|
||||||
|
|
||||||
|
The documentation to expand loopabull's usage is documented at:
|
||||||
|
https://pagure.io/Fedora-Infra/loopabull-tasks
|
||||||
|
|
||||||
|
== Outage
|
||||||
|
|
||||||
|
In the event that loopabull isn't responding or isn't running playbooks
|
||||||
|
as it should be, the following scenarios should be approached.
|
||||||
|
|
||||||
|
=== What is going on?
|
||||||
|
|
||||||
|
There are a few commands that may help figuring out what is going:
|
||||||
|
|
||||||
|
* Check the status of the different services:
|
||||||
|
|
||||||
|
....
|
||||||
|
systemctl |grep loopabull
|
||||||
|
....
|
||||||
|
|
||||||
|
* Follow the logs of the different services:
|
||||||
|
|
||||||
|
....
|
||||||
|
journalctl -lfu loopabull -u loopabull@1 -u loopabull@2 -u loopabull@3 \
|
||||||
|
-u loopabull@4 -u loopabull@5
|
||||||
|
....
|
||||||
|
|
||||||
|
If a playbook returns a non-zero error code, the worker running it will
|
||||||
|
be stopped. If that happens, you may want to carefully review the logs
|
||||||
|
to assess what lead to this situation so it can be prevented in the
|
||||||
|
future.
|
||||||
|
|
||||||
|
* Monitoring the queue size
|
||||||
|
|
||||||
|
The loopabull service listens to the fedmsg bus and puts the messages as
|
||||||
|
they come into a rabbitmq/amqp queue for the workers to process. If you
|
||||||
|
want to see the number of messages pending to be processed by the
|
||||||
|
workers you can check the queue size using:
|
||||||
|
|
||||||
|
....
|
||||||
|
rabbitmqctl list_queues
|
||||||
|
....
|
||||||
|
|
||||||
|
The output will be something like:
|
||||||
|
|
||||||
|
....
|
||||||
|
Listing queues ...
|
||||||
|
workers 489989
|
||||||
|
...done.
|
||||||
|
....
|
||||||
|
|
||||||
|
Where `workers` is the name of the queue used by loopabull and `489989`
|
||||||
|
the number of messages in that queue (yes that day we were recovering
|
||||||
|
from a several-day long outage).
|
||||||
|
|
||||||
|
=== Network Interruption
|
||||||
|
|
||||||
|
Sometimes if the network is interrupted, the loopabull service will hang
|
||||||
|
because the fedmsg listener will hold a dead socket open. The service
|
||||||
|
and its workers simply needs to be restarted at that point.
|
||||||
|
|
||||||
|
....
|
||||||
|
systemctl restart loopabull loopabull@1 loopabull@2 loopabull@3 \
|
||||||
|
loopabull@4 loopabull@5
|
||||||
|
....
|
115
modules/sysadmin_guide/pages/mailman.adoc
Normal file
115
modules/sysadmin_guide/pages/mailman.adoc
Normal file
|
@ -0,0 +1,115 @@
|
||||||
|
= Mailman Infrastructure SOP
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-tools, sysadmin-hosted
|
||||||
|
Location::
|
||||||
|
phx2
|
||||||
|
Servers::
|
||||||
|
mailman01, mailman02, mailman01.stg
|
||||||
|
Purpose::
|
||||||
|
Provides mailing list services.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Mailing list services for Fedora projects are located on the
|
||||||
|
mailman01.phx2.fedoraproject.org server.
|
||||||
|
|
||||||
|
== Common Tasks
|
||||||
|
|
||||||
|
=== Creating a new mailing list
|
||||||
|
|
||||||
|
* Log into mailman01
|
||||||
|
* `sudo -u mailman mailman3 create <listname>@lists.fedora(project|hosted).org --owner <username>@fedoraproject.org --notify`
|
||||||
|
+
|
||||||
|
[IMPORTANT]
|
||||||
|
.Important
|
||||||
|
====
|
||||||
|
Please make sure to add a valid description to the newly created list.
|
||||||
|
(to avoid [no description available] on listinfo index)
|
||||||
|
====
|
||||||
|
|
||||||
|
== Removing content from archives
|
||||||
|
|
||||||
|
We don't.
|
||||||
|
|
||||||
|
It's not easy to remove content from the archives and it's generally
|
||||||
|
useless as well because the archives are often mirrored by third parties
|
||||||
|
as well as being in the INBOXs of all of the people on the mailing list
|
||||||
|
at that time. Here's an example message to send to someone who requests
|
||||||
|
removal of archived content:
|
||||||
|
|
||||||
|
....
|
||||||
|
Greetings,
|
||||||
|
|
||||||
|
We're sorry to say that we don't remove content from the mailing list archives.
|
||||||
|
Doing so is a non-trivial amount of work and usually doesn't achieve anything
|
||||||
|
because the content has already been disseminated to a wide audience that we do
|
||||||
|
not control. The emails have gone out to all of the subscribers of the mailing
|
||||||
|
list at that time and also (for a great many of our lists) been copied by third
|
||||||
|
parties (for instance: http://markmail.org and http://gmane.org).
|
||||||
|
|
||||||
|
Sorry we cannot help further,
|
||||||
|
|
||||||
|
Mailing lists and their owners
|
||||||
|
....
|
||||||
|
|
||||||
|
== Checking Membership
|
||||||
|
|
||||||
|
Are you in need of checking who owns a certain mailing list without
|
||||||
|
having to search around on list's frontpages?
|
||||||
|
|
||||||
|
Mailman has a nice tool that will help us list members by type.
|
||||||
|
|
||||||
|
Get a full list of all the mailing lists hosted on the server:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -u mailman mailman3 lists
|
||||||
|
....
|
||||||
|
|
||||||
|
Get the list of regular members for example@example.com:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -u mailman mailman3 members example@example.com
|
||||||
|
....
|
||||||
|
|
||||||
|
Get the list of owners for example@example.com:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -u mailman mailman3 members -R owner example@example.com
|
||||||
|
....
|
||||||
|
|
||||||
|
Get the list of moderators for example@example.com:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -u mailman mailman3 members -R moderator example@example.com
|
||||||
|
....
|
||||||
|
|
||||||
|
== Troubleshooting and Resolution
|
||||||
|
|
||||||
|
=== List Administration
|
||||||
|
|
||||||
|
Specific users are marked as 'site admins' in the database.
|
||||||
|
|
||||||
|
Please file a issue if you feel you need to have this access.
|
||||||
|
|
||||||
|
=== Restart Procedure
|
||||||
|
|
||||||
|
If the server needs to be restarted mailman should come back on it's
|
||||||
|
own. Otherwise each service on it can be restarted:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo service mailman3 restart
|
||||||
|
sudo service postfix restart
|
||||||
|
....
|
||||||
|
|
||||||
|
== How to delete a mailing list
|
||||||
|
|
||||||
|
Delete a list, but keep the archives:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo -u mailman mailman3 remove <listname>
|
||||||
|
....
|
53
modules/sysadmin_guide/pages/making-ssl-certificates.adoc
Normal file
53
modules/sysadmin_guide/pages/making-ssl-certificates.adoc
Normal file
|
@ -0,0 +1,53 @@
|
||||||
|
= SSL Certificate Creation SOP
|
||||||
|
|
||||||
|
Every now and then you will need to create an SSL certificate for a
|
||||||
|
Fedora Service.
|
||||||
|
|
||||||
|
== Creating a CSR for a new server.
|
||||||
|
|
||||||
|
Know your hostname, ie [.title-ref]##lists.fedoraproject.org##`:
|
||||||
|
|
||||||
|
....
|
||||||
|
export ssl_name=<fqdn of host>
|
||||||
|
....
|
||||||
|
|
||||||
|
Create the cert. 8192 does not work with various boxes so we use 4096
|
||||||
|
currently.:
|
||||||
|
|
||||||
|
....
|
||||||
|
openssl genrsa -out ${ssl_name}.pem 4096
|
||||||
|
openssl req -new -key ${ssl_name}.pem -out $(ssl_name}.csr
|
||||||
|
|
||||||
|
Country Name (2 letter code) [XX]:US
|
||||||
|
State or Province Name (full name) []:NM
|
||||||
|
Locality Name (eg, city) [Default City]:Raleigh
|
||||||
|
Organization Name (eg, company) [Default Company Ltd]:Red Hat
|
||||||
|
Organizational Unit Name (eg, section) []:Fedora Project
|
||||||
|
Common Name (eg, your name or your server's hostname)
|
||||||
|
[]:lists.fedorahosted.org
|
||||||
|
Email Address []:admin@fedoraproject.org
|
||||||
|
|
||||||
|
Please enter the following 'extra' attributes
|
||||||
|
to be sent with your certificate request
|
||||||
|
A challenge password []:
|
||||||
|
An optional company name []:
|
||||||
|
....
|
||||||
|
|
||||||
|
send the CSR to the signing authority and wait for a cert. place all
|
||||||
|
three into private directory so that you can make certs in the future.
|
||||||
|
|
||||||
|
== Creating a temporary self-signed certificate.
|
||||||
|
|
||||||
|
Repeat the steps above but add in the following:
|
||||||
|
|
||||||
|
....
|
||||||
|
openssl x509 -req -days 30 -in ${ssl_name}.csr -signkey ${ssl_name}.pem -out ${ssl_name}.cert
|
||||||
|
Signature ok
|
||||||
|
subject=/C=US/ST=NM/L=Raleigh/O=Red Hat/OU=Fedora
|
||||||
|
Project/CN=lists.fedorahosted.org/emailAddress=admin@fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
Getting Private key
|
||||||
|
|
||||||
|
We only want a self-signed certificate to be good for a short time so 30
|
||||||
|
days sounds good.
|
418
modules/sysadmin_guide/pages/massupgrade.adoc
Normal file
418
modules/sysadmin_guide/pages/massupgrade.adoc
Normal file
|
@ -0,0 +1,418 @@
|
||||||
|
= Mass Upgrade Infrastructure SOP
|
||||||
|
|
||||||
|
Every once in a while, we need to apply mass upgrades to our servers for
|
||||||
|
various security and other upgrades.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Preparation
|
||||||
|
. Staging
|
||||||
|
. Special Considerations
|
||||||
|
+
|
||||||
|
____
|
||||||
|
* Disable builders
|
||||||
|
* Post reboot action
|
||||||
|
* Schedule autoqa01 reboot
|
||||||
|
* Bastion01 and Bastion02 and openvpn server
|
||||||
|
* Special yum directives
|
||||||
|
____
|
||||||
|
. Update Leader
|
||||||
|
. Group A reboots
|
||||||
|
. Group B reboots
|
||||||
|
. Group C reboots
|
||||||
|
. Doing the upgrade
|
||||||
|
. Doing the reboot
|
||||||
|
. Aftermath
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, sysadmin-main, infrastructure@lists.fedoraproject.org,
|
||||||
|
#fedora-noc
|
||||||
|
Location:::
|
||||||
|
All over the world.
|
||||||
|
Servers:::
|
||||||
|
all
|
||||||
|
Purpose:::
|
||||||
|
Apply kernel/other upgrades to all of our servers
|
||||||
|
|
||||||
|
== Preparation
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Determine which host group you are going to be doing updates/reboots
|
||||||
|
on.
|
||||||
|
+
|
||||||
|
Group "A"::
|
||||||
|
servers that end users will see or note being down and anything that
|
||||||
|
depends on them.
|
||||||
|
Group "B"::
|
||||||
|
servers that contributors will see or note being down and anything
|
||||||
|
that depends on them.
|
||||||
|
Group "C"::
|
||||||
|
servers that infrastructure will notice are down, or are redundent
|
||||||
|
enough to reboot some with others taking the load.
|
||||||
|
. Appoint an 'Update Leader' for the updates.
|
||||||
|
. Follow the [61]Outage Infrastructure SOP and send advance notification
|
||||||
|
to the appropriate lists. Try to schedule the update at a time when many
|
||||||
|
admins are around to help/watch for problems and when impact for the
|
||||||
|
group affected is less. Do NOT do multiple groups on the same day if
|
||||||
|
possible.
|
||||||
|
. Plan an order for rebooting the machines considering two factors:
|
||||||
|
+
|
||||||
|
____
|
||||||
|
* Location of systems on the kvm or xen hosts. [You will normally reboot
|
||||||
|
all systems on a host together]
|
||||||
|
* Impact of systems going down on other services, operations and users.
|
||||||
|
Thus since the database servers and nfs servers are the backbone of many
|
||||||
|
other systems, they and systems that are on the same xen boxes would be
|
||||||
|
rebooted before other boxes.
|
||||||
|
____
|
||||||
|
. To aid in organizing a mass upgrade/reboot with many people helping,
|
||||||
|
it may help to create a checklist of machines in a gobby document.
|
||||||
|
. Schedule downtime in nagios.
|
||||||
|
. Make doubly sure that various app owners are aware of the reboots
|
||||||
|
|
||||||
|
== Staging
|
||||||
|
|
||||||
|
____
|
||||||
|
Any updates that can be tested in staging or a pre-production
|
||||||
|
environment should be tested there first. Including new kernels, updates
|
||||||
|
to core database applications / libraries. Web applications, libraries,
|
||||||
|
etc.
|
||||||
|
____
|
||||||
|
|
||||||
|
== Special Considerations
|
||||||
|
|
||||||
|
While this may not be a complete list, here are some special things that
|
||||||
|
must be taken into account before rebooting certain systems:
|
||||||
|
|
||||||
|
=== Disable builders
|
||||||
|
|
||||||
|
Before the following machines are rebooted, all koji builders should be
|
||||||
|
disabled and all running jobs allowed to complete:
|
||||||
|
|
||||||
|
____
|
||||||
|
* db04
|
||||||
|
* nfs01
|
||||||
|
* kojipkgs02
|
||||||
|
____
|
||||||
|
|
||||||
|
Builders can be removed from koji, updated and re-added. Use:
|
||||||
|
|
||||||
|
....
|
||||||
|
koji disable-host NAME
|
||||||
|
|
||||||
|
and
|
||||||
|
|
||||||
|
koji enable-host NAME
|
||||||
|
....
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
you must be a koji admin
|
||||||
|
====
|
||||||
|
Additionally, rel-eng and builder boxes may need a special version
|
||||||
|
of rpm. Make sure to check with rel-eng on any rpm upgrades for them.
|
||||||
|
|
||||||
|
=== Post reboot action
|
||||||
|
|
||||||
|
The following machines require post-boot actions (mostly entering
|
||||||
|
passphrases). Make sure admins that have the passphrases are on hand for
|
||||||
|
the reboot:
|
||||||
|
|
||||||
|
____
|
||||||
|
* backup-2 (LUKS passphrase on boot)
|
||||||
|
* sign-vault01 (NSS passphrase for sigul service)
|
||||||
|
* sign-bridge01 (NSS passphrase for sigul bridge service)
|
||||||
|
* serverbeach* (requires fixing firewall rules):
|
||||||
|
____
|
||||||
|
|
||||||
|
Each serverbeach host needs 3 or 4 iptables rules added anytime it's
|
||||||
|
rebooted or libvirt is upgraded:
|
||||||
|
|
||||||
|
....
|
||||||
|
iptables -I FORWARD -o virbr0 -j ACCEPT
|
||||||
|
iptables -I FORWARD -i virbr0 -j ACCEPT
|
||||||
|
iptables -t nat -I POSTROUTING -s 192.168.122.3/32 -j SNAT --to-source 66.135.62.187
|
||||||
|
....
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
The source is the internal guest ips, the to-source is the external ips
|
||||||
|
that map to that guest ip. If there are multiple guests, each one needs
|
||||||
|
the above SNAT rule inserted.
|
||||||
|
====
|
||||||
|
=== Schedule autoqa01 reboot
|
||||||
|
|
||||||
|
There is currently an autoqa01.c host on cnode01. Check with QA folks
|
||||||
|
before rebooting this guest/host.
|
||||||
|
|
||||||
|
=== Bastion01 and Bastion02 and openvpn server
|
||||||
|
|
||||||
|
We need one of the bastion machines to be up to provide openvpn for all
|
||||||
|
machines. Before rebooting bastion02, modify:
|
||||||
|
`manifests/nodes/bastion0*.phx2.fedoraproject.org.pp` files to start
|
||||||
|
openvpn server on bastion01, wait for all clients to re-connect, reboot
|
||||||
|
bastion02 and then revert back to it as openvpn hub.
|
||||||
|
|
||||||
|
=== Special yum directives
|
||||||
|
|
||||||
|
Sometimes we will wish to exclude or otherwise modify the yum.conf on a
|
||||||
|
machine. For this purpose, all machines have an include, making them
|
||||||
|
read
|
||||||
|
[62]http://infrastructure.fedoraproject.org/infra/hosts/FQHN/yum.conf.include
|
||||||
|
from the infrastructure repo. If you need to make such changes, add them
|
||||||
|
to the infrastructure repo before doing updates.
|
||||||
|
|
||||||
|
== Update Leader
|
||||||
|
|
||||||
|
Each update should have a Leader appointed. This person will be in
|
||||||
|
charge of doing any read-write operations, and delegating to others to
|
||||||
|
do tasks. If you aren't specficially asked by the Leader to reboot or
|
||||||
|
change something, please don't. The Leader will assign out machine
|
||||||
|
groups to reboot, or ask specific people to look at machines that didn't
|
||||||
|
come back up from reboot or aren't working right after reboot. It's
|
||||||
|
important to avoid multiple people operating on a single machine in a
|
||||||
|
read-write manner and interfering with changes.
|
||||||
|
|
||||||
|
== Group A reboots
|
||||||
|
|
||||||
|
Group A machines are end user critical ones. Outages here should be
|
||||||
|
planned at least a week in advance and announced to the announce list.
|
||||||
|
|
||||||
|
List of machines currently in A group (note: this is going to be
|
||||||
|
automated)
|
||||||
|
|
||||||
|
These hosts are grouped based on the virt host they reside on:
|
||||||
|
|
||||||
|
* torrent02.fedoraproject.org
|
||||||
|
* ibiblio02.fedoraproject.org
|
||||||
|
* people03.fedoraproject.org
|
||||||
|
* ibiblio03.fedoraproject.org
|
||||||
|
* collab01.fedoraproject.org
|
||||||
|
* serverbeach09.fedoraproject.org
|
||||||
|
* db05.phx2.fedoraproject.org
|
||||||
|
* virthost03.phx2.fedoraproject.org
|
||||||
|
* db01.phx2.fedoraproject.org
|
||||||
|
* virthost04.phx2.fedoraproject.org
|
||||||
|
* db-fas01.phx2.fedoraproject.org
|
||||||
|
* proxy01.phx2.fedoraproject.org
|
||||||
|
* virthost05.phx2.fedoraproject.org
|
||||||
|
* ask01.phx2.fedoraproject.org
|
||||||
|
* virthost06.phx2.fedoraproject.org
|
||||||
|
|
||||||
|
These are the rest:
|
||||||
|
|
||||||
|
* bapp02.phx2.fedoraproject.org
|
||||||
|
* bastion02.phx2.fedoraproject.org
|
||||||
|
* app05.fedoraproject.org
|
||||||
|
* backup02.fedoraproject.org
|
||||||
|
* bastion01.phx2.fedoraproject.org
|
||||||
|
* fas01.phx2.fedoraproject.org
|
||||||
|
* fas02.phx2.fedoraproject.org
|
||||||
|
* log02.phx2.fedoraproject.org
|
||||||
|
* memcached03.phx2.fedoraproject.org
|
||||||
|
* noc01.phx2.fedoraproject.org
|
||||||
|
* ns02.fedoraproject.org
|
||||||
|
* ns04.phx2.fedoraproject.org
|
||||||
|
* proxy04.fedoraproject.org
|
||||||
|
* smtp-mm03.fedoraproject.org
|
||||||
|
* batcave02.phx2.fedoraproject.org
|
||||||
|
* mm3test.fedoraproject.org
|
||||||
|
* packages02.phx2.fedoraproject.org
|
||||||
|
|
||||||
|
=== Group B reboots
|
||||||
|
|
||||||
|
This Group contains machines that contributors use. Announcements of
|
||||||
|
outages here should be at least a week in advance and sent to the
|
||||||
|
devel-announce list.
|
||||||
|
|
||||||
|
These hosts are grouped based on the virt host they reside on:
|
||||||
|
|
||||||
|
* db04.phx2.fedoraproject.org
|
||||||
|
* bvirthost01.phx2.fedoraproject.org
|
||||||
|
* nfs01.phx2.fedoraproject.org
|
||||||
|
* bvirthost02.phx2.fedoraproject.org
|
||||||
|
* pkgs01.phx2.fedoraproject.org
|
||||||
|
* bvirthost03.phx2.fedoraproject.org
|
||||||
|
* kojipkgs02.phx2.fedoraproject.org
|
||||||
|
* bvirthost04.phx2.fedoraproject.org
|
||||||
|
|
||||||
|
These are the rest:
|
||||||
|
|
||||||
|
* koji04.phx2.fedoraproject.org
|
||||||
|
* releng03.phx2.fedoraproject.org
|
||||||
|
* releng04.phx2.fedoraproject.org
|
||||||
|
|
||||||
|
=== Group C reboots
|
||||||
|
|
||||||
|
Group C are machines that infrastructure uses, or can be rebooted in
|
||||||
|
such a way as to continue to provide services to others via multiple
|
||||||
|
machines. Outages here should be announced on the infrastructure list.
|
||||||
|
|
||||||
|
Group C hosts that have proxy servers on them:
|
||||||
|
|
||||||
|
* proxy02.fedoraproject.org
|
||||||
|
* ns05.fedoraproject.org
|
||||||
|
* hosted-lists01.fedoraproject.org
|
||||||
|
* internetx01.fedoraproject.org
|
||||||
|
* app01.dev.fedoraproject.org
|
||||||
|
* darkserver01.dev.fedoraproject.org
|
||||||
|
* fakefas01.fedoraproject.org
|
||||||
|
* proxy06.fedoraproject.org
|
||||||
|
* osuosl01.fedoraproject.org
|
||||||
|
* proxy07.fedoraproject.org
|
||||||
|
* bodhost01.fedoraproject.org
|
||||||
|
* proxy03.fedoraproject.org
|
||||||
|
* smtp-mm02.fedoraproject.org
|
||||||
|
* tummy01.fedoraproject.org
|
||||||
|
* app06.fedoraproject.org
|
||||||
|
* noc02.fedoraproject.org
|
||||||
|
* proxy05.fedoraproject.org
|
||||||
|
* smtp-mm01.fedoraproject.org
|
||||||
|
* telia01.fedoraproject.org
|
||||||
|
* app08.fedoraproject.org
|
||||||
|
* proxy08.fedoraproject.org
|
||||||
|
* coloamer01.fedoraproject.org
|
||||||
|
+
|
||||||
|
____
|
||||||
|
Other Group C hosts:
|
||||||
|
____
|
||||||
|
* ask01.stg.phx2.fedoraproject.org
|
||||||
|
* app02.stg.phx2.fedoraproject.org
|
||||||
|
* proxy01.stg.phx2.fedoraproject.org
|
||||||
|
* releng01.stg.phx2.fedoraproject.org
|
||||||
|
* value01.stg.phx2.fedoraproject.org
|
||||||
|
* virthost13.phx2.fedoraproject.org
|
||||||
|
* db-fas01.stg.phx2.fedoraproject.org
|
||||||
|
* pkgs01.stg.phx2.fedoraproject.org
|
||||||
|
* packages01.stg.phx2.fedoraproject.org
|
||||||
|
* virthost11.phx2.fedoraproject.org
|
||||||
|
* app01.stg.phx2.fedoraproject.org
|
||||||
|
* koji01.stg.phx2.fedoraproject.org
|
||||||
|
* db02.stg.phx2.fedoraproject.org
|
||||||
|
* fas01.stg.phx2.fedoraproject.org
|
||||||
|
* virthost10.phx2.fedoraproject.org
|
||||||
|
* autoqa01.qa.fedoraproject.org
|
||||||
|
* autoqa-stg01.qa.fedoraproject.org
|
||||||
|
* bastion-comm01.qa.fedoraproject.org
|
||||||
|
* batcave-comm01.qa.fedoraproject.org
|
||||||
|
* virthost-comm01.qa.fedoraproject.org
|
||||||
|
* compose-x86-01.phx2.fedoraproject.org
|
||||||
|
* compose-x86-02.phx2.fedoraproject.org
|
||||||
|
* download01.phx2.fedoraproject.org
|
||||||
|
* download02.phx2.fedoraproject.org
|
||||||
|
* download03.phx2.fedoraproject.org
|
||||||
|
* download04.phx2.fedoraproject.org
|
||||||
|
* download05.phx2.fedoraproject.org
|
||||||
|
* download-rdu01.vpn.fedoraproject.org
|
||||||
|
* download-rdu02.vpn.fedoraproject.org
|
||||||
|
* download-rdu03.vpn.fedoraproject.org
|
||||||
|
* fas03.phx2.fedoraproject.org
|
||||||
|
* secondary01.phx2.fedoraproject.org
|
||||||
|
* memcached04.phx2.fedoraproject.org
|
||||||
|
* virthost01.phx2.fedoraproject.org
|
||||||
|
* app02.phx2.fedoraproject.org
|
||||||
|
* value03.phx2.fedoraproject.org
|
||||||
|
* virthost07.phx2.fedoraproject.org
|
||||||
|
* app03.phx2.fedoraproject.org
|
||||||
|
* value04.phx2.fedoraproject.org
|
||||||
|
* ns03.phx2.fedoraproject.org
|
||||||
|
* darkserver01.phx2.fedoraproject.org
|
||||||
|
* virthost08.phx2.fedoraproject.org
|
||||||
|
* app04.phx2.fedoraproject.org
|
||||||
|
* packages02.phx2.fedoraproject.org
|
||||||
|
* virthost09.phx2.fedoraproject.org
|
||||||
|
* hosted03.fedoraproject.org
|
||||||
|
* serverbeach06.fedoraproject.org
|
||||||
|
* hosted04.fedoraproject.org
|
||||||
|
* serverbeach07.fedoraproject.org
|
||||||
|
* collab02.fedoraproject.org
|
||||||
|
* serverbeach08.fedoraproject.org
|
||||||
|
* dhcp01.phx2.fedoraproject.org
|
||||||
|
* relepel01.phx2.fedoraproject.org
|
||||||
|
* sign-bridge02.phx2.fedoraproject.org
|
||||||
|
* koji03.phx2.fedoraproject.org
|
||||||
|
* bvirthost05.phx2.fedoraproject.org
|
||||||
|
* (disable each builder in turn, update and reenable).
|
||||||
|
* ppc11.phx2.fedoraproject.org
|
||||||
|
* ppc12.phx2.fedoraproject.org
|
||||||
|
* backup03
|
||||||
|
|
||||||
|
== Doing the upgrade
|
||||||
|
|
||||||
|
If possible, system upgrades should be done in advance of the reboot
|
||||||
|
(with relevant testing of new packages on staging). To do the upgrades,
|
||||||
|
make sure that the Infrastructure RHEL repo is updated as necessary to
|
||||||
|
pull in the new packages ([63]Infrastructure Yum Repo SOP)
|
||||||
|
|
||||||
|
On batcave01, as root run:
|
||||||
|
|
||||||
|
....
|
||||||
|
func-yum [--host=hostname] update
|
||||||
|
....
|
||||||
|
|
||||||
|
..note: --host can be specified multiple times and takes wildcards.
|
||||||
|
|
||||||
|
pinging people as necessary if you are unsure about any packages.
|
||||||
|
|
||||||
|
Additionally you can see which machines still need rebooted with:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo func-command --timeout=10 --oneline /usr/local/bin/needs-reboot.py | grep yes
|
||||||
|
....
|
||||||
|
|
||||||
|
You can also see which machines would need a reboot if updates were all
|
||||||
|
applied with:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo func-command --timeout=10 --oneline /usr/local/bin/needs-reboot.py after-updates | grep yes
|
||||||
|
....
|
||||||
|
|
||||||
|
== Doing the reboot
|
||||||
|
|
||||||
|
In the order determined above, reboots will usually be grouped by the
|
||||||
|
virtualization hosts that the servers are on. You can see the guests per
|
||||||
|
virt host on batcave01 in /var/log/virthost-lists.out
|
||||||
|
|
||||||
|
To reboot sets of boxes based on which virthost they are we've written a
|
||||||
|
special script which facilitates it:
|
||||||
|
|
||||||
|
....
|
||||||
|
func-vhost-reboot virthost-fqdn
|
||||||
|
....
|
||||||
|
|
||||||
|
ex:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo func-vhost-reboot virthost13.phx2.fedoraproject.org
|
||||||
|
....
|
||||||
|
|
||||||
|
== Aftermath
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Make sure that everything's running fine
|
||||||
|
. Reenable nagios notification as needed
|
||||||
|
. {blank}
|
||||||
|
+
|
||||||
|
Make sure to perform any manual post-boot setup (such as entering::
|
||||||
|
passphrases for encrypted volumes)
|
||||||
|
. Close outage ticket.
|
||||||
|
|
||||||
|
=== Non virthost reboots:
|
||||||
|
|
||||||
|
If you need to reboot specific hosts and make sure they recover -
|
||||||
|
consider using:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo func-host-reboot hostname hostname1 hostname2 ...
|
||||||
|
....
|
||||||
|
|
||||||
|
If you want to reboot the hosts one at a time waiting for each to come
|
||||||
|
back before rebooting the next pass a -o to func-host-reboot.
|
79
modules/sysadmin_guide/pages/mastermirror.adoc
Normal file
79
modules/sysadmin_guide/pages/mastermirror.adoc
Normal file
|
@ -0,0 +1,79 @@
|
||||||
|
= Master Mirror Infrastructure SOP
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. PHX Master Mirror Setup
|
||||||
|
. RDU I2 Master Mirror Setup
|
||||||
|
. Raising Issues
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner:::
|
||||||
|
Red Hat IS
|
||||||
|
Contact:::
|
||||||
|
#fedora-admin, Red Hat ticket
|
||||||
|
Location:::
|
||||||
|
PHX
|
||||||
|
Servers:::
|
||||||
|
server[1-5].download.phx.redhat.com
|
||||||
|
Purpose:::
|
||||||
|
Provides the master mirrors for Fedora distribution
|
||||||
|
|
||||||
|
== PHX Master Mirror Setup
|
||||||
|
|
||||||
|
The master mirrors are accessible as:
|
||||||
|
|
||||||
|
....
|
||||||
|
download1.fedora.redhat.com -> CNAME to download3.fedora.redhat.com
|
||||||
|
download2.fedora.redhat.com -> currently no DNS entry
|
||||||
|
download3.fedora.redhat.com -> 209.132.176.20
|
||||||
|
download4.fedora.redhat.com -> 209.132.176.220
|
||||||
|
download5.fedora.redhat.com -> 209.132.176.221
|
||||||
|
....
|
||||||
|
|
||||||
|
from the outside. download.fedora.redhat.com is a round robin to the
|
||||||
|
above::
|
||||||
|
IPs.
|
||||||
|
|
||||||
|
The external IPs correspond to internal load balancer IPs that balance
|
||||||
|
between server[1-5]:
|
||||||
|
|
||||||
|
....
|
||||||
|
209.132.176.20 -> 10.9.24.20
|
||||||
|
209.132.176.220 -> 10.9.24.220
|
||||||
|
209.132.176.221 -> 10.9.24.221
|
||||||
|
....
|
||||||
|
|
||||||
|
The load balancers then balance between the below Fedora IPs on the
|
||||||
|
rsync servers:
|
||||||
|
|
||||||
|
....
|
||||||
|
10.8.24.21 (fedora1.download.phx.redhat.com) - server1.download.phx.redhat.com
|
||||||
|
10.8.24.22 (fedora2.download.phx.redhat.com) - server2.download.phx.redhat.com
|
||||||
|
10.8.24.23 (fedora3.download.phx.redhat.com) - server3.download.phx.redhat.com
|
||||||
|
10.8.24.24 (fedora4.download.phx.redhat.com) - server4.download.phx.redhat.com
|
||||||
|
10.8.24.25 (fedora5.download.phx.redhat.com) - server5.download.phx.redhat.com
|
||||||
|
....
|
||||||
|
|
||||||
|
== RDU I2 Master Mirror Setup
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
This section is awaiting confirmation from RH - information here may not
|
||||||
|
be 100% accurate yet.
|
||||||
|
====
|
||||||
|
|
||||||
|
download-i2.fedora.redhat.com (rhm-i2.redhat.com) is a round robin
|
||||||
|
between:
|
||||||
|
|
||||||
|
....
|
||||||
|
204.85.14.3 - 10.11.45.3
|
||||||
|
204.85.14.5 - 10.11.45.5
|
||||||
|
....
|
||||||
|
|
||||||
|
== Raising Issues
|
||||||
|
|
||||||
|
Issues with any of this setup should be raised in a helpdesk ticket.
|
217
modules/sysadmin_guide/pages/mbs.adoc
Normal file
217
modules/sysadmin_guide/pages/mbs.adoc
Normal file
|
@ -0,0 +1,217 @@
|
||||||
|
= Module Build Service Infra SOP
|
||||||
|
|
||||||
|
The MBS is a build orchestrator on top of Koji for "modules".
|
||||||
|
|
||||||
|
https://fedoraproject.org/wiki/Changes/ModuleBuildService
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Release Engineering Team, Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-modularity, #fedora-admin, #fedora-releng
|
||||||
|
Persons::
|
||||||
|
jkaluza, fivaldi, breilly, mikem
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Public addresses::
|
||||||
|
* mbs.fedoraproject.org
|
||||||
|
Servers::
|
||||||
|
* mbs-frontend0[1-2].phx2.fedoraproject.org
|
||||||
|
* mbs-backend01.phx2.fedoraproject.org
|
||||||
|
Purpose::
|
||||||
|
Build modules for Fedora.
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Users submit builds to mbs.fedoraproject.org referencing their modulemd
|
||||||
|
file in dist-git. (In the future, users will not submit their own module
|
||||||
|
builds. The [.title-ref]#freshmaker# daemon (running in infrastructure)
|
||||||
|
will watch for .spec file changes and modulemd.yaml file changes -- it
|
||||||
|
will submit the relevant module builds to the MBS on behalf of users.)
|
||||||
|
|
||||||
|
The request to build a module is received by the MBS flask app running
|
||||||
|
on the mbs-frontend nodes.
|
||||||
|
|
||||||
|
Cursory validation of the submitted modulemd is performed on the
|
||||||
|
frontend: are the named packages valid? Are their branches valid? The
|
||||||
|
MBS keeps a copy of the modulemd and appends additional data describing
|
||||||
|
which branches pointed to which hashes at the time of submission.
|
||||||
|
|
||||||
|
A fedmsg from the frontend triggers the backend to start building the
|
||||||
|
module. First, tags and build/srpm-build groups are created. Then, a
|
||||||
|
module-build-macros package is synthesized and submitted as an srpm
|
||||||
|
build. When it is complete and available in the buildroot, the rest of
|
||||||
|
the rpm builds are submitted.
|
||||||
|
|
||||||
|
These are grouped and limited in two ways:
|
||||||
|
|
||||||
|
* First, there is a global NUM_CONCURRENT_BUILDS config option that
|
||||||
|
controls how many koji builds the MBS is allowed to have open at any
|
||||||
|
time. It serves as a throttle.
|
||||||
|
* Second, a given module may specify that it's components should have a
|
||||||
|
certain "build order". If there are 50 components, it may say that the
|
||||||
|
first 25 of them are in one buildorder batch, and the second 25 are in
|
||||||
|
another buildorder batch. The first batch will be submitted and, when
|
||||||
|
complete, tagged back into the buildroot. Only after they are available
|
||||||
|
will the second batch of 25 begin.
|
||||||
|
|
||||||
|
When the last component is complete, the MBS backend marks the build as
|
||||||
|
"done", and then marks it again as "ready". (There is currently no
|
||||||
|
meaning to the "ready" state beyond "done". We reserved that state for
|
||||||
|
future CI interactions.)
|
||||||
|
|
||||||
|
== Observing MBS Behavior
|
||||||
|
|
||||||
|
=== The mbs-build command
|
||||||
|
|
||||||
|
The https://pagure.io/fm-orchestrator[fm-orchestrator repo] and the
|
||||||
|
[.title-ref]#module-build-service# package provide an
|
||||||
|
[.title-ref]#mbs-build# command with a few subcommands. For general
|
||||||
|
help:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ mbs-build --help
|
||||||
|
....
|
||||||
|
|
||||||
|
To generate a report of all currently active module builds:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ mbs-build overview
|
||||||
|
ID State Submitted Components Owner Module
|
||||||
|
---- ------- -------------------- ------------ ------- -----------------------------------
|
||||||
|
570 build 2017-06-01T17:18:11Z 35/134 psabata shared-userspace-f26-20170601141014
|
||||||
|
569 build 2017-06-01T14:18:04Z 14/15 mkocka mariadb-f26-20170601141728
|
||||||
|
....
|
||||||
|
|
||||||
|
To generate a report of an individual module build, given its ID:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ mbs-build info 569
|
||||||
|
NVR State Koji Task
|
||||||
|
---------------------------------------------- -------- ------------------------------------------------------------
|
||||||
|
libaio-0.3.110-7.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803741
|
||||||
|
BUILDING https://koji.fedoraproject.org/koji/taskinfo?taskID=19804081
|
||||||
|
libedit-3.1-17.20160618cvs.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803745
|
||||||
|
compat-openssl10-1.0.2j-6.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803746
|
||||||
|
policycoreutils-2.6-5.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803513
|
||||||
|
selinux-policy-3.13.1-255.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803748
|
||||||
|
systemtap-3.1-5.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803742
|
||||||
|
libcgroup-0.41-11.module_ea91dfb0 COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19685834
|
||||||
|
net-tools-2.0-0.42.20160912git.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19804010
|
||||||
|
time-1.7-52.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803747
|
||||||
|
desktop-file-utils-0.23-3.module_ea91dfb0 COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19685835
|
||||||
|
libselinux-2.6-6.module_ea91dfb0 COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19685833
|
||||||
|
module-build-macros-0.1-1.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803333
|
||||||
|
checkpolicy-2.6-1.module_414736cc COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19803514
|
||||||
|
dbus-glib-0.108-2.module_ea91dfb0 COMPLETE https://koji.fedoraproject.org/koji/taskinfo?taskID=19685836
|
||||||
|
....
|
||||||
|
|
||||||
|
To actively watch a module build in flight, given its ID:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ mbs-build watch 570
|
||||||
|
Still building:
|
||||||
|
libXrender https://koji.fedoraproject.org/koji/taskinfo?taskID=19804885
|
||||||
|
libXdamage https://koji.fedoraproject.org/koji/taskinfo?taskID=19805153
|
||||||
|
Failed:
|
||||||
|
libXxf86vm https://koji.fedoraproject.org/koji/taskinfo?taskID=19804903
|
||||||
|
|
||||||
|
Summary:
|
||||||
|
2 components in the BUILDING state
|
||||||
|
34 components in the COMPLETE state
|
||||||
|
1 components in the FAILED state
|
||||||
|
97 components in the undefined state
|
||||||
|
psabata's build #570 of shared-userspace-f26 is in the "build" state
|
||||||
|
....
|
||||||
|
|
||||||
|
=== The releng repo
|
||||||
|
|
||||||
|
There are more tools located in the [.title-ref]#scripts/mbs/# directory
|
||||||
|
of the releng repo: https://pagure.io/releng/blob/master/f/scripts/mbs
|
||||||
|
|
||||||
|
== Cancelling a module build
|
||||||
|
|
||||||
|
Users can cancel their own module builds with:
|
||||||
|
|
||||||
|
....
|
||||||
|
$ mbs-build cancel $BUILD_ID
|
||||||
|
....
|
||||||
|
|
||||||
|
MBS admins can also cancel builds of any user.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
MBS admins are defined as members of the groups listed in the
|
||||||
|
[.title-ref]#ADMIN_GROUPS# configuration options in
|
||||||
|
[.title-ref]#roles/mbs/common/templates/config.py#.
|
||||||
|
====
|
||||||
|
== Logs
|
||||||
|
|
||||||
|
The frontend logs are on mbs-frontend0[1-2] in
|
||||||
|
`/var/log/httpd/error_log`.
|
||||||
|
|
||||||
|
The backend logs are on mbs-backend01. Look in the journal for the
|
||||||
|
[.title-ref]#fedmsg-hub# service.
|
||||||
|
|
||||||
|
== Upgrading
|
||||||
|
|
||||||
|
The package in question is [.title-ref]#module-build-service#. Please
|
||||||
|
use the [.title-ref]#playbooks/manual/upgrade/mbs.yml# playbook.
|
||||||
|
|
||||||
|
== Managing Bootstrap Modules
|
||||||
|
|
||||||
|
In general, modules use other modules to define their buildroots, but
|
||||||
|
what defines the buildroot of the very first module? For this, we use
|
||||||
|
"bootstrap" modules which are manually selected. For some history on
|
||||||
|
this, see these tickets:
|
||||||
|
|
||||||
|
* https://pagure.io/releng/issue/6791
|
||||||
|
* https://pagure.io/fedora-infrastructure/issue/6097
|
||||||
|
|
||||||
|
The tag for a bootstrap module needs to be manually created and
|
||||||
|
populated by Release Engineering. Builds for that tag are curated and
|
||||||
|
selected from other Fedora tags, with care to ensure that only as many
|
||||||
|
builds are added as needed.
|
||||||
|
|
||||||
|
The existence of the tag is not enough for the bootstrap module to be
|
||||||
|
useable by MBS. MBS discovers the bootstrap module as a possible
|
||||||
|
dependency for other yet-to-be-built modules by querying PDC. During
|
||||||
|
normal operation, these entries in PDC are automatically created by
|
||||||
|
pdc-updater on pdc-backend02, but for the bootstrap tag they need to be
|
||||||
|
manually created and linked to the new bootstrap tag.
|
||||||
|
|
||||||
|
The fm-orchestrator repo has a
|
||||||
|
https://pagure.io/fm-orchestrator/blob/master/f/bootstrap[bootstrap/]
|
||||||
|
directory with tools that we used to create the first bootstrap entries.
|
||||||
|
If you need to create a new bootsrap entry or modify an existing one,
|
||||||
|
use these tools for inspiration. They are not general purpose and will
|
||||||
|
likely have to be modified to do what is needed. In particular, see
|
||||||
|
[.title-ref]#import-to-pdc.py# as an example of creating a new entry and
|
||||||
|
[.title-ref]#activate-in-pdc.py# for an example of editing an existing
|
||||||
|
entry.
|
||||||
|
|
||||||
|
To be usable, you'll need a token with rights to speak to staging/prod
|
||||||
|
PDC. See the PDC SOP for information on client configuration in
|
||||||
|
[.title-ref]#/etc/pdc.d/# and on where to find those tokens.
|
||||||
|
|
||||||
|
== Things that could go wrong
|
||||||
|
|
||||||
|
=== Overloading koji
|
||||||
|
|
||||||
|
If koji is overloaded, it should be acceptable to _stop_ the fedmsg-hub
|
||||||
|
daemon on mbs-backend01 at any time.
|
||||||
|
|
||||||
|
[NOTE]
|
||||||
|
.Note
|
||||||
|
====
|
||||||
|
As builds finish in koji, they will be _missed_ by the backend.. but
|
||||||
|
when it restarts it should find them in datagrepper. If that fails as
|
||||||
|
well, the mbs backend has a poller which should start up ~5 minutes
|
||||||
|
after startup that checks koji for anything it may have missed, at which
|
||||||
|
point it will resume functioning.
|
||||||
|
====
|
||||||
|
If koji continues to be overloaded after startup, try decreasing the
|
||||||
|
[.title-ref]#NUM_CONCURRENT_BUILDS# option in the config file in
|
||||||
|
[.title-ref]#roles/mbs/common/templates/#.
|
71
modules/sysadmin_guide/pages/memcached.adoc
Normal file
71
modules/sysadmin_guide/pages/memcached.adoc
Normal file
|
@ -0,0 +1,71 @@
|
||||||
|
= Memcached Infrastructure SOP
|
||||||
|
|
||||||
|
Our memcached setup is currently only used for wiki sessions. With
|
||||||
|
mediawiki, sessions stored in files over NFS or in the DB are very slow.
|
||||||
|
Memcached is a non-blocking solution for our session storage.
|
||||||
|
|
||||||
|
== Contents
|
||||||
|
|
||||||
|
[arabic]
|
||||||
|
. Contact Information
|
||||||
|
. Checking Status
|
||||||
|
. Flushing Memcached
|
||||||
|
. Restarting Memcached
|
||||||
|
. Configuring Memcached
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-web groups
|
||||||
|
Location::
|
||||||
|
PHX
|
||||||
|
Servers::
|
||||||
|
memcached03, memcached04
|
||||||
|
Purpose::
|
||||||
|
Provide caching for Fedora web applications.
|
||||||
|
|
||||||
|
== Checking Status
|
||||||
|
|
||||||
|
Our memcached instances are currently firewalled to only allow access
|
||||||
|
from wiki application servers. To check the status of an instance, use:
|
||||||
|
|
||||||
|
....
|
||||||
|
echo stats | nc memcached0{3,4} 11211
|
||||||
|
....
|
||||||
|
|
||||||
|
from an allowed host.
|
||||||
|
|
||||||
|
== Flushing Memcached
|
||||||
|
|
||||||
|
Sometimes, wrong contents get cached, and the cache should be flushed.
|
||||||
|
To do this, use:
|
||||||
|
|
||||||
|
....
|
||||||
|
echo flush_all | nc memcached0{3,4} 11211
|
||||||
|
....
|
||||||
|
|
||||||
|
from an allowed host.
|
||||||
|
|
||||||
|
== Restarting Memcached
|
||||||
|
|
||||||
|
Note that restarting an memcached instance will drop all sessions stored
|
||||||
|
on that instance. As mediawiki uses hashing to distribute sessions
|
||||||
|
across multiple instances, restarting one out of two instances will
|
||||||
|
result in about half of the total sessions being dropped.
|
||||||
|
|
||||||
|
To restart memcached:
|
||||||
|
|
||||||
|
....
|
||||||
|
sudo /etc/init.d/memcached restart
|
||||||
|
....
|
||||||
|
|
||||||
|
== Configuring Memcached
|
||||||
|
|
||||||
|
Memcached is currently setup as a role in the ansible git repo. The main
|
||||||
|
two tunables are the MAXCONN (the maximum number of concurrent
|
||||||
|
connections) and CACHESIZE (the amount memory to use for storage). These
|
||||||
|
variables can be set through $memcached_maxconn and $memcached_cachesize
|
||||||
|
in ansible. Additionally, other options (as described in the memcached
|
||||||
|
manpage) can be set via $memcached_options.
|
85
modules/sysadmin_guide/pages/message-tagging-service.adoc
Normal file
85
modules/sysadmin_guide/pages/message-tagging-service.adoc
Normal file
|
@ -0,0 +1,85 @@
|
||||||
|
= Message Tagging Service SOP
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Factory2 Team, Fedora QA Team, Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-qa, #fedora-admin
|
||||||
|
Persons::
|
||||||
|
cqi, lucarval, vmaljulin
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
* In OpenShift.
|
||||||
|
Purpose::
|
||||||
|
Tag module build
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
Message Tagging Service, aka MTS, is an event-driven microservice to tag
|
||||||
|
a module build triggered by MBS specific event.
|
||||||
|
|
||||||
|
MTS basically listens on message bus for the MBS event
|
||||||
|
`mbs.build.state.change`. Once a message is received, the module build
|
||||||
|
represented by that message will be tested if it matches any predefined
|
||||||
|
rules. Each rule definition has destination tag defined. If a rule
|
||||||
|
matches the build, the destination tag will be applied to that build.
|
||||||
|
Only module build in ready state is handled by MTS for now.
|
||||||
|
|
||||||
|
== Observing Behavior
|
||||||
|
|
||||||
|
Login to `os-master01.phx2.fedoraproject.org` as `root` (or,
|
||||||
|
authenticate remotely with openshift using
|
||||||
|
`oc login https://os.fedoraproject.org`), and run:
|
||||||
|
|
||||||
|
....
|
||||||
|
oc project mts
|
||||||
|
oc status -v
|
||||||
|
oc logs -f dc/mts
|
||||||
|
....
|
||||||
|
|
||||||
|
== Database
|
||||||
|
|
||||||
|
MTS does not use database.
|
||||||
|
|
||||||
|
== Configuration
|
||||||
|
|
||||||
|
Please do remember to increase `MTS_CONFIG_VERSION` so that Openshift
|
||||||
|
creates a new pod after running the playbook.
|
||||||
|
|
||||||
|
== Deployment
|
||||||
|
|
||||||
|
You can roll out configuration changes by changing the files in
|
||||||
|
`roles/openshift-apps/message-tagging-service/` and running the
|
||||||
|
`playbooks/openshift-apps/message-tagging-service.yml` playbook.
|
||||||
|
|
||||||
|
=== Stage
|
||||||
|
|
||||||
|
MTS docker image is built automatically and pushed to upstream quay.io.
|
||||||
|
By default, tag `latest` is applied to a fresh image. Tag `stg` is
|
||||||
|
applied to image, then run the playbook
|
||||||
|
`playbooks/openshift-apps/message-tagging-service.yml` with environment
|
||||||
|
`staging`.
|
||||||
|
|
||||||
|
=== Prod
|
||||||
|
|
||||||
|
If everything works well, apply tag `prod` to docker image in quay.io,
|
||||||
|
then, run the playbook with environment `prod`.
|
||||||
|
|
||||||
|
== Update Rules
|
||||||
|
|
||||||
|
https://pagure.io/fedora-infra/ansible/blob/main/f/roles/openshift-apps/message-tagging-service/files/mts-rules.yml[Rules
|
||||||
|
file] is managed along side the playbook role in same repository.
|
||||||
|
|
||||||
|
For detailed information of rules format, please refer to
|
||||||
|
https://pagure.io/modularity/blob/master/f/drafts/module-tagging-service/format.md[documentation]
|
||||||
|
under Modularity.
|
||||||
|
|
||||||
|
== Troubleshooting
|
||||||
|
|
||||||
|
In case of problems with MTS, check the logs:
|
||||||
|
|
||||||
|
....
|
||||||
|
oc logs -f dc/mts
|
||||||
|
....
|
36
modules/sysadmin_guide/pages/mirrorhiding.adoc
Normal file
36
modules/sysadmin_guide/pages/mirrorhiding.adoc
Normal file
|
@ -0,0 +1,36 @@
|
||||||
|
= Mirror hiding Infrastructure SOP
|
||||||
|
|
||||||
|
At times, such as release day, there may be a conflict between Red Hat
|
||||||
|
trying to release content for RHEL, and Fedora trying to release Fedora.
|
||||||
|
One way to limit the pain to Red Hat on release day is to hide
|
||||||
|
download.fedora.redhat.com from the publiclist and mirrorlist
|
||||||
|
redirector, which will keep most people from downloading the content
|
||||||
|
from Red Hat directly.
|
||||||
|
|
||||||
|
== Contact Information
|
||||||
|
|
||||||
|
Owner::
|
||||||
|
Fedora Infrastructure Team
|
||||||
|
Contact::
|
||||||
|
#fedora-admin, sysadmin-main, sysadmin-web group
|
||||||
|
Location::
|
||||||
|
Phoenix
|
||||||
|
Servers::
|
||||||
|
app3, app4
|
||||||
|
Purpose::
|
||||||
|
Hide Public Mirrors from the publiclist / mirrorlist redirector
|
||||||
|
|
||||||
|
== Description
|
||||||
|
|
||||||
|
To hide a public mirror, so it doesn't appear on the publiclist or the
|
||||||
|
mirrorlist, simply go into the MirrorManager administrative web user
|
||||||
|
interface, at [45]https://admin.fedoraproject.org/mirrormanager. Fedora
|
||||||
|
sysadmins can see all Sites and Hosts. For each Site and Host, there is
|
||||||
|
a checkbox marked "private", which if set, will hide that Site (and all
|
||||||
|
its Hosts), or just that single Host, such that it won't appear on the
|
||||||
|
public lists.
|
||||||
|
|
||||||
|
To make a private-marked mirror public, simply clear the "private"
|
||||||
|
checkbox again.
|
||||||
|
|
||||||
|
This change takes effect at the top of each hour.
|
|
@ -0,0 +1,20 @@
|
||||||
|
= AWS Mirrors
|
||||||
|
|
||||||
|
Fedora Infrastructure mirrors EPEL content (/pub/epel) into Amazon
|
||||||
|
Simple Storage Service (S3) in multiple regions, to make it fast for EC2
|
||||||
|
CentOS/RHEL users to get EPEL content from an effectively local mirror.
|
||||||
|
|
||||||
|
For this to work, we have private mirror entries in MirrorManager, one
|
||||||
|
for each region, which include the EC2 netblocks for that region.
|
||||||
|
|
||||||
|
Amazon updates their list of network blocks roughly monthly, as they
|
||||||
|
consume additional address space. Therefore, we need to make the
|
||||||
|
corresponding changes into MirrorManager's entries for same.
|
||||||
|
|
||||||
|
Amazon publishes their list of network blocks on their forum site, with
|
||||||
|
the subject "Announcement: Amazon EC2 Public IP Ranges". As of November
|
||||||
|
2014, this was https://forums.aws.amazon.com/ann.jspa?annID=1701
|
||||||
|
|
||||||
|
As of November 19, 2014, Amazon publishes it as a JSON file we can
|
||||||
|
download.
|
||||||
|
http://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue