Since we are moving to matrix, lets drop reference to irc. I may have missed a few of these and I left the Zodbot SOP alone for now until we replace it with the new matrix one. Signed-off-by: Kevin Fenzi <kevin@scrye.com>
158 lines
6.6 KiB
Text
158 lines
6.6 KiB
Text
= Working with Fedora Infrastructure
|
|
|
|
This document explains how to efficiently work with the Fedora Infrastructure team.
|
|
Your close attention to this document will help both you and us
|
|
do the work you are asking us to do.
|
|
|
|
== Our Workflow
|
|
|
|
=== Security related issues
|
|
Is your issue/problem related to security of an application or service we
|
|
run?
|
|
|
|
* send an email to infra-security@fedoraproject.org
|
|
|
|
=== Emergency/Authentication issues
|
|
Is your issue/problem urgent? (An important service is down, you need
|
|
a change asap) or is your issue/problem such that you cannot file a ticket
|
|
(authentication, no account, ticketing system down)
|
|
|
|
* Login to a matrix account. join the #admin:fedoraproject.org channel.
|
|
say '!oncall' and explain the issue or problem to the oncall person.
|
|
|
|
* If no one is available there:
|
|
** If you cannot authenticate to link:https://pagure.io/[], send an email
|
|
to admin@fedoraproject.org
|
|
** Otherwise: go to next step.
|
|
|
|
=== Ticket tracking
|
|
|
|
By default, the infrastructure team tracks its work in tickets at:
|
|
link:https://pagure.io/fedora-infrastructure/issues/[].
|
|
If you need something from us, please
|
|
link:https://pagure.io/fedora-infrastructure/new_issue[open a new ticket] with
|
|
as much information as you think is needed to process this request.
|
|
|
|
Once created your ticket will follow the following flow:
|
|
|
|
.Daily Process Ticket Flow
|
|
[#img-ticket-flow]
|
|
[caption="Figure 2: "]
|
|
image::daily_process.png[750,750, link="../_images/daily_process.png"]
|
|
|
|
A few notes:
|
|
|
|
* Make sure to note if there is a deadline or if this issue blocks you.
|
|
* We review tickets during the two stand ups we hold Monday through Thursday
|
|
(one more Europe timezone friendly and one more US timezone friendly).
|
|
* There is no need to ping team members or notify us about the newly filed
|
|
ticket.
|
|
|
|
* Your ticket will be triaged by a team member and moved to a new state:
|
|
** A Gain and Pain levels will be added to the ticket, these are used by the
|
|
team member to prioritize their work. (You can find the definition of each
|
|
level in the https://docs.fedoraproject.org/en-US/cpe/glossary/[glossary].)
|
|
** If it's moved to _Waiting on asignee_ it's waiting for a team member to
|
|
start working on it.
|
|
** If it's moved to _Waiting on reporter_ it means that you need to answer
|
|
questions posed in the ticket before it can be worked on.
|
|
** If the ticket is closed with _initiative_, see
|
|
https://docs.fedoraproject.org/en-US/cpe/initiatives/[New Initiative Workflow].
|
|
** If the ticket is otherwise closed, it will be with a explanation from
|
|
a team member.
|
|
|
|
|
|
* If you have an update to your issue/task or want to know when it might be
|
|
worked on:
|
|
** comment in the ticket adding that information or asking for time frame.
|
|
|
|
* When someone is available, your ticket will be assigned to someone to work
|
|
on.
|
|
** Watch for progress reports/ticket being marked done.
|
|
|
|
* If the work is not fully completed as required, please re-open the ticket
|
|
and indicate this.
|
|
** Go back to the previous step for additional work.
|
|
|
|
|
|
== The "Oncall" Role in Our Team
|
|
|
|
One team member is always designated “oncall”. The assigned person changes
|
|
every week. You can find who the currently assigned person is on matrix by using
|
|
`!oncall` in any of our various matrix channels, such as `#admin:fedoraproject.org`
|
|
|
|
When available, this person:
|
|
|
|
. Accepts urgent work items for the team, such as an important
|
|
or high SLE service being down or causing issues. A ticket should
|
|
be filed by the reporter or the oncall person to track this work in any case.
|
|
|
|
. Shields other team members from distracting pings and less
|
|
urgent tickets, deciding when an issue is important enough to
|
|
interrupt another team member.
|
|
|
|
. Triages incoming tickets for urgent items that need work outside
|
|
of normal triage process.
|
|
|
|
== Initiatives
|
|
|
|
All tasks involving new applications, major deployments, major development work
|
|
or the like will be asked to follow the
|
|
https://docs.fedoraproject.org/en-US/cpe/initiatives/[New Initiative Workflow].
|
|
It will then be scoped and prioritized from there.
|
|
|
|
== General Ticket Considerations
|
|
|
|
Please provide as much information as you can in your ticket to avoid
|
|
back and forth for information. If you know your issue is going to
|
|
cause a lot of discussion, start a mailing list or discussion thread for that.
|
|
|
|
Make sure your ticket:
|
|
|
|
* Explains the problem or issue you are having, with URLs where
|
|
possible to the services or applications involved.
|
|
* Tells us how important or urgent this is to you.
|
|
* Includes any error messages or output you see.
|
|
|
|
It is your responsibility as ticket reporter to follow your ticket, provide
|
|
information that is asked for, and keep us aware of any urgency you may have.
|
|
Do not simply file and forget your ticket.
|
|
|
|
Your ticket may take a while to process, depending on the current workload
|
|
of the team has and how important we think it is. If your ticket
|
|
is blocking you, make sure you note that in the ticket, but keep in
|
|
mind that we may already be working on tickets that are blocking more people.
|
|
|
|
Every now and then, we will go through our old tickets. When this happen we may
|
|
ask you to check if the issue still exists (it could be that a complimentary
|
|
change fixed it, or that was just an intermittent issue or simply that it got
|
|
fixed without us knowing). In those situation, we kindly ask that you reply to
|
|
our question/ping within two weeks, otherwise we reserve the right to close the
|
|
ticket (knowing that you can always re-open it or open a new one if the issue
|
|
persists or re-appeared).
|
|
|
|
== Matrix
|
|
|
|
Matrix is a great way to communicate, but please do not ping team members
|
|
directly. Instead, update your ticket with any new information you have and
|
|
when the team member(s) working on that issue have time/availability, they
|
|
may contact you on matrix for more interactive debugging/testing.
|
|
|
|
== Direct emails
|
|
|
|
E-mail is also a great communication method, but if you mail work items or
|
|
information to one person directly, they cannot easily hand off the issue,
|
|
you must wait for them to have time to address the issue (when others could
|
|
perhaps have already solved it, etc). So, please avoid direct emails and
|
|
instead update tickets with any information you want to add.
|
|
|
|
== RFC 1149
|
|
|
|
link:https://tools.ietf.org/html/rfc1149[Pigeons are too slow] for most work
|
|
items, and require facilities (e.g. dovecots) that most team members do not
|
|
have. Even if the oncall member does have a free dovecot, feed, and is trained
|
|
in handling carrier pigeons, sending a pigeon to a single team member has the
|
|
same problems as using matrix or email for the same purpose, which means tickets
|
|
are still the correct way to report problems.
|
|
|
|
In other words, please don't send us any birds.
|