143 lines
6.5 KiB
ReStructuredText
143 lines
6.5 KiB
ReStructuredText
Fedora Contributor Activity Statistics
|
|
======================================
|
|
|
|
Purpose
|
|
-------
|
|
|
|
In order to have a quantitative understanding of how the contributor activity has
|
|
changed over the years and to provide the foundational support to the Fedora Project
|
|
strategy 2028's guiding star about doubling the number of active contributors every
|
|
week, it is important to have a service that tracks their statistics. This measurement
|
|
would help make the strategy goal meaningful as well as assist the Fedora Council and
|
|
the related bodies understand how far they have progressed into making this happen and
|
|
identify the underlying particular problems that act as a barrier in realizing this
|
|
objective.
|
|
|
|
Background
|
|
----------
|
|
|
|
There was a `Fedora Council <https://docs.fedoraproject.org/en-US/council/>`_ Face To
|
|
Face 2023 Hackfest organized in Frankfurt, Germany that was attended by the Fedora
|
|
Council members, `Akashdeep Dhar <https://accounts.fedoraproject.org/user/t0xic0der>`_,
|
|
`Alexandra Fedorova <https://accounts.fedoraproject.org/user/bookwar>`_, `Ben Cottom
|
|
<https://accounts.fedoraproject.org/user/bcotton>`_, `David Cantrell
|
|
<https://accounts.fedoraproject.org/user/dcantrell>`_, `Justin W. Flory
|
|
<https://accounts.fedoraproject.org/user/jflory7>`_, `Matthew Miller
|
|
<https://accounts.fedoraproject.org/user/mattdm>`_, `Sumantro Mukherjee
|
|
<https://accounts.fedoraproject.org/user/sumantrom>`_ and `Vipul Siddharth
|
|
<https://accounts.fedoraproject.org/user/siddharthvipul1>`_. Among a bunch of strategy
|
|
goals discussed about and decided upon there, the core driving goal for the five-year
|
|
strategy plan was to facilitate a community environment where the number of active
|
|
contributors double up every week.
|
|
|
|
This was previously proposed as an Fedora Infrastructure `ticket
|
|
<https://pagure.io/fedora-infrastructure/issue/11149>`_ by `Michal Konecny
|
|
<https://accounts.fedoraproject.org/user/zlopez>`_ on Matthew Miller's request and
|
|
addressed by Akashdeep Dhar in the project called `Fedora User Activity Statistics
|
|
<https://github.com/t0xic0der/fuas>`_. During the `Community Platform Engineering
|
|
<https://docs.fedoraproject.org/en-US/cpe/>`_ `Face To Face Meeting 2023
|
|
<https://fedoramagazine.org/the-community-platform-engineering-f2f-2023-experience-part-i/>`_
|
|
in Barcelona, Spain - the scope of the project was revisited by Akashdeep Dhar, `Adam
|
|
Saleh <https://accounts.fedoraproject.org/user/asaleh>`_, `David Kirwan
|
|
<https://accounts.fedoraproject.org/user/dkirwan>`_, `Kevin Fenzi
|
|
<https://accounts.fedoraproject.org/user/nirik>`_ and Matthew Miller which led to the
|
|
refinement of the projects purpose and an increase in the deliverable requirements.
|
|
|
|
Following the expanded scope of the project, the previously provided solution no longer
|
|
addressed the updated set of requirements. Adam Saleh and Akashdeep Dhar had a
|
|
discussion about efficient methods of extracting information from the Datanommer
|
|
service. The project was proposed to be an initiative in `this ticket
|
|
<https://pagure.io/cpe/initiatives-proposal/issue/27>`_ by `Aoife Moloney
|
|
<https://accounts.fedoraproject.org/user/amoloney>`_. The project was then scoped for
|
|
ARC investigation for the period of Q2 2023 before it is sent for implementation by the
|
|
respective initiative team assigned to the said project.
|
|
|
|
Functional requirements
|
|
-----------------------
|
|
|
|
The following section details about the requirements for the project in both aspects -
|
|
the bare minimum outcome to be able to call the project as success as well as the list
|
|
of nice-to-have wishes that constitute the absolute maximum outcome. Please note that
|
|
these requirements must be taken as recommendations and changes introduced to them
|
|
during the implementation phase of the project when bound by the circumstances is
|
|
acceptable.
|
|
|
|
Minimal
|
|
~~~~~~~
|
|
|
|
- Processing - A collector service for legitimate human-owned/run accounts
|
|
- Output - Statistical information created in JSON format
|
|
|
|
Maximal
|
|
~~~~~~~
|
|
|
|
- Processing - Analyzing activity from meetbot logs
|
|
- Output - Report automatically being generated on a weekly basis
|
|
|
|
Resources
|
|
---------
|
|
|
|
- `Fedora User Activity Statistics <https://github.com/t0xic0der/fuas>`_
|
|
- `Datagrepper <https://apps.fedoraproject.org/datagrepper/>`_
|
|
- `Monitor Dashboard
|
|
<https://monitor-dashboard-monitor-gating.apps.ocp.fedoraproject.org/>`_
|
|
- `Datanommer <https://github.com/fedora-infra/datanommer>`_
|
|
- `Original Fedora Infrastructure ticket
|
|
<https://pagure.io/fedora-infrastructure/issue/11149>`_
|
|
- `Renewed Initiative Proposal ticket
|
|
<https://pagure.io/cpe/initiatives-proposal/issue/27>`_
|
|
|
|
Index
|
|
-----
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
|
|
creation_workflow
|
|
creation_gram
|
|
creation_fail
|
|
solution_datanote
|
|
solution_dataeplt
|
|
solution_examples
|
|
solution_probntec
|
|
solution_techtool
|
|
|
|
Conclusions
|
|
-----------
|
|
|
|
After understanding how effective the project can be in helping the Fedora Council
|
|
achieve its strategic objective of doubling the number of active contributors present
|
|
over a given period of time, the options for making the said service as useful as
|
|
possible were explored. It was concluded that the historical data collected by the
|
|
Datanommer from the Fedora Messaging bus would be indeed helpful in tracking
|
|
contribution activities and detailing on contribution statistics and that it should be
|
|
theoretically possible for the team to implement such a service.
|
|
|
|
Roadmap
|
|
-------
|
|
|
|
- **Step 1** - Connect with the data scientists to understand which data elements need
|
|
to be focused on
|
|
- **Step 2** - Author codebase to obtain details on human-run and human-owned legitimate
|
|
accounts
|
|
- **Step 3** - Author SQL queries for obtaining historical contribution statistics per
|
|
username
|
|
- **Step 4** - Author SQL queries for obtaining historical contribution statistics per
|
|
service
|
|
- **Step 5** - Adapt the queries to create a service to obtain current and future
|
|
statistics
|
|
- **Step 6** - Expose necessary endpoints or integrations on the dashboard for the
|
|
analytics
|
|
- **Step 7** - Setup the staging environment for the dashboard in a limited testing
|
|
environment for inspection
|
|
- **Step 8** - Deploy to the production environment after ironing out the vertex cases
|
|
for statistics and... PROFIT?
|
|
|
|
Estimate of work
|
|
----------------
|
|
|
|
As this service makes active use of technologies that are already created and maintained
|
|
such as Fedora Messaging, Datagrepper, Datanommer, FASJSON etc., and assuming that the
|
|
team that is to work on this down the road has people who are experienced in the
|
|
aforementioned technologies, the service should not take any longer than two quarters to
|
|
hit the staging environment and one more quarter to make it to the production one.
|