arc/docs/fcas/solution_dataeplt.rst
Ryan Lerch ba720c3d77 fix parsing errors and sphinx warnings
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2023-11-20 13:04:34 +00:00

112 lines
6.2 KiB
ReStructuredText

.. _solution_dataeplt.rst:
Data Exploration and Significance
=================================
The following is a set of information that would be looked into by the said service
whenever it would be deployed. Please note that this list consists of both - the
information that would be available for consumption by the service users as well as the
information that would be available for computation and analysis to the service itself
but not the service users, and there can be more such information apart from the ones
listed below.
1. Activity entry from Datanommer (For computation only)
2. Username of the "subject" i.e. owner of the contribution (For computation only)
3. Username of the "object" i.e. involved in the contribution (For computation only)
4. Datetime data of a specific contribution activity (For computation only)
5. Datetime data of a grouped contribution activity (For consumption only)
6. Service where a specific contribution activity happened (For computation only)
7. Service where a grouped contribution activity happened (For consumption only)
8. Activity trends per username (For computation only)
Activity Entry from Datanommer
------------------------------
This data forms the most basic functional entity of a "contribution record". An
occurrence of an activity means that a contribution was made by the "subject" member on
the "object" member and/or service with the "predicament" nature of the contribution at
the "time" of it happening. A computed collection of these data can help form wider
statistics for example - trend of contribution by a certain "subject" member, trend of
contribution on a certain "service" etc. allowing us to answer questions like "which
services are most active (and why) and least active (any why)?", "what period of time
attracts most contributions (and why)?" etc. As this data is intricate, it only serves
its purpose when a computed group of those form statistics and not when it is singled
out - and that is why this data is only used for computational purposes only.
Username of the "subject"
-------------------------
Alternatively, owner of the contribution.
This data is a part of the previously-stated "activity entry from Datanommer" data. In
order to protect the privacy of the members involved in the aforementioned data, this
information is anonymized as a hash and due to the fact that this data serves its
purpose when a computed group of those form statistics and not when it is singled out -
this data is only used for computational purposes only.
Username of the "object"
------------------------
Alternatively, involved in the contribution.
This data is a part of the previously-stated "activity entry from Datanommer" data. In
order to protect the privacy of the members involved in the aforementioned data, this
information is anonymized as a hash and due to the fact that this data serves its
purpose when a computed group of those form statistics and not when it is singled out -
this data is only used for computational purposes only.
Datetime data of a specific contribution activity
-------------------------------------------------
This data is a part of the previously-stated "activity entry from Datanommer" data. Due
to the fact that this data serves its purpose when a computed group of those form
statistics and not when it is singled out - this data is only used for computational
purposes only.
Datetime data of a grouped contribution actvitity
-------------------------------------------------
Being a derivative statistic obtained from a computed group of the previously stated
"datetime of a specific contribution activity", this can be used to understand the trend
of contribution over a period of "time" for contributions of a certain "nature",
contributions over a period of "time" for contributions on a certain "service" etc. This
understanding would help us answer questions like what timelines attract most
contributions, what timelines do not have much of contributions etc. and gauge the
success of activities such as events and workshops by helping answer if those were able
to bring in contributions right after their commencement time. As a result, this data is
available for user consumption by the service.
Service where a specific contribution activity happened
-------------------------------------------------------
This data is a part of the previously-stated "activity entry from Datanommer" data. As
this data is intricate, it only serves its purpose when a computed group of those form
statistics and not when it is singled out - and that is why this data is only used for
computational purposes only.
Service where a grouped contribution activity happened
------------------------------------------------------
Being a derivative statistic obtained from a computed group of the previously stated
"service where a specific grouped contribution activity happened", this can be used to
understand the trend of contribution on a certain service and create comparisons of
those against another to see how they fare in the contribution activities. This
understanding would help us answer questions like what services are most active in terms
of contributons and what services are not and gauge the usability of those services by
knowing what makes those services desirable (i.e. inferred from favourable contribution
statistics) and undesirable (i.e. inferred from unfavourable contribution statistics) to
direct what service to be contributed to. As a result, this data is available for user
consumptions by the service.
Activity trends per username
----------------------------
Being a derivative statistic obtained from a computed group of the previously stated
"activity entry from Datanommer", this can be used to understand the trend of
contribution for a certain user. This understanding would help us answer questions like
what fields a certain member contributes to and if they are transitioning from one field
to another, what reasons have led them to do that. In order to protect the privacy of
the members involved in the aforementioned data, this information is anonymized as a
hash and due to the fact that this data serves its purpose when a computed group of
those form statistics and not when it is singled out - this data is only used for
computational purposes only.