Add documentation about data exploration

Signed-off-by: Akashdeep Dhar <akashdeep.dhar@gmail.com>
2023-05-22 13:24:50 +05:30 · 2023-05-22 13:24:50 +05:30 · 90d38d58a8
commit 90d38d58a8
parent 589b2ec10f
2 changed files with 121 additions and 0 deletions
--- a/docs/fcas/index.rst
+++ b/docs/fcas/index.rst
@ -99,5 +99,6 @@ Index
    creation_gram
    creation_fail
    solution_datanote
+    solution_dataeplt
    solution_examples
    solution_techtool
--- a/docs/fcas/solution_dataeplt.rst
+++ b/docs/fcas/solution_dataeplt.rst
@ -0,0 +1,120 @@
+.. _solution_dataeplt.rst:
+
+Data Exploration and Significance
+====
+
+The following is a set of information that would be looked into by the said 
+service whenever it would be deployed. Please note that this list consists
+of both - the information that would be available for consumption by the
+service users as well as the information that would be available for 
+computation and analysis to the service itself but not the service users, and
+there can be more such information apart from the ones listed below.
+
+1. Activity entry from Datanommer (For computation only)
+2. Username of the "subject" i.e. owner of the contribution (For computation only)
+3. Username of the "object" i.e. involved in the contribution (For computation only)
+4. Datetime data of a specific contribution activity (For computation only)
+5. Datetime data of a grouped contribution activity (For consumption only)
+6. Service where a specific contribution activity happened (For computation only)
+7. Service where a grouped contribution activity happened (For consumption only)
+8. Activity trends per username (For computation only)
+
+
+Activity Entry from Datanommer
+----
+
+This data forms the most basic functional entity of a "contribution record". An
+occurrence of an activity means that a contribution was made by the "subject"
+member on the "object" member and/or service with the "predicament" nature of
+the contribution at the "time" of it happening. A computed collection of these 
+data can help form wider statistics for example - trend of contribution by a 
+certain "subject" member, trend of contribution on a certain "service" etc. 
+allowing us to answer questions like "which services are most active (and why)
+and least active (any why)?", "what period of time attracts most contributions
+(and why)?" etc. As this data is intricate, it only serves its purpose when a
+computed group of those form statistics and not when it is singled out - and
+that is why this data is only used for computational purposes only.
+
+Username of the "subject"
+----
+
+Alternatively, owner of the contribution.
+
+This data is a part of the previously-stated "activity entry from Datanommer" 
+data. In order to protect the privacy of the members involved in the 
+aforementioned data, this information is anonymized as a hash and due to the
+fact that this data serves its purpose when a computed group of those form
+statistics and not when it is singled out - this data is only used for 
+computational purposes only.
+
+Username of the "object"
+----
+
+Alternatively, involved in the contribution.
+
+This data is a part of the previously-stated "activity entry from Datanommer" 
+data. In order to protect the privacy of the members involved in the 
+aforementioned data, this information is anonymized as a hash and due to the
+fact that this data serves its purpose when a computed group of those form
+statistics and not when it is singled out - this data is only used for 
+computational purposes only.
+
+Datetime data of a specific contribution activity
+----
+
+This data is a part of the previously-stated "activity entry from Datanommer"
+data. Due to the fact that this data serves its purpose when a computed group
+of those form statistics and not when it is singled out - this data is only 
+used for computational purposes only.
+
+Datetime data of a grouped contribution actvitity
+----
+
+Being a derivative statistic obtained from a computed group of the previously
+stated "datetime of a specific contribution activity", this can be used to 
+understand the trend of contribution over a period of "time" for contributions
+of a certain "nature", contributions over a period of "time" for contributions
+on a certain "service" etc. This understanding would help us answer questions
+like what timelines attract most contributions, what timelines do not have much
+of contributions etc. and gauge the success of activities such as events and 
+workshops by helping answer if those were able to bring in contributions right
+after their commencement time. As a result, this data is available for user
+consumption by the service.
+
+Service where a specific contribution activity happened
+----
+
+This data is a part of the previously-stated "activity entry from Datanommer"
+data. As this data is intricate, it only serves its purpose when a computed
+group of those form statistics and not when it is singled out - and that is why
+this data is only used for computational purposes only. 
+
+Service where a grouped contribution activity happened
+----
+
+Being a derivative statistic obtained from a computed group of the previously
+stated "service where a specific grouped contribution activity happened", this
+can be used to understand the trend of contribution on a certain service and 
+create comparisons of those against another to see how they fare in the 
+contribution activities. This understanding would help us answer questions like
+what services are most active in terms of contributons and what services are
+not and gauge the usability of those services by knowing what makes those 
+services desirable (i.e. inferred from favourable contribution statistics) and
+undesirable (i.e. inferred from unfavourable contribution statistics) to direct
+what service to be contributed to. As a result, this data is available for user
+consumptions by the service.
+
+Activity trends per username 
+----
+
+Being a derivative statistic obtained from a computed group of the previously
+stated "activity entry from Datanommer", this can be used to understand the
+trend of contribution for a certain user. This understanding would help us 
+answer questions like what fields a certain member contributes to and if they
+are transitioning from one field to another, what reasons have led them to do
+that. In order to protect the privacy of the members involved in the 
+aforementioned data, this information is anonymized as a hash and due to the
+fact that this data serves its purpose when a computed group of those form
+statistics and not when it is singled out - this data is only used for 
+computational purposes only.
+