From 90d38d58a8f44ef24ca32284169e16568d10b32a Mon Sep 17 00:00:00 2001 From: Akashdeep Dhar Date: Mon, 22 May 2023 13:24:50 +0530 Subject: [PATCH] Add documentation about data exploration Signed-off-by: Akashdeep Dhar --- docs/fcas/index.rst | 1 + docs/fcas/solution_dataeplt.rst | 120 ++++++++++++++++++++++++++++++++ 2 files changed, 121 insertions(+) create mode 100644 docs/fcas/solution_dataeplt.rst diff --git a/docs/fcas/index.rst b/docs/fcas/index.rst index 910c06c..4f050a4 100644 --- a/docs/fcas/index.rst +++ b/docs/fcas/index.rst @@ -99,5 +99,6 @@ Index creation_gram creation_fail solution_datanote + solution_dataeplt solution_examples solution_techtool diff --git a/docs/fcas/solution_dataeplt.rst b/docs/fcas/solution_dataeplt.rst new file mode 100644 index 0000000..3062d80 --- /dev/null +++ b/docs/fcas/solution_dataeplt.rst @@ -0,0 +1,120 @@ +.. _solution_dataeplt.rst: + +Data Exploration and Significance +==== + +The following is a set of information that would be looked into by the said +service whenever it would be deployed. Please note that this list consists +of both - the information that would be available for consumption by the +service users as well as the information that would be available for +computation and analysis to the service itself but not the service users, and +there can be more such information apart from the ones listed below. + +1. Activity entry from Datanommer (For computation only) +2. Username of the "subject" i.e. owner of the contribution (For computation only) +3. Username of the "object" i.e. involved in the contribution (For computation only) +4. Datetime data of a specific contribution activity (For computation only) +5. Datetime data of a grouped contribution activity (For consumption only) +6. Service where a specific contribution activity happened (For computation only) +7. Service where a grouped contribution activity happened (For consumption only) +8. Activity trends per username (For computation only) + + +Activity Entry from Datanommer +---- + +This data forms the most basic functional entity of a "contribution record". An +occurrence of an activity means that a contribution was made by the "subject" +member on the "object" member and/or service with the "predicament" nature of +the contribution at the "time" of it happening. A computed collection of these +data can help form wider statistics for example - trend of contribution by a +certain "subject" member, trend of contribution on a certain "service" etc. +allowing us to answer questions like "which services are most active (and why) +and least active (any why)?", "what period of time attracts most contributions +(and why)?" etc. As this data is intricate, it only serves its purpose when a +computed group of those form statistics and not when it is singled out - and +that is why this data is only used for computational purposes only. + +Username of the "subject" +---- + +Alternatively, owner of the contribution. + +This data is a part of the previously-stated "activity entry from Datanommer" +data. In order to protect the privacy of the members involved in the +aforementioned data, this information is anonymized as a hash and due to the +fact that this data serves its purpose when a computed group of those form +statistics and not when it is singled out - this data is only used for +computational purposes only. + +Username of the "object" +---- + +Alternatively, involved in the contribution. + +This data is a part of the previously-stated "activity entry from Datanommer" +data. In order to protect the privacy of the members involved in the +aforementioned data, this information is anonymized as a hash and due to the +fact that this data serves its purpose when a computed group of those form +statistics and not when it is singled out - this data is only used for +computational purposes only. + +Datetime data of a specific contribution activity +---- + +This data is a part of the previously-stated "activity entry from Datanommer" +data. Due to the fact that this data serves its purpose when a computed group +of those form statistics and not when it is singled out - this data is only +used for computational purposes only. + +Datetime data of a grouped contribution actvitity +---- + +Being a derivative statistic obtained from a computed group of the previously +stated "datetime of a specific contribution activity", this can be used to +understand the trend of contribution over a period of "time" for contributions +of a certain "nature", contributions over a period of "time" for contributions +on a certain "service" etc. This understanding would help us answer questions +like what timelines attract most contributions, what timelines do not have much +of contributions etc. and gauge the success of activities such as events and +workshops by helping answer if those were able to bring in contributions right +after their commencement time. As a result, this data is available for user +consumption by the service. + +Service where a specific contribution activity happened +---- + +This data is a part of the previously-stated "activity entry from Datanommer" +data. As this data is intricate, it only serves its purpose when a computed +group of those form statistics and not when it is singled out - and that is why +this data is only used for computational purposes only. + +Service where a grouped contribution activity happened +---- + +Being a derivative statistic obtained from a computed group of the previously +stated "service where a specific grouped contribution activity happened", this +can be used to understand the trend of contribution on a certain service and +create comparisons of those against another to see how they fare in the +contribution activities. This understanding would help us answer questions like +what services are most active in terms of contributons and what services are +not and gauge the usability of those services by knowing what makes those +services desirable (i.e. inferred from favourable contribution statistics) and +undesirable (i.e. inferred from unfavourable contribution statistics) to direct +what service to be contributed to. As a result, this data is available for user +consumptions by the service. + +Activity trends per username +---- + +Being a derivative statistic obtained from a computed group of the previously +stated "activity entry from Datanommer", this can be used to understand the +trend of contribution for a certain user. This understanding would help us +answer questions like what fields a certain member contributes to and if they +are transitioning from one field to another, what reasons have led them to do +that. In order to protect the privacy of the members involved in the +aforementioned data, this information is anonymized as a hash and due to the +fact that this data serves its purpose when a computed group of those form +statistics and not when it is singled out - this data is only used for +computational purposes only. +