From e625a7405a06f25d34587087315857566d9d81a6 Mon Sep 17 00:00:00 2001 From: Adam Saleh Date: Wed, 7 Jul 2021 18:33:36 +0200 Subject: [PATCH] Added notes on dnf counting/mirrors-countme initiative. --- docs/initiatives.rst | 1 + docs/mirrors-countme/index.rst | 69 ++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) create mode 100644 docs/mirrors-countme/index.rst diff --git a/docs/initiatives.rst b/docs/initiatives.rst index 01c4c33..5d686d7 100644 --- a/docs/initiatives.rst +++ b/docs/initiatives.rst @@ -8,3 +8,4 @@ Initiatives monitoring_metrics/index pdc/index fmn/index + mirrors-countme/index diff --git a/docs/mirrors-countme/index.rst b/docs/mirrors-countme/index.rst new file mode 100644 index 0000000..0dda01d --- /dev/null +++ b/docs/mirrors-countme/index.rst @@ -0,0 +1,69 @@ +Improving reliability of mirrors-countme scripts +======================== + +Notes on curent deployment +-------------------------- + +For investigating and deployment, you need to be the member of +sysadmin-analysis. + +The repo that has the code is on https://pagure.io/mirrors-countme/ + +The deployment configuration is stored in ansible repo, run through playbook playbooks/groups/logserver.yml, +mostly in role roles/web-data-analysis. + +The scripts are running on log01.iad2.fedoraproject.org. If you are a member of sysadmin-analysis, you should be able to ssh, +and have root there. + +There are several cron jobs responsible for running the scripts: + +* syncHttpLogs in /etc/cron.daily/ rsync logs to /var/log/hosts/$HOST/$YEAR/$MONTH/$DAY/http +* combineHttp - in /etc/cron.d/ every day at 6, runs /usr/local/bin/combineHttpLogs.sh + combines logs from /var/log/hosts to /mnt/fedora_stats/combined-http + based on the project. We are using /usr/share/awstats/tools/logresolvemerge.pl and I am not sure we are using it correctly +* condense-mirrorlogs - in /etc/cron.d/ every day at 6, does some sort of analysis, posibly one of the older scripts. It seems to attempt to sort the logs again. +* countme-update - in /etc/cron.d/ every day at 9, runs two scripts, + countme-update-rawdb.sh that parses the logs and fills in the raw database + and countme-update-totals.sh that uses the rawdb to calculate the statistics + The results of countme-update-totals.sh are then copied to a web-folder to make it available at https://data-analysis.fedoraproject.org/csv-reports/countme/ + + +Notes on avenues of improvement +------------------------------- + +We have several areas we need to improve: + +* downloading and syncing the logs, sometimes can fail or hang. +* problems when combining them +* instalation of the scripts, as there has been problem with updates, + and currently we are doing just a pull of the git repo and running the pip install + +Notes on replacing with off-the shelf solutions +----------------------------------------------- + +As the raw data we are basing our staticis on are just the access-logs +from our proxy-servers, we could be able to find an off-the shelf solution, +that could replace our brittle scripts. + +There are two solutions that psesent themselves, ELK stack +and Loki and Promtail by Grafana. + +We are already running ELK stack on our openshift, +but our experience so far is that Elastic Search has even more brittle deployment. + +We did some experiments with Loki. The technology seems promissing, +as it is much more simple than ELK stack, with size looking comparable to the raw logs. + +Moreover, promtail that does the parsing and uploading of logs has facilities to both +add labels to loglies that will then be indexed and queriable in the database +and collect statistics from the loglines directly that can be gathered by prometheus. + +You can query the logs with language simmilar to GraphQL. + +We are not going to use it because: + +* it doesn't deal well with historical data, so any attempts at initial import of logsare pain. +* using promtail enabled metrics wouldn't help us with double-counting of people hitting different proxy servers +* configuration is fiddly and tricky to test +* changing batch-process to soft-realtime sounds like a headache + \ No newline at end of file