Commit graph

193 commits

Author SHA1 Message Date
Stephen Smoogen
bfb1320bc9 Add a countme for CentOS Stream 9 2022-01-05 13:48:34 -05:00
Stephen Smoogen
b0f46d9ce7 remove getfedora statistics. not run in over year.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2022-01-05 13:48:34 -05:00
Nils Philippsen
337248e4eb Muffle one more cron job
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-10-04 21:45:04 +00:00
Stephen Smoogen
f79e5d1b43 Add in a message to dbus to see if we can get why files arent getting sorted
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2021-09-30 13:45:39 -04:00
Stephen Smoogen
bb9864402f For some reason we are not sorting the files even though the script says we are. Try to get a result to find out why
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2021-09-29 18:16:43 +00:00
Stephen Smoogen
3e4a02a427 Update gnuplot dates to 2021-12-31. The fact that no one asked for this says that hotspot and getfedora data is not watched 2021-09-29 18:16:43 +00:00
Stephen Smoogen
8dc88ade71 Add some of the EPEL-9 items into things. We need to clean up the code to add in -next counting at some point.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2021-09-29 18:16:43 +00:00
Stephen Smoogen
960de34413 Fix non-zeroed data
Matthew Miller found that running the awk script over multiple days
caused newer releases than F33 would go up forever. This fix should
zero out all the new variables.

Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2021-09-29 18:16:43 +00:00
Stephen Smoogen
07076b9f9a Fix traceback in old mirror counting program.
When a corrupt date is found in the log file, I have the program
default of 1970-01-01 but because there is a lookup used it needed to
be 1970-Jan-01 which would then get replaced by 1970-01-01.

Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2021-09-29 18:16:43 +00:00
Timothée Ravier
4d43b7e377 kinoite.fedoraproject.org: Add site and pipeline
Signed-off-by: Timothée Ravier <tim@siosm.fr>
2021-09-27 19:01:44 +00:00
Nils Philippsen
a0d70d7e4f Move executable scripts into /etc/cron.daily
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-23 10:41:57 +02:00
Nils Philippsen
369487a3bb Don't write normal operational message to stderr
This prevents sending out unnecessary mails when run from the related
cron job:

   condense-mirrorlogs.cron
-> condense-mirrorlogs.sh > /dev/null
-> mirrorlist.py

Additionally, report the failing file name in the case of an error.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-22 10:49:30 +02:00
Nils Philippsen
6ecdab22c0 Clean up messages issued from sync-http-logs.py
Previously, the script was very talkative by default. Make the default
to be silent for log levels < WARNING and allow logging (at different
level) to syslog. Additionally, configure the cronjob to log everything
of levels >= INFO to syslog.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-22 10:49:30 +02:00
Nils Philippsen
27b41a491e Deploy mirrors-countme (only) as RPM package
This also ensures that the previously cloned git repository and local
installation of the Python package and associated scripts are removed.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-15 14:41:41 +00:00
Nils Philippsen
cb61463c26 Allow presets for message bodies
This lets users of simple_message_to_bus predefine items which should be
present in all message bodies this way:

export MSGBODY_PRESET="key1=value1 key2=value2"

This doesn't work with spaces in either keys or values, any quotation
will be used verbatim.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-09 12:40:56 +02:00
Nils Philippsen
a3203d29d9 Get rid of implicit message topic prefix
Callers of simple_message_to_bus need to set and export MSGTOPIC_PREFIX
explicitly.

This decouples the fedora-messaging-utils and web-data-analysis roles.

Additionally, don't assume /bin/sh is /bin/bash.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-09 12:40:56 +02:00
Nils Philippsen
0b518a7e88 Fix typo
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-09 10:38:38 +00:00
Nils Philippsen
ecd8ab8383 Merge syncing and combining logs into one cronjob
This should prevent race conditions of the form that logs are attempted
to be combined while syncing those of individual hosts hasn't finished.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-09 10:38:38 +00:00
Nils Philippsen
a766ec6416 Merge awstats role into web-data-analysis
This is to enable running the syncing and combining scripts in
series rather than from independently scheduled cron jobs.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-09-09 10:38:38 +00:00
Nils Philippsen
5e09dce82d Import fedora-messaging-utils role
Importing the role rather than listing it in the playbook lets its tasks
have the tags used in the importing role, i.e. should ensure they are
run when the things that need simple_message_to_bus are installed.

Additionally, don't attempt to install it manually from
web-data-analysis (it isn't found because it lives in a different role).

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-08-16 06:02:37 +00:00
Nils Philippsen
6e62fcbe69 Don't drop temporary files all over the place
When renaming a file over another which is the same hard link, the
rename is a no-op. This left many temporary files in /var/log/hosts
because a file is attempted to be synced (and thus hard-linked between
dated and undated file names) over a couple of days. The solution to
this is how the `ln` command does it: rename, then unlink the temporary
file.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-08-12 09:45:49 +00:00
Adam Saleh
db936062b3 Add more message-based tracing to log01 scripts 2021-08-11 11:18:17 +00:00
Nils Philippsen
f703e7a771 Add and use optimized http log syncing script
The previous one synced all hosts serially and ran rsync for each log
file. This reimplements the shell script in Python, with these changes:

- Run rsync on whole directories of log files, with much reduced
  overhead.
- Use a pool of five workers which process hosts in parallel.

Additionally, remove download-rdu01.vpn.fedoraproject.org from the list
of synced hosts.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-08-05 16:44:47 +00:00
Stephen Smoogen
33df23d457 this will give copies of these emails to asaleh and nils so they can see how the cron jobs are working 2021-08-05 06:46:17 -04:00
Stephen Smoogen
b78179ed3c remove an email to smooge@smoogespace.com as debug is done 2021-08-04 08:38:42 -04:00
Adam Saleh
7a013fe511 Send tracing messages to the bus in syncHttpLogs
In the course, fix a typo which reduces stdout spam.

Signed-off-by: Adam Saleh <asaleh@redhat.com>
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-07-28 11:23:10 +02:00
Nils Philippsen
c782eceae1 Move syncHttpLogs.sh into web-data-analysis role
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-07-23 13:06:23 +02:00
Stephen Smoogen
9a54f23d1e Fix a lot of unknown arches in mirrorlist.py. Take a stab at fixing some of the graphs in mirrors-data.gp. Let the team figure out a better way to fix the rest. 2021-07-15 05:06:56 -04:00
Kevin Fenzi
b7a031c9fd fedoraloveskde.org: add site and pipeline to deploy it and dns zone
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-06-14 12:49:11 -07:00
Stephen Smoogen
e21c30a720 Make the git countme email me at home so I can see if there is an error
Signed-off-by: Stephen Smoogen <smooge@smoogespace.com>
2021-05-27 12:37:51 -04:00
Stephen Smoogen
255b10c922 Add in roles for f34-f39 and epel9 for counting with old stat program 2021-04-06 13:11:31 -04:00
Stephen Smoogen
625441f66b remove wwoods and put mattdm as owner of this script.
Signed-off-by: Stephen Smoogen <smooge@smoogespace.com>
2021-03-29 08:43:25 -04:00
Stephen Smoogen
e1c6dccd5b update the dates for web data analysis gnuplots til 2021-06-30. should give enough time for next project 2020-12-08 17:56:37 -05:00
Will Woods
9a4201efc1 suppress 'nothing added to commit..' messages from countme-update.sh
Right now countme-update.sh tries to `git commit -a` whether or not
anything has changed, which results in this output whenever there's no
new changes to commit:

    On branch master
    Untracked files:
      (use "git add <file>..." to include in what will be committed)
            raw.db
            totals.db

    nothing added to commit but untracked files present (use "git add" to track)

This commit tweaks `countme-update.sh` so that it only attempts `git commit`
if there are changes to be committed - i.e. when `git diff` returns 1.

Signed-off-by: Will Woods <wwoods@redhat.com>
2020-11-17 13:31:52 -05:00
Will Woods
3dadedeb26 web-data-analysis: fix countme-update
So it turns out that pip3 installs scripts to /usr/local/bin and cron
jobs don't have /usr/local/bin in the path.

This commit adds /usr/local/bin to PATH in countme-update.sh.

For Maximum Correctness we should probably get pip to tell us where it
installed countme-update-{rawdb,totals}.sh but this'll work just fine
as long as pip keeps installing scripts to /usr/bin or /usr/local/bin.

Signed-off-by: Will Woods <wwoods@redhat.com>
2020-11-10 17:57:56 -05:00
Stephen Smoogen
8d7d4f5389 when trying to find out why something is failing.. remove the /dev/null so that you can see why it is failing. Also add wwoods so he can delight in this. 2020-11-06 10:15:17 -05:00
Stephen Smoogen
1096f9b35f try to get the cron output to see why this is failing. 2020-11-06 10:07:15 -05:00
Kevin Fenzi
1cd00567f2 web-data-analysis: pip is pip3
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-28 08:33:02 -07:00
Kevin Fenzi
1736849caa web-data-analysis: add countme group before trying to add it to countme user
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-28 08:25:53 -07:00
Will Woods
f46768ec6b countme: add .gitconfig
This gives the web-data-analysis `countme` user a .gitconfig file so the
commits it makes in its local git repo have a proper user name and
email address. (Also it makes git stop complaining..)

The email address might not actually be valid, but this repo doesn't
currently go anywhere public so it shouldn't really matter.
2020-10-13 16:17:00 +00:00
Will Woods
f8a5720535 add 'countme' stuff to web-data-analysis role
This should automate running the "countme" scripts every day to parse
new log data and publish updated totals.

Here's what I've added to the ansible role:

* install package deps for `mirrors-countme`
* make "countme" user with home /srv/countme
* clone 'prod' branch of https://pagure.io/mirrors-countme to /srv/countme
  * if changed: pip install /srv/countme/mirrors-countme
* make web subdir /var/www/html/csv-reports/countme
* make local data dir /var/lib/countme
* install `countme-update.sh` to /usr/local/bin
* install `countme-update.cron` to /etc/cron.d
  * runs /usr/local/bin/countme-update.sh daily, as user `countme`

That should make sure `countme-update.sh` runs every day.
That script works like this:

1. Run `countme-update-rawdb.sh`
  * parse new mirrors.fp.o logs in /var/log/hosts/proxy*
  * write data to /var/lib/countme/raw.db
2. Run `countme-update-totals.sh`
  * parse raw data from /var/lib/countme/raw.db
  * write updated totals to /var/lib/countme/totals.{db,csv}
3. Track changes in updated totals
  * set up /var/lib/countme as git repo (if needed)
  * commit new `totals.csv` (if changed)
4. Make updated totals public
  * Copy totals.{db,csv} to /var/www/html/csv-reports/countme

For safety's sake, I've tried to set up everything so it runs as the
`countme` user rather than running everything as `root`. This might be
an unnecessary complication but it seemed like the right thing to do.

Similarly, keeping totals.csv in a git repo isn't _required_, but it
seemed like a good idea to keep historical records in case we want/need
to change the counting algorithm or something.

I checked the YAML with ansible-lint and tested that all the scripts
work as expected when run as `wwoods`, so unless I've missed something
this should do the trick.
2020-10-13 16:17:00 +00:00
Stephen Smoogen
28aa22994e this file was needed for older log analysis and not for the current one 2020-08-18 14:01:12 -04:00
Stephen Smoogen
e86fb420fd remove a playbook and job which should not be run on log01 2020-08-04 08:14:48 -04:00
Stephen Smoogen
ebfeeecc83 remove moving average from hotspot as it will not work in EL8 2020-08-03 15:11:38 -04:00
Stephen Smoogen
a694ee6042 again remove ending backslash from last line before unset 2020-08-03 07:25:15 -04:00
Stephen Smoogen
11b735ce27 move all latest start dates to be the same of 2018-01-01 2020-07-31 18:20:46 -04:00
Stephen Smoogen
348f7e76cd unknown_releases are currently getting a large amount due to the open cisco repository getting seen. need to rerun 6 months of data 2020-07-31 18:19:15 -04:00
Stephen Smoogen
64f85b6657 remove moving average graphs as they dont work 2020-07-31 18:08:26 -04:00
Stephen Smoogen
e60973c6cb fix a trailing \, in the gnuplot which does not look well 2020-07-31 18:03:12 -04:00
Stephen Smoogen
98b38667f0 add fedora 30 to 33 to the mirrors-data.gp. These are the last releases which can be added to this software without some major changes in how the csv and db are stored. 2020-07-31 16:41:21 -04:00