Right now countme-update.sh tries to `git commit -a` whether or not
anything has changed, which results in this output whenever there's no
new changes to commit:
On branch master
Untracked files:
(use "git add <file>..." to include in what will be committed)
raw.db
totals.db
nothing added to commit but untracked files present (use "git add" to track)
This commit tweaks `countme-update.sh` so that it only attempts `git commit`
if there are changes to be committed - i.e. when `git diff` returns 1.
Signed-off-by: Will Woods <wwoods@redhat.com>
So it turns out that pip3 installs scripts to /usr/local/bin and cron
jobs don't have /usr/local/bin in the path.
This commit adds /usr/local/bin to PATH in countme-update.sh.
For Maximum Correctness we should probably get pip to tell us where it
installed countme-update-{rawdb,totals}.sh but this'll work just fine
as long as pip keeps installing scripts to /usr/bin or /usr/local/bin.
Signed-off-by: Will Woods <wwoods@redhat.com>
This gives the web-data-analysis `countme` user a .gitconfig file so the
commits it makes in its local git repo have a proper user name and
email address. (Also it makes git stop complaining..)
The email address might not actually be valid, but this repo doesn't
currently go anywhere public so it shouldn't really matter.
This should automate running the "countme" scripts every day to parse
new log data and publish updated totals.
Here's what I've added to the ansible role:
* install package deps for `mirrors-countme`
* make "countme" user with home /srv/countme
* clone 'prod' branch of https://pagure.io/mirrors-countme to /srv/countme
* if changed: pip install /srv/countme/mirrors-countme
* make web subdir /var/www/html/csv-reports/countme
* make local data dir /var/lib/countme
* install `countme-update.sh` to /usr/local/bin
* install `countme-update.cron` to /etc/cron.d
* runs /usr/local/bin/countme-update.sh daily, as user `countme`
That should make sure `countme-update.sh` runs every day.
That script works like this:
1. Run `countme-update-rawdb.sh`
* parse new mirrors.fp.o logs in /var/log/hosts/proxy*
* write data to /var/lib/countme/raw.db
2. Run `countme-update-totals.sh`
* parse raw data from /var/lib/countme/raw.db
* write updated totals to /var/lib/countme/totals.{db,csv}
3. Track changes in updated totals
* set up /var/lib/countme as git repo (if needed)
* commit new `totals.csv` (if changed)
4. Make updated totals public
* Copy totals.{db,csv} to /var/www/html/csv-reports/countme
For safety's sake, I've tried to set up everything so it runs as the
`countme` user rather than running everything as `root`. This might be
an unnecessary complication but it seemed like the right thing to do.
Similarly, keeping totals.csv in a git repo isn't _required_, but it
seemed like a good idea to keep historical records in case we want/need
to change the counting algorithm or something.
I checked the YAML with ansible-lint and tested that all the scripts
work as expected when run as `wwoods`, so unless I've missed something
this should do the trick.