Fedora Infrastructure Ansible Repository
Find a file
Adrian Reber 7146663478 mm2-crawler: reduce database load during crawling
The two crawlers used to start 25 threads (each) every 12 hours to crawl
the mirrors. The second crawler was already started two hours later to
reduce the database load. This commit increases the delay of the second
crawler to 6 hours, so that every crawler has the MM database for 6
hours on its own. At the same time the number of parallel crawlers is
reduced from 25 to 20 to also reduce the load on the database.

In addition the crawl timeout has been increased from 3 to 4 hours. This
is related to the fact the especially pub/archive has grown and
pub/fedora with the addition of the modular tree. Crawl timeouts can now
be seen more often, which can lead to mirrors being auto-disabled.

The main reason for these changes is that it can be seen in the logs that
the actual crawling of the mirrors does not always require most of time
but updating the state of all directories of each mirror in the database
can take a very long time. By reducing the number of parallel accesses to
the database, in the best case from 50 to 20, the crawling should get
faster (hopefully).

Signed-off-by: Adrian Reber <adrian@lisas.de>
2017-10-08 13:15:48 +02:00
callback_plugins Cross the Ts 2017-08-16 20:13:27 +00:00
files Move the infra tags over to koji dist-repos 2017-10-06 21:53:44 +02:00
filter_plugins Consider the qa network too when building the stg fedmsg routing policy. 2016-03-23 19:14:43 +00:00
handlers Rename this handler. 2017-08-27 20:02:56 +00:00
inventory Revert "this is br2 now" 2017-10-07 23:54:23 +00:00
library builders: initial pass at a ARMv7 virt builder 2017-04-13 17:58:50 +00:00
playbooks Build wcidff on prod sundries. 2017-10-07 09:55:55 +00:00
roles mm2-crawler: reduce database load during crawling 2017-10-08 13:15:48 +02:00
scripts add resultsdb to public db dumps, clean up script to use fqdn 2017-10-07 21:24:51 +00:00
tasks do not replace base repo on branched instance 2017-09-10 20:11:11 +00:00
vars add fedora26 image here 2017-08-16 06:08:41 +00:00
.gitignore Migrate a bunch of things to roles. Thanks to misc! 2013-08-19 20:12:26 +00:00
CONVENTIONS Fix typo in doc 2016-04-04 14:33:21 +00:00
master.yml Add preliminary definitions for Ansible Magazine 2017-09-14 14:44:19 +00:00
README Undo test 2016-11-01 09:42:09 +00:00
README.cloud Update instructions for deleting transient cloud nodes. 2016-04-01 14:19:23 +00:00
TODO restructure virt to be more like cloud creation 2013-05-03 16:56:38 +00:00

== ansible repository/structure ==

files - files and templates for use in playbooks/tasks
      - subdirs for specific tasks/dirs highly recommended

inventory - where the inventory and additional vars is stored
          - All files in this directory in ini format 
          - added together for total inventory
  group_vars: 
          - per group variables set here in a file per group 
  host_vars: 
          - per host variables set here in a file per host 

library - library of custom local ansible modules

playbooks - collections of plays we want to run on systems

  groups: groups of hosts configured from one playbook.
  
  hosts: playbooks for single hosts. 

  manual: playbooks that are only run manually by an admin as needed.

tasks - snippets of tasks that should be included in plays

roles - specific roles to be use in playbooks. 
        Each role has it's own files/templates/vars

filter_plugins - Jinja filters

master.yml - This is the master playbook, consisting of all 
             current group and host playbooks. Note that the 
             daily cron doesn't run this, it runs even over
             playbooks that are not yet included in master. 
             This playbook is usefull for making changes over 
             multiple groups/hosts usually with -t (tag). 

== Paths ==

public path for everything is:

 /srv/web/infra/ansible

private path - which is sysadmin-main accessible only is:

 /srv/private/ansible

In general to run any ansible playbook you will want to run:

sudo -i ansible-playbook /path/to/playbook.yml

== Scheduled check-diff ==

Every night a cron job runs over all playbooks under playbooks/{groups}{hosts}
with the ansible --check --diff options. A report from this is sent to 
sysadmin-logs. In the ideal state this report would be empty. 

== Idempotency ==

All playbooks should be idempotent. Ie, if run once they should bring the 
machine(s) to the desired state, and if run again N times after that they should
make 0 changes (because the machine(s) are in the desired state). 
Please make sure your playbooks are idempotent. 

== Can be run anytime ==

When a playbook or change is checked into ansible you should assume 
that it could be run at ANY TIME. Always make sure the checked in state
is the desired state. Always test changes when they land so they don't 
surprise you later.