2021-07-26 10:39:47 +02:00
|
|
|
= Ansible infrastructure SOP/Information.
|
|
|
|
|
|
|
|
== Background
|
|
|
|
|
|
|
|
Fedora infrastructure used to use func and puppet for system change
|
|
|
|
management. We are now using ansible for all system change mangement and
|
|
|
|
ad-hoc tasks.
|
|
|
|
|
|
|
|
== Overview
|
|
|
|
|
|
|
|
Ansible runs from batcave01 or backup01. These hosts run a ssh-agent
|
|
|
|
that has unlocked the ansible root ssh private key. (This is unlocked
|
|
|
|
manually by a human with the passphrase each reboot, the passphrase
|
|
|
|
itself is not stored anywhere on the machines). Using 'sudo -i',
|
|
|
|
sysadmin-main members can use this agent to access any machines with the
|
|
|
|
ansible root ssh public key setup, either with 'ansible' for one-off
|
|
|
|
commands or 'ansible-playbook' to run playbooks.
|
|
|
|
|
|
|
|
Playbooks are idempotent (or should be). Meaning you should be able to
|
|
|
|
re-run the same playbook over and over and it should get to a state
|
|
|
|
where 0 items are changing.
|
|
|
|
|
|
|
|
Additionally (see below) there is a rbac wrapper that allows members of
|
|
|
|
some other groups to run playbooks against specific hosts.
|
|
|
|
|
|
|
|
=== GIT repositories
|
|
|
|
|
|
|
|
There are 2 git repositories associated with Ansible:
|
|
|
|
|
|
|
|
* The Fedora Infrastructure Ansible repository and replicas.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
[CAUTION]
|
|
|
|
====
|
|
|
|
This is a public repository. Never commit private data to this repo.
|
|
|
|
====
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
image:ansible-repositories.png[image]
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
This repository exists as several copies or replicas:
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
** The "upstream" repository on Pagure.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
https://pagure.io/fedora-infra/ansible
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
This repository is the public facing place where people can contribute
|
|
|
|
(e.g. pull requests) as well as the authoritative source. Members of the
|
|
|
|
`sysadmin` FAS group or the `fedora-infra` Pagure group have commit
|
|
|
|
access to this repository.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
To contribute changes, fork the repository on Pagure and submit a Pull
|
|
|
|
Request. Someone from the aforementioned groups can then review and
|
|
|
|
merge them.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
It is recommended that you configure git to use `pull --rebase` by
|
|
|
|
default by running `git config --bool pull.rebase true` in your ansible
|
|
|
|
clone directory. This configuration prevents unneeded merges which can
|
|
|
|
occur if someone else pushes changes to the remote repository while you
|
|
|
|
are working on your own local changes.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
|
|
|
** Two bare mirrors on _batcave01_, `/srv/git/ansible.git`
|
2021-07-26 10:39:47 +02:00
|
|
|
and `/srv/git/mirrors/ansible.git`
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
[CAUTION]
|
|
|
|
====
|
|
|
|
These are public repositories. Never commit private data to these
|
|
|
|
repositories. Don't commit or push to these repos directly, unless
|
|
|
|
Pagure is unavailable.
|
|
|
|
====
|
2021-08-16 18:24:53 +02:00
|
|
|
|
|
|
|
The `mirror_pagure_ansible` service on _batcave01_ receives
|
2021-07-26 10:39:47 +02:00
|
|
|
bus messages about changes in the repository on Pagure, fetches these
|
|
|
|
into `/srv/git/mirrors/ansible.git` and pushes from there to
|
|
|
|
`/srv/git/ansible.git`. When this happens, various actions are triggered
|
|
|
|
via git hooks:
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
*** The working copy at `/srv/web/infra/ansible` is updated.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
|
|
|
*** A mail about the changes is sent to _sysadmin-members_.
|
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
*** The changes are announced on the message bus, which in turn triggers
|
|
|
|
announcements on IRC.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
|
|
|
You can check out the repo locally on _batcave01_ with:
|
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
....
|
|
|
|
git clone /srv/git/ansible.git
|
|
|
|
....
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
If the Ansible repository on Pagure is unavailable, members of the
|
2021-08-16 18:24:53 +02:00
|
|
|
_sysadmin_ group may commit directly, provided this
|
2021-07-26 10:39:47 +02:00
|
|
|
procedure is followed:
|
|
|
|
[arabic]
|
|
|
|
. The synchronization service is stopped and disabled:
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
....
|
|
|
|
sudo systemctl disable --now mirror_pagure_ansible.service
|
|
|
|
....
|
2021-08-16 18:24:53 +02:00
|
|
|
. Changes are applied to the repository on _batcave01_.
|
2021-07-26 10:39:47 +02:00
|
|
|
. After Pagure is available again, the changes are pushed to the
|
|
|
|
repository there.
|
|
|
|
. The synchronization service is enabled and started:
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
....
|
|
|
|
sudo systemctl enable --now mirror_pagure_ansible.service
|
|
|
|
....
|
2021-08-16 18:24:53 +02:00
|
|
|
** `/srv/web/infra/ansible` on _batcave01_, the working copy
|
2021-07-26 10:39:47 +02:00
|
|
|
from which playbooks are run.
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
[CAUTION]
|
|
|
|
====
|
|
|
|
This is a public repository. Never commit private data to this repo.
|
|
|
|
Don't commit or push to this repo directly, unless Pagure is
|
|
|
|
unavailable.
|
|
|
|
====
|
|
|
|
+
|
|
|
|
You can access it also via a cgit web interface at:
|
|
|
|
https://pagure.io/fedora-infra/ansible/
|
|
|
|
|
2021-08-16 18:24:53 +02:00
|
|
|
* `/srv/git/ansible-private` on _batcave01_.
|
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
[CAUTION]
|
|
|
|
====
|
|
|
|
This is a private repository for passwords and other sensitive data. It
|
|
|
|
is not available in cgit, nor should it be cloned or copied remotely.
|
|
|
|
====
|
2021-08-16 18:24:53 +02:00
|
|
|
|
2021-07-26 10:39:47 +02:00
|
|
|
This repository is only accessible to members of 'sysadmin-main'.
|
|
|
|
|
|
|
|
=== Cron job/scheduled runs
|
|
|
|
|
|
|
|
With use of run_ansible-playbook_cron.py that is run daily via cron we
|
2021-08-16 18:24:53 +02:00
|
|
|
walk through playbooks and run them with _--check --diff_
|
2021-07-26 10:39:47 +02:00
|
|
|
params to perform a dry-run.
|
|
|
|
|
|
|
|
This way we make sure all the playbooks are idempotent and there is no
|
|
|
|
unexpected changes on servers (or playbooks).
|
|
|
|
|
|
|
|
=== Logging
|
|
|
|
|
|
|
|
We have in place a callback plugin that stores history for any
|
|
|
|
ansible-playbook runs and then sends a report each day to
|
|
|
|
sysadmin-logs-members with any CHANGED or FAILED actions. Additionally,
|
|
|
|
there's a fedmsg plugin that reports start and end of ansible playbook
|
|
|
|
runs to the fedmsg bus. Ansible also logs to syslog verbose reporting of
|
|
|
|
when and what commands and playbooks were run.
|
|
|
|
|
|
|
|
=== role based access control for playbooks
|
|
|
|
|
2021-08-16 18:24:53 +02:00
|
|
|
There's a wrapper script on _batcave01_ called 'rbac-playbook' that allows
|
2021-07-26 10:39:47 +02:00
|
|
|
non sysadmin-main members to run specific playbooks against specific
|
|
|
|
groups of hosts. This is part of the ansible_utils package. The upstream
|
|
|
|
for ansible_utils is: https://bitbucket.org/tflink/ansible_utils
|
|
|
|
|
|
|
|
To add a new group:
|
|
|
|
|
|
|
|
[arabic]
|
|
|
|
. add the playbook name and sysadmin group to the rbac-playbook
|
|
|
|
(ansible-private repo)
|
|
|
|
. add that sysadmin group to sudoers on batcave01 (also in
|
|
|
|
ansible-private repo)
|
|
|
|
|
|
|
|
To use the wrapper:
|
|
|
|
|
|
|
|
....
|
|
|
|
sudo rbac-playbook playbook.yml
|
|
|
|
....
|
|
|
|
|
|
|
|
== Directory setup
|
|
|
|
|
|
|
|
=== Inventory
|
|
|
|
|
|
|
|
The inventory directory tells ansible all the hosts that are managed by
|
|
|
|
it and the groups they are in. All files in this dir are concatenated
|
|
|
|
together, so you can split out groups/hosts into separate files for
|
|
|
|
readability. They are in ini file format.
|
|
|
|
|
|
|
|
Additionally under the inventory directory are host_vars and group_vars
|
|
|
|
subdirectories. These are files named for the host or group and
|
|
|
|
containing variables to set for that host or group. You should strive to
|
|
|
|
set variables in the highest level possible, and precedence is in:
|
|
|
|
global, group, host order.
|
|
|
|
|
|
|
|
=== Vars
|
|
|
|
|
|
|
|
This directory contains global variables as well as OS specific
|
|
|
|
variables. Note that in order to use the OS specific ones you must have
|
|
|
|
'gather_facts' as 'True' or ansible will not have the facts it needs to
|
|
|
|
determine the OS.
|
|
|
|
|
|
|
|
=== Roles
|
|
|
|
|
|
|
|
Roles are a collection of tasks/files/templates that can be used on any
|
|
|
|
host or group of hosts that all share that role. In other words, roles
|
|
|
|
should be used except in cases where configuration only applies to a
|
|
|
|
single host. Roles can be reused between hosts and groups and are more
|
|
|
|
portable/flexable than tasks or specific plays.
|
|
|
|
|
|
|
|
=== Scripts
|
|
|
|
|
|
|
|
In the ansible git repo under scripts are a number of utilty scripts for
|
|
|
|
sysadmins.
|
|
|
|
|
|
|
|
=== Playbooks
|
|
|
|
|
|
|
|
In the ansible git repo there's a directory for playbooks. The top level
|
|
|
|
contains utility playbooks for sysadmins. These playbooks perform
|
|
|
|
one-off functions or gather information. Under this directory are hosts
|
|
|
|
and groups playbooks. These playbooks are for specific hosts and groups
|
|
|
|
of hosts, from provision to fully configured. You should only use a host
|
|
|
|
playbook in cases where there will never be more than one of that thing.
|
|
|
|
|
|
|
|
=== Tasks
|
|
|
|
|
|
|
|
This directory contains one-off tasks that are used in playbooks. Some
|
|
|
|
of these should be migrated to roles (we had this setup before roles
|
|
|
|
existed in ansible). Those that are truely only used on one host/group
|
|
|
|
could stay as isolated tasks.
|
|
|
|
|
|
|
|
=== Syntax
|
|
|
|
|
|
|
|
Ansible now warns about depreciated syntax. Please fix any cases you see
|
|
|
|
related to depreciation warnings.
|
|
|
|
|
|
|
|
Templates use the jinja2 syntax.
|
|
|
|
|
|
|
|
== Libvirt virtuals
|
|
|
|
|
|
|
|
* TODO: add steps to make new libvirt virtuals in staging and production
|
|
|
|
* TODO: merge in new-hosts.txt
|
|
|
|
|
|
|
|
== Cloud Instances
|
|
|
|
|
|
|
|
* TODO: add how to make new cloud instances
|
|
|
|
* TODO: merge in from ansible README file.
|
|
|
|
|
|
|
|
== rdiff-backups
|
|
|
|
|
|
|
|
see:
|
|
|
|
https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/rdiff-backup.html
|
|
|
|
|
|
|
|
== Additional Reading/Resources
|
|
|
|
|
|
|
|
Upstream docs:::
|
|
|
|
https://docs.ansible.com/
|
|
|
|
Example repo with all kinds of examples:::
|
|
|
|
* https://github.com/ansible/ansible-examples
|
|
|
|
* https://gist.github.com/marktheunissen/2979474
|
|
|
|
Jinja2 docs:::
|
|
|
|
http://jinja.pocoo.org/docs/
|