Added the infra SOPs ported to asciidoc.
This commit is contained in:
parent
8a7f111a12
commit
a0301e30f1
148 changed files with 18575 additions and 17 deletions
133
modules/sysadmin_guide/pages/rabbitmq.adoc
Normal file
133
modules/sysadmin_guide/pages/rabbitmq.adoc
Normal file
|
@ -0,0 +1,133 @@
|
|||
= RabbitMQ SOP
|
||||
|
||||
link:[RabbitMQ] is the message broker Fedora uses to allow applications
|
||||
to send each other (or themselves) messages.
|
||||
|
||||
== Contact Information
|
||||
|
||||
=== Owner
|
||||
|
||||
Fedora Infrastructure Team
|
||||
|
||||
=== Contact
|
||||
|
||||
#fedora-admin
|
||||
|
||||
=== Servers
|
||||
|
||||
* rabbitmq0[1-3].phx2.fedoraproject.org
|
||||
* rabbitmq0[1-3].stg.phx2.fedoraproject.org
|
||||
|
||||
=== Purpose
|
||||
|
||||
General purpose publish-subscribe message broker as well as
|
||||
application-specific messaging.
|
||||
|
||||
== Description
|
||||
|
||||
RabbitMQ is a message broker written in Erlang that offers a number of
|
||||
interfaces including AMQP 0.9.1, AMQP 1.0, STOMP, and MQTT. At this time
|
||||
only AMQP 0.9.1 is made available to clients.
|
||||
|
||||
Fedora uses the RabbitMQ packages provided by the Red Hat Openstack
|
||||
repository as it has a more up-to-date version.
|
||||
|
||||
=== The Cluster
|
||||
|
||||
RabbitMQ supports link:[clustering] a set of hosts into a single logical
|
||||
message broker. The Fedora cluster is composed of 3 nodes,
|
||||
rabbitmq01-03, in both staging and production. `groups/rabbitmq.yml` is
|
||||
the playbook that deploys the cluster.
|
||||
|
||||
=== Virtual Hosts
|
||||
|
||||
The cluster contains a number of virtual hosts. Each virtual host has
|
||||
its own set of resources - exchanges, bindings, queues - and users are
|
||||
given permissions by virtual host.
|
||||
|
||||
==== /pubsub
|
||||
|
||||
The /pubsub virtual host is the generic publish-subscribe virtual host
|
||||
used by most applications. Messages published via AMQP are sent to the
|
||||
"amq.topic" exchange. Messages being bridged from fedmsg into AMQP are
|
||||
sent via "zmq.topic".
|
||||
|
||||
==== /public_pubsub
|
||||
|
||||
This virtual host has the "amq.topic" and "zmq.topic" exchanges from
|
||||
/pubsub link:[federated] to it, and we allow anyone on the Internet to
|
||||
connect to this virtual host. For the moment it is on the same broker
|
||||
cluster, but if people abuse it it can be moved to a separate cluster.
|
||||
|
||||
=== Authentication
|
||||
|
||||
Clients authenticate to the broker using x509 certificates. The common
|
||||
name of the certificate needs to match the username of a user in
|
||||
RabbitMQ.
|
||||
|
||||
== Troubleshooting
|
||||
|
||||
RabbitMQ offers a CLI, rabbitmqctl, which you can use on any node in the
|
||||
cluster. It also offers a web interface for management and monitoring,
|
||||
but that is not currently configured.
|
||||
|
||||
=== Network Partition
|
||||
|
||||
In case of network partitions, the RabbitMQ cluster should handle it and
|
||||
recover on its own. In case it doesn't when the network situation is
|
||||
fixed, the partition can be diagnosed with `rabbitmqctl cluster_status`.
|
||||
It should include the line `{partitions,[]},` (empty array).
|
||||
|
||||
If the array is not empty, the first nodes in the array can be
|
||||
restartedi one by one, but make sure you give them plenty of time to
|
||||
sync messages after restart (this can be watched in the
|
||||
`/var/log/rabbitmq/rabbit.log` file)
|
||||
|
||||
=== Federation Status
|
||||
|
||||
Federation is the process of copying messages from the internal
|
||||
`/pubsub` vhost to the external `/public_pubsub` vhost. During network
|
||||
partitions, it has been seen that the Federation relaying process does
|
||||
not come back up. The federation status can be checked with the command
|
||||
`rabbitmqctl eval 'rabbit_federation_status:status().'` on `rabbitmq01`.
|
||||
It should not return the empty array (`[]`) but something like:
|
||||
|
||||
....
|
||||
[[{exchange,<<"amq.topic">>},
|
||||
{upstream_exchange,<<"amq.topic">>},
|
||||
{type,exchange},
|
||||
{vhost,<<"/public_pubsub">>},
|
||||
{upstream,<<"pubsub-to-public_pubsub">>},
|
||||
{id,<<"b40208be0a999cc93a78eb9e41531618f96d4cb2">>},
|
||||
{status,running},
|
||||
{local_connection,<<"<rabbit@rabbitmq01.phx2.fedoraproject.org.2.8709.481>">>},
|
||||
{uri,<<"amqps://rabbitmq01.phx2.fedoraproject.org/%2Fpubsub">>},
|
||||
{timestamp,{{2020,3,11},{16,45,18}}}],
|
||||
[{exchange,<<"zmq.topic">>},
|
||||
{upstream_exchange,<<"zmq.topic">>},
|
||||
{type,exchange},
|
||||
{vhost,<<"/public_pubsub">>},
|
||||
{upstream,<<"pubsub-to-public_pubsub">>},
|
||||
{id,<<"c1e7747425938349520c60dda5671b2758e210b8">>},
|
||||
{status,running},
|
||||
{local_connection,<<"<rabbit@rabbitmq01.phx2.fedoraproject.org.2.8718.481>">>},
|
||||
{uri,<<"amqps://rabbitmq01.phx2.fedoraproject.org/%2Fpubsub">>},
|
||||
{timestamp,{{2020,3,11},{16,45,17}}}]]
|
||||
....
|
||||
|
||||
If the empty array is returned, the following command will restart the
|
||||
federation (again on `rabbitmq01`):
|
||||
|
||||
....
|
||||
rabbitmqctl clear_policy -p /public_pubsub pubsub-to-public_pubsub
|
||||
rabbitmqctl set_policy -p /public_pubsub --apply-to exchanges pubsub-to-public_pubsub "^(amq|zmq)\.topic$" '{"federation-upstream":"pubsub-to-public_pubsub"}'
|
||||
....
|
||||
|
||||
After which the Federation link status can be checked with the same
|
||||
command as before.
|
||||
|
||||
https://www.rabbitmq.com/
|
||||
|
||||
https://www.rabbitmq.com/clustering.html
|
||||
|
||||
https://www.rabbitmq.com/federation.html
|
Loading…
Add table
Add a link
Reference in a new issue