infra-docs-fpo/modules/sysadmin_guide/pages/rabbitmq.adoc
Nils Philippsen b4afb2f945 DC move: iad => rdu3, 10.3. => 10.16.
And remove some obsolete things.

Signed-off-by: Nils Philippsen <nils@redhat.com>
2025-07-04 16:32:42 +02:00

128 lines
4.3 KiB
Text

= RabbitMQ SOP
https://www.rabbitmq.com/[RabbitMQ] is the message broker Fedora uses to allow applications
to send each other (or themselves) messages.
== Contact Information
=== Owner
Fedora Infrastructure Team
=== Contact
#fedora-admin
=== Servers
* rabbitmq0[1-3].rdu3.fedoraproject.org
* rabbitmq0[1-3].stg.rdu3.fedoraproject.org
=== Purpose
General purpose publish-subscribe message broker as well as
application-specific messaging.
== Description
RabbitMQ is a message broker written in Erlang that offers a number of
interfaces including AMQP 0.9.1, AMQP 1.0, STOMP, and MQTT. At this time
only AMQP 0.9.1 is made available to clients.
Fedora uses the RabbitMQ packages provided by the Red Hat Openstack
repository as it has a more up-to-date version.
=== The Cluster
RabbitMQ supports https://www.rabbitmq.com/clustering.html[clustering]
a set of hosts into a single logical
message broker. The Fedora cluster is composed of 3 nodes,
rabbitmq01-03, in both staging and production. `groups/rabbitmq.yml` is
the playbook that deploys the cluster.
=== Virtual Hosts
The cluster contains a number of virtual hosts. Each virtual host has
its own set of resources - exchanges, bindings, queues - and users are
given permissions by virtual host.
==== /pubsub
The /pubsub virtual host is the generic publish-subscribe virtual host
used by most applications. Messages published via AMQP are sent to the
"amq.topic" exchange.
==== /public_pubsub
This virtual host has the "amq.topic" and "zmq.topic" exchanges from
/pubsub https://www.rabbitmq.com/federation.html[federated] to it,
and we allow anyone on the Internet to
connect to this virtual host. For the moment it is on the same broker
cluster, but if people abuse it it can be moved to a separate cluster.
=== Authentication
Clients authenticate to the broker using x509 certificates. The common
name of the certificate needs to match the username of a user in
RabbitMQ.
== Troubleshooting
RabbitMQ offers a CLI, rabbitmqctl, which you can use on any node in the
cluster. It also offers a web interface for management and monitoring,
but that is not currently configured.
=== Network Partition
In case of network partitions, the RabbitMQ cluster should handle it and
recover on its own. In case it doesn't when the network situation is
fixed, the partition can be diagnosed with `rabbitmqctl cluster_status`.
It should include the line `{partitions,[]},` (empty array).
If the array is not empty, the first nodes in the array can be
restartedi one by one, but make sure you give them plenty of time to
sync messages after restart (this can be watched in the
`/var/log/rabbitmq/rabbit.log` file)
=== Federation Status
Federation is the process of copying messages from the internal
`/pubsub` vhost to the external `/public_pubsub` vhost. During network
partitions, it has been seen that the Federation relaying process does
not come back up. The federation status can be checked with the command
`rabbitmqctl eval 'rabbit_federation_status:status().'` on `rabbitmq01`.
It should not return the empty array (`[]`) but something like:
....
[[{exchange,<<"amq.topic">>},
{upstream_exchange,<<"amq.topic">>},
{type,exchange},
{vhost,<<"/public_pubsub">>},
{upstream,<<"pubsub-to-public_pubsub">>},
{id,<<"b40208be0a999cc93a78eb9e41531618f96d4cb2">>},
{status,running},
{local_connection,<<"<rabbit@rabbitmq01.rdu3.fedoraproject.org.2.8709.481>">>},
{uri,<<"amqps://rabbitmq01.rdu3.fedoraproject.org/%2Fpubsub">>},
{timestamp,{{2020,3,11},{16,45,18}}}],
[{exchange,<<"zmq.topic">>},
{upstream_exchange,<<"zmq.topic">>},
{type,exchange},
{vhost,<<"/public_pubsub">>},
{upstream,<<"pubsub-to-public_pubsub">>},
{id,<<"c1e7747425938349520c60dda5671b2758e210b8">>},
{status,running},
{local_connection,<<"<rabbit@rabbitmq01.rdu3.fedoraproject.org.2.8718.481>">>},
{uri,<<"amqps://rabbitmq01.rdu3.fedoraproject.org/%2Fpubsub">>},
{timestamp,{{2020,3,11},{16,45,17}}}]]
....
If the empty array is returned, the following command will restart the
federation (again on `rabbitmq01`):
....
rabbitmqctl clear_policy -p /public_pubsub pubsub-to-public_pubsub
rabbitmqctl set_policy -p /public_pubsub --apply-to exchanges pubsub-to-public_pubsub "^(amq|zmq)\.topic$" '{"federation-upstream":"pubsub-to-public_pubsub"}'
....
After which the Federation link status can be checked with the same
command as before.