infra-docs-fpo/modules/sysadmin_guide/pages/fedmsg-gateway.adoc
Michal Konečný c43693340d Review fedmsg-gateway SOP
Signed-off-by: Michal Konečný <mkonecny@redhat.com>
2021-08-19 11:31:04 +02:00

104 lines
3.3 KiB
Text

= fedmsg-gateway SOP
Outgoing raw ZeroMQ message stream.
[NOTE]
====
See also: <<fedmsg-websocket.adoc#>>
====
== Contact Information
Owner:::
Messaging SIG, Fedora Infrastructure Team
Contact:::
#fedora-apps, #fedora-admin, #fedora-noc
Servers:::
busgateway01, proxy0*
Purpose:::
Expose raw ZeroMQ messages outside the FI environment.
== Description
Users outside of Fedora Infrastructure can listen to the production
message bus by connecting to specific addresses. This is required for
local users to run their own hubs and message processors ("Consumers").
The specific public endpoints are:
production::
tcp://hub.fedoraproject.org:9940
staging::
tcp://stg.fedoraproject.org:9940
_fedmsg-gateway_, the daemon running on _busgateway01_, is listening to the
FI production fedmsg bus and will relay every message that it receives
out to a special ZMQ pub endpoint bound to port 9940. haproxy mediates
connections to the _fedmsg-gateway_ daemon.
== Connection Flow
Clients connect through haproxy on `proxy0*:9940` are redirected to
`busgateway0*:9940`. This can be found in the `haproxy.cfg` entry for
`listen fedmsg-raw-zmq 0.0.0.0:9940`.
This is different than the apache reverse proxy pass setup we have for
the _app0*_ and _packages0*_ machines. _That_ flow looks something like
this:
....
Client -> apache(proxy01) -> haproxy(proxy01) -> apache(app01)
....
The flow for the raw zmq stream provided by _fedmsg-gateway_ looks
something like this:
....
Client -> haproxy(proxy01) -> fedmsg-gateway(busgateway01)
....
_haproxy_ is listening on a public port.
At the time of this writing, _haproxy_ does not actually load balance
zeromq session requests across multiple _busgateway0*_ machines, but there
is nothing stopping us from adding them. New hosts can be added in
ansible and pressed from _busgateway01_'s template. Add them to the
fedmsg-raw-zmq listen in _haproxy_'s config and it should Just Work.
== Increasing the Maximum Number of Concurrent Connections
HTTP requests are typically very short (a few seconds at most). This
means that the number of concurrent tcp connections we require for most
of our services is quite low (1024 is overkill). ZeroMQ tcp connections,
on the other hand, are expected to live for quite a long time.
Consequently we needed to scale up the number of possible concurrent tcp
connections.
All of this is in ansible and should be handled for us automatically if
we bring up new nodes.
* The pam_limits user limit for the fedmsg user was increased from 1024
to 160000 on _busgateway01_.
* The pam_limits user limit for the haproxy user was increased from 1024
to 160000 on the _proxy0*_ machines.
* The zeromq High Water Mark (HWM) was increased to 160000 on
_busgateway01_.
* The maximum number of connections allowed was increased in
`haproxy.cfg`.
== Nagios
New nagios checks were added for this that check to see if the number of
concurrent connections through haproxy is approaching the maximum number
allowed.
You can check these numbers by hand by inspecting the _haproxy_ web
interface: https://admin.fedoraproject.org/haproxy/proxy1#fedmsg-raw-zmq
Look at the "Sessions" section. "Cur" is the current number of sessions
versus "Max", the maximum number seen at the same time and "Limit", the
maximum number of concurrent connections allowed.
== RHIT
We had RHIT open up port 9940 special to _proxy01.iad2_ for this.