Infrastructure/fedora-infrastructure

Fork 0

Add monitoring for the queue used by resultsdb #8392

New issue

Closed

opened 2019-11-18 17:06:50 +00:00 by pingou · 10 comments

pingou commented

2019-11-18 17:06:50 +00:00

resultsdb01.qa has a fedora-messaging listener that has its own queue.

Monitoring how that queue is doing in nagios would alert of misbehave on the consumer (such as the one we had this week-end due to a message without body).

resultsdb01.qa has a fedora-messaging listener that has its own queue. Monitoring how that queue is doing in nagios would alert of misbehave on the consumer (such as the one we had this week-end due to a message without body).

kevin commented

2019-11-19 18:18:29 +00:00

Is this in it's own rabbitmq instance? or in the main one?

@abompard has done some rabbit monitoring work...

Is this in it's own rabbitmq instance? or in the main one? @abompard has done some rabbit monitoring work...

kevin commented

2019-11-19 18:18:29 +00:00

Metadata Update from @kevin:

Issue priority set to: Waiting on Assignee (was: Needs Review)

**Metadata Update from @kevin**: - Issue priority set to: Waiting on Assignee (was: Needs Review)

pingou commented

2019-11-20 08:11:54 +00:00

Author

Is this in it's own rabbitmq instance? or in the main one?

It is in the main one

> Is this in it's own rabbitmq instance? or in the main one? It is in the main one

mizdebsk commented

2019-11-26 20:46:28 +00:00

Metadata Update from @mizdebsk:

Issue tagged with: easyfix, monitoring

**Metadata Update from @mizdebsk**: - Issue tagged with: easyfix, monitoring

ekulik commented

2020-01-13 14:25:53 +00:00

I imagine https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=ee4a746691a0cb6edd81505a275ce82a33a40542 fixes this (sans any threshold adjustments)?

kevin commented

2020-01-15 16:37:53 +00:00

That would alert if it got too large, but wouldn't alert if there were no messages coming in from it.

We need something like the checks we have that use datanommer to query for the last message from that service.

https://apps.fedoraproject.org/datagrepper/raw?category=resultsdb

There's some nagios checks for other services like that.

That would alert if it got too large, but wouldn't alert if there were no messages coming in from it. We need something like the checks we have that use datanommer to query for the last message from that service. https://apps.fedoraproject.org/datagrepper/raw?category=resultsdb There's some nagios checks for other services like that.

ekulik commented

2020-01-16 08:09:38 +00:00

Sorry if I sound like a broken record, but what does https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=93d0eeaf54ff57fea9c8942f25618c826b454c54 achieve, then?

kevin commented

2020-01-16 16:09:26 +00:00

oh man, you are in fact right. I somehow was looking at another commit or something. ;(

My bad... you are right, this does what we want. :)

Sorry about that.

oh man, you are in fact right. I somehow was looking at another commit or something. ;( My bad... you are right, this does what we want. :) Sorry about that.

kevin commented

2020-01-16 16:09:27 +00:00

Metadata Update from @kevin: