diff --git a/docs/datanommer_datagrepper/pg_timescaledb.rst b/docs/datanommer_datagrepper/pg_timescaledb.rst index cf377ff..030a505 100644 --- a/docs/datanommer_datagrepper/pg_timescaledb.rst +++ b/docs/datanommer_datagrepper/pg_timescaledb.rst @@ -66,6 +66,7 @@ timescaledb uses table partitioning as well. This leads to the same issue with the foreign key constraints that we have seen in the plain partitioning approach we took. + Foreign key considerations ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -93,6 +94,22 @@ database is mostly about inserts and has no updates or deletes, we don't foresee much problems with this. +Duplicated messages +~~~~~~~~~~~~~~~~~~~ + +When testing datagrepper and datanommer in our test instance with the timescaledb +plugin, we saw a number of duplicated messages showing up in the `/raw` endpoint. +Checking if we could fix this server side, we found out that the previous database +schema had an `UNIQUE` constraint on `msg_id` field. However, with the timescaledb +plugin, that constraint is now on both `msg_id` and `timestamp` fields, meaning +a message can be inserted twice in the database if there is a little delay between +the two inserts. + +However, migrating datanommer from fedmsg to fedora-messaging should resolve that +issue client side as rabbitmq will ensure there is only one consumer at a time +handling a message. + + Open questions --------------