fedora-image-uploader: deploy as multiple containers #2592
No reviewers
Labels
No labels
freeze-break-request
post-freeze
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Infrastructure/ansible#2592
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fiu-multi-container"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In the beginning, this just handled Azure images. Now it does Azure,
AWS, GCP, and containers. Currently, it processes images serially, which
is mostly okay. However, it does mean that whatever service is handled
last has to wait for all the others to succeed before it starts, and it
also means if any of the handlers for their respective platform fail, it
retries all the images again. For most things this is a no-op (or a
few inexpensive calls), but it does have to re-download the image from
Koji to checksum it.
This adds an AMQP message queue for each content type we handle, and
produces a fedora-messaging config for each content type. The deployment
is now made up of 4 containers: azure-image-uploader,
aws-image-uploader, container-image-uploader, and
google-cloud-image-uploader. They only differ in the secrets injected
into them and the fedora-messaging config file they use. The end result
is that images should be available faster and its more resilient to
remote services being down.
Finally, it's worth noting that this bumps the warning threshold for
queue sizes. It can take some services (Azure and AWS) upwards of 30
minutes to replicate the images around the world, and since we subscribe
to any compose status changes, it's not unreasonable for 5-10 messages
to stack up when we hit a compose change that is "FINISHED" with images.
Also note: the templating in here feels a little unpleasant. I wasn't sure how to loop on a role for queue suffixes, and those are disconnected from the queue suffix definitions in the config template. If there's a better way to handle it I'm all ears.
Build failed. More information on how to proceed and troubleshoot errors available at https://fedoraproject.org/wiki/Zuul-based-ci
https://fedora.softwarefactory-project.io/zuul/buildset/badd086f2f464e14bc9fd4e92dedcb3f
+1 for this as it makes the deployment much cleaner.
Seems reasonable to me.
Would you like me to merge/deploy this? Or would you like to?
I have thus far avoided needing privileges to run Ansible stuff (although perhaps that is becoming increasingly unreasonable), so please do merge + deploy, I'm around to fix anything I've messed up
rebased onto
240aa7b8e0
rebased onto
240aa7b8e0
ok. No problem.
Pull-Request has been merged by kevin
Build failed. More information on how to proceed and troubleshoot errors available at https://fedoraproject.org/wiki/Zuul-based-ci
https://fedora.softwarefactory-project.io/zuul/buildset/5dd31710249049afb1321dc1d64887ce