Turns out the lookaside cache is not as clean as I expected.
Specifically, it contains some files where we'd expect directories, for
example:
/srv/cache/lookaside/pkgs/GFS-kernel/@13013.1e77f453ba1c86cd7616087d0643bbd8e
/srv/cache/lookaside/pkgs/openswan/tmpLRV5Gn5556cb2fcea6ba862ce14e1debf98b6d
This commit makes the script print an error instead of crashing on an
OSError in such a case.
We are migrating from the following path scheme:
/%(srpmname)s/%(filename)s/%(hash)s/%(filename)s
To:
/%(srpmname)s/%(filename)s/%(hashtype)s/%(hash)s/%(filename)s
As a result, we need to hardlink all the files existing under the old
path to their new path.
This script does just that.
Given that it should only ever be run once anyway, it is added as a
file to the distgit role, but not set to be installed anywhere.
This avoids some race conditions, as testing for a directory existence
before creating it is racy.
The best way is to try creating it no matter what, and ignore errors
when the directory already exists.
The script checks for the file at the new location.
As a result, it will report that the file is missing if it had only been
uploaded to the old location, which will prompt the client to reupload.
With this change, the script will check at the new location, and if it
doesn't find the file it will try checking for it at the old location as
well.
If the file is found at the old location, we hardlink it to the new
location, and report the file is available.
There is a send_error method, which sends the error message back to the
client. (pyrpkg in our case)
Unfortunately, that method doesn't send back an error HTTP status code,
which I'm assuming must be interpreted as a "200 OK" status.
pyrpkg completely ignore the text sent back by the server, unless the
status code is not 200, which means all those errors are silently
ignored.
This commit makes sure the send_error method will always return an error
status ("500 Internal Server Error" by default), and moves all the error
handling code to use that method, specifying their own status code if
needed.
Without this, the file could exist at both the old and new path, taking
the space on the disk twice.
This forces a hardlink if the file already existed at the old path.
Currently, the CGI script is set to upload files:
- to the old path if the upload uses md5
- to the new path if the upload uses sha512
The old path is as follows:
/%(srpmname)s/%(filename)s/%(hash)s/%(filename)s
The new path is:
/%(srpmname)s/%(filename)s/%(hashtype)s/%(hash)s/%(filename)s
This was meant to ensure compatibility with current fedpkg which
always downloads from the old path, but will eventually download from
the new path when we move to sha512.
However, working more on this, I now think it would make for a smoother
transition if we instead always stored the files at the new path, but
just hardlinked to the old path if the upload is using md5.
This is what this patch achieves.
With this deployed in production, fedpkg could be patched to try
downloading from the new path, and fallback to the old one if necessary,
which decouples the migration to the new path from the migration to the
new hash.