Update splitter to fedora modules upstream and improve documentation.
The grobisplitter parts need some documentation to explain what they are doing and for whom. This is a first attempt at getting that right Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
This commit is contained in:
parent
8b4c38e29e
commit
ddb13e640a
4 changed files with 451 additions and 135 deletions
183
roles/grobisplitter/README.md
Normal file
183
roles/grobisplitter/README.md
Normal file
|
@ -0,0 +1,183 @@
|
||||||
|
# Grobisplitter
|
||||||
|
### Or how I learned to stop worrying and love modules
|
||||||
|
|
||||||
|
## Where are the sources
|
||||||
|
|
||||||
|
The Current Master Git Repository for the grobisplitter program is
|
||||||
|
https://github.com/fedora-modularity/GrobiSplitter . The program
|
||||||
|
depends upon python3 and some other programs.
|
||||||
|
|
||||||
|
* gobject-introspection
|
||||||
|
* libmodulemd-2.5.0
|
||||||
|
* libmodulemd1-1.8.11
|
||||||
|
* librepo
|
||||||
|
* python3-gobject-base
|
||||||
|
* python3-hawkey
|
||||||
|
* python3-librepo
|
||||||
|
|
||||||
|
## What does Grobisplitter splitter.py do?
|
||||||
|
|
||||||
|
Grobisplitter was born out of the addition of modules to Fedora and
|
||||||
|
RHEL-8. A module is a virtual rpm repository inside of a standard rpm
|
||||||
|
repository where a sysadmin can choose which virtual repositories are
|
||||||
|
used in a system or not. This allows for useful choices without having
|
||||||
|
to add more repository configs, but it adds a complexity that the koji
|
||||||
|
build system does not understand. While the MBS system could help
|
||||||
|
handle this for packages it knows it built, it can not do so for ones
|
||||||
|
that are external which is the case when building CentOS or EPEL
|
||||||
|
packages.
|
||||||
|
|
||||||
|
Grobisplitter was created by Patrick Uiterwijk to deal with part of
|
||||||
|
this while permanent solutions were created in MBS and
|
||||||
|
koji. Grobisplitter takes a modular repository (as example a reposync
|
||||||
|
copy of RHEL-8) and 'flattens' it out with each module becoming its
|
||||||
|
own independent repository. Options to the command are
|
||||||
|
|
||||||
|
``` shell
|
||||||
|
[smooge@batcave01 RHEL-8-001]$ /usr/local/bin/splitter.py --help
|
||||||
|
usage: splitter.py [-h] [--action {hardlink,symlink,copy}] [--target TARGET]
|
||||||
|
[--skip-missing] [--create-repos] [--only-defaults]
|
||||||
|
repository
|
||||||
|
|
||||||
|
Split repositories up
|
||||||
|
|
||||||
|
positional arguments:
|
||||||
|
repository The repository to split
|
||||||
|
|
||||||
|
optional arguments:
|
||||||
|
-h, --help show this help message and exit
|
||||||
|
--action {hardlink,symlink,copy}
|
||||||
|
Method to create split repos files
|
||||||
|
--target TARGET Target directory for split repos
|
||||||
|
--skip-missing Skip missing packages
|
||||||
|
--create-repos Create repository metadatas
|
||||||
|
--only-defaults Only output default modules
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
To save diskspace, one can use different methods to copy packages,
|
||||||
|
target a specific directory, only allow for default modules, and
|
||||||
|
create repos for each of the virtual repositories seperately.
|
||||||
|
|
||||||
|
Each module is split into a name matching its modular dataname, for
|
||||||
|
example as of 2020-12-03, here are the httpd modules of RHEL-8 split out:
|
||||||
|
|
||||||
|
``` shell
|
||||||
|
|
||||||
|
[smooge@batcave01 RHEL-8-001]$ ls -1d httpd*
|
||||||
|
httpd:2.4:8000020190405071959:55190bc5:x86_64/
|
||||||
|
httpd:2.4:8000020190829150747:f8e95b4e:x86_64/
|
||||||
|
httpd:2.4:8010020190829143335:cdc1202b:x86_64/
|
||||||
|
httpd:2.4:8020020200122152618:6a468ee4:x86_64/
|
||||||
|
httpd:2.4:8020020200824162909:4cda2c84:x86_64/
|
||||||
|
httpd:2.4:8030020200818000036:30b713e6:x86_64/
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
The reason that there are multiple modules versus just the latest
|
||||||
|
module was due to problems in knowing what the 'latest' module was to
|
||||||
|
use. It needs to know about all the packages in the upstream
|
||||||
|
repositories for modular decisions to be made. This means that the
|
||||||
|
staged data will be a complete copy of the RHN repository.
|
||||||
|
|
||||||
|
``` shell
|
||||||
|
|
||||||
|
total 4980
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 1463679 2020-11-03 09:18 httpd-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 224591 2020-11-03 09:18 httpd-devel-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 37599 2020-11-03 09:18 httpd-filesystem-2.4.37-30.module+el8.3.0+7001+0766b9e7.noarch.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 2486719 2020-11-03 09:18 httpd-manual-2.4.37-30.module+el8.3.0+7001+0766b9e7.noarch.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 106479 2020-11-03 09:18 httpd-tools-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 157763 2020-11-03 09:18 mod_http2-1.15.7-2.module+el8.3.0+7670+8bf57d29.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 84163 2020-11-03 09:18 mod_ldap-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 189343 2020-11-03 09:18 mod_md-2.0.8-8.module+el8.3.0+6814+67d1e611.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 60531 2020-11-03 09:18 mod_proxy_html-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 72475 2020-11-03 09:18 mod_session-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
-rw-r--r--. 1 root sysadmin-main 135799 2020-11-03 09:18 mod_ssl-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64.rpm
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
All non-modular rpms from the repository are put in a directory called
|
||||||
|
`non-modular` which can also have its own repodata set up for it.
|
||||||
|
|
||||||
|
## What does rhel8-split.sh do?
|
||||||
|
|
||||||
|
While the splitter command does the hard work of splitting out the
|
||||||
|
packages, the rhel8-split.sh shell does the 'business' work of setting
|
||||||
|
up the repositories so that koji can consume it for EPEL-8 and other
|
||||||
|
builds.
|
||||||
|
|
||||||
|
The first part of this is done by a cron job which reposyncs down from
|
||||||
|
the Red Hat access.redhat.com the various packages for the
|
||||||
|
architectures Fedora Infrastructure needs. The data is synced down
|
||||||
|
into subdirectories in `/mnt/fedora/app/fi-repo/rhel/rhel8` which
|
||||||
|
match channels in RHEL BaseOS, AppStream, CodeReadyBuilder as needed.
|
||||||
|
|
||||||
|
Next a new destination directory is made in
|
||||||
|
`/mnt/fedora/app/fi-repo/rhel/rhel8/koji/` with the date of the cron
|
||||||
|
job being run so that we can always roll back to an older external Red
|
||||||
|
Hat repo if needed. Afterwards we begin breaking apart the repos per
|
||||||
|
architecture. The splitter is then called per channel that is wanted
|
||||||
|
to be used in EPEL. The Base and AppStream channel only splits out the
|
||||||
|
'default' modules while the Code Ready Builder splits out all modules
|
||||||
|
as many are non-default.
|
||||||
|
|
||||||
|
After the files have been copied into a single tree a `createrepo_c`
|
||||||
|
is run with the data. This creates a 'flattened' repository with data
|
||||||
|
in it. However modular data from all these repos is currently lost.
|
||||||
|
|
||||||
|
Once the data has been synced and flattened for all repositories, a
|
||||||
|
series of links are set up that koji can point to. At this point a
|
||||||
|
last reposync cycle is done using dnf to pull in only the newest
|
||||||
|
rpms. This effectively cleans up large number of older packages to
|
||||||
|
make sure the builders have an easier time deciding which package to
|
||||||
|
use. [Basically as of 2020-12-03, the staged repo has 66130 packages
|
||||||
|
in it, and the latest shrinks that down to 26530.]
|
||||||
|
|
||||||
|
Koji then is pointed to the trees on batcave served from
|
||||||
|
`/mnt/fedora/app/fi-repo/rhel/rhel8/koji/latest/${arch}/RHEL-8-001`.
|
||||||
|
|
||||||
|
TODO:
|
||||||
|
1. Currently the RHEL-8-001 is a consequence of the rhel8-split.sh
|
||||||
|
script. We split each repo into its own tree and then copy them
|
||||||
|
into one final one. This should be done better.
|
||||||
|
2. A way to clean up the 'empty' directory names in latest would help
|
||||||
|
make it easier to see what is actually being 'used' by koji.
|
||||||
|
|
||||||
|
```
|
||||||
|
[smooge@batcave01 latest]$ ls -1d x86_64/RHEL-8-001/go-toolset\:rhel8\:80*
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8000020190509153318:b9255456:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8000120190520160856:4a778a88:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8000120190828225436:14bc675c:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8010020190829001136:ccff3eb7:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8010020191220185136:0ed30617:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8020020200128163444:0ab52eed:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8020020200817154239:02f7cb7a:x86_64/
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/
|
||||||
|
|
||||||
|
```
|
||||||
|
makes this look like it has lots of files .. however only one tree
|
||||||
|
has files in it.
|
||||||
|
```
|
||||||
|
|
||||||
|
[smooge@batcave01 latest]$ find x86_64/RHEL-8-001/go-toolset\:rhel8\:80*
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8000020190509153318:b9255456:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8000120190520160856:4a778a88:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8000120190828225436:14bc675c:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8010020190829001136:ccff3eb7:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8010020191220185136:0ed30617:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8020020200128163444:0ab52eed:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8020020200817154239:02f7cb7a:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/delve-1.4.1-1.module+el8.3.0+7840+63dfb1ed.x86_64.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/go-toolset-1.14.7-1.module+el8.3.0+7840+63dfb1ed.x86_64.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-1.14.7-2.module+el8.3.0+7840+63dfb1ed.x86_64.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-bin-1.14.7-2.module+el8.3.0+7840+63dfb1ed.x86_64.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-docs-1.14.7-2.module+el8.3.0+7840+63dfb1ed.noarch.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-misc-1.14.7-2.module+el8.3.0+7840+63dfb1ed.noarch.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-race-1.14.7-2.module+el8.3.0+7840+63dfb1ed.x86_64.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-src-1.14.7-2.module+el8.3.0+7840+63dfb1ed.noarch.rpm
|
||||||
|
x86_64/RHEL-8-001/go-toolset:rhel8:8030020200827141259:13702366:x86_64/golang-tests-1.14.7-2.module+el8.3.0+7840+63dfb1ed.noarch.rpm
|
||||||
|
|
||||||
|
```
|
|
@ -1,12 +0,0 @@
|
||||||
The Current Master Git Repository for the grobisplitter program is
|
|
||||||
https://github.com/smooge/GrobiSplitter.git to be moved under a
|
|
||||||
Community Infrastructure repository later. The program depends upon
|
|
||||||
python3 and other programs.
|
|
||||||
|
|
||||||
gobject-introspection
|
|
||||||
libmodulemd-2.5.0
|
|
||||||
libmodulemd1-1.8.11
|
|
||||||
librepo
|
|
||||||
python3-gobject-base
|
|
||||||
python3-hawkey
|
|
||||||
python3-librepo
|
|
|
@ -1,4 +1,6 @@
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
|
## Setup basic environment variables.
|
||||||
HOMEDIR=/mnt/fedora/app/fi-repo/rhel/rhel8
|
HOMEDIR=/mnt/fedora/app/fi-repo/rhel/rhel8
|
||||||
BINDIR=/usr/local/bin
|
BINDIR=/usr/local/bin
|
||||||
|
|
||||||
|
@ -7,6 +9,10 @@ DATE=$(date -Ih | sed 's/+.*//')
|
||||||
|
|
||||||
DATEDIR=${HOMEDIR}/koji/${DATE}
|
DATEDIR=${HOMEDIR}/koji/${DATE}
|
||||||
|
|
||||||
|
##
|
||||||
|
## Make a directory for where the new tree will live. Use a new date
|
||||||
|
## so that we can roll back to an older release or stop updates for
|
||||||
|
## some time if needed.
|
||||||
if [ -d ${DATEDIR} ]; then
|
if [ -d ${DATEDIR} ]; then
|
||||||
echo "Directory already exists. Please remove or fix"
|
echo "Directory already exists. Please remove or fix"
|
||||||
exit
|
exit
|
||||||
|
@ -14,6 +20,9 @@ else
|
||||||
mkdir -p ${DATEDIR}
|
mkdir -p ${DATEDIR}
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
##
|
||||||
|
## Go through each architecture and
|
||||||
|
##
|
||||||
for ARCH in ${ARCHES}; do
|
for ARCH in ${ARCHES}; do
|
||||||
# The archdir is where we daily download updates for rhel8
|
# The archdir is where we daily download updates for rhel8
|
||||||
ARCHDIR=${HOMEDIR}/${ARCH}
|
ARCHDIR=${HOMEDIR}/${ARCH}
|
||||||
|
|
|
@ -12,32 +12,33 @@ import tempfile
|
||||||
import os
|
import os
|
||||||
import subprocess
|
import subprocess
|
||||||
import sys
|
import sys
|
||||||
|
import logging
|
||||||
|
|
||||||
# Look for a specific version of modulemd. The 1.x series does not
|
# Look for a specific version of modulemd. The 1.x series does not
|
||||||
# have the tools we need.
|
# have the tools we need.
|
||||||
try:
|
try:
|
||||||
gi.require_version('Modulemd', '2.0')
|
gi.require_version('Modulemd', '2.0')
|
||||||
from gi.repository import Modulemd
|
from gi.repository import Modulemd as mmd
|
||||||
except:
|
except ValueError:
|
||||||
print("We require newer vesions of modulemd than installed..")
|
print("libmodulemd 2.0 is not installed..")
|
||||||
sys.exit(0)
|
sys.exit(1)
|
||||||
|
|
||||||
mmd = Modulemd
|
|
||||||
|
|
||||||
# This code is from Stephen Gallagher to make my other caveman code
|
# We only want to load the module metadata once. It can be reused as often as required
|
||||||
# less icky.
|
_idx = None
|
||||||
def _get_latest_streams (mymod, stream):
|
|
||||||
|
def _get_latest_streams(mymod, stream):
|
||||||
"""
|
"""
|
||||||
Routine takes modulemd object and a stream name.
|
Routine takes modulemd object and a stream name.
|
||||||
Finds the lates stream from that and returns that as a stream
|
Finds the lates stream from that and returns that as a stream
|
||||||
object.
|
object.
|
||||||
"""
|
"""
|
||||||
all_streams = mymod.search_streams(stream, 0)
|
all_streams = mymod.search_streams(stream, 0)
|
||||||
latest_streams = mymod.search_streams(stream,
|
latest_streams = mymod.search_streams(stream,
|
||||||
all_streams[0].props.version)
|
all_streams[0].props.version)
|
||||||
|
|
||||||
return latest_streams
|
return latest_streams
|
||||||
|
|
||||||
|
|
||||||
def _get_repoinfo(directory):
|
def _get_repoinfo(directory):
|
||||||
"""
|
"""
|
||||||
A function which goes into the given directory and sets up the
|
A function which goes into the given directory and sets up the
|
||||||
|
@ -54,6 +55,46 @@ def _get_repoinfo(directory):
|
||||||
r = h.perform()
|
r = h.perform()
|
||||||
return r.getinfo(librepo.LRR_YUM_REPO)
|
return r.getinfo(librepo.LRR_YUM_REPO)
|
||||||
|
|
||||||
|
|
||||||
|
def _get_modulemd(directory=None, repo_info=None):
|
||||||
|
"""
|
||||||
|
Retrieve the module metadata from this repository.
|
||||||
|
:param directory: The path to the repository. Must contain repodata/repomd.xml and modules.yaml.
|
||||||
|
:param repo_info: An already-acquired repo_info structure
|
||||||
|
:return: A Modulemd.ModulemdIndex object containing the module metadata from this repository.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Return the cached value
|
||||||
|
global _idx
|
||||||
|
if _idx:
|
||||||
|
return _idx
|
||||||
|
|
||||||
|
# If we don't have a cached value, we need either directory or repo_info
|
||||||
|
assert directory or repo_info
|
||||||
|
|
||||||
|
if directory:
|
||||||
|
directory = os.path.abspath(directory)
|
||||||
|
repo_info = _get_repoinfo(directory)
|
||||||
|
|
||||||
|
if 'modules' not in repo_info:
|
||||||
|
return None
|
||||||
|
|
||||||
|
_idx = mmd.ModuleIndex.new()
|
||||||
|
|
||||||
|
with gzip.GzipFile(filename=repo_info['modules'], mode='r') as gzf:
|
||||||
|
mmdcts = gzf.read().decode('utf-8')
|
||||||
|
res, failures = _idx.update_from_string(mmdcts, True)
|
||||||
|
if len(failures) != 0:
|
||||||
|
raise Exception("YAML FAILURE: FAILURES: %s" % failures)
|
||||||
|
if not res:
|
||||||
|
raise Exception("YAML FAILURE: res != True")
|
||||||
|
|
||||||
|
# Ensure that every stream in the index is using v2
|
||||||
|
_idx.upgrade_streams(mmd.ModuleStreamVersionEnum.TWO)
|
||||||
|
|
||||||
|
return _idx
|
||||||
|
|
||||||
|
|
||||||
def _get_hawkey_sack(repo_info):
|
def _get_hawkey_sack(repo_info):
|
||||||
"""
|
"""
|
||||||
A function to pull in the repository sack from hawkey.
|
A function to pull in the repository sack from hawkey.
|
||||||
|
@ -66,9 +107,10 @@ def _get_hawkey_sack(repo_info):
|
||||||
|
|
||||||
primary_sack = hawkey.Sack()
|
primary_sack = hawkey.Sack()
|
||||||
primary_sack.load_repo(hk_repo, build_cache=False)
|
primary_sack.load_repo(hk_repo, build_cache=False)
|
||||||
|
|
||||||
return primary_sack
|
return primary_sack
|
||||||
|
|
||||||
|
|
||||||
def _get_filelist(package_sack):
|
def _get_filelist(package_sack):
|
||||||
"""
|
"""
|
||||||
Determine the file locations of all packages in the sack. Use the
|
Determine the file locations of all packages in the sack. Use the
|
||||||
|
@ -77,10 +119,12 @@ def _get_filelist(package_sack):
|
||||||
"""
|
"""
|
||||||
pkg_list = {}
|
pkg_list = {}
|
||||||
for pkg in hawkey.Query(package_sack):
|
for pkg in hawkey.Query(package_sack):
|
||||||
nevr="%s-%s:%s-%s.%s"% (pkg.name,pkg.epoch,pkg.version,pkg.release,pkg.arch)
|
nevr = "%s-%s:%s-%s.%s" % (pkg.name, pkg.epoch,
|
||||||
|
pkg.version, pkg.release, pkg.arch)
|
||||||
pkg_list[nevr] = pkg.location
|
pkg_list[nevr] = pkg.location
|
||||||
return pkg_list
|
return pkg_list
|
||||||
|
|
||||||
|
|
||||||
def _parse_repository_non_modular(package_sack, repo_info, modpkgset):
|
def _parse_repository_non_modular(package_sack, repo_info, modpkgset):
|
||||||
"""
|
"""
|
||||||
Simple routine to go through a repo, and figure out which packages
|
Simple routine to go through a repo, and figure out which packages
|
||||||
|
@ -97,20 +141,14 @@ def _parse_repository_non_modular(package_sack, repo_info, modpkgset):
|
||||||
pkgs.add(pkg.location)
|
pkgs.add(pkg.location)
|
||||||
return pkgs
|
return pkgs
|
||||||
|
|
||||||
def _parse_repository_modular(repo_info,package_sack):
|
|
||||||
|
def _parse_repository_modular(repo_info, package_sack):
|
||||||
"""
|
"""
|
||||||
Returns a dictionary of packages indexed by the modules they are
|
Returns a dictionary of packages indexed by the modules they are
|
||||||
contained in.
|
contained in.
|
||||||
"""
|
"""
|
||||||
cts = {}
|
cts = {}
|
||||||
idx = mmd.ModuleIndex()
|
idx = _get_modulemd(repo_info=repo_info)
|
||||||
with gzip.GzipFile(filename=repo_info['modules'], mode='r') as gzf:
|
|
||||||
mmdcts = gzf.read().decode('utf-8')
|
|
||||||
res, failures = idx.update_from_string(mmdcts, True)
|
|
||||||
if len(failures) != 0:
|
|
||||||
raise Exception("YAML FAILURE: FAILURES: %s" % failures)
|
|
||||||
if not res:
|
|
||||||
raise Exception("YAML FAILURE: res != True")
|
|
||||||
|
|
||||||
pkgs_list = _get_filelist(package_sack)
|
pkgs_list = _get_filelist(package_sack)
|
||||||
idx.upgrade_streams(2)
|
idx.upgrade_streams(2)
|
||||||
|
@ -124,14 +162,14 @@ def _parse_repository_modular(repo_info,package_sack):
|
||||||
else:
|
else:
|
||||||
continue
|
continue
|
||||||
cts[stream.get_NSVCA()] = templ
|
cts[stream.get_NSVCA()] = templ
|
||||||
|
|
||||||
return cts
|
return cts
|
||||||
|
|
||||||
|
|
||||||
def _get_modular_pkgset(mod):
|
def _get_modular_pkgset(mod):
|
||||||
"""
|
"""
|
||||||
Takes a module and goes through the moduleset to determine which
|
Takes a module and goes through the moduleset to determine which
|
||||||
packages are inside it.
|
packages are inside it.
|
||||||
Returns a list of packages
|
Returns a list of packages
|
||||||
"""
|
"""
|
||||||
pkgs = set()
|
pkgs = set()
|
||||||
|
@ -142,6 +180,7 @@ def _get_modular_pkgset(mod):
|
||||||
|
|
||||||
return list(pkgs)
|
return list(pkgs)
|
||||||
|
|
||||||
|
|
||||||
def _perform_action(src, dst, action):
|
def _perform_action(src, dst, action):
|
||||||
"""
|
"""
|
||||||
Performs either a copy, hardlink or symlink of the file src to the
|
Performs either a copy, hardlink or symlink of the file src to the
|
||||||
|
@ -160,6 +199,7 @@ def _perform_action(src, dst, action):
|
||||||
elif action == 'symlink':
|
elif action == 'symlink':
|
||||||
os.symlink(src, dst)
|
os.symlink(src, dst)
|
||||||
|
|
||||||
|
|
||||||
def validate_filenames(directory, repoinfo):
|
def validate_filenames(directory, repoinfo):
|
||||||
"""
|
"""
|
||||||
Take a directory and repository information. Test each file in
|
Take a directory and repository information. Test each file in
|
||||||
|
@ -176,107 +216,175 @@ def validate_filenames(directory, repoinfo):
|
||||||
return isok
|
return isok
|
||||||
|
|
||||||
|
|
||||||
def get_default_modules(directory):
|
def _get_recursive_dependencies(all_deps, idx, stream, ignore_missing_deps):
|
||||||
|
if stream.get_NSVCA() in all_deps:
|
||||||
|
# We've already encountered this NSVCA, so don't go through it again
|
||||||
|
logging.debug('Already included {}'.format(stream.get_NSVCA()))
|
||||||
|
return
|
||||||
|
|
||||||
|
# Store this NSVCA/NS pair
|
||||||
|
local_deps = all_deps
|
||||||
|
local_deps.add(stream.get_NSVCA())
|
||||||
|
|
||||||
|
logging.debug("Recursive deps: {}".format(stream.get_NSVCA()))
|
||||||
|
|
||||||
|
# Loop through the dependencies for this stream
|
||||||
|
deps = stream.get_dependencies()
|
||||||
|
|
||||||
|
# At least one of the dependency array entries must exist in the repo
|
||||||
|
found_dep = False
|
||||||
|
for dep in deps:
|
||||||
|
# Within an array entry, all of the modules must be present in the
|
||||||
|
# index
|
||||||
|
found_all_modules = True
|
||||||
|
for modname in dep.get_runtime_modules():
|
||||||
|
# Ignore "platform" because it's special
|
||||||
|
if modname == "platform":
|
||||||
|
logging.debug('Skipping platform')
|
||||||
|
continue
|
||||||
|
logging.debug('Processing dependency on module {}'.format(modname))
|
||||||
|
|
||||||
|
mod = idx.get_module(modname)
|
||||||
|
if not mod:
|
||||||
|
# This module wasn't present in the index.
|
||||||
|
found_module = False
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Within a module, at least one of the requested streams must be
|
||||||
|
# present
|
||||||
|
streamnames = dep.get_runtime_streams(modname)
|
||||||
|
found_stream = False
|
||||||
|
for streamname in streamnames:
|
||||||
|
stream_list = _get_latest_streams(mod, streamname)
|
||||||
|
for inner_stream in stream_list:
|
||||||
|
try:
|
||||||
|
_get_recursive_dependencies(
|
||||||
|
local_deps, idx, inner_stream, ignore_missing_deps)
|
||||||
|
except FileNotFoundError as e:
|
||||||
|
# Could not find all of this stream's dependencies in
|
||||||
|
# the repo
|
||||||
|
continue
|
||||||
|
found_stream = True
|
||||||
|
|
||||||
|
# None of the streams were found for this module
|
||||||
|
if not found_stream:
|
||||||
|
found_all_modules = False
|
||||||
|
|
||||||
|
# We've iterated through all of the modules; if it's still True, this
|
||||||
|
# dependency is consistent in the index
|
||||||
|
if found_all_modules:
|
||||||
|
found_dep = True
|
||||||
|
|
||||||
|
# We were unable to resolve the dependencies for any of the array entries.
|
||||||
|
# raise FileNotFoundError
|
||||||
|
if not found_dep and not ignore_missing_deps:
|
||||||
|
raise FileNotFoundError(
|
||||||
|
"Could not resolve dependencies for {}".format(
|
||||||
|
stream.get_NSVCA()))
|
||||||
|
|
||||||
|
all_deps.update(local_deps)
|
||||||
|
|
||||||
|
|
||||||
|
def get_default_modules(directory, ignore_missing_deps):
|
||||||
"""
|
"""
|
||||||
Work through the list of modules and come up with a default set of
|
Work through the list of modules and come up with a default set of
|
||||||
modules which would be the minimum to output.
|
modules which would be the minimum to output.
|
||||||
Returns a set of modules
|
Returns a set of modules
|
||||||
"""
|
"""
|
||||||
directory = os.path.abspath(directory)
|
|
||||||
repo_info = _get_repoinfo(directory)
|
|
||||||
|
|
||||||
provides = set()
|
all_deps = set()
|
||||||
contents = set()
|
|
||||||
if 'modules' not in repo_info:
|
|
||||||
return contents
|
|
||||||
idx = mmd.ModuleIndex()
|
|
||||||
with gzip.GzipFile(filename=repo_info['modules'], mode='r') as gzf:
|
|
||||||
mmdcts = gzf.read().decode('utf-8')
|
|
||||||
res, failures = idx.update_from_string(mmdcts, True)
|
|
||||||
if len(failures) != 0:
|
|
||||||
raise Exception("YAML FAILURE: FAILURES: %s" % failures)
|
|
||||||
if not res:
|
|
||||||
raise Exception("YAML FAILURE: res != True")
|
|
||||||
|
|
||||||
idx.upgrade_streams(2)
|
idx = _get_modulemd(directory)
|
||||||
|
if not idx:
|
||||||
|
return all_deps
|
||||||
|
|
||||||
# OK this is cave-man no-sleep programming. I expect there is a
|
for modname, streamname in idx.get_default_streams().items():
|
||||||
# better way to do this that would be a lot better. However after
|
# Only the latest version of a stream is important, as that is the only one that DNF will consider in its
|
||||||
# a long long day.. this is what I have.
|
# transaction logic. We still need to handle each context individually.
|
||||||
|
|
||||||
# First we oo through the default streams and create a set of
|
|
||||||
# provides that we can check against later.
|
|
||||||
for modname in idx.get_default_streams():
|
|
||||||
mod = idx.get_module(modname)
|
mod = idx.get_module(modname)
|
||||||
# Get the default streams and loop through them.
|
stream_set = _get_latest_streams(mod, streamname)
|
||||||
stream_set = mod.get_streams_by_stream_name(
|
|
||||||
mod.get_defaults().get_default_stream())
|
|
||||||
for stream in stream_set:
|
for stream in stream_set:
|
||||||
tempstr = "%s:%s" % (stream.props.module_name,
|
# Different contexts have different dependencies
|
||||||
stream.props.stream_name)
|
try:
|
||||||
provides.add(tempstr)
|
logging.debug("Processing {}".format(stream.get_NSVCA()))
|
||||||
|
_get_recursive_dependencies(all_deps, idx, stream, ignore_missing_deps)
|
||||||
|
logging.debug("----------")
|
||||||
|
except FileNotFoundError as e:
|
||||||
|
# Not all dependencies could be satisfied
|
||||||
|
print(
|
||||||
|
"Not all dependencies for {} could be satisfied. {}. Skipping".format(
|
||||||
|
stream.get_NSVCA(), e))
|
||||||
|
continue
|
||||||
|
|
||||||
|
logging.debug('Default module streams: {}'.format(all_deps))
|
||||||
|
|
||||||
|
return all_deps
|
||||||
|
|
||||||
|
|
||||||
# Now go through our list and build up a content lists which will
|
def _pad_svca(svca, target_length):
|
||||||
# have only modules which have their dependencies met
|
"""
|
||||||
tempdict = {}
|
If the split() doesn't return all values (e.g. arch is missing), pad it
|
||||||
for modname in idx.get_default_streams():
|
with `None`
|
||||||
mod = idx.get_module(modname)
|
"""
|
||||||
# Get the default streams and loop through them.
|
length = len(svca)
|
||||||
# This is a sorted list with the latest in it. We could drop
|
svca.extend([None] * (target_length - length))
|
||||||
# looking at later ones here in a future version. (aka lines
|
return svca
|
||||||
# 237 to later)
|
|
||||||
stream_set = mod.get_streams_by_stream_name(
|
|
||||||
mod.get_defaults().get_default_stream())
|
|
||||||
for stream in stream_set:
|
|
||||||
ourname = stream.get_NSVCA()
|
|
||||||
tmp_name = "%s:%s" % (stream.props.module_name,
|
|
||||||
stream.props.stream_name)
|
|
||||||
# Get dependencies is a list of items. All of the modules
|
|
||||||
# seem to only have 1 item in them, but we should loop
|
|
||||||
# over the list anyway.
|
|
||||||
for deps in stream.get_dependencies():
|
|
||||||
isprovided = True # a variable to say this can be added.
|
|
||||||
for mod in deps.get_runtime_modules():
|
|
||||||
tempstr=""
|
|
||||||
# It does not seem easy to figure out what the
|
|
||||||
# platform is so just assume we will meet it.
|
|
||||||
if mod != 'platform':
|
|
||||||
for stm in deps.get_runtime_streams(mod):
|
|
||||||
tempstr = "%s:%s" %(mod,stm)
|
|
||||||
if tempstr not in provides:
|
|
||||||
# print( "%s : %s not found." % (ourname,tempstr))
|
|
||||||
isprovided = False
|
|
||||||
if isprovided:
|
|
||||||
if tmp_name in tempdict:
|
|
||||||
# print("We found %s" % tmp_name)
|
|
||||||
# Get the stream version we are looking at
|
|
||||||
ts1=ourname.split(":")[2]
|
|
||||||
# Get the stream version we stored away
|
|
||||||
ts2=tempdict[tmp_name].split(":")[2]
|
|
||||||
# See if we got a newer one. We probably
|
|
||||||
# don't as it is a sorted list but we
|
|
||||||
# could have multiple contexts which would
|
|
||||||
# change things.
|
|
||||||
if ( int(ts1) > int(ts2) ):
|
|
||||||
# print ("%s > %s newer for %s", ts1,ts2,ourname)
|
|
||||||
tempdict[tmp_name] = ourname
|
|
||||||
else:
|
|
||||||
# print("We did not find %s" % tmp_name)
|
|
||||||
tempdict[tmp_name] = ourname
|
|
||||||
# OK we finally got all our stream names we want to send back to
|
|
||||||
# our calling function. Read them out and add them to the set.
|
|
||||||
for indx in tempdict:
|
|
||||||
contents.add(tempdict[indx])
|
|
||||||
|
|
||||||
return contents
|
|
||||||
|
def _dump_modulemd(modname, yaml_file):
|
||||||
|
idx = _get_modulemd()
|
||||||
|
assert idx
|
||||||
|
|
||||||
|
# Create a new index to hold the information about this particular
|
||||||
|
# module and stream
|
||||||
|
new_idx = mmd.ModuleIndex.new()
|
||||||
|
|
||||||
|
# Add the module streams
|
||||||
|
module_name, *svca = modname.split(':')
|
||||||
|
stream_name, version, context, arch = _pad_svca(svca, 4)
|
||||||
|
|
||||||
|
logging.debug("Dumping YAML for {}, {}, {}, {}, {}".format(
|
||||||
|
module_name, stream_name, version, context, arch))
|
||||||
|
|
||||||
|
mod = idx.get_module(module_name)
|
||||||
|
streams = mod.search_streams(stream_name, int(version), context, arch)
|
||||||
|
|
||||||
|
# This should usually be a single item, but we'll be future-compatible
|
||||||
|
# and account for the possibility of having multiple streams here.
|
||||||
|
for stream in streams:
|
||||||
|
new_idx.add_module_stream(stream)
|
||||||
|
|
||||||
|
# Add the module defaults
|
||||||
|
defs = mod.get_defaults()
|
||||||
|
if defs:
|
||||||
|
new_idx.add_defaults(defs)
|
||||||
|
|
||||||
|
# libmodulemd doesn't currently expose the get_translation()
|
||||||
|
# function, but that will be added in 2.8.0
|
||||||
|
try:
|
||||||
|
# Add the translation object
|
||||||
|
translation = mod.get_translation()
|
||||||
|
if translation:
|
||||||
|
new_idx.add_translation(translation)
|
||||||
|
except AttributeError as e:
|
||||||
|
# This version of libmodulemd does not yet support this function.
|
||||||
|
# Just ignore it.
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Write out the file
|
||||||
|
try:
|
||||||
|
with open(yaml_file, 'w') as output:
|
||||||
|
output.write(new_idx.dump_to_string())
|
||||||
|
except PermissionError as e:
|
||||||
|
logging.error("Could not write YAML to file: {}".format(e))
|
||||||
|
raise
|
||||||
|
|
||||||
|
|
||||||
def perform_split(repos, args, def_modules):
|
def perform_split(repos, args, def_modules):
|
||||||
for modname in repos:
|
for modname in repos:
|
||||||
if args.only_defaults and modname not in def_modules:
|
if args.only_defaults and modname not in def_modules:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
targetdir = os.path.join(args.target, modname)
|
targetdir = os.path.join(args.target, modname)
|
||||||
os.mkdir(targetdir)
|
os.mkdir(targetdir)
|
||||||
|
|
||||||
|
@ -287,8 +395,12 @@ def perform_split(repos, args, def_modules):
|
||||||
os.path.join(targetdir, pkgfile),
|
os.path.join(targetdir, pkgfile),
|
||||||
args.action)
|
args.action)
|
||||||
|
|
||||||
|
# Extract the modular metadata for this module
|
||||||
|
if modname != 'non_modular':
|
||||||
|
_dump_modulemd(modname, os.path.join(targetdir, 'modules.yaml'))
|
||||||
|
|
||||||
def create_repos(target, repos,def_modules, only_defaults):
|
|
||||||
|
def create_repos(target, repos, def_modules, only_defaults):
|
||||||
"""
|
"""
|
||||||
Routine to create repositories. Input is target directory and a
|
Routine to create repositories. Input is target directory and a
|
||||||
list of repositories.
|
list of repositories.
|
||||||
|
@ -297,9 +409,19 @@ def create_repos(target, repos,def_modules, only_defaults):
|
||||||
for modname in repos:
|
for modname in repos:
|
||||||
if only_defaults and modname not in def_modules:
|
if only_defaults and modname not in def_modules:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
targetdir = os.path.join(target, modname)
|
||||||
|
|
||||||
subprocess.run([
|
subprocess.run([
|
||||||
'createrepo_c', os.path.join(target, modname),
|
'createrepo_c', targetdir,
|
||||||
'--no-database'])
|
'--no-database'])
|
||||||
|
if modname != 'non_modular':
|
||||||
|
subprocess.run([
|
||||||
|
'modifyrepo_c',
|
||||||
|
'--mdtype=modules',
|
||||||
|
os.path.join(targetdir, 'modules.yaml'),
|
||||||
|
os.path.join(targetdir, 'repodata')
|
||||||
|
])
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
@ -309,6 +431,8 @@ def parse_args():
|
||||||
"""
|
"""
|
||||||
parser = argparse.ArgumentParser(description='Split repositories up')
|
parser = argparse.ArgumentParser(description='Split repositories up')
|
||||||
parser.add_argument('repository', help='The repository to split')
|
parser.add_argument('repository', help='The repository to split')
|
||||||
|
parser.add_argument('--debug', help='Enable debug logging',
|
||||||
|
action='store_true', default=False)
|
||||||
parser.add_argument('--action', help='Method to create split repos files',
|
parser.add_argument('--action', help='Method to create split repos files',
|
||||||
choices=('hardlink', 'symlink', 'copy'),
|
choices=('hardlink', 'symlink', 'copy'),
|
||||||
default='hardlink')
|
default='hardlink')
|
||||||
|
@ -319,6 +443,11 @@ def parse_args():
|
||||||
action='store_true', default=False)
|
action='store_true', default=False)
|
||||||
parser.add_argument('--only-defaults', help='Only output default modules',
|
parser.add_argument('--only-defaults', help='Only output default modules',
|
||||||
action='store_true', default=False)
|
action='store_true', default=False)
|
||||||
|
parser.add_argument('--ignore-missing-default-deps',
|
||||||
|
help='When using --only-defaults, do not skip '
|
||||||
|
'default streams whose dependencies cannot be '
|
||||||
|
'resolved within this repository',
|
||||||
|
action='store_true', default=False)
|
||||||
return parser.parse_args()
|
return parser.parse_args()
|
||||||
|
|
||||||
|
|
||||||
|
@ -337,6 +466,7 @@ def setup_target(args):
|
||||||
else:
|
else:
|
||||||
os.mkdir(args.target)
|
os.mkdir(args.target)
|
||||||
|
|
||||||
|
|
||||||
def parse_repository(directory):
|
def parse_repository(directory):
|
||||||
"""
|
"""
|
||||||
Parse a specific directory, returning a dict with keys module NSVC's and
|
Parse a specific directory, returning a dict with keys module NSVC's and
|
||||||
|
@ -353,45 +483,51 @@ def parse_repository(directory):
|
||||||
# If we have a repository with no modules we do not want our
|
# If we have a repository with no modules we do not want our
|
||||||
# script to error out but just remake the repository with
|
# script to error out but just remake the repository with
|
||||||
# everything in a known sack (aka non_modular).
|
# everything in a known sack (aka non_modular).
|
||||||
|
|
||||||
if 'modules' in repo_info:
|
if 'modules' in repo_info:
|
||||||
mod = _parse_repository_modular(repo_info,package_sack)
|
mod = _parse_repository_modular(repo_info, package_sack)
|
||||||
modpkgset = _get_modular_pkgset(mod)
|
modpkgset = _get_modular_pkgset(mod)
|
||||||
else:
|
else:
|
||||||
mod = dict()
|
mod = dict()
|
||||||
modpkgset = set()
|
modpkgset = set()
|
||||||
|
|
||||||
non_modular = _parse_repository_non_modular(package_sack,repo_info,
|
non_modular = _parse_repository_non_modular(package_sack, repo_info,
|
||||||
modpkgset)
|
modpkgset)
|
||||||
mod['non_modular'] = non_modular
|
mod['non_modular'] = non_modular
|
||||||
|
|
||||||
## We should probably go through our default modules here and
|
# We should probably go through our default modules here and
|
||||||
## remove them from our mod. This would cut down some code paths.
|
# remove them from our mod. This would cut down some code paths.
|
||||||
|
|
||||||
return mod
|
return mod
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
# Determine what the arguments are and
|
# Determine what the arguments are and
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
|
|
||||||
|
if args.debug:
|
||||||
|
logging.basicConfig(level=logging.DEBUG)
|
||||||
|
|
||||||
# Go through arguments and act on their values.
|
# Go through arguments and act on their values.
|
||||||
setup_target(args)
|
setup_target(args)
|
||||||
|
|
||||||
repos = parse_repository(args.repository)
|
repos = parse_repository(args.repository)
|
||||||
|
|
||||||
if args.only_defaults:
|
if args.only_defaults:
|
||||||
def_modules = get_default_modules(args.repository)
|
def_modules = get_default_modules(args.repository, args.ignore_missing_default_deps)
|
||||||
else:
|
else:
|
||||||
def_modules = set()
|
def_modules = set()
|
||||||
def_modules.add('non_modular')
|
|
||||||
|
def_modules.add('non_modular')
|
||||||
|
|
||||||
if not args.skip_missing:
|
if not args.skip_missing:
|
||||||
if not validate_filenames(args.repository, repos):
|
if not validate_filenames(args.repository, repos):
|
||||||
raise ValueError("Package files were missing!")
|
raise ValueError("Package files were missing!")
|
||||||
if args.target:
|
if args.target:
|
||||||
perform_split(repos, args, def_modules)
|
perform_split(repos, args, def_modules)
|
||||||
if args.create_repos:
|
if args.create_repos:
|
||||||
create_repos(args.target, repos,def_modules,args.only_defaults)
|
create_repos(args.target, repos, def_modules, args.only_defaults)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
main()
|
main()
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue