From e0799069a7878052036e4356626078e3426c727a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Kone=C4=8Dn=C3=BD?= Date: Thu, 19 Jan 2023 16:58:48 +0100 Subject: [PATCH] Add investigation of GitLab import from file option MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This investigation is trying to figure out if the GitLab import/export feature could be used for migrating pagure projects. Signed-off-by: Michal Konečný --- docs/pagure2gitlab/gitlab_file_import.rst | 144 ++++++++++++++++++++++ docs/pagure2gitlab/index.rst | 3 +- 2 files changed, 146 insertions(+), 1 deletion(-) create mode 100644 docs/pagure2gitlab/gitlab_file_import.rst diff --git a/docs/pagure2gitlab/gitlab_file_import.rst b/docs/pagure2gitlab/gitlab_file_import.rst new file mode 100644 index 0000000..310aaa2 --- /dev/null +++ b/docs/pagure2gitlab/gitlab_file_import.rst @@ -0,0 +1,144 @@ +.. _gitlab_file_import: + +GitLab import from a file +========================= + +GitLab provides option to `import a new project from a file +`_. This option is created for migrating +projects from one GitLab instance to another. In case of Pagure to GitLab importer we need to adapt +Pagure data to file format used by GitLab. This document will investigate that option. + + +GitLab file format +------------------ + +For purpose of investigation of the GitLab export format I tried to export `test project +`_ I created during the investigation of GitLab API. +See :doc:`gitlab`. + +The export will generate one archive in `tar.gz` format. This archive contains a following +directory structure:: + + 2023-01-20_11-48-813_testgroup519_arc_export + ├── GITLAB_REVISION + ├── GITLAB_VERSION + ├── lfs-objects + │ └── 45a5d77993d525cdda15d08e63c34339a1bf49a43756a05908082bb04b4c4087 + ├── lfs-objects.json + ├── project.bundle + ├── project.design.bundle + ├── snippets + ├── tree + │ ├── project + │ │ ├── auto_devops.ndjson + │ │ ├── boards.ndjson + │ │ ├── ci_cd_settings.ndjson + │ │ ├── ci_pipelines.ndjson + │ │ ├── container_expiration_policy.ndjson + │ │ ├── custom_attributes.ndjson + │ │ ├── error_tracking_setting.ndjson + │ │ ├── external_pull_requests.ndjson + │ │ ├── issues.ndjson + │ │ ├── labels.ndjson + │ │ ├── merge_requests.ndjson + │ │ ├── metrics_setting.ndjson + │ │ ├── milestones.ndjson + │ │ ├── pipeline_schedules.ndjson + │ │ ├── project_badges.ndjson + │ │ ├── project_feature.ndjson + │ │ ├── project_members.ndjson + │ │ ├── prometheus_metrics.ndjson + │ │ ├── protected_branches.ndjson + │ │ ├── protected_environments.ndjson + │ │ ├── protected_tags.ndjson + │ │ ├── push_rule.ndjson + │ │ ├── releases.ndjson + │ │ ├── security_setting.ndjson + │ │ ├── service_desk_setting.ndjson + │ │ └── snippets.ndjson + │ └── project.json + ├── uploads + │ └── 8b4f7247f154d0b77c5d7d13e16cb904 + │ └── Infra___Releng_2022.jpg + └── VERSION + + 7 directories, 35 files + +Following is the explanation of some of the files found in the archive: + +* GitLab metadata files (version and revision) + +* `.bundle` file which is created by `git bundle `_ command. + You can easily look at the content of `.bundle` file by using `git clone` command. + +* `.design.bundle` contains all the attachments from issues and merge requests. It is + a repository file bundled by `git bundle `_ command. + +* `lsf-object.json` contains list of hashes of designs and their mapping to issue id or merge + request id. This is something we can skip, because Pagure doesn't have this feature. + +* `VERSION` file contains version, but I was not able what this version refers to. My assumption is + that it's version of the export tool. + +* `lfs-objects/` folder contains all the designs named by hash. This is something we can skip, + because Pagure doesn't have this feature. + +* `snippets/` folder contains `GitLab snippets `_. + +* `tree/project.json` file contains all the project metadata in JSON format. + +* `tree/project/` contains files in `ndjson format `_ describing various + objects defined in GitLab project. For purpose of this investigation only `issues.ndjson` and + `merge_requests.ndjson` are important for us. + +* `uploads/` folder contains all the attachments from issues or merge requests. + + +Conversion of Pagure project to GitLab file formats +--------------------------------------------------- + +For purpose of the investigation I tried to convert `ARC project `_ +hosted on Pagure to GitLab import format. For this purpose I started with the export generated by GitLab +and changed files to correspond to what I want to import. + +Here is the list of all files that I needed to prepare and their content with explanation: + +* `project.bundle` is a binary bundle file created by `git bundle `_ + command. It was created by running `git bundle create project.bundle --all` inside ARC project repository. + +* `tree/project/issues.ndjson` contains issues description in `ndjson format `_. + The file contains `project_id` or `author_id` set to 0, instead it contains `author` object with + FAS username and public FAS e-mail. Unfortunately if the `author_id` isn't recognized by GitLab it + will create the issue or comment as a user who is providing the import, completely ignoring the author + object in JSON. + + .. code-block:: json + + {"title":"Investigate the GitLab API for Pagure to Gitlab importer","author_id":0,"author":{"username": "zlopez","email": "michal.konecny@pacse.eu"},"project_id":42729361,"created_at":"2023-01-19T11:41:40.000Z","updated_at":"2023-01-19T14:06:47.659Z","description":"Investigate the GitLab API for Pagure to Gitlab importer ARC investigation. This ticket will also work as a test ticket in investigation.","iid":1,"updated_by_id":null,"weight":null,"confidential":false,"due_date":null,"lock_version":0,"time_estimate":0,"relative_position":513,"last_edited_at":null,"last_edited_by_id":null,"discussion_locked":null,"closed_at":"2023-01-19T14:06:47.641Z","closed_by_id":3072529,"health_status":null,"external_key":null,"issue_type":"issue","state":"closed","events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T13:07:11.164Z","updated_at":"2023-01-19T13:07:11.164Z","action":"created","target_type":"Issue","fingerprint":null},{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T14:06:47.712Z","updated_at":"2023-01-19T14:06:47.712Z","action":"closed","target_type":"Issue","fingerprint":null}],"timelogs":[],"notes":[{"note":"Here's a sample comment as you requested @zlopez.","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-19T12:59:59.000Z","updated_at":"2023-01-19T12:59:59.000Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":false,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"f98cdeabaaec68ae453e1dbf5d9e535fbbcede0a","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-19T12:59:59.000Z","author":{"name":"Zlopez"},"award_emoji":[],"events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T13:13:21.071Z","updated_at":"2023-01-19T13:13:21.071Z","action":"commented","target_type":"Note","fingerprint":null}]}],"label_links":[],"resource_label_events":[],"resource_milestone_events":[],"resource_state_events":[{"user_id":3072529,"created_at":"2023-01-19T14:06:47.734Z","state":"closed","source_commit":null,"close_after_error_tracking_resolve":false,"close_auto_resolve_prometheus_alert":false}],"designs":[],"design_versions":[],"issue_assignees":[],"zoom_meetings":[],"award_emoji":[],"resource_iteration_events":[]} + {"title":"Test open issue","author_id":0,"author":{"username": "akashdeep","email": "akashdeep.dhar@gmail.com"},"project_id":42729361,"created_at":"2023-01-19T14:07:05.823Z","updated_at":"2023-01-20T11:48:02.495Z","description":"Test open issue","iid":2,"updated_by_id":null,"weight":null,"confidential":false,"due_date":null,"lock_version":0,"time_estimate":0,"relative_position":1026,"last_edited_at":null,"last_edited_by_id":null,"discussion_locked":null,"closed_at":null,"closed_by_id":null,"health_status":null,"external_key":null,"issue_type":"issue","state":"opened","events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T14:07:05.930Z","updated_at":"2023-01-19T14:07:05.930Z","action":"created","target_type":"Issue","fingerprint":null}],"timelogs":[],"notes":[{"note":"![Infra___Releng_2022](/uploads/8b4f7247f154d0b77c5d7d13e16cb904/Infra___Releng_2022.jpg)","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-20T11:48:02.435Z","updated_at":"2023-01-20T11:48:02.435Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":false,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"30302c7dee98663fcfca845a2ec2715eb3e35e4f","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-20T11:48:02.435Z","author":{"name":"Zlopez"},"award_emoji":[],"events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-20T11:48:02.617Z","updated_at":"2023-01-20T11:48:02.617Z","action":"commented","target_type":"Note","fingerprint":null}]},{"note":"added [1 design](/testgroup519/arc/-/issues/2/designs?version=490993)","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-19T14:07:45.310Z","updated_at":"2023-01-19T14:07:45.315Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":true,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"e15e7c584cc7e6c7e298529f034f0b55eeacca90","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-19T14:07:45.315Z","author":{"name":"Zlopez"},"award_emoji":[],"system_note_metadata":{"commit_count":null,"action":"designs_added","created_at":"2023-01-19T14:07:45.343Z","updated_at":"2023-01-19T14:07:45.343Z"},"events":[]}],"label_links":[],"resource_label_events":[],"resource_milestone_events":[],"resource_state_events":[],"designs":[{"project_id":42729361,"filename":"Infra___Releng_2022.jpg","relative_position":null,"iid":1,"notes":[]}],"design_versions":[{"sha":"69419c215f53d401c1b3c451e6fc08e3351d2679","created_at":"2023-01-19T14:07:45.233Z","author_id":3072529,"actions":[{"event":"creation","design":{"project_id":42729361,"filename":"Infra___Releng_2022.jpg","relative_position":null,"iid":1}}]}],"issue_assignees":[],"zoom_meetings":[],"award_emoji":[],"resource_iteration_events":[]} + + +Importing the archive to GitLab +------------------------------- + +Archive for the migration is prepared by executing `tar -czvf test_arc_export.tar.gz .` command. This needs +to be executed in the root folder of the prepared file structure, otherwise the import will fail with +`No such file or directory`. + +To import the archive to GitLab `API call `_ +could be used. Here is the full API call made by `curl`:: + + curl --request POST --header "PRIVATE-TOKEN: XXX" --form "namespace=testgroup519" --form "path=arc2" --form "file=@test_arc_export.tar.gz" "https://gitlab.com/api/v4/projects/import" + +To check for any error in the import use GitLab `import status API call +`_. This could be made by `curl`:: + + curl --header "PRIVATE-TOKEN: XXX" "https://gitlab.com/api/v4/projects//import" + + +Conclusion +---------- + +At this point I ended up with the investigation, because the situation is the same as in case of using API. +Which is much more convenient to use and provides a better response in case of errors (I spent two days trying +to debug `No such file or directory [FILTERED]` error message). diff --git a/docs/pagure2gitlab/index.rst b/docs/pagure2gitlab/index.rst index 3e6c3ff..a653951 100644 --- a/docs/pagure2gitlab/index.rst +++ b/docs/pagure2gitlab/index.rst @@ -40,13 +40,14 @@ List of features that would be nice to have. Investigation ------------- -Following are the investigations of Pagure options to export the data and GitLab API investigation. +Following are the investigations of Pagure options to export and GitLab options to import. .. toctree:: :maxdepth: 1 pagure gitlab + gitlab_file_import Conclusions