arc/docs/pagure2gitlab/gitlab_file_import.rst
Ryan Lerch ba720c3d77 fix parsing errors and sphinx warnings
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2023-11-20 13:04:34 +00:00

148 lines
12 KiB
ReStructuredText

.. _gitlab_file_import:
GitLab import from a file
=========================
GitLab provides option to `import a new project from a file
<https://docs.gitlab.com/ee/api/project_import_export.html>`_. This option is created
for migrating projects from one GitLab instance to another. In case of Pagure to GitLab
importer we need to adapt Pagure data to file format used by GitLab. This document will
investigate that option.
GitLab file format
------------------
For purpose of investigation of the GitLab export format I tried to export `test project
<https://gitlab.com/testgroup519/arc>`_ I created during the investigation of GitLab
API. See :doc:`gitlab`.
The export will generate one archive in `tar.gz` format. This archive contains a
following directory structure:
.. code-block::
2023-01-20_11-48-813_testgroup519_arc_export
├── GITLAB_REVISION
├── GITLAB_VERSION
├── lfs-objects
│ └── 45a5d77993d525cdda15d08e63c34339a1bf49a43756a05908082bb04b4c4087
├── lfs-objects.json
├── project.bundle
├── project.design.bundle
├── snippets
├── tree
│ ├── project
│ │ ├── auto_devops.ndjson
│ │ ├── boards.ndjson
│ │ ├── ci_cd_settings.ndjson
│ │ ├── ci_pipelines.ndjson
│ │ ├── container_expiration_policy.ndjson
│ │ ├── custom_attributes.ndjson
│ │ ├── error_tracking_setting.ndjson
│ │ ├── external_pull_requests.ndjson
│ │ ├── issues.ndjson
│ │ ├── labels.ndjson
│ │ ├── merge_requests.ndjson
│ │ ├── metrics_setting.ndjson
│ │ ├── milestones.ndjson
│ │ ├── pipeline_schedules.ndjson
│ │ ├── project_badges.ndjson
│ │ ├── project_feature.ndjson
│ │ ├── project_members.ndjson
│ │ ├── prometheus_metrics.ndjson
│ │ ├── protected_branches.ndjson
│ │ ├── protected_environments.ndjson
│ │ ├── protected_tags.ndjson
│ │ ├── push_rule.ndjson
│ │ ├── releases.ndjson
│ │ ├── security_setting.ndjson
│ │ ├── service_desk_setting.ndjson
│ │ └── snippets.ndjson
│ └── project.json
├── uploads
│ └── 8b4f7247f154d0b77c5d7d13e16cb904
│ └── Infra___Releng_2022.jpg
└── VERSION
7 directories, 35 files
Following is the explanation of some of the files found in the archive:
- GitLab metadata files (version and revision)
- `.bundle` file which is created by `git bundle <https://git-scm.com/docs/git-bundle>`_
command. You can easily look at the content of `.bundle` file by using `git clone`
command.
- `.design.bundle` contains all the attachments from issues and merge requests. It is a
repository file bundled by `git bundle <https://git-scm.com/docs/git-bundle>`_
command.
- `lsf-object.json` contains list of hashes of designs and their mapping to issue id or
merge request id. This is something we can skip, because Pagure doesn't have this
feature.
- `VERSION` file contains version, but I was not able what this version refers to. My
assumption is that it's version of the export tool.
- `lfs-objects/` folder contains all the designs named by hash. This is something we can
skip, because Pagure doesn't have this feature.
- `snippets/` folder contains `GitLab snippets
<https://docs.gitlab.com/ee/user/snippets.html>`_.
- `tree/project.json` file contains all the project metadata in JSON format.
- `tree/project/` contains files in `ndjson format <http://ndjson.org/>`_ describing
various objects defined in GitLab project. For purpose of this investigation only
`issues.ndjson` and `merge_requests.ndjson` are important for us.
- `uploads/` folder contains all the attachments from issues or merge requests.
Conversion of Pagure project to GitLab file formats
---------------------------------------------------
For purpose of the investigation I tried to convert `ARC project
<https://pagure.io/fedora-infra/arc>`_ hosted on Pagure to GitLab import format. For
this purpose I started with the export generated by GitLab and changed files to
correspond to what I want to import.
Here is the list of all files that I needed to prepare and their content with
explanation:
- `project.bundle` is a binary bundle file created by `git bundle
<https://git-scm.com/docs/git-bundle>`_ command. It was created by running `git bundle
create project.bundle --all` inside ARC project repository.
- `tree/project/issues.ndjson` contains issues description in `ndjson format
<http://ndjson.org/>`_. The file contains `project_id` or `author_id` set to 0,
instead it contains `author` object with FAS username and public FAS e-mail.
Unfortunately if the `author_id` isn't recognized by GitLab it will create the issue
or comment as a user who is providing the import, completely ignoring the author
object in JSON.
.. code-block:: json
{"title":"Investigate the GitLab API for Pagure to Gitlab importer","author_id":0,"author":{"username": "zlopez","email": "michal.konecny@pacse.eu"},"project_id":42729361,"created_at":"2023-01-19T11:41:40.000Z","updated_at":"2023-01-19T14:06:47.659Z","description":"Investigate the GitLab API for Pagure to Gitlab importer ARC investigation. This ticket will also work as a test ticket in investigation.","iid":1,"updated_by_id":null,"weight":null,"confidential":false,"due_date":null,"lock_version":0,"time_estimate":0,"relative_position":513,"last_edited_at":null,"last_edited_by_id":null,"discussion_locked":null,"closed_at":"2023-01-19T14:06:47.641Z","closed_by_id":3072529,"health_status":null,"external_key":null,"issue_type":"issue","state":"closed","events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T13:07:11.164Z","updated_at":"2023-01-19T13:07:11.164Z","action":"created","target_type":"Issue","fingerprint":null},{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T14:06:47.712Z","updated_at":"2023-01-19T14:06:47.712Z","action":"closed","target_type":"Issue","fingerprint":null}],"timelogs":[],"notes":[{"note":"Here's a sample comment as you requested @zlopez.","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-19T12:59:59.000Z","updated_at":"2023-01-19T12:59:59.000Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":false,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"f98cdeabaaec68ae453e1dbf5d9e535fbbcede0a","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-19T12:59:59.000Z","author":{"name":"Zlopez"},"award_emoji":[],"events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T13:13:21.071Z","updated_at":"2023-01-19T13:13:21.071Z","action":"commented","target_type":"Note","fingerprint":null}]}],"label_links":[],"resource_label_events":[],"resource_milestone_events":[],"resource_state_events":[{"user_id":3072529,"created_at":"2023-01-19T14:06:47.734Z","state":"closed","source_commit":null,"close_after_error_tracking_resolve":false,"close_auto_resolve_prometheus_alert":false}],"designs":[],"design_versions":[],"issue_assignees":[],"zoom_meetings":[],"award_emoji":[],"resource_iteration_events":[]}
{"title":"Test open issue","author_id":0,"author":{"username": "akashdeep","email": "akashdeep.dhar@gmail.com"},"project_id":42729361,"created_at":"2023-01-19T14:07:05.823Z","updated_at":"2023-01-20T11:48:02.495Z","description":"Test open issue","iid":2,"updated_by_id":null,"weight":null,"confidential":false,"due_date":null,"lock_version":0,"time_estimate":0,"relative_position":1026,"last_edited_at":null,"last_edited_by_id":null,"discussion_locked":null,"closed_at":null,"closed_by_id":null,"health_status":null,"external_key":null,"issue_type":"issue","state":"opened","events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T14:07:05.930Z","updated_at":"2023-01-19T14:07:05.930Z","action":"created","target_type":"Issue","fingerprint":null}],"timelogs":[],"notes":[{"note":"![Infra___Releng_2022](/uploads/8b4f7247f154d0b77c5d7d13e16cb904/Infra___Releng_2022.jpg)","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-20T11:48:02.435Z","updated_at":"2023-01-20T11:48:02.435Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":false,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"30302c7dee98663fcfca845a2ec2715eb3e35e4f","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-20T11:48:02.435Z","author":{"name":"Zlopez"},"award_emoji":[],"events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-20T11:48:02.617Z","updated_at":"2023-01-20T11:48:02.617Z","action":"commented","target_type":"Note","fingerprint":null}]},{"note":"added [1 design](/testgroup519/arc/-/issues/2/designs?version=490993)","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-19T14:07:45.310Z","updated_at":"2023-01-19T14:07:45.315Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":true,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"e15e7c584cc7e6c7e298529f034f0b55eeacca90","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-19T14:07:45.315Z","author":{"name":"Zlopez"},"award_emoji":[],"system_note_metadata":{"commit_count":null,"action":"designs_added","created_at":"2023-01-19T14:07:45.343Z","updated_at":"2023-01-19T14:07:45.343Z"},"events":[]}],"label_links":[],"resource_label_events":[],"resource_milestone_events":[],"resource_state_events":[],"designs":[{"project_id":42729361,"filename":"Infra___Releng_2022.jpg","relative_position":null,"iid":1,"notes":[]}],"design_versions":[{"sha":"69419c215f53d401c1b3c451e6fc08e3351d2679","created_at":"2023-01-19T14:07:45.233Z","author_id":3072529,"actions":[{"event":"creation","design":{"project_id":42729361,"filename":"Infra___Releng_2022.jpg","relative_position":null,"iid":1}}]}],"issue_assignees":[],"zoom_meetings":[],"award_emoji":[],"resource_iteration_events":[]}
Importing the archive to GitLab
-------------------------------
Archive for the migration is prepared by executing `tar -czvf test_arc_export.tar.gz .`
command. This needs to be executed in the root folder of the prepared file structure,
otherwise the import will fail with `No such file or directory`.
To import the archive to GitLab `API call
<https://docs.gitlab.com/ee/api/project_import_export.html#import-a-file>`_ could be
used. Here is the full API call made by `curl`:
.. code-block::
curl --request POST --header "PRIVATE-TOKEN: XXX" --form "namespace=testgroup519" --form "path=arc2" --form "file=@test_arc_export.tar.gz" "https://gitlab.com/api/v4/projects/import"
To check for any error in the import use GitLab `import status API call
<https://docs.gitlab.com/ee/api/project_import_export.html#import-status>`_. This could
be made by `curl`:
.. code-block::
curl --header "PRIVATE-TOKEN: XXX" "https://gitlab.com/api/v4/projects/<id_returned_by_import_call>/import"
Conclusion
----------
At this point I ended up with the investigation, because the situation is the same as in
case of using API. Which is much more convenient to use and provides a better response
in case of errors (I spent two days trying to debug `No such file or directory
[FILTERED]` error message).