It is unfortunately all mixed up, because refreshing the data, means breaking the tests. And changing the code means needing fresh data.
- tests: ignore some more headers and sort the rest when dumping http responses
- code: fixed#10234 by requesting the latest issues first.
- tests: created a new repo to replace the disappeared repo, needed for the skip-numbers test
- refreshed the testdata.
- follow-up fixes to get the tests green.
- including a cherry-pick of https://github.com/go-gitea/gitea/pull/36295 and #11272
Co-authored-by: Joakim Olsson <joakim@unbound.se>
Co-authored-by: Robert Wolff <mahlzahn@posteo.de>
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/11282
Reviewed-by: Robert Wolff <mahlzahn@posteo.de>
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Reviewed-by: patdyn <patdyn@noreply.codeberg.org>
Co-authored-by: oliverpool <git@olivier.pfad.fr>
Co-committed-by: oliverpool <git@olivier.pfad.fr>
For the previous code with the Page attribute present in
ListCursorOptions for page 1, github would not return an "After" cursor,
such that the request for page 2 would request what effectively is the
content of page 1 a second time.
This would lead to an attempt to insert the same issues twice.
Note that this is not the only reason why this can happen with the
current code base.
We fix this particular issue by not using the Page attribute so github
does return an "After" cursor.
Fixes#10794
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10798
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: Nils Goroll <nils.goroll@uplex.de>
Co-committed-by: Nils Goroll <nils.goroll@uplex.de>
This is a successor to #10805, which simply did not work. It is also much simpler and basically a one line change to enable an existing feature in [go-github](https://github.com/google/go-github).
Fixes#10845
With this fix and #10798 in place, a migration of a repo with ~3K issues and ~1.3k pull requests finally completed successfully.
## Patch
We use SleepUntilPrimaryRateLimitResetWhenRateLimited to instruct the go-github code to wait until the retry time and retry the request when the primary rate limit gets hit.
## Test case
TestGitHubDownloadRepo() has been modified such that 403 rate limit errors are injected every 7 requests with a retry time of one second, resulting in the rate limit condition being hit twice with the current tests. The test case confirms that the migration code itself is in fact unaffected by the rate limit being hit.
## Scope
This change does not affect secondary rate limits.
If the server is restarted during the wait for the rate limit refresh, the migration likely still fails when retried, because inserts for already present database objects will be attempted.
This approach effectively puts the task's goroutine to sleep until the retry time, which implies that the respective resources stay allocated.
A better approach might be to add the necessary infrastructure to support restarts of migration tasks at a later time, but this is much more involved, because the migration state would need to be saved and/or re-created based on already pulled data. This would also require adding support for database upserts.
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10846
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: Nils Goroll <nils.goroll@uplex.de>
Co-committed-by: Nils Goroll <nils.goroll@uplex.de>
As mentioned in https://codeberg.org/forgejo/forgejo/issues/8131 and https://codeberg.org/forgejo/forgejo/issues/9018:
The github API changed and they now use cursor based pagination. So migration of issues could fail if there were about 10k resources to migrate.
What was done:
* Added a test for reproduction of the bug
* Updated the go-github library to v74
* Update api usage for Reactions
* Added a struct to GithubDownloaderV3 which holds cursorPagination related info
* Updated GetIssues to use cursorPagination
Caveats:
* So far, only listing issues supports the cursor method
* The test requires a valid access token to github as we need to access a repository with **a lot** of issues to test the issue
* We may want to skip this test in the pipeline
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/9348
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: erik <erik_se@posteo.de>
Co-committed-by: erik <erik_se@posteo.de>
Related to https://codeberg.org/Codeberg/Community/issues/1944
* Allowed the githubdownloaderv3 to know whether issues and, or PRs are requested to migrate
* Used this information to decide to filter for "/pulls/" or "/issues"
* Or not to filter at all if issues == true && prs == true
* Added isolated test for the downloader and for the uploader
* Created a new test_repo in github.com/forgejo and set it up properly together with @Gusted
* Updated github_downloader_test with the new URLs and test data from the repo
* Recorded the API calls for local testing
* Added a minimal gitbucket test (which uses the github downloader under the hood)
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/8892
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: patdyn <patdyn@noreply.codeberg.org>
Co-committed-by: patdyn <patdyn@noreply.codeberg.org>
Git authorization was not taking into account multiple token feature,
leading to auth failures
Closes: https://github.com/go-gitea/gitea/issues/34141
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit 8a6df00c532becd4d10efb70827ccf80b2bf74e2)
* cleanup: remove not used properties
* feat: implement migration of website field from gogs
* feat: implement dumping and restoring website field
* feat: implement migration of website field from gitea
* feat: implement migration of homepage/website field from github
* feat: implement website properties for repository migration
Gogs migration is untested for now.
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/6474
Reviewed-by: Otto <otto@codeberg.org>
Co-authored-by: ThomasBoom89 <thomasboom89@noreply.codeberg.org>
Co-committed-by: ThomasBoom89 <thomasboom89@noreply.codeberg.org>
Change all license headers to comply with REUSE specification.
Fix#16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
GitHub migration tests will be skipped if the secret for the GitHub API
token hasn't been set.
This change should make all tests pass (or skip in the case of this one)
for anyone running the pipeline on their own infrastructure without
further action on their part.
Resolves https://github.com/go-gitea/gitea/issues/21739
Signed-off-by: Gary Moon <gary@garymoon.net>
Storing the foreign identifier of an imported issue in the database is a prerequisite to implement idempotent migrations or mirror for issues. It is a baby step towards mirroring that introduces a new table.
At the moment when an issue is created by the Gitea uploader, it fails if the issue already exists. The Gitea uploader could be modified so that, instead of failing, it looks up the database to find an existing issue. And if it does it would update the issue instead of creating a new one. However this is not currently possible because an information is missing from the database: the foreign identifier that uniquely represents the issue being migrated is not persisted. With this change, the foreign identifier is stored in the database and the Gitea uploader will then be able to run a query to figure out if a given issue being imported already exists.
The implementation of mirroring for issues, pull requests, releases, etc. can be done in three steps:
1. Store an identifier for the element being mirrored (issue, pull request...) in the database (this is the purpose of these changes)
2. Modify the Gitea uploader to be able to update an existing repository with all it contains (issues, pull request...) instead of failing if it exists
3. Optimize the Gitea uploader to speed up the updates, when possible.
The second step creates code that does not yet exist to enable idempotent migrations with the Gitea uploader. When a migration is done for the first time, the behavior is not changed. But when a migration is done for a repository that already exists, this new code is used to update it.
The third step can use the code created in the second step to optimize and speed up migrations. For instance, when a migration is resumed, an issue that has an update time that is not more recent can be skipped and only newly created issues or updated ones will be updated. Another example of optimization could be that a webhook notifies Gitea when an issue is updated. The code triggered by the webhook would download only this issue and call the code created in the second step to update the issue, as if it was in the process of an idempotent migration.
The ForeignReferences table is added to contain local and foreign ID pairs relative to a given repository. It can later be used for pull requests and other artifacts that can be mirrored. Although the foreign id could be added as a single field in issues or pull requests, it would need to be added to all tables that represent something that can be mirrored. Creating a new table makes for a simpler and more generic design. The drawback is that it requires an extra lookup to obtain the information. However, this extra information is only required during migration or mirroring and does not impact the way Gitea currently works.
The foreign identifier of an issue or pull request is similar to the identifier of an external user, which is stored in reactions, issues, etc. as OriginalPosterID and so on. The representation of a user is however different and the ability of users to link their account to an external user at a later time is also a logic that is different from what is involved in mirroring or migrations. For these reasons, despite some commonalities, it is unclear at this time how the two tables (foreign reference and external user) could be merged together.
The ForeignID field is extracted from the issue migration context so that it can be dumped in files with dump-repo and later restored via restore-repo.
The GetAllComments downloader method is introduced to simplify the implementation and not overload the Context for the purpose of pagination. It also clarifies in which context the comments are paginated and in which context they are not.
The Context interface is no longer useful for the purpose of retrieving the LocalID and ForeignID since they are now both available from the PullRequest and Issue struct. The Reviewable and Commentable interfaces replace and serve the same purpose.
The Context data member of PullRequest and Issue becomes a DownloaderContext to clarify that its purpose is not to support in memory operations while the current downloader is acting but is not otherwise persisted. It is, for instance, used by the GitLab downloader to store the IsMergeRequest boolean and sort out issues.
---
[source](https://lab.forgefriends.org/forgefriends/forgefriends/-/merge_requests/36)
Signed-off-by: Loïc Dachary <loic@dachary.org>
Co-authored-by: Loïc Dachary <loic@dachary.org>