Overzealous sanitisation of migration URL #12137

Closed
opened 2025-11-02 09:59:45 -06:00 by GiteaMirror · 3 comments
Owner

Originally created by @arifer612 on GitHub (Dec 1, 2023).

Description

Background

Using the migration feature, I am trying to create a pull mirror of a git repository hosted on Overleaf. There are two ways to pull the repository:

  1. Using the project's URL (password-auth URL), where the authentication is accomplished using the Overleaf username and password.
  2. Using the project's token URL (token-auth URL), where the authentication is accomplished using the Overleaf username and an access token.

Screenshot from 2023-12-02 02-25-35-obfuscated
Two ways of cloning the Overleaf repository. The top URL is the password-auth URL, and the bottom URL is the token-auth URL.

The problem

Creating mirror clones using the first method, where I authenticated with my username and password, can be done without an issue. However, whenever I try to create mirror clones using the second method, it fails. Of course, I have verified that there is nothing wrong with the link or the access token, as cloning it on my computers worked without any issue. So it seems that the problem may lie in how Gitea clones the repository.

The issue

After a closer inspection into the logs and process, I figured out what was actually going on. The token-auth URL , https://git@git.overleaf.com/PROJECT_ID), is always sanitised to https://git.overleaf.com/PROJECT_ID, the password-auth URL, that needs to be authenticated with a password and not an access token!
Screenshot from 2023-12-02 02-46-59-obfuscated
Gitea log entries detailing the failure to authenticate

To verify my suspicions, I tried once again to create a migration using the token-auth URL, but this time the authentication details are my Overleaf username and password. The results were as expected, a successful migration and a working pull mirror. In other words, this means that the migration URL is being overzealously sanitised.

A possible solution

I am unsure where to start looking in the source code to fix this issue, and whether it is even right for me to say that all migration URLs that take the form of the auth-token URL, i.e. https://git@DOMAIN/REPO, should not be sanitised. However, if there is no strong reason why the sanitation should be in place, then I think it should be removed.

This issue is reproducible on the demo site, but there is no link I can provide since the repository cannot be migrated to begin with.

Gitea Version

1.21.1

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

2.40.1

Operating System

Slackware

How are you running Gitea?

As a Docker container.

Database

MySQL/MariaDB

Originally created by @arifer612 on GitHub (Dec 1, 2023). ### Description ### Background Using the migration feature, I am trying to create a pull mirror of a git repository hosted on Overleaf. There are two ways to pull the repository: 1. Using the project's URL (password-auth URL), where the authentication is accomplished using the Overleaf username and password. 2. Using the project's token URL (token-auth URL), where the authentication is accomplished using the Overleaf username and an access token. ![Screenshot from 2023-12-02 02-25-35-obfuscated](https://github.com/go-gitea/gitea/assets/46054733/0a4818b7-7028-4b5e-bbad-0bed6360ea81) _Two ways of cloning the Overleaf repository. The top URL is the password-auth URL, and the bottom URL is the token-auth URL._ ### The problem Creating mirror clones using the first method, where I authenticated with my username and password, can be done without an issue. However, whenever I try to create mirror clones using the second method, it fails. Of course, I have verified that there is nothing wrong with the link or the access token, as cloning it on my computers worked without any issue. So it seems that the problem may lie in how Gitea clones the repository. ### The issue After a closer inspection into the logs and process, I figured out what was actually going on. The token-auth URL , `https://git@git.overleaf.com/PROJECT_ID`), is always sanitised to `https://git.overleaf.com/PROJECT_ID`, the password-auth URL, that needs to be authenticated with a password and not an access token! ![Screenshot from 2023-12-02 02-46-59-obfuscated](https://github.com/go-gitea/gitea/assets/46054733/0095f319-181e-4976-847c-38316902ab69) _Gitea log entries detailing the failure to authenticate_ To verify my suspicions, I tried once again to create a migration using the token-auth URL, but this time the authentication details are my Overleaf username and password. The results were as expected, a successful migration and a working pull mirror. In other words, this means that **the migration URL is being overzealously sanitised**. ### A possible solution I am unsure where to start looking in the source code to fix this issue, and whether it is even right for me to say that all migration URLs that take the form of the auth-token URL, i.e. `https://git@DOMAIN/REPO`, should not be sanitised. However, if there is no strong reason why the sanitation should be in place, then I think it should be removed. ### Demo site link This issue is reproducible on the demo site, but there is no link I can provide since the repository cannot be migrated to begin with. ### Gitea Version 1.21.1 ### Can you reproduce the bug on the Gitea demo site? Yes ### Log Gist _No response_ ### Screenshots _No response_ ### Git Version 2.40.1 ### Operating System Slackware ### How are you running Gitea? As a Docker container. ### Database MySQL/MariaDB
GiteaMirror added the type/bug label 2025-11-02 09:59:45 -06:00
Author
Owner

@arifer612 commented on GitHub (Dec 1, 2023):

How to reproduce this

  1. Create an authentication token on Overleaf.
  2. Create a new document on Overleaf.
  3. Get the token-auth URL for the Overleaf document (https://git@git.overleaf.com/PROJECT_ID)
  4. Begin a new migration on Gitea:
    • token-auth URL as the migration URL.
    • Overleaf username as the authentication username.
    • Authentication token generated in (1) as the authentication password.
  5. "Migrate repository" <<< It will fail here.
  6. Begin a new migration on Gitea:
    • token-auth URL as the migration URL.
    • Overleaf username as the authentication username.
    • Overleaf password as the authentication password.
  7. "Migrate repository" <<< It will succeed here.
@arifer612 commented on GitHub (Dec 1, 2023): ### How to reproduce this 1. Create an authentication token on Overleaf. 2. Create a new document on Overleaf. 3. Get the token-auth URL for the Overleaf document (`https://git@git.overleaf.com/PROJECT_ID`) 4. Begin a new migration on Gitea: - token-auth URL as the migration URL. - Overleaf username as the authentication username. - Authentication token generated in (1) as the authentication password. 5. "Migrate repository" <<< It will fail here. 6. Begin a new migration on Gitea: - token-auth URL as the migration URL. - Overleaf username as the authentication username. - Overleaf password as the authentication password. 7. "Migrate repository" <<< It will succeed here.
Author
Owner

@lng2020 commented on GitHub (Dec 4, 2023):

The backend will assume your user is git and stripe out the username. Indeed, it could be more flexible. But it will require a redesign.
FYI
ec1feedbf5/services/mirror/mirror_pull.go (L41-L45)
ec1feedbf5/modules/util/sanitize.go (L35-L74)

Have you tried setting username to git and password to your token in that credential box? Not sure if it will work.

@lng2020 commented on GitHub (Dec 4, 2023): The backend will assume your user is `git` and stripe out the username. Indeed, it could be more flexible. But it will require a redesign. FYI https://github.com/go-gitea/gitea/blob/ec1feedbf582b05b6a5e8c59fb2457f25d053ba2/services/mirror/mirror_pull.go#L41-L45 https://github.com/go-gitea/gitea/blob/ec1feedbf582b05b6a5e8c59fb2457f25d053ba2/modules/util/sanitize.go#L35-L74 Have you tried setting `username` to `git` and `password` to your token in that credential box? Not sure if it will work.
Author
Owner

@arifer612 commented on GitHub (Dec 4, 2023):

Thanks for identifying where the issue lies!

Have you tried setting username to git and password to your token in that credential box? Not sure if it will work.

I tried it, and it works! Wow, I didn't expect it to be that straightforward. I suppose this makes sense since after sanitising auth-URL, the server should expect either: username + password, or git + user token.

This workaround solves my immediate issue, so I'll be closing the issue. However, you're right to say that there should be more flexibility but if it requires a redesign, then there's no need to push for it right now.

Once again, thanks for your help!

@arifer612 commented on GitHub (Dec 4, 2023): Thanks for identifying where the issue lies! > Have you tried setting `username` to `git` and `password` to your token in that credential box? Not sure if it will work. I tried it, and it works! Wow, I didn't expect it to be that straightforward. I suppose this makes sense since after sanitising auth-URL, the server should expect either: username + password, or git + user token. This workaround solves my immediate issue, so I'll be closing the issue. However, you're right to say that there should be more flexibility but if it requires a redesign, then there's no need to push for it right now. Once again, thanks for your help!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#12137