Dashes in Wiki page titles are replaced with spaces #3647

Closed
opened 2025-11-02 05:20:28 -06:00 by GiteaMirror · 17 comments
Owner

Originally created by @qwertfisch on GitHub (Jul 22, 2019).

Description

The Gitea internal wiki does not allow dash characters in page titles. I can enter it on creation, but it is replaced with a space on display.

I know that the direct cause is that spaces are replaced with dashes on filename level (also in the URL), so there is no distinction between spaces and dashes when reopening a page. Hence it will all be escaped and displayed as a space character.

Attempted workaround with URL-escaped character

All non-ASCII Unicode characters are displayed as URL-escaped variants of their Unicode code point, and even / is replaced with %2F. So I would expect that a filename containing %2D (stands for the dash) could be loaded and displayed with its correct title. I had manually created such a file, pushed it, but the %2D-escaped dash is still further escaped to a space. Even worse: the URL with the page won’t load anymore.

This looks like a deeper error in escape handling with the dash character.

Originally created by @qwertfisch on GitHub (Jul 22, 2019). <!-- NOTE: If your issue is a security concern, please send an email to security@gitea.io instead of opening a public issue --> <!-- 1. Please speak English, this is the language all maintainers can speak and write. 2. Please ask questions or configuration/deploy problems on our Discord server (https://discord.gg/gitea) or forum (https://discourse.gitea.io). 3. Please take a moment to check that your issue doesn't already exist. 4. Please give all relevant information below for bug reports, because incomplete details will be handled as an invalid report. --> - Gitea version (or commit ref): 1.9.0+ - Git version: 2.22 - Operating system: Linux - Database: - [ ] PostgreSQL - [x] MySQL - [ ] MSSQL - [ ] SQLite - Can you reproduce the bug at https://try.gitea.io: - [x] Yes: https://try.gitea.io/qwertfisch/TestRepo/wiki/Page-with-dash-here - [ ] No - [ ] Not relevant ## Description The Gitea internal wiki does not allow dash characters in page titles. I can enter it on creation, but it is replaced with a space on display. I know that the direct cause is that _spaces_ are replaced with dashes on filename level (also in the URL), so there is no distinction between spaces and dashes when reopening a page. Hence it will all be escaped and displayed as a space character. ## Attempted workaround with URL-escaped character All non-ASCII Unicode characters are displayed as URL-escaped variants of their Unicode code point, and even `/` is replaced with `%2F`. So I would expect that a filename containing `%2D` (stands for the dash) could be loaded and displayed with its correct title. I had manually created such a file, pushed it, but the %2D-escaped dash is still further escaped to a space. Even worse: the URL with the page won’t load anymore. This looks like a deeper error in escape handling with the dash character.
GiteaMirror added the type/proposal label 2025-11-02 05:20:28 -06:00
Author
Owner

@zeripath commented on GitHub (Jul 22, 2019):

If what I suspect is happening is, then I bet if you use %252D it will work.

However that's not really a solution...

@zeripath commented on GitHub (Jul 22, 2019): If what I suspect is happening is, then I bet if you use %252D it will work. However that's not really a solution...
Author
Owner

@lunny commented on GitHub (Jul 23, 2019):

It seems we already have a PR to fix this.

@lunny commented on GitHub (Jul 23, 2019): It seems we already have a PR to fix this.
Author
Owner

@qwertfisch commented on GitHub (Jul 24, 2019):

@lunny In the list of open PRs I could not find anything related. Can you reference the PR in mind?

@qwertfisch commented on GitHub (Jul 24, 2019): @lunny In the list of open PRs I could not find anything related. Can you reference the PR in mind?
Author
Owner

@qwertfisch commented on GitHub (Jul 24, 2019):

@zeripath Sorry, that does not work either. The replacement of a dash character to a space seems to be executed right after URI decoding, but there is no double decoding.

I made an overview table to see the different behaviour. The last column means that a page can be loaded on the given URL, which does not work in the second and third case. It is only visible in the list of all pages.

Description Filename Resulting URL Display Page is found
Normal URI encoded chars colon%3A-try-it.md colon%3A-try-it colon: try it
Space URI encoded space%20test.md space-test space test
Dash URI encoded page-with-one%2Ddash.md page-with-one-dash page with one dash
Dash double URI encoded page-with-one%252Ddash.md page-with-one%2Ddash page with one%2Ddash

The second and third rows are kind of confusing. It looks as if %20 and %2D are URI decoded from the filename, then there is the default dash/space conversion to create a page title. But when creating the URL, it seems as if it’s not created from filename, but from the (previously decoded) page title. Which of course results in - characters for all spaces (and no dash at all), so neither the correct file with %2D nor %20 can be found.

Suggestion

Can’t we just remove skip the errornous space/dash conversion and store the page with URI encoding? This would solve every problem case, with the slight disadvantage of spaces encoded %20 in the URL …

@qwertfisch commented on GitHub (Jul 24, 2019): @zeripath Sorry, that does not work either. The replacement of a dash character to a space seems to be executed right after URI decoding, but there is no double decoding. I made an overview table to see the different behaviour. The last column means that a page can be loaded on the given URL, which does not work in the second and third case. It is only visible in the list of all pages. | Description | Filename | Resulting URL | Display | Page is found | | --- | --- | --- | --- | ---: | | Normal URI encoded chars | `colon%3A-try-it.md` | `colon%3A-try-it` | colon: try it | ✔ | | Space URI encoded | `space%20test.md` | `space-test` | space test | ✘ | | Dash URI encoded | `page-with-one%2Ddash.md` | `page-with-one-dash` | page with one dash | ✘ | | Dash double URI encoded | `page-with-one%252Ddash.md` | `page-with-one%2Ddash` | page with one%2Ddash | ✔ | The second and third rows are kind of confusing. It looks as if %20 and %2D are URI decoded from the filename, then there is the default dash/space conversion to create a page title. But when creating the URL, it seems as if it’s not created from filename, but from the (previously decoded) page title. Which of course results in `-` characters for all spaces (and no dash at all), so neither the correct file with `%2D` nor `%20` can be found. ## Suggestion Can’t we just remove skip the errornous space/dash conversion and store the page with URI encoding? This would solve every problem case, with the slight disadvantage of spaces encoded `%20` in the URL …
Author
Owner

@mrsdizzie commented on GitHub (Jul 24, 2019):

Haven't looked too close but this is probably happening here:

d4667a4949/models/wiki.go (L43-L53)

So its first unescaping the file name and then running it through something that just replaces - with " " which matches described behavior.

I agree the filenames should have always been stored encoded but unfortunately they weren't and changing that would put dashes in the title of everyones page where they weren't there before. Would be a breaking change to consider.

@mrsdizzie commented on GitHub (Jul 24, 2019): Haven't looked too close but this is probably happening here: https://github.com/go-gitea/gitea/blob/d4667a4949729ad73b7dc0c633bd6daca4508073/models/wiki.go#L43-L53 So its first unescaping the file name and then running it through something that just replaces - with " " which matches described behavior. I agree the filenames should have always been stored encoded but unfortunately they weren't and changing that would put dashes in the title of everyones page where they weren't there before. Would be a breaking change to consider.
Author
Owner

@zeripath commented on GitHub (Jul 24, 2019):

Make it a per repo configurable?

@zeripath commented on GitHub (Jul 24, 2019): Make it a per repo configurable?
Author
Owner

@mrsdizzie commented on GitHub (Jul 24, 2019):

I hesitate to add yet more config options, particularly to keep a behavior that is not good (current replacing of space with dash).

I'd rather try and have some sort of code to handle legacy cases, or even just consider all literal dashes as spaces (breaking the less common case of a dash in the title) and then escape everything going forward and handle that properly.

I guess that would maybe look like first replacing all literal dashes and then escaping the filename. It would only break existing titles with a dash in them which already seem to not work properly anyways per this issue

@mrsdizzie commented on GitHub (Jul 24, 2019): I hesitate to add yet more config options, particularly to keep a behavior that is not good (current replacing of space with dash). I'd rather try and have some sort of code to handle legacy cases, or even just consider all literal dashes as spaces (breaking the less common case of a dash in the title) and then escape everything going forward and handle that properly. I guess that would maybe look like first replacing all literal dashes and then escaping the filename. It would only break existing titles with a dash in them which already seem to not work properly anyways per this issue
Author
Owner

@zeripath commented on GitHub (Jul 24, 2019):

Hmm what happens to + ?

@zeripath commented on GitHub (Jul 24, 2019): Hmm what happens to + ?
Author
Owner

@mrsdizzie commented on GitHub (Jul 24, 2019):

In what situation? I think in the situation of saving a new filename we should make sure to encode + to %2B and not leave it as +

Are you asking if somebody already has a + in their filename? If so that should just still work since any unencoding wouldn't mess with it.

@mrsdizzie commented on GitHub (Jul 24, 2019): In what situation? I think in the situation of saving a new filename we should make sure to encode + to %2B and not leave it as + Are you asking if somebody already has a + in their filename? If so that should just still work since any unencoding wouldn't mess with it.
Author
Owner

@mrsdizzie commented on GitHub (Jul 24, 2019):

Probably the real problem to solve would be preserving current links or making sure they still work somehow

@mrsdizzie commented on GitHub (Jul 24, 2019): Probably the real problem to solve would be preserving current links or making sure they still work somehow
Author
Owner

@qwertfisch commented on GitHub (Jul 25, 2019):

@zeripath + and ? are escaped and can be used properly. In fact every ASCII special character is usable. Only -._~ are not encoded but put directly in the filename / URL. (And of course the space is converted to dash before storing as file. This should be %20 normally.)

  • page title: chartest !"#$%&'()*+, ./:;<=>?@[]^_`{|}~§
  • filename: chartest-%21%22%23%24%25%26%27%28%29%2A%2B%2C-.%2F%3A%3B%3C%3D%3E%3F%40%5B%5C%5D%5E_%60%7B%7C%7D~%C2%A7.md
@qwertfisch commented on GitHub (Jul 25, 2019): @zeripath `+` and `?` are escaped and can be used properly. In fact every ASCII special character is usable. Only `-._~` are not encoded but put directly in the filename / URL. (And of course the space is converted to dash before storing as file. This should be `%20` normally.) - page title: chartest !"#$%&'()*+, ./:;<=>?@[\]^_`{|}~§ - filename: `chartest-%21%22%23%24%25%26%27%28%29%2A%2B%2C-.%2F%3A%3B%3C%3D%3E%3F%40%5B%5C%5D%5E_%60%7B%7C%7D~%C2%A7.md`
Author
Owner

@zeripath commented on GitHub (Jul 25, 2019):

@mrsdizzie @qwertfisch I was meaning that a plain '+' in an url should map to ' ' by convention, and through my testing it appears that this doesn't get mapped to '-' but rather retains the ' ' when passed through - so you should be able to reach a file with spaces in that way.

Yeesh this is so broken. I've been ignoring the wiki, as like the diff page, as I have been suspicious that it needs a thorough overhaul. This proves my fears.

@mrsdizzie I agree that adding configuration should be avoided as much as is possible. If we don't want to add configuration another option is to from 1.10 do things correctly but fallback to the previous behaviour if the old way would have a different file?

@zeripath commented on GitHub (Jul 25, 2019): @mrsdizzie @qwertfisch I was meaning that a plain '+' in an url should map to ' ' by convention, and through my testing it appears that this doesn't get mapped to '-' but rather retains the ' ' when passed through - so you should be able to reach a file with spaces in that way. Yeesh this is so broken. I've been ignoring the wiki, as like the diff page, as I have been suspicious that it needs a thorough overhaul. This proves my fears. @mrsdizzie I agree that adding configuration should be avoided as much as is possible. If we don't want to add configuration another option is to from 1.10 do things correctly but fallback to the previous behaviour if the old way would have a different file?
Author
Owner

@mrsdizzie commented on GitHub (Jul 26, 2019):

Perhaps -- probably something to think about if redesigning this. I'll note that it appears Gitea has just copied this same exact behavior from Github, which does the same thing with dashes and titles/filenames and you can't create a page title with dashes in the title there either.

@mrsdizzie commented on GitHub (Jul 26, 2019): Perhaps -- probably something to think about if redesigning this. I'll note that it appears Gitea has just copied this same exact behavior from Github, which does the same thing with dashes and titles/filenames and you can't create a page title with dashes in the title there either.
Author
Owner

@icetiger1974 commented on GitHub (May 19, 2021):

hello
Is this issue solved?
Or remain in the same situation?

@icetiger1974 commented on GitHub (May 19, 2021): hello Is this issue solved? Or remain in the same situation?
Author
Owner

@unchaynd commented on GitHub (Dec 17, 2022):

You can't put a dash in a Wiki page title?... Seriously??

@unchaynd commented on GitHub (Dec 17, 2022): You can't put a dash in a Wiki page title?... Seriously??
Author
Owner

@zeripath commented on GitHub (Dec 18, 2022):

You can't put a dash in a Wiki page title?... Seriously??

See https://github.com/go-gitea/gitea/issues/7570#issuecomment-515313767

Can you put a dash in a wiki page title on Github or Gitlab? If you can we should make that work - but otherwise we won't be compatible with gh.

@zeripath commented on GitHub (Dec 18, 2022): > You can't put a dash in a Wiki page title?... Seriously?? See https://github.com/go-gitea/gitea/issues/7570#issuecomment-515313767 Can you put a dash in a wiki page title on Github or Gitlab? If you can we should make that work - but otherwise we won't be compatible with gh.
Author
Owner

@rrrutledge commented on GitHub (Jan 12, 2023):

😢

@rrrutledge commented on GitHub (Jan 12, 2023): 😢
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#3647