Wiki filenames for non-latin character are escaped / not UTF8 #10908

Open
opened 2025-11-02 09:21:52 -06:00 by GiteaMirror · 4 comments
Owner

Originally created by @bserem on GitHub (May 24, 2023).

Description

When creating a Wiki page with non-latin characters in the title (Greek in my case), the resulting MD file is using escaped characters. Please see screenshots bellow.
Seems like Wiki filenames are not using UTF-8 ?

Gitea Version

1.19.3

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

image

Git Version

2.38.5

Operating System

Linux gitea-gitea1 4.4.180+ #42962 SMP Sat Apr 8 00:14:24 CST 2023 x86_64 Linux

How are you running Gitea?

Official Docker image from: https://hub.docker.com/r/gitea/gitea

Database

SQLite

Originally created by @bserem on GitHub (May 24, 2023). ### Description When creating a Wiki page with non-latin characters in the title (Greek in my case), the resulting MD file is using escaped characters. Please see screenshots bellow. Seems like Wiki filenames are not using UTF-8 ? ### Gitea Version 1.19.3 ### Can you reproduce the bug on the Gitea demo site? Yes ### Log Gist _No response_ ### Screenshots ![image](https://github.com/go-gitea/gitea/assets/520237/a6134cf0-e009-4792-b1b1-e38e940f05d9) ### Git Version 2.38.5 ### Operating System Linux gitea-gitea1 4.4.180+ #42962 SMP Sat Apr 8 00:14:24 CST 2023 x86_64 Linux ### How are you running Gitea? Official Docker image from: https://hub.docker.com/r/gitea/gitea ### Database SQLite
GiteaMirror added the type/enhancementtopic/wiki labels 2025-11-02 09:21:52 -06:00
Author
Owner

@wxiaoguang commented on GitHub (May 24, 2023):

That's an legacy problem. In history, the wiki filename system was not well-designed. The current rule is: most chars are encoded to URL parameter format.

@wxiaoguang commented on GitHub (May 24, 2023): That's an legacy problem. In history, the wiki filename system was not well-designed. The current rule is: most chars are encoded to URL parameter format.
Author
Owner

@bserem commented on GitHub (May 24, 2023):

@wxiaoguang thanks for the response.

Is it safe to say the issue lies in this function ??
https://github.com/go-gitea/gitea/blob/main/services/wiki/wiki.go#L49

	foundEscaped := false
	for _, filename := range filesInIndex {
		switch filename {
		case unescaped:
			// if we find the unescaped file return it
			return true, unescaped, nil
		case gitPath:
			foundEscaped = true
		}
	}

As I am not familiar with Gitea decisions, I'll ask: is there a reason to keep this as is? Will it break things if it gets updated?

@bserem commented on GitHub (May 24, 2023): @wxiaoguang thanks for the response. Is it safe to say the issue lies in this function ?? https://github.com/go-gitea/gitea/blob/main/services/wiki/wiki.go#L49 ``` foundEscaped := false for _, filename := range filesInIndex { switch filename { case unescaped: // if we find the unescaped file return it return true, unescaped, nil case gitPath: foundEscaped = true } } ``` As I am not familiar with Gitea decisions, I'll ask: is there a reason to keep this as is? Will it break things if it gets updated?
Author
Owner

@wxiaoguang commented on GitHub (May 24, 2023):

The real problem is more complicated than it.

If you have a chance to work on 1.20, you can take a look at my PR for the details (especially the wiki_path.go)

Make wiki title supports dashes and improve wiki name related features #24143

(the purpose of that PR is just clarifying the details, it doesn't change the legacy behavior too much, there were already a lot of technical debts)

@wxiaoguang commented on GitHub (May 24, 2023): The real problem is more complicated than it. If you have a chance to work on 1.20, you can take a look at my PR for the details (especially the `wiki_path.go`) Make wiki title supports dashes and improve wiki name related features #24143 (the purpose of that PR is just clarifying the details, it doesn't change the legacy behavior too much, there were already a lot of technical debts)
Author
Owner

@silverwind commented on GitHub (May 24, 2023):

Note that we can not dump the given title as-is into the file system because because not all strings are valid filenames, like filenames with / or : in them. URL-Encoding them is the safe option. See here for regexes to match for invalid filenames:

https://github.com/sindresorhus/filename-reserved-regex/blob/main/index.js

Also, on Windows, IIRC, there may be different issues because NTFS uses UTF-16 in filenames, but the UI sends UTF-8.

@silverwind commented on GitHub (May 24, 2023): Note that we can not dump the given title as-is into the file system because because not all strings are valid filenames, like filenames with `/` or `:` in them. URL-Encoding them is the safe option. See here for regexes to match for invalid filenames: https://github.com/sindresorhus/filename-reserved-regex/blob/main/index.js Also, on Windows, IIRC, there may be different issues because NTFS uses UTF-16 in filenames, but the UI sends UTF-8.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#10908