4 byte unicode returns a 500 w/mysql #1535

Closed
opened 2025-11-02 04:04:11 -06:00 by GiteaMirror · 2 comments
Owner

Originally created by @philfry on GitHub (Feb 14, 2018).

  • Gitea version (or commit ref): 1.4.0rc1
  • Git version: 2.16.1
  • Operating system: CentOS 7
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist:

Description

when using 4 byte unicode, like emojis, in an issue (subject, content, ..) gitea will return a 500. This is because the database was created as utf8 by gitea and gitea is using the utf8 namespace when connecting to it.
This issue can be solved by

  1. altering all tables/columns from utf8 to utf8mb4
# quick'n dirty, we don't need alter tables here
mysqldump gitea | sed 's/\butf8\b/utf8mb4/g' > gitea_utf8mb4.sql
mysql gitea < gitea_utf8mb4.sql

and

  1. patching gitea to use utf8mb4 when connecting to the db:
diff --git i/models/models.go w/models/models.go
index 7738e1a3..9693b88c 100644
--- i/models/models.go
+++ w/models/models.go
@@ -205,10 +205,10 @@ func getEngine() (*xorm.Engine, error) {
        switch DbCfg.Type {
        case "mysql":
                if DbCfg.Host[0] == '/' { // looks like a unix socket
-                       connStr = fmt.Sprintf("%s:%s@unix(%s)/%s%scharset=utf8&parseTime=true",
+                       connStr = fmt.Sprintf("%s:%s@unix(%s)/%s%scharset=utf8mb4&parseTime=true",
                                DbCfg.User, DbCfg.Passwd, DbCfg.Host, DbCfg.Name, Param)
                } else {
-                       connStr = fmt.Sprintf("%s:%s@tcp(%s)/%s%scharset=utf8&parseTime=true",
+                       connStr = fmt.Sprintf("%s:%s@tcp(%s)/%s%scharset=utf8mb4&parseTime=true",
                                DbCfg.User, DbCfg.Passwd, DbCfg.Host, DbCfg.Name, Param)
                }
        case "postgres":

Unfortunately, I'm not familiar enough with the gitea source code regarding the database creation (I'm pretty sure it's somewhat related to the types/struct in models/*.go, though), so sorry for not providing a patch.

Originally created by @philfry on GitHub (Feb 14, 2018). - Gitea version (or commit ref): 1.4.0rc1 - Git version: 2.16.1 - Operating system: CentOS 7 - Database (use `[x]`): - [ ] PostgreSQL - [X] MySQL - [ ] MSSQL - [ ] SQLite - Can you reproduce the bug at https://try.gitea.io: - [ ] Yes (provide example URL) - [X] No - [ ] Not relevant - Log gist: ## Description when using 4 byte unicode, like emojis, in an issue (subject, content, ..) gitea will return a 500. This is because the database was created as `utf8` by gitea and gitea is using the `utf8` namespace when connecting to it. This issue can be solved by 1. altering all tables/columns from `utf8` to `utf8mb4` ```bash # quick'n dirty, we don't need alter tables here mysqldump gitea | sed 's/\butf8\b/utf8mb4/g' > gitea_utf8mb4.sql mysql gitea < gitea_utf8mb4.sql ``` and 2. patching gitea to use `utf8mb4` when connecting to the db: ```diff diff --git i/models/models.go w/models/models.go index 7738e1a3..9693b88c 100644 --- i/models/models.go +++ w/models/models.go @@ -205,10 +205,10 @@ func getEngine() (*xorm.Engine, error) { switch DbCfg.Type { case "mysql": if DbCfg.Host[0] == '/' { // looks like a unix socket - connStr = fmt.Sprintf("%s:%s@unix(%s)/%s%scharset=utf8&parseTime=true", + connStr = fmt.Sprintf("%s:%s@unix(%s)/%s%scharset=utf8mb4&parseTime=true", DbCfg.User, DbCfg.Passwd, DbCfg.Host, DbCfg.Name, Param) } else { - connStr = fmt.Sprintf("%s:%s@tcp(%s)/%s%scharset=utf8&parseTime=true", + connStr = fmt.Sprintf("%s:%s@tcp(%s)/%s%scharset=utf8mb4&parseTime=true", DbCfg.User, DbCfg.Passwd, DbCfg.Host, DbCfg.Name, Param) } case "postgres": ``` Unfortunately, I'm not familiar enough with the gitea source code regarding the database creation (I'm pretty sure it's somewhat related to the types/struct in `models/*.go`, though), so sorry for not providing a patch.
GiteaMirror added the type/enhancement label 2025-11-02 04:04:11 -06:00
Author
Owner

@philfry commented on GitHub (Feb 14, 2018):

related to #2711

@philfry commented on GitHub (Feb 14, 2018): related to #2711
Author
Owner

@philfry commented on GitHub (May 23, 2018):

see #3516

@philfry commented on GitHub (May 23, 2018): see #3516
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#1535