About DB connection random lost #3269

Closed
opened 2025-11-02 05:06:05 -06:00 by GiteaMirror · 19 comments
Owner

Originally created by @cevin on GitHub (Apr 30, 2019).

  • Gitea version (or commit ref): 1.8.0
  • Git version:2.9.5
  • Operating system: centos 7
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist:
2019/04/30 12:59:41 [.../gogits/cron/cron.go:92 Run()] [E] MirrorUpdate: invalid connection
2019/04/30 13:01:36 [...user/notification.go:33 GetNotificationCount()] [E] GetNotificationCount: invalid connection
2019/04/30 13:02:12 [...s/context/context.go:238 func1()] [E] GetUserById: invalid connection
2019/04/30 13:05:34 [...s/context/context.go:238 func1()] [E] GetUserById: invalid connection

Description

like #5736
Aliyun Cloud ( www.aliyun.com ) & Aliyun RDS MySQL ( www.aliyun.com/product/rds/mysql )
as far as I know aliyun rds mysql server recycle connection is very fast, so....

Originally created by @cevin on GitHub (Apr 30, 2019). - Gitea version (or commit ref): 1.8.0 - Git version:2.9.5 - Operating system: centos 7 - Database (use `[x]`): - [ ] PostgreSQL - [x] MySQL - [ ] MSSQL - [ ] SQLite - Can you reproduce the bug at https://try.gitea.io: - [ ] Yes (provide example URL) - [x] No - [ ] Not relevant - Log gist: ``` 2019/04/30 12:59:41 [.../gogits/cron/cron.go:92 Run()] [E] MirrorUpdate: invalid connection 2019/04/30 13:01:36 [...user/notification.go:33 GetNotificationCount()] [E] GetNotificationCount: invalid connection 2019/04/30 13:02:12 [...s/context/context.go:238 func1()] [E] GetUserById: invalid connection 2019/04/30 13:05:34 [...s/context/context.go:238 func1()] [E] GetUserById: invalid connection ``` ## Description like #5736 Aliyun Cloud ( www.aliyun.com ) & Aliyun RDS MySQL ( www.aliyun.com/product/rds/mysql ) as far as I know aliyun rds mysql server recycle connection is very fast, so....
GiteaMirror added the type/bug label 2025-11-02 05:06:05 -06:00
Author
Owner

@bobemoe commented on GitHub (May 13, 2019):

I'm getting the same kinds of errors:

==> /home/gitea/log/gitea.log <==
2019/05/13 19:41:15 [...s/context/context.go:238 func1()] [E] GetUserById: invalid connection
2019/05/13 19:41:15 [...ules/context/repo.go:299 func1()] [E] GetUserByName: invalid connection

Seems to be pretty random and manifesting as 500/404 errors, sometimes on git push/pull and others also while navigating the web fronted. A retry/refresh of the same URL seems to always succeed straight away.

I'm using Gitea 1.8.1
mysql Ver 14.14 Distrib 5.6.43, for Linux (x86_64) using EditLine wrapper
PHP 7.3.5-1+020190503093827.38+stretch1.gbp60a41b (cli) (built: May 3 2019 09:38:28) ( NTS

Not seeing anything in MySQL log, but maybe I haven't enabled it verbose enough.

Seeing similar reports in #5736 and #5612

@bobemoe commented on GitHub (May 13, 2019): I'm getting the same kinds of errors: ==> /home/gitea/log/gitea.log <== 2019/05/13 19:41:15 [...s/context/context.go:238 func1()] [E] GetUserById: invalid connection 2019/05/13 19:41:15 [...ules/context/repo.go:299 func1()] [E] GetUserByName: invalid connection Seems to be pretty random and manifesting as 500/404 errors, sometimes on git push/pull and others also while navigating the web fronted. A retry/refresh of the same URL seems to always succeed straight away. I'm using Gitea 1.8.1 mysql Ver 14.14 Distrib 5.6.43, for Linux (x86_64) using EditLine wrapper PHP 7.3.5-1+0~20190503093827.38+stretch~1.gbp60a41b (cli) (built: May 3 2019 09:38:28) ( NTS Not seeing anything in MySQL log, but maybe I haven't enabled it verbose enough. Seeing similar reports in #5736 and #5612
Author
Owner

@lunny commented on GitHub (May 14, 2019):

I think this should be a go-sql-driver/mysql 's problem.

@lunny commented on GitHub (May 14, 2019): I think this should be a go-sql-driver/mysql 's problem.
Author
Owner

@bobemoe commented on GitHub (May 14, 2019):

How does go load the driver? Is it all packaged into the binary? I didn't have to install any dependencies, so I'm guessing there isn't anything I can update on the OS level, apart from MySQL itself perhaps?

@bobemoe commented on GitHub (May 14, 2019): How does go load the driver? Is it all packaged into the binary? I didn't have to install any dependencies, so I'm guessing there isn't anything I can update on the OS level, apart from MySQL itself perhaps?
Author
Owner

@bobemoe commented on GitHub (May 14, 2019):

I've upgraded MySQL to 5.7.26 and still experiencing this.

The error in mysql.log is
2019-05-14T16:13:22.312637Z 3045 [Note] Aborted connection 3045 to db: 'gitea' user: 'gitea' host: 'localhost' (Got timeout reading communication packets)

@bobemoe commented on GitHub (May 14, 2019): I've upgraded MySQL to 5.7.26 and still experiencing this. The error in mysql.log is 2019-05-14T16:13:22.312637Z 3045 [Note] Aborted connection 3045 to db: 'gitea' user: 'gitea' host: 'localhost' (Got timeout reading communication packets)
Author
Owner

@lafriks commented on GitHub (May 14, 2019):

Could it be that xorm does not close connection at same cases (https://github.com/go-sql-driver/mysql/issues/556)?

@lafriks commented on GitHub (May 14, 2019): Could it be that xorm does not close connection at same cases (https://github.com/go-sql-driver/mysql/issues/556)?
Author
Owner

@bobemoe commented on GitHub (May 14, 2019):

Yeah I reckon it could be.

I've been sitting tailing the logs for a bit and it seems that the mysql timeout comes FIRST. Then next page refresh I get the 404/500 error accompanied by the GO error "GetUserById: invalid connection" and this is 100% repeatable.

Now also watching the MySQL process list. What it looks like is happening:

  1. When I hit ANY frontend page, a gitea connection appears in mysql processlist.
  2. This connection returns its data and then enters "sleep" mode.
  3. As time passes the sleep count increases.
  4. If I hit another page within 60 secs, the connection is reused, and the sleep count resets to 0.
  5. If the sleep count reaches 60 then the connection times out, the mysql timeout error is logged in mysql.log
  6. Next page hit the go error "GetUserById: invalid connection" appears in gitea.log and the frontend 404's.
  7. A new connection now appears, go to step 3

So, I'm guessing Gitea is (efficiently) only opening one connection, and keeping it idle until it needs to use it again?

However when that time comes, if the connection is no longer available this is not caught properly to re-open the connection and retry the last query.

I can increase my mysql timeout, but this will only make the errors less frequent. Unless Gitea closes the connection after a set time, in which case if I make the mysql timeout greater than this it should solve it?

Should Gitea be closing its connection after use or is it by design keeping it open?

@bobemoe commented on GitHub (May 14, 2019): Yeah I reckon it could be. I've been sitting tailing the logs for a bit and it seems that the mysql timeout comes FIRST. Then next page refresh I get the 404/500 error accompanied by the GO error "GetUserById: invalid connection" and this is 100% repeatable. Now also watching the MySQL process list. What it looks like is happening: 1. When I hit ANY frontend page, a gitea connection appears in mysql processlist. 2. This connection returns its data and then enters "sleep" mode. 3. As time passes the sleep count increases. 4. If I hit another page within 60 secs, the connection is reused, and the sleep count resets to 0. 5. If the sleep count reaches 60 then the connection times out, the mysql timeout error is logged in mysql.log 6. Next page hit the go error "GetUserById: invalid connection" appears in gitea.log and the frontend 404's. 7. A new connection now appears, go to step 3 So, I'm guessing Gitea is (efficiently) only opening one connection, and keeping it idle until it needs to use it again? However when that time comes, if the connection is no longer available this is not caught properly to re-open the connection and retry the last query. I can increase my mysql timeout, but this will only make the errors less frequent. Unless Gitea closes the connection after a set time, in which case if I make the mysql timeout greater than this it should solve it? Should Gitea be closing its connection after use or is it by design keeping it open?
Author
Owner

@lafriks commented on GitHub (May 14, 2019):

I think golang MySQL library is keeping pool of open connections to reuse them but it should have reopen new one if current one has become invalid ihmo

@lafriks commented on GitHub (May 14, 2019): I think golang MySQL library is keeping pool of open connections to reuse them but it should have reopen new one if current one has become invalid ihmo
Author
Owner

@bobemoe commented on GitHub (May 14, 2019):

I've tried setting mysql timeout on the server to 120, 180 and 300 and each time the connection times out, I hit a page and get the error still. I've now set the timeout to 3000 and been watching the logs and processes, it seems Gitea has an internal cron that is making a query every 10 mins, this is essentially acting as a keep alive heartbeat that keeps resetting the connection sleep time.

2019/05/14 20:25:13 [I] [SQL] SELECT `id`, `repo_id`, `interval`, `enable_prune`, `updated_unix`, `next_update_unix` FROM `mirror` WHERE (next_update_unix<=?) AND (next_update_unix!=0) []interface {}{1557861913}

This looks like a maintenance task and not a real keep alive implementation, so its a bit of a crazy workaround, but I'm not getting the 404/500 errors any more :)

So how do I proceed? How can I see what version of go-sql-driver/mysql is in the binary? The issue go-sql-driver/mysql#556 was closed several months ago with no real solution other than either "dont forget to db.Close()" or "disable strict mode".

@bobemoe commented on GitHub (May 14, 2019): I've tried setting mysql timeout on the server to 120, 180 and 300 and each time the connection times out, I hit a page and get the error still. I've now set the timeout to 3000 and been watching the logs and processes, it seems Gitea has an internal cron that is making a query every 10 mins, this is essentially acting as a keep alive heartbeat that keeps resetting the connection sleep time. ```2019/05/14 20:25:13 [I] [SQL] SELECT `id`, `repo_id`, `interval`, `enable_prune`, `updated_unix`, `next_update_unix` FROM `mirror` WHERE (next_update_unix<=?) AND (next_update_unix!=0) []interface {}{1557861913}``` This looks like a maintenance task and not a real keep alive implementation, so its a bit of a crazy workaround, but I'm not getting the 404/500 errors any more :) So how do I proceed? How can I see what version of go-sql-driver/mysql is in the binary? The issue go-sql-driver/mysql#556 was closed several months ago with no real solution other than either "dont forget to db.Close()" or "disable strict mode".
Author
Owner

@lunny commented on GitHub (May 15, 2019):

This has been packaged into Gitea binary.

@lunny commented on GitHub (May 15, 2019): This has been packaged into Gitea binary.
Author
Owner

@cevin commented on GitHub (May 15, 2019):

the problem is very annoying

@cevin commented on GitHub (May 15, 2019): the problem is very annoying
Author
Owner

@bobemoe commented on GitHub (May 15, 2019):

I've set my timeouts in /etc/mysql/my.cnf to just over 10 mins and can confirm that the issue has cleared up.

wait_timeout = 650         
interactive_timeout = 650

This is a shared server so not an ideal solution.

@lunny how can I check which version is packaged into the binary? Is it worth me trying Gitea@dev-master?

@cevin are you able to try this on your cloud instance? or check the mysql logs for that timeout error? It would help confirm if we are describing the same issue. I guess you don't really have that level of access on a cloud?

@bobemoe commented on GitHub (May 15, 2019): I've set my timeouts in /etc/mysql/my.cnf to just over 10 mins and can confirm that the issue has cleared up. ``` wait_timeout = 650 interactive_timeout = 650 ``` This is a shared server so not an ideal solution. @lunny how can I check which version is packaged into the binary? Is it worth me trying Gitea@dev-master? @cevin are you able to try this on your cloud instance? or check the mysql logs for that timeout error? It would help confirm if we are describing the same issue. I guess you don't really have that level of access on a cloud?
Author
Owner

@bobemoe commented on GitHub (May 15, 2019):

So looks like Gitea 1.8.1 is using go mysql driver V1.4.
V1.4.1 is out now (2018-11-14) with a potential fix:
https://github.com/go-sql-driver/mysql/blob/v1.4.1/CHANGELOG.md
https://github.com/go-sql-driver/mysql/issues/836 (committed Oct 1, 2018 )

The Gitea 1.8.1 branch looks like its using driver V1.4
The Gitea 1.9.0+dev-254-g710245e81 branch looks like its using driver c45f530f8e7fe40f4687eaa50d0c8c5f1b66f9e0 which was committed Dec 18, 2018 so should include that fix??

My testing on 1.9.0+dev-254-g710245e81 still has the issue :(

So either my calculations are wrong and the fix has not been included (can anyone confirm this?) or it has not worked.

Hmmm :/

@bobemoe commented on GitHub (May 15, 2019): So looks like Gitea 1.8.1 is using go mysql driver V1.4. V1.4.1 is out now (2018-11-14) with a potential fix: https://github.com/go-sql-driver/mysql/blob/v1.4.1/CHANGELOG.md https://github.com/go-sql-driver/mysql/issues/836 (committed Oct 1, 2018 ) The Gitea 1.8.1 branch looks like its using driver V1.4 The Gitea 1.9.0+dev-254-g710245e81 branch looks like its using driver c45f530f8e7fe40f4687eaa50d0c8c5f1b66f9e0 which was committed Dec 18, 2018 so should include that fix?? My testing on 1.9.0+dev-254-g710245e81 still has the issue :( So either my calculations are wrong and the fix has not been included (can anyone confirm this?) or it has not worked. Hmmm :/
Author
Owner

@sapk commented on GitHub (May 15, 2019):

We override the previous tag 1.4.0 with a commit dating 2018-12-18 that should contain this fix 🤔 710245e81e/go.mod (L140)

@sapk commented on GitHub (May 15, 2019): We override the previous tag 1.4.0 with a commit dating 2018-12-18 that should contain this fix :thinking: https://github.com/go-gitea/gitea/blame/710245e81e0d65c72231dbb3b5c9f860cdc71899/go.mod#L140
Author
Owner

@cevin commented on GitHub (May 16, 2019):

@bobemoe yep, but I can set wait_timeout & interactive_timeout in rds( aliyun mysql cloud database) online dashboard. but looks that's not a good idea ?

@cevin commented on GitHub (May 16, 2019): @bobemoe yep, but I can set `wait_timeout & interactive_timeout` in rds( aliyun mysql cloud database) online dashboard. but looks that's not a good idea ?
Author
Owner

@mrsdizzie commented on GitHub (May 17, 2019):

Perhaps relevant here:

0a176df6fb

Which claims to fix similar problem and has more discussion/links here:

https://github.com/gogs/gogs/issues/5532

@mrsdizzie commented on GitHub (May 17, 2019): Perhaps relevant here: https://github.com/gogs/gogs/commit/0a176df6fb4109793c7efddb493b1f2f4ba1b1b2 Which claims to fix similar problem and has more discussion/links here: https://github.com/gogs/gogs/issues/5532
Author
Owner

@lunny commented on GitHub (May 18, 2019):

gogs/gogs@0a176df should be good practice, we could take it.

@lunny commented on GitHub (May 18, 2019): gogs/gogs@0a176df should be good practice, we could take it.
Author
Owner

@lafriks commented on GitHub (May 24, 2019):

That should probably be set only if database is MySQL

@lafriks commented on GitHub (May 24, 2019): That should probably be set only if database is MySQL
Author
Owner

@lunny commented on GitHub (May 29, 2019):

closed by #7071

@lunny commented on GitHub (May 29, 2019): closed by #7071
Author
Owner

@bobemoe commented on GitHub (Jun 9, 2019):

Just to confirm I've tried 1.8.2 and set my mysql timeouts back to 60s. All seems to be functioning well.

Looking at mysql processlist, I can see that the behaviour has changed from my previous investigation. Gitea no no longer keeps a connection open. This solves the issue for me. Thanks everyone :)

Although I see no connection kept open, looking at the mysql id numbers jump up by about 10 now each time I hit a page, suggests Gitea is not reusing its mysql connections at all now and opening a new one per query? Just wanted to make sure this will not be a performance hit on busier sites?

@bobemoe commented on GitHub (Jun 9, 2019): Just to confirm I've tried 1.8.2 and set my mysql timeouts back to 60s. All seems to be functioning well. Looking at mysql processlist, I can see that the behaviour has changed from my previous investigation. Gitea no no longer keeps a connection open. This solves the issue for me. Thanks everyone :) Although I see no connection kept open, looking at the mysql id numbers jump up by about 10 now each time I hit a page, suggests Gitea is not reusing its mysql connections at all now and opening a new one per query? Just wanted to make sure this will not be a performance hit on busier sites?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#3269