Long running queries cause freezing when using sqlite #12386

Open
opened 2025-11-02 10:08:09 -06:00 by GiteaMirror · 12 comments
Owner

Originally created by @PeterSurda on GitHub (Jan 25, 2024).

Description

When using the sqlite backend, a long running query will freeze both the web UI as well as the ssh server. During the freeze, there is nothing in the log, and one CPU core is pegged at 100%. After moving to MariaDB, the query (see below) will still take a very long time, and mariadbd will still fully utilise one core or more, but due to more fine grained locking there is no freezing anymore.

The freezing happened during startup (about 2 hours), and when the "Cron: Check all repository statistics" job was running (between 10 minutes and 2 hours). I was only able to obtain the problematic query (for the cron job) after moving to MariaDB. I don't know if in both cases it's the same query:

SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(*) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=?)

My attempts to debug didn't provide much helpful results, as I said during the freeze the UI is unresponsive and I can't git pull (throws an error after 1 minute). The logs don't change, the files on the filesystem don't change other than the session files. There is negligible disk I/O. My server is way overpowered for my gitea installation, with 24 cores and more RAM than the whole gitea data volume, and a SSD RAID. I tried a strace, and all I could see was threads waiting for futexes. I was running a slightly older version of gitea, and I tried fixing it by upgrading to the latest docker image, but it only got worse (query took longer to run). Only the educated guess this could be a long running query combined with sqlite finally helped me.

The way I see it, there are two issues:

  • there is a long running query which perhaps could be optimised
  • if there is a long running query and you're using sqlite, gitea interaction freezes

It's not really a problem for me anymore since after moving to MariaDB there is no observable freeze anymore, but perhaps someone else will experience this, so it would be helpful to have some sort of detection or mitigation of this.

Gitea Version

1.21.4

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

gitea internal

Operating System

Ubuntu 22.04

How are you running Gitea?

docker compose, official image

Database

SQLite

Originally created by @PeterSurda on GitHub (Jan 25, 2024). ### Description When using the sqlite backend, a long running query will freeze both the web UI as well as the ssh server. During the freeze, there is nothing in the log, and one CPU core is pegged at 100%. After moving to MariaDB, the query (see below) will still take a very long time, and `mariadbd` will still fully utilise one core or more, but due to more fine grained locking there is no freezing anymore. The freezing happened during startup (about 2 hours), and when the "Cron: Check all repository statistics" job was running (between 10 minutes and 2 hours). I was only able to obtain the problematic query (for the cron job) after moving to MariaDB. I don't know if in both cases it's the same query: ``` SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(*) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=?) ``` My attempts to debug didn't provide much helpful results, as I said during the freeze the UI is unresponsive and I can't git pull (throws an error after 1 minute). The logs don't change, the files on the filesystem don't change other than the session files. There is negligible disk I/O. My server is way overpowered for my gitea installation, with 24 cores and more RAM than the whole gitea data volume, and a SSD RAID. I tried a `strace`, and all I could see was threads waiting for futexes. I was running a slightly older version of gitea, and I tried fixing it by upgrading to the latest docker image, but it only got worse (query took longer to run). Only the educated guess this could be a long running query combined with sqlite finally helped me. The way I see it, there are two issues: - there is a long running query which perhaps could be optimised - if there is a long running query and you're using sqlite, gitea interaction freezes It's not really a problem for me anymore since after moving to MariaDB there is no observable freeze anymore, but perhaps someone else will experience this, so it would be helpful to have some sort of detection or mitigation of this. ### Gitea Version 1.21.4 ### Can you reproduce the bug on the Gitea demo site? No ### Log Gist _No response_ ### Screenshots _No response_ ### Git Version gitea internal ### Operating System Ubuntu 22.04 ### How are you running Gitea? docker compose, official image ### Database SQLite
GiteaMirror added the performance/speedtype/bug labels 2025-11-02 10:08:09 -06:00
Author
Owner

@KN4CK3R commented on GitHub (Jan 26, 2024):

Could you please tell how many rows your issue, issue_label and label have? As all involved fields are indexes, there should not be a problem.

@KN4CK3R commented on GitHub (Jan 26, 2024): Could you please tell how many rows your `issue`, `issue_label` and `label` have? As all involved fields are indexes, there should not be a problem.
Author
Owner

@PeterSurda commented on GitHub (Jan 27, 2024):

The answer is

  • issue: 65192
  • issue_label: 169506
  • label: 5006
@PeterSurda commented on GitHub (Jan 27, 2024): The answer is - issue: **65192** - issue_label: **169506** - label: **5006**
Author
Owner

@PeterSurda commented on GitHub (Feb 25, 2024):

Update: the symptoms occasionally appear when using MariaDB too but they are less frequent, whereas with sqlite they happened every day.

@PeterSurda commented on GitHub (Feb 25, 2024): Update: the symptoms occasionally appear when using MariaDB too but they are less frequent, whereas with sqlite they happened every day.
Author
Owner

@somera commented on GitHub (Aug 1, 2024):

I would check the output for

analyse <sql>

with an MariaDB SQL Client.

@somera commented on GitHub (Aug 1, 2024): I would check the output for `analyse <sql>` with an MariaDB SQL Client.
Author
Owner

@PeterSurda commented on GitHub (Aug 3, 2024):

Here's the output from ANALYZE:

MariaDB [gitea]> ANALYZE SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(*) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1);
+------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+
| id   | select_type        | table       | type   | possible_keys               | key                 | key_len | ref                           | rows  | r_rows   | filtered | r_filtered | Extra       |
+------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+
|    1 | PRIMARY            | label       | ALL    | NULL                        | NULL                | NULL    | NULL                          | 4682  | 4642.00  |   100.00 |       0.00 | Using where |
|    2 | DEPENDENT SUBQUERY | issue       | ref    | PRIMARY,IDX_issue_is_closed | IDX_issue_is_closed | 2       | const                         | 25468 | 23110.00 |   100.00 |     100.00 | Using index |
|    2 | DEPENDENT SUBQUERY | issue_label | eq_ref | UQE_issue_label_s           | UQE_issue_label_s   | 18      | gitea.issue.id,gitea.label.id | 1     | 0.00     |   100.00 |     100.00 | Using index |
+------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+
3 rows in set (3 min 45.244 sec)
@PeterSurda commented on GitHub (Aug 3, 2024): Here's the output from `ANALYZE`: ``` MariaDB [gitea]> ANALYZE SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(*) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1); +------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+ | 1 | PRIMARY | label | ALL | NULL | NULL | NULL | NULL | 4682 | 4642.00 | 100.00 | 0.00 | Using where | | 2 | DEPENDENT SUBQUERY | issue | ref | PRIMARY,IDX_issue_is_closed | IDX_issue_is_closed | 2 | const | 25468 | 23110.00 | 100.00 | 100.00 | Using index | | 2 | DEPENDENT SUBQUERY | issue_label | eq_ref | UQE_issue_label_s | UQE_issue_label_s | 18 | gitea.issue.id,gitea.label.id | 1 | 0.00 | 100.00 | 100.00 | Using index | +------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+ 3 rows in set (3 min 45.244 sec) ```
Author
Owner

@somera commented on GitHub (Aug 3, 2024):

Hm ... 1xusing where and 2x using index.

Get you the same result with

SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(`issue_label`.label_id) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1);

and how fast ist this query?

@somera commented on GitHub (Aug 3, 2024): Hm ... 1xusing where and 2x using index. Get you the same result with ```SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(`issue_label`.label_id) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1);``` and how fast ist this query?
Author
Owner

@PeterSurda commented on GitHub (Aug 3, 2024):

No change in duration:

MariaDB [gitea]> ANALYZE SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(`issue_label`.label_id) FROM `issue_label`,`issue` WHERE
`issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1);
+------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+
| id   | select_type        | table       | type   | possible_keys               | key                 | key_len | ref                           | rows  | r_rows   | filtered | r_filtered | Extra       |
+------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+
|    1 | PRIMARY            | label       | ALL    | NULL                        | NULL                | NULL    | NULL                          | 4682  | 4642.00  |   100.00 |       0.00 | Using where |
|    2 | DEPENDENT SUBQUERY | issue       | ref    | PRIMARY,IDX_issue_is_closed | IDX_issue_is_closed | 2       | const                         | 25468 | 23110.00 |   100.00 |     100.00 | Using index |
|    2 | DEPENDENT SUBQUERY | issue_label | eq_ref | UQE_issue_label_s           | UQE_issue_label_s   | 18      | gitea.issue.id,gitea.label.id | 1     | 0.00     |   100.00 |     100.00 | Using index |
+------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+
3 rows in set (3 min 47.192 sec)

I also tried to create an index on label.num_closed_issues and the duration is still the same.

@PeterSurda commented on GitHub (Aug 3, 2024): No change in duration: ``` MariaDB [gitea]> ANALYZE SELECT `label`.id FROM `label` WHERE `label`.num_closed_issues!=(SELECT COUNT(`issue_label`.label_id) FROM `issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1); +------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+ | 1 | PRIMARY | label | ALL | NULL | NULL | NULL | NULL | 4682 | 4642.00 | 100.00 | 0.00 | Using where | | 2 | DEPENDENT SUBQUERY | issue | ref | PRIMARY,IDX_issue_is_closed | IDX_issue_is_closed | 2 | const | 25468 | 23110.00 | 100.00 | 100.00 | Using index | | 2 | DEPENDENT SUBQUERY | issue_label | eq_ref | UQE_issue_label_s | UQE_issue_label_s | 18 | gitea.issue.id,gitea.label.id | 1 | 0.00 | 100.00 | 100.00 | Using index | +------+--------------------+-------------+--------+-----------------------------+---------------------+---------+-------------------------------+-------+----------+----------+------------+-------------+ 3 rows in set (3 min 47.192 sec) ``` I also tried to create an index on label.num_closed_issues and the duration is still the same.
Author
Owner

@somera commented on GitHub (Aug 3, 2024):

I'm using PostgreSQL with Gitea and I don't have issues in my setup. I can't check this statement on my setup.

The question is, how is you MariaDB setup? Perhaps is there to less memory?

https://www.cloudways.com/blog/mariadb-performance-tuning/#Optimizing-MariaDB

What happens when the query is running? CPU usage? I/O on SSD? Memory usage?

@somera commented on GitHub (Aug 3, 2024): I'm using PostgreSQL with Gitea and I don't have issues in my setup. I can't check this statement on my setup. The question is, how is you MariaDB setup? Perhaps is there to less memory? https://www.cloudways.com/blog/mariadb-performance-tuning/#Optimizing-MariaDB What happens when the query is running? CPU usage? I/O on SSD? Memory usage?
Author
Owner

@PeterSurda commented on GitHub (Aug 3, 2024):

I'm using the MariaDB container from docker hub, the only custom parameter I added is --thread_stack=1048576 (I forgot why exactly I had to add this). The server has 256GB RAM, only 10GB is used, the rest the OS uses as buffers/cache. The whole MariaDB container volume is only 1.5GB. The issue, label and issue_label database files together occupy less than 100MB. During the query, one CPU core is fully loaded. There is negligible I/O on the mariadbd process, whether the query is running or not, maxing out at around 200kB/s, and even this negligible I/O is only write I/O.

@PeterSurda commented on GitHub (Aug 3, 2024): I'm using the MariaDB container from docker hub, the only custom parameter I added is `--thread_stack=1048576` (I forgot why exactly I had to add this). The server has 256GB RAM, only 10GB is used, the rest the OS uses as buffers/cache. The whole MariaDB container volume is only 1.5GB. The `issue`, `label` and `issue_label` database files together occupy less than 100MB. During the query, one CPU core is fully loaded. There is negligible I/O on the `mariadbd` process, whether the query is running or not, maxing out at around 200kB/s, and even this negligible I/O is only write I/O.
Author
Owner

@somera commented on GitHub (Aug 3, 2024):

Perhaps ... https://releem.com/docs/mysql-performance-tuning/thread_stack

@somera commented on GitHub (Aug 3, 2024): Perhaps ... https://releem.com/docs/mysql-performance-tuning/thread_stack
Author
Owner

@PeterSurda commented on GitHub (Aug 3, 2024):

I increased thread_stack to 268435456, which is almost 1000 times the default value. There is no change in the query time.

@PeterSurda commented on GitHub (Aug 3, 2024): I increased `thread_stack` to 268435456, which is almost 1000 times the default value. There is no change in the query time.
Author
Owner

@PeterSurda commented on GitHub (Aug 4, 2024):

If I rewrite the query using HAVING, it works ok:

MariaDB [gitea]> SELECT `label`.id, `label`.num_closed_issues
                              FROM `label`,`issue_label`,`issue`
                              WHERE `issue_label`.label_id=`label`.id
                                  AND `issue_label`.issue_id=`issue`.id 
                                  AND `issue`.is_closed=1
                              GROUP BY label.id
                              HAVING `label`.num_closed_issues != COUNT(*);
Empty set (0.063 sec)
@PeterSurda commented on GitHub (Aug 4, 2024): If I rewrite the query using `HAVING`, it works ok: ``` MariaDB [gitea]> SELECT `label`.id, `label`.num_closed_issues FROM `label`,`issue_label`,`issue` WHERE `issue_label`.label_id=`label`.id AND `issue_label`.issue_id=`issue`.id AND `issue`.is_closed=1 GROUP BY label.id HAVING `label`.num_closed_issues != COUNT(*); Empty set (0.063 sec) ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#12386