cron.update_mirrors broken in Gitea 1.16.0 #8497

Closed
opened 2025-11-02 08:08:42 -06:00 by GiteaMirror · 48 comments
Owner

Originally created by @somera on GitHub (Feb 4, 2022).

Gitea Version

1.16.0

Git Version

2.25.1

Operating System

Ubuntu 20.04.3

How are you running Gitea?

Precompiled gitea-1.16.0-linux-amd64

Database

PostgreSQL

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Description

I'm hosting a lot of mirrors on my Gitea instance. Some days ago I updated to 1.16.0.

image

I see -> it started at 04:00 AM, but ...

Last run with Gitea 1.15.x:

image

cron.update_mirrors needed with Gitea 1.15.x ~3h to update all mirrors. The cron runs fridays at 04:00 AM. And today was the first run.

I saw in gitea that the cron runs -> run count was 1. But the monitoring data shows no activity (RAM usage, CPU usage, Disk usage) between 04:00 and 07:00 AM.

Than I started (4-6x times) the cron in Gitea admin web console. It runs for some seconds. and I saw some activity.

But there is no date for the last run of the cron

image

After the 4-6 runs the sorting by last update time isn't showing any changes

image

All the projects with Feb 04 date are created today. But I can't see the updated mirrors when I change the sort order.

Why cron.update_mirrors is not updating all my mirrors? I can't see any errors in the Gitea logs.

Screenshots

No response

Originally created by @somera on GitHub (Feb 4, 2022). ### Gitea Version 1.16.0 ### Git Version 2.25.1 ### Operating System Ubuntu 20.04.3 ### How are you running Gitea? Precompiled gitea-1.16.0-linux-amd64 ### Database PostgreSQL ### Can you reproduce the bug on the Gitea demo site? No ### Log Gist _No response_ ### Description I'm hosting a lot of mirrors on my Gitea instance. Some days ago I updated to 1.16.0. ![image](https://user-images.githubusercontent.com/8334250/152591343-d333514e-e4e7-4977-a4a3-ce8c147f3223.png) I see -> it started at 04:00 AM, but ... Last run with Gitea 1.15.x: ![image](https://user-images.githubusercontent.com/8334250/152591564-2ef11d57-d6ba-478e-9168-33af06d5f288.png) cron.update_mirrors needed with Gitea 1.15.x ~3h to update all mirrors. The cron runs fridays at 04:00 AM. And today was the first run. I saw in gitea that the cron runs -> run count was 1. But the monitoring data shows no activity (RAM usage, CPU usage, Disk usage) between 04:00 and 07:00 AM. Than I started (4-6x times) the cron in Gitea admin web console. It runs for some seconds. and I saw some activity. But there is no date for the last run of the cron ![image](https://user-images.githubusercontent.com/8334250/152568281-05259fec-a895-4e60-b090-3622fe9d1ba8.png) After the 4-6 runs the sorting by last update time isn't showing any changes ![image](https://user-images.githubusercontent.com/8334250/152568709-9e3a5487-ff2c-4521-9698-54d843b96905.png) All the projects with Feb 04 date are created today. But I can't see the updated mirrors when I change the sort order. Why cron.update_mirrors is not updating all my mirrors? I can't see any errors in the Gitea logs. ### Screenshots _No response_
Author
Owner

@somera commented on GitHub (Feb 4, 2022):

@zeripath should I set now the new parameters from #16982?

Cause I added now

PULL_LIMIT = 100

and it runs longer.

After an update (1.15.x -> 1.16.0) I rather expect the old behavior.

@somera commented on GitHub (Feb 4, 2022): @zeripath should I set now the new parameters from #16982? Cause I added now `PULL_LIMIT = 100` and it runs longer. After an update (1.15.x -> 1.16.0) I rather expect the old behavior.
Author
Owner

@somera commented on GitHub (Feb 4, 2022):

I increased it to

PULL_LIMIT = 2000

and the cron.update_mirrors run's ~8-10 minutes. Second time ~12 minutes. Means for me: 2 runns -> 4000 updated 2000 mirrors.

but ~20 minutes for 4000 mirrors it's to fast for me.

Questions:

  • I'm right or wrong?
  • Are there some optimizations? If yes -> which?

Sometimes I see what Gitea is doing

image

and sometimes not

image

Just missing the git call for a lot projects.

And it will be good to know that this cron was running (start date, finish date) and how many repos was updated.

image

You are showing some infos for some crons. But not for the mirror cron.

I just have to know: is the process working fine or not? Can I lost data?

@somera commented on GitHub (Feb 4, 2022): I increased it to `PULL_LIMIT = 2000` and the cron.update_mirrors run's ~8-10 minutes. Second time ~12 minutes. Means for me: 2 runns -> 4000 updated 2000 mirrors. but ~20 minutes for 4000 mirrors it's to fast for me. Questions: - I'm right or wrong? - Are there some optimizations? If yes -> which? Sometimes I see what Gitea is doing ![image](https://user-images.githubusercontent.com/8334250/152602190-ba76047b-e010-4310-b676-fd94dbebc19d.png) and sometimes not ![image](https://user-images.githubusercontent.com/8334250/152602223-7aed101a-b980-40bb-ae31-99f9ded148f9.png) Just missing the git call for a lot projects. And it will be good to know that this cron was running (start date, finish date) and how many repos was updated. ![image](https://user-images.githubusercontent.com/8334250/152602903-5f413958-6740-4104-80b8-5225de0e5204.png) You are showing some infos for some crons. But not for the mirror cron. I just have to know: is the process working fine or not? Can I lost data?
Author
Owner

@somera commented on GitHub (Feb 4, 2022):

I run the cron 4x manually.

1 time with PULL_LIMIT = 100 and 3 times with PULL_LIMIT = 2000

And in the DB I see

image

I expected ~6100 repos with update_unix = 2022-02-04. But there are only 1675.

@somera commented on GitHub (Feb 4, 2022): I run the cron 4x manually. 1 time with PULL_LIMIT = 100 and 3 times with PULL_LIMIT = 2000 And in the DB I see ![image](https://user-images.githubusercontent.com/8334250/152604756-1a1be406-877b-4e4c-ac46-03278ab32caa.png) I expected ~6100 repos with update_unix = 2022-02-04. But there are only 1675.
Author
Owner

@somera commented on GitHub (Feb 4, 2022):

Now I'm sure. The new process is broken.

image

6 runs: 1 time with PULL_LIMIT = 100 and 5 times with PULL_LIMIT = 2000

But

image

And if I start the cron again -> nothing happens.

And in the mirror table (next_update_unix column) I see, how many mirrors can be updated

image

@somera commented on GitHub (Feb 4, 2022): Now I'm sure. The new process is broken. ![image](https://user-images.githubusercontent.com/8334250/152606043-1ad637b8-aa43-4574-950c-72bdcd08f14d.png) 6 runs: 1 time with PULL_LIMIT = 100 and 5 times with PULL_LIMIT = 2000 But ![image](https://user-images.githubusercontent.com/8334250/152606183-a2816b67-0636-443f-901b-18f1bd8ee4b7.png) And if I start the cron again -> nothing happens. And in the mirror table (next_update_unix column) I see, how many mirrors can be updated ![image](https://user-images.githubusercontent.com/8334250/152606422-847ced5d-3ff6-4b7a-bc6b-39ea0ca5e183.png)
Author
Owner

@somera commented on GitHub (Feb 5, 2022):

Since yesterday klicked x times on "Now sync" -> nothing happens

image

@somera commented on GitHub (Feb 5, 2022): Since yesterday klicked x times on "Now sync" -> nothing happens ![image](https://user-images.githubusercontent.com/8334250/152636956-501e2016-75ee-4219-a1c8-c3f983958519.png)
Author
Owner

@somera commented on GitHub (Feb 5, 2022):

I run today manually the cron. Gitea updated run some mirror updates. But the mirror.updated_unix column hasn't after the run ZERO changes.

image

Now I call the curl

curl -X 'POST' 'http://nuc-mini-celeron:3000/api/v1/repos/gaphor/in-app-notification-demo/mirror-sync' -H 'accept: application/json' -H 'Authorization: token xxxxx' -d ''
call for evry mirror again and the mirror.updated_unix is changing

image

But the whole mirror update process looks brocken.

@somera commented on GitHub (Feb 5, 2022): I run today manually the cron. Gitea updated run some mirror updates. But the mirror.updated_unix column hasn't after the run ZERO changes. ![image](https://user-images.githubusercontent.com/8334250/152637946-ff931cd0-7dfb-4f45-ab66-4bdfc1781305.png) Now I call the curl `curl -X 'POST' 'http://nuc-mini-celeron:3000/api/v1/repos/gaphor/in-app-notification-demo/mirror-sync' -H 'accept: application/json' -H 'Authorization: token xxxxx' -d '' ` call for evry mirror again and the mirror.updated_unix is changing ![image](https://user-images.githubusercontent.com/8334250/152638018-e74332a9-404b-4c60-b6c4-f3e704cbbac3.png) But the whole mirror update process looks brocken.
Author
Owner

@somera commented on GitHub (Feb 5, 2022):

After spending some hours with Gitea 1.16.0 I'm disapointed. Disapointed cause Gitea is not working like in 1.15.x for mirror sync. Yes, there was changes. But the changes shouldn't change the existing behavior.

I sended ~40.000 curl calls (4x for all mirrors) to get this mirror_update status

image

1785 mirrors are not up2date.

This is my current config:

[repository]
ROOT = /var/lib/gitea/repositories
; Mirror sync queue length, increase if mirror syncing starts hanging
; DEPRECATED!!!
MIRROR_QUEUE_LENGTH = 20000

[mirror]
DEFAULT_INTERVAL = 8h
LENGTH = 20000

; Update mirrors
[cron.update_mirrors]
SCHEDULE = 0 0 4 * * 5 ; -> on 04.02.2022 04:00:00 it runs only for 8-10 minutes
PULL_LIMIT = -1
PUSH_LIMIT = -1

If I now start the update_mirrors cron manually, the oldest mirrors don't be updated. Nothing happens. Why?

If I open the repo with the oldest sync date

image

and I click on sync. It's not syncing. Why not? How long have I to wait?

@somera commented on GitHub (Feb 5, 2022): After spending some hours with Gitea 1.16.0 I'm disapointed. Disapointed cause Gitea is not working like in 1.15.x for mirror sync. Yes, there was changes. But the changes shouldn't change the existing behavior. I sended ~40.000 curl calls (4x for all mirrors) to get this mirror_update status ![image](https://user-images.githubusercontent.com/8334250/152644154-6d95a800-712a-4d97-9a6f-057feef3f9f1.png) 1785 mirrors are not up2date. This is my current config: ``` [repository] ROOT = /var/lib/gitea/repositories ; Mirror sync queue length, increase if mirror syncing starts hanging ; DEPRECATED!!! MIRROR_QUEUE_LENGTH = 20000 [mirror] DEFAULT_INTERVAL = 8h LENGTH = 20000 ; Update mirrors [cron.update_mirrors] SCHEDULE = 0 0 4 * * 5 ; -> on 04.02.2022 04:00:00 it runs only for 8-10 minutes PULL_LIMIT = -1 PUSH_LIMIT = -1 ``` If I now start the update_mirrors cron manually, the oldest mirrors don't be updated. Nothing happens. Why? If I open the repo with the oldest sync date ![image](https://user-images.githubusercontent.com/8334250/152644379-a28a840f-9367-46c6-bd99-ce7d9096b411.png) and I click on sync. It's not syncing. Why not? How long have I to wait?
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

Same status after the very short cron run today at 04:00AM

image

@somera commented on GitHub (Feb 6, 2022): Same status after the very short cron run today at 04:00AM ![image](https://user-images.githubusercontent.com/8334250/152681791-4c2a3e9c-b71e-4ae8-9173-db400f68a607.png)
Author
Owner

@lunny commented on GitHub (Feb 6, 2022):

Are there any logs?

@lunny commented on GitHub (Feb 6, 2022): Are there any logs?
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

Are there any logs?

Hm ... if I switch TO DEUBUG I see a lot of SQL's.

Or this: https://github.com/go-gitea/gitea/issues/16982#issuecomment-1030449782

I think, the "problem" is the new feature implemented in #16982. Or you did other big changes.

@somera commented on GitHub (Feb 6, 2022): > Are there any logs? Hm ... if I switch TO DEUBUG I see a lot of SQL's. Or this: https://github.com/go-gitea/gitea/issues/16982#issuecomment-1030449782 I think, the "problem" is the new feature implemented in #16982. Or you did other big changes.
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

I added

[queue.mirror]
LENGTH = 20000

and than

./gitea manager flush-queues

and restart gitea. After I run mirror-cron manually I see no changes.

image

And I can bet, that if I would do my ~9000 curl calls, some of the mirrors will be synced.

The 1 repo with mirror.updated_unix = 2022-02-06 is a new repo which I added today.

@somera commented on GitHub (Feb 6, 2022): I added ``` [queue.mirror] LENGTH = 20000 ``` and than `./gitea manager flush-queues` and restart gitea. After I run mirror-cron manually I see no changes. ![image](https://user-images.githubusercontent.com/8334250/152692003-76793586-2382-44fb-bb69-65197f05819b.png) And I can bet, that if I would do my ~9000 curl calls, some of the mirrors will be synced. The 1 repo with mirror.updated_unix = 2022-02-06 is a new repo which I added today.
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

I've held off replying to this issue because I'm struggling to completely understand what is the problem and what is actually happening.

This is despite you having posted 11 comments...

So can we please have a succinct three line description as to what is happening, what you want to happen and what your configuration is? (that includes cron configuration.)


I will also remind you that you SPECIFICALLY asked for LIMIT_SIZE - I checked with you as to how it was supposed to work and I wrote a very long explanation as to what it was doing.

There have been many changes to mirroring this is not the only thing.

@zeripath commented on GitHub (Feb 6, 2022): I've held off replying to this issue because I'm struggling to completely understand what is the problem and what is actually happening. This is despite you having posted 11 comments... So can we please have a succinct three line description as to what is happening, what you want to happen and what your configuration is? (that includes cron configuration.) --- I will also remind you that you SPECIFICALLY asked for LIMIT_SIZE - I checked with you as to how it was supposed to work and I wrote a very long explanation as to what it was doing. There have been many changes to mirroring this is not the only thing.
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

I've held off replying to this issue because I'm struggling to completely understand what is the problem and what is actually happening.

This is despite you having posted 11 comments...

Sorry ... I was tested and I posted every try. And there are some problems now.

Started two times after gitea restart. But no last start date set

image

So can we please have a succinct three line description as to what is happening, what you want to happen and what your configuration is? (that includes cron configuration.)

I will also remind you that you SPECIFICALLY asked for LIMIT_SIZE - I checked with you as to how it was supposed to work and I wrote a very long explanation as to what it was doing.

Yes, this was a wish. But this implementation should not change the "normal process". This should be only for small optimization for people with a lot of mirrors on the gitea instance.

There have been many changes to mirroring this is not the only thing.

Short version on my problem.

I have 9000+ mirrors. The mirror sync cron was running one time a week and needed 3h to update all mirrors. And after update to gitea 1.16.0 it's not working anymore. The cron runs only for 8-10 seconds.

To update ~90% of the repos I sended yesterday 4 times 9000+ curl calls

curl -X 'POST' 'http://nuc-mini-celeron:3000/api/v1/repos/gaphor/in-app-notification-demo/mirror-sync' -H 'accept: application/json' -H 'Authorization: token xxxxx' -d ''

In this case the sync worked.

And now after flush-queue (cause it could be corrupt) if I start the mirror cron manually no one mirror will be updated. I tryied this 3 times.

My config now:

[repository]
ROOT = /var/lib/gitea/repositories
; Mirror sync queue length, increase if mirror syncing starts hanging
; DEPRECATED
MIRROR_QUEUE_LENGTH = 20000

[mirror]
DEFAULT_INTERVAL = 8h
LENGTH = 20000

[queue.mirror]
LENGTH = 20000

; Update mirrors
[cron.update_mirrors]
; Every day at 4AM
SCHEDULE = 0 0 4 * * *
PULL_LIMIT = -1
PUSH_LIMIT = -1
@somera commented on GitHub (Feb 6, 2022): > I've held off replying to this issue because I'm struggling to completely understand what is the problem and what is actually happening. > > This is despite you having posted 11 comments... Sorry ... I was tested and I posted every try. And there are some problems now. Started two times after gitea restart. But no last start date set ![image](https://user-images.githubusercontent.com/8334250/152694145-e26e9a71-67b1-4e45-bb72-240948aaf081.png) > So can we please have a succinct three line description as to what is happening, what you want to happen and what your configuration is? (that includes cron configuration.) > > I will also remind you that you SPECIFICALLY asked for LIMIT_SIZE - I checked with you as to how it was supposed to work and I wrote a very long explanation as to what it was doing. Yes, this was a wish. But this implementation should not change the "normal process". This should be only for small optimization for people with a lot of mirrors on the gitea instance. > There have been many changes to mirroring this is not the only thing. Short version on my problem. I have 9000+ mirrors. The mirror sync cron was running one time a week and needed 3h to update all mirrors. And after update to gitea 1.16.0 it's not working anymore. The cron runs only for 8-10 seconds. To update ~90% of the repos I sended yesterday 4 times 9000+ curl calls `curl -X 'POST' 'http://nuc-mini-celeron:3000/api/v1/repos/gaphor/in-app-notification-demo/mirror-sync' -H 'accept: application/json' -H 'Authorization: token xxxxx' -d ''` In this case the sync worked. And now after flush-queue (cause it could be corrupt) if I start the mirror cron manually no one mirror will be updated. I tryied this 3 times. My config now: ``` [repository] ROOT = /var/lib/gitea/repositories ; Mirror sync queue length, increase if mirror syncing starts hanging ; DEPRECATED MIRROR_QUEUE_LENGTH = 20000 [mirror] DEFAULT_INTERVAL = 8h LENGTH = 20000 [queue.mirror] LENGTH = 20000 ; Update mirrors [cron.update_mirrors] ; Every day at 4AM SCHEDULE = 0 0 4 * * * PULL_LIMIT = -1 PUSH_LIMIT = -1 ```
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

@zeripath when I change the loglevel to

[log]
LEVEL            = Info

than I see 9000+ (each for every mirror) select statements in the log

2022/02/06 20:24:32 models/repo/repo.go:635:getRepositoryByID() [I] [SQL] SELECT "id", "owner_id", "owner_name", "lower_name", "name", "description", "website", "original_service_type", "original_url", "default_branch", "num_watches", "num_stars", "num_forks", "num_issues", "num_closed_issues", "num_pulls", "num_closed_pulls", "num_milestones", "num_closed_milestones", "num_projects", "num_closed_projects", "is_private", "is_empty", "is_archived", "is_mirror", "status", "is_fork", "fork_id", "is_template", "template_id", "size", "is_fsck_enabled", "close_issues_via_commit_in_any_branch", "topics", "trust_model", "avatar", "created_unix", "updated_unix" FROM "repository" WHERE "id"=$1 LIMIT 1 [4445] - 848.612µs

and than this

2022/02/06 20:24:32 ...s/repo/pushmirror.go:111:PushMirrorsIterate() [I] [SQL] SELECT "id", "repo_id", "remote_name", "interval", "created_unix", "last_update", "last_error" FROM "push_mirror" WHERE (last_update + ("interval" / $1) <= $2) AND ("interval" != 0) ORDER BY last_update ASC [1s 1644175472] - 1.891748ms

select. and no one mirror will be updated.

This are the 8-10 seconds the cron.update_mirrors needs to run and finish.

@somera commented on GitHub (Feb 6, 2022): @zeripath when I change the loglevel to ``` [log] LEVEL = Info ``` than I see 9000+ (each for every mirror) select statements in the log `2022/02/06 20:24:32 models/repo/repo.go:635:getRepositoryByID() [I] [SQL] SELECT "id", "owner_id", "owner_name", "lower_name", "name", "description", "website", "original_service_type", "original_url", "default_branch", "num_watches", "num_stars", "num_forks", "num_issues", "num_closed_issues", "num_pulls", "num_closed_pulls", "num_milestones", "num_closed_milestones", "num_projects", "num_closed_projects", "is_private", "is_empty", "is_archived", "is_mirror", "status", "is_fork", "fork_id", "is_template", "template_id", "size", "is_fsck_enabled", "close_issues_via_commit_in_any_branch", "topics", "trust_model", "avatar", "created_unix", "updated_unix" FROM "repository" WHERE "id"=$1 LIMIT 1 [4445] - 848.612µs` and than this `2022/02/06 20:24:32 ...s/repo/pushmirror.go:111:PushMirrorsIterate() [I] [SQL] SELECT "id", "repo_id", "remote_name", "interval", "created_unix", "last_update", "last_error" FROM "push_mirror" WHERE (last_update + ("interval" / $1) <= $2) AND ("interval" != 0) ORDER BY last_update ASC [1s 1644175472] - 1.891748ms` select. and no one mirror will be updated. This are the 8-10 seconds the cron.update_mirrors needs to run and finish.
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

OK let's look at that configuration first.

You have way too many mirrors to for a persistable-channel queue to ever be the correct queue for you.

Therefore change your update_checker queue to use a level queue and get rid of the channel queue related things.

[repository]
ROOT = /var/lib/gitea/repositories

[mirror]
DEFAULT_INTERVAL = 8h 

[queue.mirror]
TYPE=level; <- You have way too many mirrors for a channel or persistent-channel to be the right queue for you

; Update mirrors
[cron.update_mirrors]
; Every day at 4AM
SCHEDULE = 0 0 4 * * *
PULL_LIMIT = -1
PUSH_LIMIT = -1

Next I think we need address what the update_mirrors cron task does,

update_mirrors looks through the list of mirrors which are due to be updated - (e.g. next_update_unix <= Now & next_update_unix != 0) and sorts them into a last_updated order. It will then iterate through them.

It will then queue PULL_LIMIT pull mirrors and PUSH_LIMIT push mirrors for update.

The updates will be done by the workers on the other end of the queue. This will depend on your general queue configuration but often this is a scaling worker pool up to a maximum of 10 workers.


So ... If you are finding that nothing is being queued ... It would be useful to check the values of next_update_unix in the mirror table.

The PULL_LIMIT and PUSH_LIMIT code is working perfectly as described and so your problem is elsewhere.


Now to explain why we changed the default PULL_LIMIT and PUSH_LIMIT.

The vast majority of installs will benefit from this change as most people do not:

  1. Change their cron configuration to only check things once per day (instead of changing the repository mirror update time.)
  2. Have 9000 mirrors.

Your situation represents an edgecase of edgecases.

But I personally have tried to provide you with ways to make your personal situation easier.

The change to use a normal queue for mirrors (#17326) allows you the option to use a level queue for your underlying queue thus prevent gitea from seizing up due to a blocked queue. The PULL_LIMIT and PUSH_LIMIT options were requested by you gives you other options to consider changing your cron configuration back to /10 minutes (but you might actually need a PULL_LIMIT/PUSH_LIMIT to be percentages of the total number of mirrors (not sure here.))

You're not running Gitea in a normal way and that means you will always need to carefully think about things. In your situation you need to tune things properly and that is what we have provided for you.

@zeripath commented on GitHub (Feb 6, 2022): OK let's look at that configuration first. You have way too many mirrors to for a persistable-channel queue to ever be the correct queue for you. Therefore change your update_checker queue to use a level queue and get rid of the channel queue related things. ```ini [repository] ROOT = /var/lib/gitea/repositories [mirror] DEFAULT_INTERVAL = 8h [queue.mirror] TYPE=level; <- You have way too many mirrors for a channel or persistent-channel to be the right queue for you ; Update mirrors [cron.update_mirrors] ; Every day at 4AM SCHEDULE = 0 0 4 * * * PULL_LIMIT = -1 PUSH_LIMIT = -1 ``` --- Next I think we need address what the `update_mirrors` cron task does, `update_mirrors` looks through the list of mirrors which are due to be updated - (e.g. `next_update_unix` <= Now & `next_update_unix != 0`) and sorts them into a last_updated order. It will then iterate through them. It will then queue PULL_LIMIT pull mirrors and PUSH_LIMIT push mirrors for update. The updates will be done by the workers on the other end of the queue. This will depend on your general queue configuration but often this is a scaling worker pool up to a maximum of 10 workers. --- So ... If you are finding that nothing is being queued ... It would be useful to check the values of `next_update_unix` in the mirror table. The PULL_LIMIT and PUSH_LIMIT code is working perfectly as described and so your problem is elsewhere. --- Now to explain why we changed the default PULL_LIMIT and PUSH_LIMIT. The vast majority of installs will benefit from this change as most people do not: 1. Change their cron configuration to only check things once per day (instead of changing the repository mirror update time.) 2. Have 9000 mirrors. Your situation represents an edgecase of edgecases. But I personally have tried to provide you with ways to make your personal situation easier. The change to use a normal queue for mirrors (#17326) allows you the option to use a level queue for your underlying queue thus prevent gitea from seizing up due to a blocked queue. The PULL_LIMIT and PUSH_LIMIT options were requested by you gives you other options to consider changing your cron configuration back to /10 minutes (but you might actually need a PULL_LIMIT/PUSH_LIMIT to be percentages of the total number of mirrors (not sure here.)) You're not running Gitea in a normal way and that means you will always need to carefully think about things. In your situation you need to tune things properly and that is what we have provided for you.
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

OK let's look at that configuration first.

You have way too many mirrors to for a persistable-channel queue to ever be the correct queue for you.

Therefore change your update_checker queue to use a level queue and get rid of the channel queue related things.

[repository]
ROOT = /var/lib/gitea/repositories

[mirror]
DEFAULT_INTERVAL = 8h 

[queue.mirror]
TYPE=level; <- You have way too many mirrors for a channel or persistent-channel to be the right queue for you

This is not defined in

https://github.com/go-gitea/gitea/blob/main/custom/conf/app.example.ini

and

https://docs.gitea.io/en-us/config-cheat-sheet/#mirror-mirror

; Update mirrors
[cron.update_mirrors]
; Every day at 4AM
SCHEDULE = 0 0 4 * * *
PULL_LIMIT = -1
PUSH_LIMIT = -1


Next I think we need address what the `update_mirrors` cron task does,

`update_mirrors` looks through the list of mirrors which are due to be updated - (e.g. `next_update_unix` <= Now & `next_update_unix != 0`) and sorts them into a last_updated order. It will then iterate through them.

It will then queue PULL_LIMIT pull mirrors and PUSH_LIMIT push mirrors for update.

The updates will be done by the workers on the other end of the queue. This will depend on your general queue configuration but often this is a scaling worker pool up to a maximum of 10 workers.

So ... If you are finding that nothing is being queued ... It would be useful to check the values of `next_update_unix` in the mirror table.

My current status for the mirrors

image

This means, that all 9000+ mirrors should be synced now.

The PULL_LIMIT and PUSH_LIMIT code is working perfectly as described and so your problem is elsewhere.

Now to explain why we changed the default PULL_LIMIT and PUSH_LIMIT.

The vast majority of installs will benefit from this change as most people do not:

  1. Change their cron configuration to only check things once per day (instead of changing the repository mirror update time.)
  2. Have 9000 mirrors.

Your situation represents an edgecase of edgecases.

;)

But I personally have tried to provide you with ways to make your personal situation easier.

Thx!

The change to use a normal queue for mirrors (#17326) allows you the option to use a level queue for your underlying queue thus prevent gitea from seizing up due to a blocked queue. The PULL_LIMIT and PUSH_LIMIT options were requested by you gives you other options to consider changing your cron configuration back to /10 minutes (but you might actually need a PULL_LIMIT/PUSH_LIMIT to be percentages of the total number of mirrors (not sure here.))

You're not running Gitea in a normal way and that means you will always need to carefully think about things. In your situation you need to tune things properly and that is what we have provided for you.

I added

[queue.mirror]
TYPE=level; 

and restarted my Gitea. But the cron do the same: nothing.

The sync update with curl and directly in repo settings is working.

@somera commented on GitHub (Feb 6, 2022): > OK let's look at that configuration first. > > You have way too many mirrors to for a persistable-channel queue to ever be the correct queue for you. > > Therefore change your update_checker queue to use a level queue and get rid of the channel queue related things. > > ```ini > [repository] > ROOT = /var/lib/gitea/repositories > > [mirror] > DEFAULT_INTERVAL = 8h > > [queue.mirror] > TYPE=level; <- You have way too many mirrors for a channel or persistent-channel to be the right queue for you This is not defined in https://github.com/go-gitea/gitea/blob/main/custom/conf/app.example.ini and https://docs.gitea.io/en-us/config-cheat-sheet/#mirror-mirror > > ; Update mirrors > [cron.update_mirrors] > ; Every day at 4AM > SCHEDULE = 0 0 4 * * * > PULL_LIMIT = -1 > PUSH_LIMIT = -1 > ``` > > Next I think we need address what the `update_mirrors` cron task does, > > `update_mirrors` looks through the list of mirrors which are due to be updated - (e.g. `next_update_unix` <= Now & `next_update_unix != 0`) and sorts them into a last_updated order. It will then iterate through them. > > It will then queue PULL_LIMIT pull mirrors and PUSH_LIMIT push mirrors for update. > > The updates will be done by the workers on the other end of the queue. This will depend on your general queue configuration but often this is a scaling worker pool up to a maximum of 10 workers. > > So ... If you are finding that nothing is being queued ... It would be useful to check the values of `next_update_unix` in the mirror table. My current status for the mirrors ![image](https://user-images.githubusercontent.com/8334250/152698867-75d2e476-b2d1-4b78-a9e2-fdfdc42d204c.png) This means, that all 9000+ mirrors should be synced now. > The PULL_LIMIT and PUSH_LIMIT code is working perfectly as described and so your problem is elsewhere. > > Now to explain why we changed the default PULL_LIMIT and PUSH_LIMIT. > > The vast majority of installs will benefit from this change as most people do not: > > 1. Change their cron configuration to only check things once per day (instead of changing the repository mirror update time.) > 2. Have 9000 mirrors. > > Your situation represents an edgecase of edgecases. ;) > But I personally have tried to provide you with ways to make your personal situation easier. Thx! > The change to use a normal queue for mirrors (#17326) allows you the option to use a level queue for your underlying queue thus prevent gitea from seizing up due to a blocked queue. The PULL_LIMIT and PUSH_LIMIT options were requested by you gives you other options to consider changing your cron configuration back to /10 minutes (but you might actually need a PULL_LIMIT/PUSH_LIMIT to be percentages of the total number of mirrors (not sure here.)) > > You're not running Gitea in a normal way and that means you will always need to carefully think about things. In your situation you need to tune things properly and that is what we have provided for you. I added ``` [queue.mirror] TYPE=level; ``` and restarted my Gitea. But the cron do the same: nothing. The sync update with curl and directly in repo settings is working.
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

@zeripath when I change the loglevel to

[log]
LEVEL            = Info

Change from what? Info is one of the lowest log levels and it will not be giving us any special information.

[log]
MODE=console, traceconsole
LEVEL=info

[log.traceconsole]
MODE=console
LEVEL=trace
EXPRESSION=services/mirror

Would be more useful.

OR even whilst gitea is running you can simply run:

./gitea manager logging add console --name traceconsole --level TRACE --expression services/mirror

And it will add trace level console logger that will emit TRACE level logs from events in the services/mirror files

than I see 9000+ (each for every mirror) select statements in the log

2022/02/06 20:24:32 models/repo/repo.go:635:getRepositoryByID() [I] [SQL] SELECT "id", "owner_id", "owner_name", "lower_name", "name", "description", "website", "original_service_type", "original_url", "default_branch", "num_watches", "num_stars", "num_forks", "num_issues", "num_closed_issues", "num_pulls", "num_closed_pulls", "num_milestones", "num_closed_milestones", "num_projects", "num_closed_projects", "is_private", "is_empty", "is_archived", "is_mirror", "status", "is_fork", "fork_id", "is_template", "template_id", "size", "is_fsck_enabled", "close_issues_via_commit_in_any_branch", "topics", "trust_model", "avatar", "created_unix", "updated_unix" FROM "repository" WHERE "id"=$1 LIMIT 1 [4445] - 848.612µs

That is your 9000 repositories being loaded and then added to the update queue.

and than this

2022/02/06 20:24:32 ...s/repo/pushmirror.go:111:PushMirrorsIterate() [I] [SQL] SELECT "id", "repo_id", "remote_name", "interval", "created_unix", "last_update", "last_error" FROM "push_mirror" WHERE (last_update + ("interval" / $1) <= $2) AND ("interval" != 0) ORDER BY last_update ASC [1s 1644175472] - 1.891748ms

These are push mirrors and it is clear that you have none.

select. and no one mirror will be updated.

This are the 8-10 seconds the cron.update_mirrors needs to run and finish.

Yes this is correct because the cron.update_mirrors task is simply adding mirrors to the queue to be updated - that is all it has ever done. It has never represented the actual work of doing the updating.

The work of updating a mirror will be done by workers on the queue. The mirror queue will scale its workers to do account for the amount of things in the queue.


But the cron do the same: nothing

I've tried to explain this to you before - the cron task update_mirrors DOES NOT represent the actual work of doing the updating. It has never done that. Previously you've had a proxy of this because in your situation you've been blocking the whole queue due to the number of mirrors you have.

@zeripath commented on GitHub (Feb 6, 2022): > @zeripath when I change the loglevel to > > ``` > [log] > LEVEL = Info > ``` Change from what? Info is one of the lowest log levels and it will not be giving us any special information. ```ini [log] MODE=console, traceconsole LEVEL=info [log.traceconsole] MODE=console LEVEL=trace EXPRESSION=services/mirror ``` Would be more useful. OR even whilst gitea is running you can simply run: ```bash ./gitea manager logging add console --name traceconsole --level TRACE --expression services/mirror ``` And it will add trace level console logger that will emit TRACE level logs from events in the services/mirror files > than I see 9000+ (each for every mirror) select statements in the log > > `2022/02/06 20:24:32 models/repo/repo.go:635:getRepositoryByID() [I] [SQL] SELECT "id", "owner_id", "owner_name", "lower_name", "name", "description", "website", "original_service_type", "original_url", "default_branch", "num_watches", "num_stars", "num_forks", "num_issues", "num_closed_issues", "num_pulls", "num_closed_pulls", "num_milestones", "num_closed_milestones", "num_projects", "num_closed_projects", "is_private", "is_empty", "is_archived", "is_mirror", "status", "is_fork", "fork_id", "is_template", "template_id", "size", "is_fsck_enabled", "close_issues_via_commit_in_any_branch", "topics", "trust_model", "avatar", "created_unix", "updated_unix" FROM "repository" WHERE "id"=$1 LIMIT 1 [4445] - 848.612µs` That is your 9000 repositories being loaded and then added to the update queue. > and than this > > `2022/02/06 20:24:32 ...s/repo/pushmirror.go:111:PushMirrorsIterate() [I] [SQL] SELECT "id", "repo_id", "remote_name", "interval", "created_unix", "last_update", "last_error" FROM "push_mirror" WHERE (last_update + ("interval" / $1) <= $2) AND ("interval" != 0) ORDER BY last_update ASC [1s 1644175472] - 1.891748ms` These are push mirrors and it is clear that you have none. > > select. and no one mirror will be updated. > > This are the 8-10 seconds the cron.update_mirrors needs to run and finish. Yes this is correct because the cron.update_mirrors task is simply adding mirrors to the queue to be updated - that is all it has ever done. It has never represented the actual work of doing the updating. The work of updating a mirror will be done by workers on the queue. The mirror queue will scale its workers to do account for the amount of things in the queue. --- > But the cron do the same: nothing I've tried to explain this to you before - the cron task update_mirrors DOES NOT represent the actual work of doing the updating. It has never done that. Previously you've had a proxy of this because in your situation you've been blocking the whole queue due to the number of mirrors you have.
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

@zeripath when I change the loglevel to
You're not running Gitea in a normal way and that means you will always need to carefully think about things. In your situation you need to tune things properly and that is what we have provided for you.

What do you mean with "You're not running Gitea in a normal way"? ;)

If someone (not I) is running Gitea which is used by a lot of people with a lot of repos and mirrors. Than that person get the same problems.

I've tried to explain this to you before - the cron task update_mirrors DOES NOT represent the actual work of doing the updating. It has never done that. Previously you've had a proxy of this because in your situation you've been blocking the whole queue due to the number of mirrors you have.

Is this an design "problem"?

Understand or not. I'm not understand what exactly is blocked? The workes for the mirrors queue?

{
  "Name": "mirror",
  "DataDir": "/var/lib/gitea/data/queues/common",
  "BatchLength": 20,
  "QueueLength": 20000,
  "Timeout": 90000000000,
  "MaxAttempts": 10,
  "Workers": 0,
  "MaxWorkers": 10,
  "BlockTimeout": 1000000000,
  "BoostTimeout": 300000000000,
  "BoostWorkers": 1
}

It works with Gitea 1.15.x. -> is the old way isn't implemented anymore?

And the solution now is? At the moment I can send one a week the curl calls to update the mirrors.

@somera commented on GitHub (Feb 6, 2022): > @zeripath when I change the loglevel to > You're not running Gitea in a normal way and that means you will always need to carefully think about things. In your situation you need to tune things properly and that is what we have provided for you. What do you mean with "You're not running Gitea in a normal way"? ;) If someone (not I) is running Gitea which is used by a lot of people with a lot of repos and mirrors. Than that person get the same problems. > I've tried to explain this to you before - the cron task update_mirrors DOES NOT represent the actual work of doing the updating. It has never done that. Previously you've had a proxy of this because in your situation you've been blocking the whole queue due to the number of mirrors you have. Is this an design "problem"? Understand or not. I'm not understand what exactly is blocked? The workes for the mirrors queue? ``` { "Name": "mirror", "DataDir": "/var/lib/gitea/data/queues/common", "BatchLength": 20, "QueueLength": 20000, "Timeout": 90000000000, "MaxAttempts": 10, "Workers": 0, "MaxWorkers": 10, "BlockTimeout": 1000000000, "BoostTimeout": 300000000000, "BoostWorkers": 1 } ``` It works with Gitea 1.15.x. -> is the old way isn't implemented anymore? And the solution now is? At the moment I can send one a week the curl calls to update the mirrors.
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

Thus you get a

f393bc82cb/services/mirror/mirror.go (L67-L70)

This will get pushed to the queue unless it is already in the queue.

f393bc82cb/services/mirror/mirror.go (L92-L99)

f393bc82cb/services/mirror/mirror.go (L101-L104)


Now you assert that calling sync with curl works.

So let's follow what that does:

f393bc82cb/routers/api/v1/repo/mirror.go (L17)

f393bc82cb/routers/api/v1/repo/mirror.go (L51)

f393bc82cb/services/mirror/mirror.go (L151-L165)

Which pushes to the same queue

Now you might argue that that push doesn't have a Has wrapped around it but... There's a Has internally in the push.


So what have we found:

  • PULL_LIMIT -1 must be working otherwise you wouldn't have 9K sql selects
  • Similarly the 9K selects imply that at least one point the next_update_unix query found all the cases.
  • Your own assertion about how curl syncs things means the queue infrastructure works at doing the syncing.
  • Which leaves the Has test at lines 99 above but there's a has in the push anyway.

I guess the question I have is what is making you think this isn't working?


So... One thing you could do is simply change the next_update_unix for a mirror manually. Put the tracer logger I suggested above on. And then click the Cron task button and follow what the logs do.

@zeripath commented on GitHub (Feb 6, 2022): * next_update_unix in the mirror table is the key thing for things to be added by the Cron task to the queue. Not updated_unix and certainly not updated_unix on the repository. https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/models/repo/mirror.go#L124-L128 * The 9k select repository statements says that: https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/services/mirror/mirror.go#L115 Happens for each repo Thus you get a https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/services/mirror/mirror.go#L67-L70 This will get pushed to the queue unless it is already in the queue. https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/services/mirror/mirror.go#L92-L99 https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/services/mirror/mirror.go#L101-L104 --- Now you assert that calling sync with curl works. So let's follow what that does: https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/routers/api/v1/repo/mirror.go#L17 https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/routers/api/v1/repo/mirror.go#L51 https://github.com/go-gitea/gitea/blob/f393bc82cbf83ab890a55ecdb7f41583e583ddad/services/mirror/mirror.go#L151-L165 Which pushes to the same queue Now you might argue that that push doesn't have a Has wrapped around it but... There's a Has internally in the push. --- So what have we found: * PULL_LIMIT -1 must be working otherwise you wouldn't have 9K sql selects * Similarly the 9K selects imply that at least one point the next_update_unix query found all the cases. * Your own assertion about how curl syncs things means the queue infrastructure works at doing the syncing. * Which leaves the Has test at lines 99 above but there's a has in the push anyway. --- I guess the question I have is what is making you think this isn't working? --- So... One thing you could do is simply change the next_update_unix for a mirror manually. Put the tracer logger I suggested above on. And then click the Cron task button and follow what the logs do.
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

Results for next_update_unix

image

If I use curl, that do something, but not for all curl calls.

I made 547 curl calls for 547 repos. And after gitea was ready, 266 repos still was not updated. Than I made second run for the 266 repos.

And this show me, that the sync mirror process isn't working. Cause in this case the manual cron start will update all my mirrors.

Before new manual start

image

image

Manual start ...

And this are the changes

image

Can you explain this? Gitea updates only 4 mirrors with mirror_next_update_unix = '2022-02-06'

image

And if I start the cron 2nd time ... no one repo will be updated.

@somera commented on GitHub (Feb 6, 2022): Results for next_update_unix ![image](https://user-images.githubusercontent.com/8334250/152706506-852cf2af-e2dc-4317-9b0c-9858d59d16af.png) If I use curl, that do something, but not for all curl calls. I made 547 curl calls for 547 repos. And after gitea was ready, 266 repos still was not updated. Than I made second run for the 266 repos. And this show me, that the sync mirror process isn't working. Cause in this case the manual cron start will update all my mirrors. Before new manual start ![image](https://user-images.githubusercontent.com/8334250/152706652-8eb18cd7-2aac-48b3-adb2-3db87ac9d57f.png) ![image](https://user-images.githubusercontent.com/8334250/152706661-0d9fee4d-d147-4791-8f6e-4dde4ae373ae.png) Manual start ... And this are the changes ![image](https://user-images.githubusercontent.com/8334250/152706707-df3e086b-af7e-4457-9d0c-8f8588697556.png) Can you explain this? Gitea updates only 4 mirrors with mirror_next_update_unix = '2022-02-06' ![image](https://user-images.githubusercontent.com/8334250/152706770-61857204-e744-408d-b3dc-2ff6644cb6b7.png) And if I start the cron 2nd time ... no one repo will be updated.
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

Go to monitor and tell me how many workers you have in your mirror queue right now. (Not initial configuration)

@zeripath commented on GitHub (Feb 6, 2022): Go to monitor and tell me how many workers you have in your mirror queue right now. (Not initial configuration)
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

{
  "Name": "mirror",
  "DataDir": "/var/lib/gitea/data/queues/common",
  "BatchLength": 20,
  "QueueLength": 20000,
  "Timeout": 90000000000,
  "MaxAttempts": 10,
  "Workers": 0,
  "MaxWorkers": 10,
  "BlockTimeout": 1000000000,
  "BoostTimeout": 300000000000,
  "BoostWorkers": 1
}

{
  "QueueLength": 20000,
  "BatchLength": 20,
  "BlockTimeout": 1000000000,
  "BoostTimeout": 300000000000,
  "BoostWorkers": 1,
  "MaxWorkers": 10,
  "Workers": 0,
  "Name": "mirror-channel"
}

But the number of workers it not important. If I put 1000 items into the queue and I have only 10 workers, that it need longer. And Gitea is not updating more than 1 repo in a row.

@somera commented on GitHub (Feb 6, 2022): > ``` { "Name": "mirror", "DataDir": "/var/lib/gitea/data/queues/common", "BatchLength": 20, "QueueLength": 20000, "Timeout": 90000000000, "MaxAttempts": 10, "Workers": 0, "MaxWorkers": 10, "BlockTimeout": 1000000000, "BoostTimeout": 300000000000, "BoostWorkers": 1 } { "QueueLength": 20000, "BatchLength": 20, "BlockTimeout": 1000000000, "BoostTimeout": 300000000000, "BoostWorkers": 1, "MaxWorkers": 10, "Workers": 0, "Name": "mirror-channel" } ``` But the number of workers it not important. If I put 1000 items into the queue and I have only 10 workers, that it need longer. And Gitea is not updating more than 1 repo in a row.
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

That is initial configuration

@zeripath commented on GitHub (Feb 6, 2022): That is initial configuration
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

If I refresh the config side for the mirror queue I don't see any changes.

@somera commented on GitHub (Feb 6, 2022): > If I refresh the config side for the mirror queue I don't see any changes.
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

image

"This queue surrounds other queues and has no worker pool itself."

@somera commented on GitHub (Feb 6, 2022): ![image](https://user-images.githubusercontent.com/8334250/152707018-c3bec2f0-ae56-40b3-96fc-31d2bcb50d47.png) "This queue surrounds other queues and has no worker pool itself."
Author
Owner

@zeripath commented on GitHub (Feb 6, 2022):

So... Mirror-channel?

@zeripath commented on GitHub (Feb 6, 2022): So... Mirror-channel?
Author
Owner

@somera commented on GitHub (Feb 6, 2022):

image

image

@somera commented on GitHub (Feb 6, 2022): ![image](https://user-images.githubusercontent.com/8334250/152707053-deda2314-1f0e-495f-bc3b-6d4fbe39f948.png) ![image](https://user-images.githubusercontent.com/8334250/152707061-4e534f2e-e2e8-4caa-b092-05e8336d5c77.png)
Author
Owner

@somera commented on GitHub (Feb 7, 2022):

I made 48 curl calls and saw this

image

@somera commented on GitHub (Feb 7, 2022): I made 48 curl calls and saw this ![image](https://user-images.githubusercontent.com/8334250/152707326-b22f0975-224d-4658-ae52-c16269459dea.png)
Author
Owner

@zeripath commented on GitHub (Feb 7, 2022):

Hmm... I wonder... Is it possible that the queue worker is timing out and then not being replaced?

@zeripath commented on GitHub (Feb 7, 2022): Hmm... I wonder... Is it possible that the queue worker is timing out and then not being replaced?
Author
Owner

@zeripath commented on GitHub (Feb 7, 2022):

I guess a trick for that is to wait until that worker was due to timeout and then add another worker manually yourself and see if that finishes off the work

@zeripath commented on GitHub (Feb 7, 2022): I guess a trick for that is to wait until that worker was due to timeout and then add another worker manually yourself and see if that finishes off the work
Author
Owner

@zeripath commented on GitHub (Feb 7, 2022):

Actually a flush worker would be better

@zeripath commented on GitHub (Feb 7, 2022): Actually a flush worker would be better
Author
Owner

@zeripath commented on GitHub (Feb 7, 2022):

Yup I bet this is this problem. When the zero worker times out unless there's a push the lack of worker won't be noticed.

Workaround just set workers=1 in [Queue.mirrors] or flush the queue. I'll have a think - likely when the managedQueue loses its final worker and if there's something in the queue it should zeroboost again.

@zeripath commented on GitHub (Feb 7, 2022): Yup I bet this is this problem. When the zero worker times out unless there's a push the lack of worker won't be noticed. Workaround just set workers=1 in [Queue.mirrors] or flush the queue. I'll have a think - likely when the managedQueue loses its final worker and if there's something in the queue it should zeroboost again.
Author
Owner

@somera commented on GitHub (Feb 7, 2022):

Yup I bet this is this problem. When the zero worker times out unless there's a push the lack of worker won't be noticed.

Workaround just set workers=1 in [Queue.mirrors] or flush the queue. I'll have a think - likely when the managedQueue loses its final worker and if there's something in the queue it should zeroboost again.

I added this

[queue.mirror]
LENGTH = 20000
WORKERS = 1

and set the limits to 100 (only for testing)

[cron.update_mirrors]
; Every day at 4AM
SCHEDULE = 0 0 4 * * *
PULL_LIMITS = 100
PUSH_LIMITS = 100

Than restart gitea

image

And start the cron manually ... no repo was synced.

I see this

image

and no other activity. after 8-10 seconds this task is finished.

I made 4 tests

2022/02/07 01:22:00 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update
2022/02/07 01:22:09 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update
2022/02/07 01:24:19 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update
2022/02/07 01:24:28 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update
2022/02/07 01:28:16 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update
2022/02/07 01:28:25 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update
2022/02/07 01:31:05 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update
2022/02/07 01:31:14 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update
@somera commented on GitHub (Feb 7, 2022): > Yup I bet this is this problem. When the zero worker times out unless there's a push the lack of worker won't be noticed. > > Workaround just set workers=1 in [Queue.mirrors] or flush the queue. I'll have a think - likely when the managedQueue loses its final worker and if there's something in the queue it should zeroboost again. I added this ``` [queue.mirror] LENGTH = 20000 WORKERS = 1 ``` and set the limits to 100 (only for testing) ``` [cron.update_mirrors] ; Every day at 4AM SCHEDULE = 0 0 4 * * * PULL_LIMITS = 100 PUSH_LIMITS = 100 ``` Than restart gitea ![image](https://user-images.githubusercontent.com/8334250/152708440-700e4e4f-9969-4488-aca3-b57712dc6975.png) And start the cron manually ... no repo was synced. I see this ![image](https://user-images.githubusercontent.com/8334250/152708483-3f49327c-ade6-4f60-8a78-194b1a743665.png) and no other activity. after 8-10 seconds this task is finished. I made 4 tests ``` 2022/02/07 01:22:00 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update 2022/02/07 01:22:09 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update 2022/02/07 01:24:19 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update 2022/02/07 01:24:28 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update 2022/02/07 01:28:16 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update 2022/02/07 01:28:25 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update 2022/02/07 01:31:05 ...ces/mirror/mirror.go:56:Update() [T] Doing: Update 2022/02/07 01:31:14 ...ces/mirror/mirror.go:129:Update() [T] Finished: Update ```
Author
Owner

@somera commented on GitHub (Feb 7, 2022):

Yup I bet this is this problem. When the zero worker times out unless there's a push the lack of worker won't be noticed.

Workaround just set workers=1 in [Queue.mirrors] or flush the queue. I'll have a think - likely when the managedQueue loses its final worker and if there's something in the queue it should zeroboost again.

But what is the difference to curl?

curl is an external call which put the mirror into the queue. And it works.

What is the difference between the cron call? If you try to explain this, you will find the "problem". ;)

@somera commented on GitHub (Feb 7, 2022): > Yup I bet this is this problem. When the zero worker times out unless there's a push the lack of worker won't be noticed. > > Workaround just set workers=1 in [Queue.mirrors] or flush the queue. I'll have a think - likely when the managedQueue loses its final worker and if there's something in the queue it should zeroboost again. But what is the difference to curl? curl is an external call which put the mirror into the queue. And it works. What is the difference between the cron call? If you try to explain this, you will find the "problem". ;)
Author
Owner

@lunny commented on GitHub (Feb 7, 2022):

OK. I found something. It seems the mirror address is not right. Which may caused by #15157 . PR #18649

@lunny commented on GitHub (Feb 7, 2022): OK. I found something. It seems the mirror address is not right. Which may caused by #15157 . PR #18649
Author
Owner

@somera commented on GitHub (Feb 7, 2022):

OK. I found something. It seems the mirror address is not right. Which may caused by #15157 . PR #18649

What does it mean?

Should cron.update_mirrors be disabled? And can something go wrong here?

@somera commented on GitHub (Feb 7, 2022): > OK. I found something. It seems the mirror address is not right. Which may caused by #15157 . PR #18649 What does it mean? Should cron.update_mirrors be disabled? And can something go wrong here?
Author
Owner

@somera commented on GitHub (Feb 11, 2022):

After

Run:

gitea manager flush-queues

Wait for it to finish.

Shutdown Gitea and delete the /data/queues/common folder.

Restart.

Gitea is syncing only ~220 mirrors.

@somera commented on GitHub (Feb 11, 2022): After Run: `gitea manager flush-queues` Wait for it to finish. Shutdown Gitea and delete the /data/queues/common folder. Restart. Gitea is syncing only ~220 mirrors.
Author
Owner

@zeripath commented on GitHub (Feb 11, 2022):

Are you running 1.16-head?

@zeripath commented on GitHub (Feb 11, 2022): Are you running 1.16-head?
Author
Owner

@somera commented on GitHub (Feb 11, 2022):

Are you running 1.16-head?

No. I can't run this on my instance.

@somera commented on GitHub (Feb 11, 2022): > Are you running 1.16-head? No. I can't run this on my instance.
Author
Owner

@zeripath commented on GitHub (Feb 11, 2022):

I'm not suggesting that you run 1.17/main - I am simply suggesting that you move the 1.16-dev or 1.16 which tracks the backports and bug fixes that will become 1.16.2 in future in the next week.

Are you at least running with WORKERS=1 so that there is a permanent worker for the mirror queue?

@zeripath commented on GitHub (Feb 11, 2022): I'm not suggesting that you run 1.17/main - I am simply suggesting that you move the 1.16-dev or 1.16 which tracks the backports and bug fixes that will become 1.16.2 in future in the next week. Are you at least running with `WORKERS=1` so that there is a permanent worker for the mirror queue?
Author
Owner

@somera commented on GitHub (Feb 11, 2022):

Powered by Gitea Version: 1.16.1

lrwxrwxrwx 1 git git        24 Feb  6 17:20 gitea -> gitea-1.16.1-linux-amd64
-rwxr-xr-x 1 git git 106011280 Feb  6 14:44 gitea-1.16.1-linux-amd64
@somera commented on GitHub (Feb 11, 2022): Powered by Gitea Version: 1.16.1 ``` lrwxrwxrwx 1 git git 24 Feb 6 17:20 gitea -> gitea-1.16.1-linux-amd64 -rwxr-xr-x 1 git git 106011280 Feb 6 14:44 gitea-1.16.1-linux-amd64 ```
Author
Owner

@somera commented on GitHub (Feb 11, 2022):

I'm not suggesting that you run 1.17/main - I am simply suggesting that you move the 1.16-dev or 1.16 which tracks the backports and bug fixes that will become 1.16.2 in future in the next week.

Are you at least running with WORKERS=1 so that there is a permanent worker for the mirror queue?

Yes.

I restarted repeated flush-queues ... and started the mirror process again.

@somera commented on GitHub (Feb 11, 2022): > I'm not suggesting that you run 1.17/main - I am simply suggesting that you move the 1.16-dev or 1.16 which tracks the backports and bug fixes that will become 1.16.2 in future in the next week. > > Are you at least running with `WORKERS=1` so that there is a permanent worker for the mirror queue? Yes. I restarted repeated flush-queues ... and started the mirror process again.
Author
Owner

@somera commented on GitHub (Feb 11, 2022):

It looks better now. And Gitea is syncing the mirrors.

@somera commented on GitHub (Feb 11, 2022): It looks better now. And Gitea is syncing the mirrors.
Author
Owner

@somera commented on GitHub (Feb 11, 2022):

It's possible to stop the whole sync mirror process?

If I press on yellow marked trash. Will this stop only for the one mirror or whole cron?

image

@somera commented on GitHub (Feb 11, 2022): It's possible to stop the whole sync mirror process? If I press on yellow marked trash. Will this stop only for the one mirror or whole cron? ![image](https://user-images.githubusercontent.com/8334250/153659895-6914f657-c57e-4789-b7e8-f2dc62998e84.png)
Author
Owner

@zeripath commented on GitHub (Feb 11, 2022):

Just one mirror - if you want to stop all mirroring you'll need to stop the queue worker.

@zeripath commented on GitHub (Feb 11, 2022): Just one mirror - if you want to stop all mirroring you'll need to stop the queue worker.
Author
Owner

@somera commented on GitHub (Feb 12, 2022):

Just one mirror - if you want to stop all mirroring you'll need to stop the queue worker.

Thx. I will try this other time.

Some update ...

After last try:

  • flush-queues
  • stop gitea
  • remove ./../data/queues/common folder
  • start gitea

with manual cron start I could update all mirrors in one row. Good. Looks like this helps. But ...

About 16-18 hours (DEFAULT_INTERVAL = 8h) later I wanted see what happens, if I start the cron again.

My actual configuration is:

[mirror]
; Default interval as a duration between each check
DEFAULT_INTERVAL = 8h
; Min interval as a duration must be > 1m
MIN_INTERVAL = 10m
LENGTH = 20000
TYPE = level

[queue.mirror]
LENGTH = 20000
WORKERS = 1

[cron.update_mirrors]
; Every day at 4AM
SCHEDULE = 0 0 4 * * *
PULL_LIMITS = 1000
PUSH_LIMITS = 1000

But "nothing" happens. Gite updates ~10 mirrors.

@somera commented on GitHub (Feb 12, 2022): > Just one mirror - if you want to stop all mirroring you'll need to stop the queue worker. Thx. I will try this other time. Some update ... After last try: - flush-queues - stop gitea - remove ./../data/queues/common folder - start gitea with manual cron start I could update all mirrors in one row. Good. Looks like this helps. But ... About 16-18 hours (DEFAULT_INTERVAL = 8h) later I wanted see what happens, if I start the cron again. My actual configuration is: ``` [mirror] ; Default interval as a duration between each check DEFAULT_INTERVAL = 8h ; Min interval as a duration must be > 1m MIN_INTERVAL = 10m LENGTH = 20000 TYPE = level [queue.mirror] LENGTH = 20000 WORKERS = 1 [cron.update_mirrors] ; Every day at 4AM SCHEDULE = 0 0 4 * * * PULL_LIMITS = 1000 PUSH_LIMITS = 1000 ``` But "nothing" happens. Gite updates ~10 mirrors.
Author
Owner

@somera commented on GitHub (Feb 18, 2022):

Only ~50 mirrors are updated every day on the 04:00AM run

image

With the configuration which I postet in the comment above.

@somera commented on GitHub (Feb 18, 2022): Only ~50 mirrors are updated every day on the 04:00AM run ![image](https://user-images.githubusercontent.com/8334250/154644295-1556e442-c868-4503-a75d-ae688d298d57.png) With the configuration which I postet in the comment above.
Author
Owner

@somera commented on GitHub (Feb 24, 2022):

@zeripath I opened #18895

@somera commented on GitHub (Feb 24, 2022): @zeripath I opened #18895
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#8497