Gitea doesn't index a specific repo #5608

Closed
opened 2025-11-02 06:30:42 -06:00 by GiteaMirror · 3 comments
Owner

Originally created by @u3shit on GitHub (Jun 22, 2020).

  • Gitea version (or commit ref): 1.12.1
  • Git version: 2.26.2
  • Operating system: alpine linux
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No: repo indexer not enabled
    • Not relevant
  • Log gist:

Description

I've enabled repo indexer in my config:

[indexer]
REPO_INDEXER_ENABLED = true
UPDATE_BUFFER_LEN = 20
MAX_FILE_SIZE = 1048576

It works fine for every project on my gitea server, except for a single private one. I tried repository health check, force gc, tried to push to the repo, create a commit from the web interface, no change. I did some investigation, and I found out that in the sqlite db in the repo_indexer_status, that one repo has indexer_type=1 while every other has 0. In gitea.log I only found this one specific log line:

2020/06/22 10:19:45 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large

and a bunch of SQL related logs.

I've retried this inside docker on the current gitea/gitea:latest image, I only added the REPO_INDEXER_ENABLED = true line to app.ini and otherwise went with the default settings, created a new public repo, pushed my repo there. Gitea wrote a bunch of truncated git cat-file output to the console (including things like raw png files) then it just stopped. Maybe relevant lines from the log:

# grep Index /data/gitea/log/gitea.log                     
2020/06/22 12:50:23 ...er/issues/indexer.go:142:func2() [I] PID 17: Initializing Issue Indexer: bleve
2020/06/22 12:50:23 ...exer/code/indexer.go:66:func2() [I] PID: 17 Initializing Repository Indexer at: /data/gitea/indexers/repos.bleve
2020/06/22 12:50:24 ...xer/stats/indexer.go:38:populateRepoIndexer() [I] Populating the repo stats indexer with existing repositories
2020/06/22 12:50:24 ...er/issues/indexer.go:221:func3() [I] Issue Indexer Initialization took 724.725022ms
2020/06/22 12:50:24 ...ndexer/code/queue.go:79:populateRepoIndexer() [I] Populating the repo indexer with existing repositories
2020/06/22 12:50:24 ...exer/code/indexer.go:121:func3() [I] Repository Indexer Initialization took 724.706653ms
2020/06/22 12:51:11 ...n/indexer/indexer.go:131:NotifyPushCommits() [E] stats_indexer.UpdateRepoIndexer(1) failed: already in queue
2020/06/22 12:51:12 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large
2020/06/22 12:51:13 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large
2020/06/22 12:53:04 ...n/indexer/indexer.go:131:NotifyPushCommits() [E] stats_indexer.UpdateRepoIndexer(1) failed: already in queue
2020/06/22 12:53:05 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large
2020/06/22 12:53:05 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large

Unfortunately the repo is not public so I can't post a link to it.

Originally created by @u3shit on GitHub (Jun 22, 2020). - Gitea version (or commit ref): 1.12.1 - Git version: 2.26.2 - Operating system: alpine linux - Database (use `[x]`): - [ ] PostgreSQL - [ ] MySQL - [ ] MSSQL - [x] SQLite - Can you reproduce the bug at https://try.gitea.io: - [ ] Yes (provide example URL) - [x] No: repo indexer not enabled - [ ] Not relevant - Log gist: ## Description I've enabled repo indexer in my config: ``` [indexer] REPO_INDEXER_ENABLED = true UPDATE_BUFFER_LEN = 20 MAX_FILE_SIZE = 1048576 ``` It works fine for every project on my gitea server, except for a single private one. I tried repository health check, force gc, tried to push to the repo, create a commit from the web interface, no change. I did some investigation, and I found out that in the sqlite db in the `repo_indexer_status`, that one repo has indexer_type=1 while every other has 0. In gitea.log I only found this one specific log line: ``` 2020/06/22 10:19:45 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large ``` and a bunch of SQL related logs. I've retried this inside docker on the current `gitea/gitea:latest` image, I only added the `REPO_INDEXER_ENABLED = true` line to app.ini and otherwise went with the default settings, created a new public repo, pushed my repo there. Gitea wrote a bunch of truncated `git cat-file` output to the console (including things like raw png files) then it just stopped. Maybe relevant lines from the log: ``` # grep Index /data/gitea/log/gitea.log 2020/06/22 12:50:23 ...er/issues/indexer.go:142:func2() [I] PID 17: Initializing Issue Indexer: bleve 2020/06/22 12:50:23 ...exer/code/indexer.go:66:func2() [I] PID: 17 Initializing Repository Indexer at: /data/gitea/indexers/repos.bleve 2020/06/22 12:50:24 ...xer/stats/indexer.go:38:populateRepoIndexer() [I] Populating the repo stats indexer with existing repositories 2020/06/22 12:50:24 ...er/issues/indexer.go:221:func3() [I] Issue Indexer Initialization took 724.725022ms 2020/06/22 12:50:24 ...ndexer/code/queue.go:79:populateRepoIndexer() [I] Populating the repo indexer with existing repositories 2020/06/22 12:50:24 ...exer/code/indexer.go:121:func3() [I] Repository Indexer Initialization took 724.706653ms 2020/06/22 12:51:11 ...n/indexer/indexer.go:131:NotifyPushCommits() [E] stats_indexer.UpdateRepoIndexer(1) failed: already in queue 2020/06/22 12:51:12 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large 2020/06/22 12:51:13 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large 2020/06/22 12:53:04 ...n/indexer/indexer.go:131:NotifyPushCommits() [E] stats_indexer.UpdateRepoIndexer(1) failed: already in queue 2020/06/22 12:53:05 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large 2020/06/22 12:53:05 ...ndexer/code/queue.go:39:processRepoIndexerOperationQueue() [E] indexer.Index: key too large ``` Unfortunately the repo is not public so I can't post a link to it.
GiteaMirror added the issue/confirmedtype/bugtype/upstream labels 2025-11-02 06:30:42 -06:00
Author
Owner

@u3shit commented on GitHub (Jun 22, 2020):

Okay, I think I've narrowed down the problem. Clone this repository and try to push it to a newly created gitea repository: https://github.com/u3shit/gitea-indexer-failure
I get the indexer.Index: key too large error after push. Also, searching for foo returns no result, when it should find it in the bar file. Just a guess, but maybe the indexer chokes on the huge base64 string in data/human_chibitest.gltf?

@u3shit commented on GitHub (Jun 22, 2020): Okay, I think I've narrowed down the problem. Clone this repository and try to push it to a newly created gitea repository: https://github.com/u3shit/gitea-indexer-failure I get the `indexer.Index: key too large` error after push. Also, searching for `foo` returns no result, when it should find it in the `bar` file. Just a guess, but maybe the indexer chokes on the huge base64 string in `data/human_chibitest.gltf`?
Author
Owner

@lunny commented on GitHub (Jun 23, 2020):

@u3shit Thank you for your investigation. It seems the error is reported by bleve.

@lunny commented on GitHub (Jun 23, 2020): @u3shit Thank you for your investigation. It seems the error is reported by bleve.
Author
Owner

@stale[bot] commented on GitHub (Aug 23, 2020):

This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.

@stale[bot] commented on GitHub (Aug 23, 2020): This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#5608