Incomplete issue search results in repository with many issues #10829

Closed
opened 2025-11-02 09:19:24 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @brechtvl on GitHub (May 11, 2023).

Description

There multiple ways to reproduce this, but one way:

  • Create 50 issues with same title
  • Create 1 pull request with the same title as the issues
  • Searching for the title will return either 0 results in pull requests search, or only 49 results in issue search

Another way:

  • Create 60 issues with same title
  • Apply one label to 30 of them, and another label to the other 30
  • Search for the title, filter by one of the labels, and it will return less than 30 results

The reason behind this is that that indexers will index the title, contents and comments and return up to 50 search results based on that. Filtering by issue or PR, open or closed, labels, author, .. happens afterwards. Note that pagination as in #22704 does not solve this problem.

The solution could be to make all indexers filter by and index all these issue fields. That would require adding quite a bit of code to all indexers though, every filtering option would need to be implemented in every indexer.

Alternative solutions with worse performance would be to get an unlimited number of results from the indexers, or compute a list of filter matching issue IDs to give to the indexers.

Gitea Version

60e7963 (main)

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

No response

How are you running Gitea?

Own build, using meilisearch. But it should happen anywhere, with any indexer.

Database

None

Originally created by @brechtvl on GitHub (May 11, 2023). ### Description There multiple ways to reproduce this, but one way: * Create 50 issues with same title * Create 1 pull request with the same title as the issues * Searching for the title will return either 0 results in pull requests search, or only 49 results in issue search Another way: * Create 60 issues with same title * Apply one label to 30 of them, and another label to the other 30 * Search for the title, filter by one of the labels, and it will return less than 30 results The reason behind this is that that indexers will index the title, contents and comments and return up to 50 search results based on that. Filtering by issue or PR, open or closed, labels, author, .. happens afterwards. Note that pagination as in #22704 does not solve this problem. The solution could be to make all indexers filter by and index all these issue fields. That would require adding quite a bit of code to all indexers though, every filtering option would need to be implemented in every indexer. Alternative solutions with worse performance would be to get an unlimited number of results from the indexers, or compute a list of filter matching issue IDs to give to the indexers. ### Gitea Version 60e7963 (main) ### Can you reproduce the bug on the Gitea demo site? Yes ### Log Gist _No response_ ### Screenshots _No response_ ### Git Version _No response_ ### Operating System _No response_ ### How are you running Gitea? Own build, using meilisearch. But it should happen anywhere, with any indexer. ### Database None
GiteaMirror added the issue/criticaltype/bug labels 2025-11-02 09:19:24 -06:00
Author
Owner

@lunny commented on GitHub (May 12, 2023):

I think this is an indexer design problem. In the indexer, only content, repo_id, issue_id have been stored. So when starting a search with keyword, labels and other conditions, it will have two steps. First, search the keyword in indexer to get issue ids, and then group issue ids and other conditions and pagination to search in database.

So I think maybe we need to store almost all content to indexer to resolve the problem.

@lunny commented on GitHub (May 12, 2023): I think this is an indexer design problem. In the indexer, only content, repo_id, issue_id have been stored. So when starting a search with keyword, labels and other conditions, it will have two steps. First, search the keyword in indexer to get issue ids, and then group issue ids and other conditions and pagination to search in database. So I think maybe we need to store almost all content to indexer to resolve the problem.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#10829