Code search needs enhancement #1293

Closed
opened 2025-11-02 03:55:44 -06:00 by GiteaMirror · 12 comments
Owner

Originally created by @ofnhkb1 on GitHub (Nov 29, 2017).

gitea v:1.3 rc2
When I set the code indexs
start gitea
Start normal
But the web did not start
Stay in the index creation phase

Suggest
First start the web
Create the index again
The current repo name is output during index creation

Originally created by @ofnhkb1 on GitHub (Nov 29, 2017). gitea v:1.3 rc2 When I set the code indexs start gitea Start normal But the web did not start Stay in the index creation phase Suggest First start the web Create the index again The current repo name is output during index creation
GiteaMirror added the type/enhancement label 2025-11-02 03:55:44 -06:00
Author
Owner

@ofnhkb1 commented on GitHub (Nov 29, 2017):

repos total size 1.5G
repos number 78

@ofnhkb1 commented on GitHub (Nov 29, 2017): repos total size 1.5G repos number 78
Author
Owner

@lafriks commented on GitHub (Nov 29, 2017):

Code search index takes a lot of disk space that's why it's currently disabled by default (5x repository size) and also can take a lot of time to index

@lafriks commented on GitHub (Nov 29, 2017): Code search index takes a lot of disk space that's why it's currently disabled by default (5x repository size) and also can take a lot of time to index
Author
Owner

@ofnhkb1 commented on GitHub (Nov 29, 2017):

My idea is to first start the web, and then start a new process or thread, according to repo polling index, instead of now, blocking the start of the web
If the current repo is generating an index, you can pause the push operation
I also have a new idea, set to determine which repo need to generate an index, because not all repo need to generate an index
Thank you

@ofnhkb1 commented on GitHub (Nov 29, 2017): My idea is to first start the web, and then start a new process or thread, according to repo polling index, instead of now, blocking the start of the web If the current repo is generating an index, you can pause the push operation I also have a new idea, set to determine which repo need to generate an index, because not all repo need to generate an index Thank you
Author
Owner

@lunny commented on GitHub (Nov 29, 2017):

Currently, we have a global config to enable or disable code search, a better improvement is every repository has a config to enable or disable code search.

@lunny commented on GitHub (Nov 29, 2017): Currently, we have a global config to enable or disable code search, a better improvement is every repository has a config to enable or disable code search.
Author
Owner

@lunny commented on GitHub (Nov 29, 2017):

Even config which file types should be indexed.

@lunny commented on GitHub (Nov 29, 2017): Even config which file types should be indexed.
Author
Owner

@laoshaw commented on GitHub (Jan 7, 2018):

using the newest gitea i still do not see how the indexing work, i can only search repositories and can not search the code at all, how can I enable code search?

@laoshaw commented on GitHub (Jan 7, 2018): using the newest gitea i still do not see how the indexing work, i can only search repositories and can not search the code at all, how can I enable code search?
Author
Owner

@sapk commented on GitHub (Jan 7, 2018):

@laoshaw You need to activate it via app.ini conf file. See https://github.com/go-gitea/gitea/blob/master/custom/conf/app.ini.sample#L211. Thoses options are currently missing from documentation https://docs.gitea.io/en-us/config-cheat-sheet/ and should be added in #3324.

@sapk commented on GitHub (Jan 7, 2018): @laoshaw You need to activate it via app.ini conf file. See https://github.com/go-gitea/gitea/blob/master/custom/conf/app.ini.sample#L211. Thoses options are currently missing from documentation https://docs.gitea.io/en-us/config-cheat-sheet/ and should be added in #3324.
Author
Owner

@ghost commented on GitHub (Jan 16, 2018):

Just tested repo indexing. It works well, but there are 2 problems:

  • It doesn't show searched text for jsp and jrxml files, even if file lines seems to be right (follow an example):
    immagine
  • Is it possible to improve the disk size repo indexing file? I get a size of 131MB for just 1 repo.
@ghost commented on GitHub (Jan 16, 2018): Just tested repo indexing. It works well, but there are 2 problems: - It doesn't show searched text for jsp and jrxml files, even if file lines seems to be right (follow an example): ![immagine](https://user-images.githubusercontent.com/6648129/34980746-b5a7cb98-faa5-11e7-91e8-0391041409dc.png) - Is it possible to improve the disk size repo indexing file? I get a size of 131MB for just 1 repo.
Author
Owner

@lafriks commented on GitHub (Jan 16, 2018):

Might be worth to rewrite it not to use bleve index but Google codesearch code like Etsy/hound is using it. For codesearch index size is about 1/3 of code file size

@lafriks commented on GitHub (Jan 16, 2018): Might be worth to rewrite it not to use bleve index but Google codesearch code like Etsy/hound is using it. For codesearch index size is about 1/3 of code file size
Author
Owner

@ghost commented on GitHub (Jan 16, 2018):

The code of my project is about 41.6MB. So, I can confirm that indexing are occupying 3 times the space of code

@ghost commented on GitHub (Jan 16, 2018): The code of my project is about 41.6MB. So, I can confirm that indexing are occupying 3 times the space of code
Author
Owner

@thehowl commented on GitHub (Jan 16, 2018):

I think this should be closed. #3366 addresses the issue posted by the OP - it will do the initial indexing alongside the web frontend.

For the other issues posted in here:

  • The space consumption problem is known and pretty clearly documented:
    image
    For now there is no workaround, since it is caused by our indexer itself (bleve), so there is nothing to do about it. We might one day change the indexing lib as said by @lafriks.
  • @giudon your JSP/JRXML issue seems interesting - mind creating an issue about it? possibly with a link to your repository, so we can clone and replicate, if you're able to replicate.
@thehowl commented on GitHub (Jan 16, 2018): I think this should be closed. #3366 addresses the issue posted by the OP - it will do the initial indexing alongside the web frontend. For the other issues posted in here: - The space consumption problem is known and pretty clearly documented: ![image](https://user-images.githubusercontent.com/4681308/34992108-54b6bc6e-facc-11e7-88db-16c35be44c90.png) For now there is no workaround, since it is caused by our indexer itself (bleve), so there is nothing to do about it. We might one day change the indexing lib as said by @lafriks. - @giudon your JSP/JRXML issue seems interesting - mind creating an issue about it? possibly with a link to your repository, so we can clone and replicate, if you're able to replicate.
Author
Owner

@lafriks commented on GitHub (Jan 16, 2018):

Closing as resolved by #3366

@lafriks commented on GitHub (Jan 16, 2018): Closing as resolved by #3366
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#1293