[GH-ISSUE #16887] feat: add env var support for pgvector hnsw index type #56753

Closed
opened 2026-05-05 20:03:25 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @dlamoris on GitHub (Aug 25, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16887

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

Currently the pgvector db init code is hardcoded to use the IVFFLAT index type:

            # Create an index on the vector column if it doesn't exist
            self.session.execute(
                text(
                    "CREATE INDEX IF NOT EXISTS idx_document_chunk_vector "
                    "ON document_chunk USING ivfflat (vector vector_cosine_ops) WITH (lists = 100);"
                )
            )

Since pgvector 0.5.0 HNSW index type is available and seems to be a better choice (ref https://stormatics.tech/blogs/understanding-indexes-in-pgvector and https://aws.amazon.com/blogs/database/optimize-generative-ai-applications-with-pgvector-indexing-a-deep-dive-into-ivfflat-and-hnsw-techniques/), index type should be configurable via env var.

TLDR: ivfflat index type is best done when there are already existing data and would need to be reindexed periodically when there's new data, while hnsw index type can be created anytime and give more accurate results (with the drawback of using more memory)

Desired Solution you'd like

Add env var for pgvector db option to configure vector index type and any other related options (like m, ef_construction)

Alternatives Considered

Do nothing and let deployer delete/recreate/maintain indexes directly with postgres

Additional Context

No response

Originally created by @dlamoris on GitHub (Aug 25, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/16887 ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description Currently the pgvector db init code is hardcoded to use the IVFFLAT index type: ``` # Create an index on the vector column if it doesn't exist self.session.execute( text( "CREATE INDEX IF NOT EXISTS idx_document_chunk_vector " "ON document_chunk USING ivfflat (vector vector_cosine_ops) WITH (lists = 100);" ) ) ``` Since pgvector 0.5.0 HNSW index type is available and seems to be a better choice (ref https://stormatics.tech/blogs/understanding-indexes-in-pgvector and https://aws.amazon.com/blogs/database/optimize-generative-ai-applications-with-pgvector-indexing-a-deep-dive-into-ivfflat-and-hnsw-techniques/), index type should be configurable via env var. TLDR: ivfflat index type is best done when there are already existing data and would need to be reindexed periodically when there's new data, while hnsw index type can be created anytime and give more accurate results (with the drawback of using more memory) ### Desired Solution you'd like Add env var for pgvector db option to configure vector index type and any other related options (like m, ef_construction) ### Alternatives Considered Do nothing and let deployer delete/recreate/maintain indexes directly with postgres ### Additional Context _No response_
Author
Owner

@tjbck commented on GitHub (Aug 25, 2025):

PR welcome!

<!-- gh-comment-id:3219615903 --> @tjbck commented on GitHub (Aug 25, 2025): PR welcome!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#56753