[GH-ISSUE #8] Documentation for Environment Variables? How to change API Host? #27428

Closed
opened 2026-04-25 02:06:47 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @coolaj86 on GitHub (Oct 21, 2023).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8

Originally assigned to: @tjbck on GitHub.

How to I change the API Host from https://localhost:11434 to something else, such as https://api.example.com or just /api on the same server?

What other environment variables can I set before running npm run build?

Originally created by @coolaj86 on GitHub (Oct 21, 2023). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8 Originally assigned to: @tjbck on GitHub. How to I change the API Host from https://localhost:11434 to something else, such as `https://api.example.com` or just `/api` on the same server? What other environment variables can I set before running `npm run build`?
Author
Owner

@tjbck commented on GitHub (Oct 21, 2023):

Hi,

Currently, our web UI supports the OLLAMA_ENDPOINT variable in the format: OLLAMA_ENDPOINT="http://[insert your ollama URL]", where you must include the port number as well. For instance: OLLAMA_ENDPOINT="http://192.168.4.1:11434/".

If you have any further questions or need assistance, please don't hesitate to ask.

<!-- gh-comment-id:1773918049 --> @tjbck commented on GitHub (Oct 21, 2023): Hi, Currently, our web UI supports the OLLAMA_ENDPOINT variable in the format: OLLAMA_ENDPOINT="http://[insert your ollama URL]", where you must include the port number as well. For instance: OLLAMA_ENDPOINT="http://192.168.4.1:11434/". If you have any further questions or need assistance, please don't hesitate to ask.
Author
Owner

@coolaj86 commented on GitHub (Oct 21, 2023):

I had running with the ENVs like this, but it didn't work.

SERVER_ENDPOINT='http://localhost:11434' npm run dev

I think that's because I chose SERVER_ENDPOINT, which was only hardcoded.

Do you know how to change the build step to make use of ENVs? It can't be through process.env.* since that doesn't exist at build time... unless it does?

Maybe PageServerLoad can use process.env.* during the build phase?

<!-- gh-comment-id:1773919838 --> @coolaj86 commented on GitHub (Oct 21, 2023): I had running with the ENVs like this, but it didn't work. ```sh SERVER_ENDPOINT='http://localhost:11434' npm run dev ``` I think that's because I chose `SERVER_ENDPOINT`, which was only hardcoded. Do you know how to change the build step to make use of ENVs? It can't be through `process.env.*` since that doesn't exist at build time... unless it does? Maybe `PageServerLoad` _can_ use `process.env.*` during the build phase?
Author
Owner

@tjbck commented on GitHub (Oct 22, 2023):

With the latest Pull Request #10 commit, you can use the following command:

OLLAMA_API_ENDPOINT='http://localhost:11434' npm run dev

Let me know if you need further assistance.

<!-- gh-comment-id:1773952195 --> @tjbck commented on GitHub (Oct 22, 2023): With the latest Pull Request #10 commit, you can use the following command: ``` OLLAMA_API_ENDPOINT='http://localhost:11434' npm run dev ``` Let me know if you need further assistance.
Author
Owner

@ralyodio commented on GitHub (Dec 18, 2023):

How do I change the PORT of the web ui?

<!-- gh-comment-id:1861855204 --> @ralyodio commented on GitHub (Dec 18, 2023): How do I change the PORT of the web ui?
Author
Owner

@jsboige commented on GitHub (Mar 3, 2026):

Phase 2 Update: Local Qwen Model + Async Fixes

Changes (commit 0b731a7e2)

  • Model switch: Default bot model changed from OpenAI.gpt-4o-mini to Local.qwen3.5-35b-a3b-fast (local Qwen 3.5 35B)

    • Free inference (electricity only) vs. paid API
    • Better quality for French responses
    • ~26s response time on local vLLM (3x RTX 4090)
  • Async timeout fixes:

    • aiohttp.ClientTimeout(total=300, sock_read=180) — local models need longer for initial response
    • Message processing moved to asyncio.create_task — prevents Socket.IO event loop blocking
    • Typing indicator made fire-and-forget to avoid potential deadlocks
  • Model visibility fix (all 7 tenants):

    • Set base_model_id on expert-analyste, redacteur-technique, vision-expert, Local.qwen3.5-35b-a3b-fast
    • These custom models are now visible to all non-admin users (was broken because base_model_id=None)
  • Upstream sync: Merged v0.8.8 from upstream (504 commits, including Open Terminal fixes)

Verified

✓ FAQ bot responds using Local.qwen3.5-35b-a3b-fast (26-35s response time)
✓ Replies posted as thread replies in channels
✓ <think> tags stripped from model output
✓ Typing indicator sent without blocking
✓ Bot reconnects automatically after disconnection
✓ Model visible to bot user (26 models total)

Remaining for Phase 3

  • Multi-tenant deployment (bot per tenant or multi-connection)
  • RAG integration with knowledge bases
  • Production monitoring
<!-- gh-comment-id:3990455687 --> @jsboige commented on GitHub (Mar 3, 2026): ## Phase 2 Update: Local Qwen Model + Async Fixes ### Changes (commit `0b731a7e2`) - **Model switch**: Default bot model changed from `OpenAI.gpt-4o-mini` to `Local.qwen3.5-35b-a3b-fast` (local Qwen 3.5 35B) - Free inference (electricity only) vs. paid API - Better quality for French responses - ~26s response time on local vLLM (3x RTX 4090) - **Async timeout fixes**: - `aiohttp.ClientTimeout(total=300, sock_read=180)` — local models need longer for initial response - Message processing moved to `asyncio.create_task` — prevents Socket.IO event loop blocking - Typing indicator made fire-and-forget to avoid potential deadlocks - **Model visibility fix** (all 7 tenants): - Set `base_model_id` on `expert-analyste`, `redacteur-technique`, `vision-expert`, `Local.qwen3.5-35b-a3b-fast` - These custom models are now visible to all non-admin users (was broken because `base_model_id=None`) - **Upstream sync**: Merged v0.8.8 from upstream (504 commits, including Open Terminal fixes) ### Verified ``` ✓ FAQ bot responds using Local.qwen3.5-35b-a3b-fast (26-35s response time) ✓ Replies posted as thread replies in channels ✓ <think> tags stripped from model output ✓ Typing indicator sent without blocking ✓ Bot reconnects automatically after disconnection ✓ Model visible to bot user (26 models total) ``` ### Remaining for Phase 3 - Multi-tenant deployment (bot per tenant or multi-connection) - RAG integration with knowledge bases - Production monitoring
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#27428