When responding to dialogues, CPU and memory usage surge to around 70%, while GPU usage remains below 10%. #1576

Closed
opened 2025-11-11 14:47:41 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @ddxl123 on GitHub (Jul 21, 2024).

When responding to dialogues, CPU and memory usage surge to around 70%, while GPU usage remains below 10%.
Docker Desktop 4.32.0

ollama v0.2.7
Model: llama3:70b
Open-webui v0.3.10
cuda v12.5

-PC-
cpu: intel i9-14900k
4080super,NVIDIA graphics card driver version:32.0.15.5599
RAM: 64G

Windows 11 x64
Browser is edge v126.0.2592.113.

Docker Desktop Containers log:

2024-07-22 06:00:54 Loading WEBUI_SECRET_KEY from file, not provided as an environment variable.
2024-07-22 06:00:54 Generating WEBUI_SECRET_KEY
2024-07-22 06:00:54 Loading WEBUI_SECRET_KEY from .webui_secret_key
2024-07-22 06:00:54 CUDA is enabled, appending LD_LIBRARY_PATH to include torch/cudnn & cublas libraries.
2024-07-22 06:01:09 /app
2024-07-22 06:01:09 
2024-07-22 06:01:09   ___                    __        __   _     _   _ ___ 
2024-07-22 06:01:09  / _ \ _ __   ___ _ __   \ \      / /__| |__ | | | |_ _|
2024-07-22 06:01:09 | | | | '_ \ / _ \ '_ \   \ \ /\ / / _ \ '_ \| | | || | 
2024-07-22 06:01:09 | |_| | |_) |  __/ | | |   \ V  V /  __/ |_) | |_| || | 
2024-07-22 06:01:09  \___/| .__/ \___|_| |_|    \_/\_/ \___|_.__/ \___/|___|
2024-07-22 06:01:09       |_|                                               
2024-07-22 06:01:09 
2024-07-22 06:01:09       
2024-07-22 06:01:09 v0.3.10 - building the best open-source AI user interface.
2024-07-22 06:01:09 
2024-07-22 06:01:09 https://github.com/open-webui/open-webui
2024-07-22 06:01:01 USER_AGENT environment variable not set, consider setting it to identify your requests.
2024-07-22 06:01:08 INFO:     Started server process [1]
2024-07-22 06:01:08 INFO:     Waiting for application startup.
2024-07-22 06:01:09 INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
2024-07-22 06:01:09 INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
2024-07-22 06:01:09 INFO:     Application startup complete.
2024-07-22 06:01:09 INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
2024-07-22 06:01:09 
2024-07-22 06:01:24 INFO  [apps.openai.main] get_all_models()
2024-07-22 06:01:24 INFO  [apps.ollama.main] get_all_models()
2024-07-22 06:01:24 INFO:     172.17.0.1:52334 - "GET /static/splash.png HTTP/1.1" 200 OK
2024-07-22 06:01:24 INFO:     127.0.0.1:52592 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:01:24 INFO:     172.17.0.1:52334 - "GET /manifest.json HTTP/1.1" 200 OK
2024-07-22 06:01:24 INFO:     172.17.0.1:52334 - "GET /api/config HTTP/1.1" 200 OK
2024-07-22 06:01:24 INFO:     172.17.0.1:52334 - "GET /ws/socket.io/?EIO=4&transport=polling&t=P3NK5Qx HTTP/1.1" 200 OK
2024-07-22 06:01:24 INFO:     172.17.0.1:52344 - "GET /api/v1/auths/ HTTP/1.1" 401 Unauthorized
2024-07-22 06:01:25 INFO:     ('172.17.0.1', 40278) - "WebSocket /ws/socket.io/?EIO=4&transport=websocket&sid=C5u-BpFM8ibdiddbAAAA" [accepted]
2024-07-22 06:01:25 INFO:     connection open
2024-07-22 06:01:25 INFO:     172.17.0.1:52344 - "POST /ws/socket.io/?EIO=4&transport=polling&t=P3NK5Rt&sid=C5u-BpFM8ibdiddbAAAA HTTP/1.1" 200 OK
2024-07-22 06:01:25 INFO:     172.17.0.1:52334 - "GET /ws/socket.io/?EIO=4&transport=polling&t=P3NK5Ru&sid=C5u-BpFM8ibdiddbAAAA HTTP/1.1" 200 OK
2024-07-22 06:01:25 INFO:     172.17.0.1:52334 - "GET /static/favicon.png HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO  [apps.webui.models.auths] authenticate_user: 1033839760@qq.com
2024-07-22 06:01:26 INFO:     172.17.0.1:52334 - "POST /api/v1/auths/signin HTTP/1.1" 200 OK
2024-07-22 06:01:26 user-join AlgSpev31drlO-w3AAAB {'auth': {'token': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6Ijc4MmI3ZWM5LWM2NmEtNDNlYy1hNmQ4LTg2YmI2M2Y1MjYyOSJ9.QRUbWeJDLAwZQdinCT32LWSqgWnWSEImWOvsFZHdWtY'}}
2024-07-22 06:01:26 user long(782b7ec9-c66a-43ec-a6d8-86bb63f52629) connected with session ID AlgSpev31drlO-w3AAAB
2024-07-22 06:01:26 INFO:     172.17.0.1:52334 - "GET /api/changelog HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:52344 - "GET /api/v1/users/user/settings HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO  [apps.openai.main] get_all_models()
2024-07-22 06:01:26 INFO  [apps.ollama.main] get_all_models()
2024-07-22 06:01:26 INFO:     172.17.0.1:52334 - "GET /api/v1/prompts/ HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:52344 - "GET /api/models HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40284 - "GET /api/v1/documents/ HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40302 - "GET /api/v1/configs/banners HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40292 - "GET /api/v1/functions/ HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40288 - "GET /api/v1/tools/ HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:52334 - "GET /api/v1/chats/tags/all HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:52334 - "POST /api/v1/chats/tags HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40288 - "GET /ollama/api/version HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40292 - "GET /api/v1/users/user/settings HTTP/1.1" 200 OK
2024-07-22 06:01:26 INFO:     172.17.0.1:40292 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
2024-07-22 06:01:35 INFO:     172.17.0.1:54990 - "POST /api/v1/chats/new HTTP/1.1" 200 OK
2024-07-22 06:01:35 INFO:     172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
2024-07-22 06:01:35 INFO  [apps.ollama.main] url: http://host.docker.internal:11434
2024-07-22 06:01:54 INFO:     127.0.0.1:60406 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:02:15 INFO:     172.17.0.1:54990 - "POST /ollama/api/chat HTTP/1.1" 200 OK
2024-07-22 06:02:24 INFO:     127.0.0.1:37392 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:02:47 INFO:     172.17.0.1:54990 - "POST /api/chat/completed HTTP/1.1" 200 OK
2024-07-22 06:02:47 INFO:     172.17.0.1:54990 - "POST /api/v1/chats/4601439b-225d-4c58-abf2-4570ed59a740 HTTP/1.1" 200 OK
2024-07-22 06:02:47 INFO:     172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
2024-07-22 06:02:47 INFO:     172.17.0.1:54990 - "POST /api/v1/chats/4601439b-225d-4c58-abf2-4570ed59a740 HTTP/1.1" 200 OK
2024-07-22 06:02:47 INFO:     172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
2024-07-22 06:02:47 INFO  [apps.ollama.main] url: http://host.docker.internal:11434
2024-07-22 06:02:53 generate_title
2024-07-22 06:02:53 llama3:70b
2024-07-22 06:02:53 generate_ollama_chat_completion
2024-07-22 06:02:53 INFO:     172.17.0.1:54990 - "POST /api/task/title/completions HTTP/1.1" 200 OK
2024-07-22 06:02:53 INFO:     172.17.0.1:54990 - "POST /api/v1/chats/4601439b-225d-4c58-abf2-4570ed59a740 HTTP/1.1" 200 OK
2024-07-22 06:02:53 INFO:     172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
2024-07-22 06:02:53 INFO:     172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
2024-07-22 06:02:54 INFO:     127.0.0.1:53210 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:03:24 ['AlgSpev31drlO-w3AAAB']
2024-07-22 06:03:24 INFO:     127.0.0.1:37912 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:03:54 INFO:     127.0.0.1:46856 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:04:24 INFO:     127.0.0.1:41006 - "GET /health HTTP/1.1" 200 OK
2024-07-22 06:04:54 INFO:     127.0.0.1:50412 - "GET /health HTTP/1.1" 200 OK

Running nvidia-smi in Docker Desktop Containers:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.52.01              Driver Version: 555.99         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4080 ...    On  |   00000000:01:00.0  On |                  N/A |
| 35%   34C    P8             21W /  320W |    2025MiB /  16376MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A         1      C   /python3.11                                 N/A      |
+-----------------------------------------------------------------------------------------+
#
Originally created by @ddxl123 on GitHub (Jul 21, 2024). When responding to dialogues, CPU and memory usage surge to around 70%, while GPU usage remains below 10%. Docker Desktop 4.32.0 ollama v0.2.7 Model: llama3:70b Open-webui v0.3.10 cuda v12.5 -PC- cpu: intel i9-14900k 4080super,NVIDIA graphics card driver version:32.0.15.5599 RAM: 64G Windows 11 x64 Browser is edge v126.0.2592.113. Docker Desktop Containers log: ``` 2024-07-22 06:00:54 Loading WEBUI_SECRET_KEY from file, not provided as an environment variable. 2024-07-22 06:00:54 Generating WEBUI_SECRET_KEY 2024-07-22 06:00:54 Loading WEBUI_SECRET_KEY from .webui_secret_key 2024-07-22 06:00:54 CUDA is enabled, appending LD_LIBRARY_PATH to include torch/cudnn & cublas libraries. 2024-07-22 06:01:09 /app 2024-07-22 06:01:09 2024-07-22 06:01:09 ___ __ __ _ _ _ ___ 2024-07-22 06:01:09 / _ \ _ __ ___ _ __ \ \ / /__| |__ | | | |_ _| 2024-07-22 06:01:09 | | | | '_ \ / _ \ '_ \ \ \ /\ / / _ \ '_ \| | | || | 2024-07-22 06:01:09 | |_| | |_) | __/ | | | \ V V / __/ |_) | |_| || | 2024-07-22 06:01:09 \___/| .__/ \___|_| |_| \_/\_/ \___|_.__/ \___/|___| 2024-07-22 06:01:09 |_| 2024-07-22 06:01:09 2024-07-22 06:01:09 2024-07-22 06:01:09 v0.3.10 - building the best open-source AI user interface. 2024-07-22 06:01:09 2024-07-22 06:01:09 https://github.com/open-webui/open-webui 2024-07-22 06:01:01 USER_AGENT environment variable not set, consider setting it to identify your requests. 2024-07-22 06:01:08 INFO: Started server process [1] 2024-07-22 06:01:08 INFO: Waiting for application startup. 2024-07-22 06:01:09 INFO [alembic.runtime.migration] Context impl SQLiteImpl. 2024-07-22 06:01:09 INFO [alembic.runtime.migration] Will assume non-transactional DDL. 2024-07-22 06:01:09 INFO: Application startup complete. 2024-07-22 06:01:09 INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit) 2024-07-22 06:01:09 2024-07-22 06:01:24 INFO [apps.openai.main] get_all_models() 2024-07-22 06:01:24 INFO [apps.ollama.main] get_all_models() 2024-07-22 06:01:24 INFO: 172.17.0.1:52334 - "GET /static/splash.png HTTP/1.1" 200 OK 2024-07-22 06:01:24 INFO: 127.0.0.1:52592 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:01:24 INFO: 172.17.0.1:52334 - "GET /manifest.json HTTP/1.1" 200 OK 2024-07-22 06:01:24 INFO: 172.17.0.1:52334 - "GET /api/config HTTP/1.1" 200 OK 2024-07-22 06:01:24 INFO: 172.17.0.1:52334 - "GET /ws/socket.io/?EIO=4&transport=polling&t=P3NK5Qx HTTP/1.1" 200 OK 2024-07-22 06:01:24 INFO: 172.17.0.1:52344 - "GET /api/v1/auths/ HTTP/1.1" 401 Unauthorized 2024-07-22 06:01:25 INFO: ('172.17.0.1', 40278) - "WebSocket /ws/socket.io/?EIO=4&transport=websocket&sid=C5u-BpFM8ibdiddbAAAA" [accepted] 2024-07-22 06:01:25 INFO: connection open 2024-07-22 06:01:25 INFO: 172.17.0.1:52344 - "POST /ws/socket.io/?EIO=4&transport=polling&t=P3NK5Rt&sid=C5u-BpFM8ibdiddbAAAA HTTP/1.1" 200 OK 2024-07-22 06:01:25 INFO: 172.17.0.1:52334 - "GET /ws/socket.io/?EIO=4&transport=polling&t=P3NK5Ru&sid=C5u-BpFM8ibdiddbAAAA HTTP/1.1" 200 OK 2024-07-22 06:01:25 INFO: 172.17.0.1:52334 - "GET /static/favicon.png HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO [apps.webui.models.auths] authenticate_user: 1033839760@qq.com 2024-07-22 06:01:26 INFO: 172.17.0.1:52334 - "POST /api/v1/auths/signin HTTP/1.1" 200 OK 2024-07-22 06:01:26 user-join AlgSpev31drlO-w3AAAB {'auth': {'token': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6Ijc4MmI3ZWM5LWM2NmEtNDNlYy1hNmQ4LTg2YmI2M2Y1MjYyOSJ9.QRUbWeJDLAwZQdinCT32LWSqgWnWSEImWOvsFZHdWtY'}} 2024-07-22 06:01:26 user long(782b7ec9-c66a-43ec-a6d8-86bb63f52629) connected with session ID AlgSpev31drlO-w3AAAB 2024-07-22 06:01:26 INFO: 172.17.0.1:52334 - "GET /api/changelog HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:52344 - "GET /api/v1/users/user/settings HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO [apps.openai.main] get_all_models() 2024-07-22 06:01:26 INFO [apps.ollama.main] get_all_models() 2024-07-22 06:01:26 INFO: 172.17.0.1:52334 - "GET /api/v1/prompts/ HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:52344 - "GET /api/models HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40284 - "GET /api/v1/documents/ HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40302 - "GET /api/v1/configs/banners HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40292 - "GET /api/v1/functions/ HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40288 - "GET /api/v1/tools/ HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:52334 - "GET /api/v1/chats/tags/all HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:52334 - "POST /api/v1/chats/tags HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40288 - "GET /ollama/api/version HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40292 - "GET /api/v1/users/user/settings HTTP/1.1" 200 OK 2024-07-22 06:01:26 INFO: 172.17.0.1:40292 - "GET /api/v1/chats/ HTTP/1.1" 200 OK 2024-07-22 06:01:35 INFO: 172.17.0.1:54990 - "POST /api/v1/chats/new HTTP/1.1" 200 OK 2024-07-22 06:01:35 INFO: 172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK 2024-07-22 06:01:35 INFO [apps.ollama.main] url: http://host.docker.internal:11434 2024-07-22 06:01:54 INFO: 127.0.0.1:60406 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:02:15 INFO: 172.17.0.1:54990 - "POST /ollama/api/chat HTTP/1.1" 200 OK 2024-07-22 06:02:24 INFO: 127.0.0.1:37392 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:02:47 INFO: 172.17.0.1:54990 - "POST /api/chat/completed HTTP/1.1" 200 OK 2024-07-22 06:02:47 INFO: 172.17.0.1:54990 - "POST /api/v1/chats/4601439b-225d-4c58-abf2-4570ed59a740 HTTP/1.1" 200 OK 2024-07-22 06:02:47 INFO: 172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK 2024-07-22 06:02:47 INFO: 172.17.0.1:54990 - "POST /api/v1/chats/4601439b-225d-4c58-abf2-4570ed59a740 HTTP/1.1" 200 OK 2024-07-22 06:02:47 INFO: 172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK 2024-07-22 06:02:47 INFO [apps.ollama.main] url: http://host.docker.internal:11434 2024-07-22 06:02:53 generate_title 2024-07-22 06:02:53 llama3:70b 2024-07-22 06:02:53 generate_ollama_chat_completion 2024-07-22 06:02:53 INFO: 172.17.0.1:54990 - "POST /api/task/title/completions HTTP/1.1" 200 OK 2024-07-22 06:02:53 INFO: 172.17.0.1:54990 - "POST /api/v1/chats/4601439b-225d-4c58-abf2-4570ed59a740 HTTP/1.1" 200 OK 2024-07-22 06:02:53 INFO: 172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK 2024-07-22 06:02:53 INFO: 172.17.0.1:54990 - "GET /api/v1/chats/ HTTP/1.1" 200 OK 2024-07-22 06:02:54 INFO: 127.0.0.1:53210 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:03:24 ['AlgSpev31drlO-w3AAAB'] 2024-07-22 06:03:24 INFO: 127.0.0.1:37912 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:03:54 INFO: 127.0.0.1:46856 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:04:24 INFO: 127.0.0.1:41006 - "GET /health HTTP/1.1" 200 OK 2024-07-22 06:04:54 INFO: 127.0.0.1:50412 - "GET /health HTTP/1.1" 200 OK ``` Running `nvidia-smi` in Docker Desktop Containers: ``` +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.52.01 Driver Version: 555.99 CUDA Version: 12.5 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 4080 ... On | 00000000:01:00.0 On | N/A | | 35% 34C P8 21W / 320W | 2025MiB / 16376MiB | 2% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1 C /python3.11 N/A | +-----------------------------------------------------------------------------------------+ # ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#1576