[GH-ISSUE #16886] issue: STT preview of the transcript not working #33615

Closed
opened 2026-04-25 07:31:06 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @strizi9 on GitHub (Aug 25, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16886

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

v0.6.25

Ollama Version (if applicable)

No response

Operating System

Debian 12

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

You upload the audio file in the chat window and when you click on the audio file, a transcript is displayed.

Actual Behavior

Since v0.6.23 with STT OpenAI and Local the preview shows no content.
However, when you transfer the content to a model, the text is there. It is only not displayed in the first step.

Steps to Reproduce

  1. Login
  2. Start new Chat
  3. Upload Audio File
  4. Click on uploaded File -> You see an audio playback and below "No content"
  5. Click Button "Send message"
  6. Click on the file under the response. -> Content is filled as it should.

Logs & Screenshots

Image Image

Additional Information

No response

Originally created by @strizi9 on GitHub (Aug 25, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/16886 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version v0.6.25 ### Ollama Version (if applicable) _No response_ ### Operating System Debian 12 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior You upload the audio file in the chat window and when you click on the audio file, a transcript is displayed. ### Actual Behavior Since v0.6.23 with STT OpenAI and Local the preview shows no content. However, when you transfer the content to a model, the text is there. It is only not displayed in the first step. ### Steps to Reproduce 1. Login 2. Start new Chat 3. Upload Audio File 4. Click on uploaded File -> You see an audio playback and below "No content" 5. Click Button "Send message" 6. Click on the file under the response. -> Content is filled as it should. ### Logs & Screenshots <img width="1171" height="579" alt="Image" src="https://github.com/user-attachments/assets/7b654d7e-ac95-4a8c-a287-c34c84699d66" /> <img width="1134" height="784" alt="Image" src="https://github.com/user-attachments/assets/a69371e5-7721-4d8f-ba8d-b6e9693517bd" /> ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-25 07:31:06 -05:00
Author
Owner

@tjbck commented on GitHub (Aug 25, 2025):

Should be addressed in dev, testing wanted here!

<!-- gh-comment-id:3220803383 --> @tjbck commented on GitHub (Aug 25, 2025): Should be addressed in dev, testing wanted here!
Author
Owner

@rgaricano commented on GitHub (Aug 25, 2025):

@tjbck,
work ok with commit 630cea105e,
also tested with xls, cvs, json & txt

Closed PR https://github.com/open-webui/open-webui/pull/16874 in favour of 630cea105e

<!-- gh-comment-id:3221103346 --> @rgaricano commented on GitHub (Aug 25, 2025): @tjbck, work ok with commit https://github.com/open-webui/open-webui/commit/630cea105e10d2480fed2814c66e08d4efe76e8b, also tested with xls, cvs, json & txt Closed PR https://github.com/open-webui/open-webui/pull/16874 in favour of https://github.com/open-webui/open-webui/commit/630cea105e10d2480fed2814c66e08d4efe76e8b
Author
Owner

@strizi9 commented on GitHub (Aug 26, 2025):

It also works perfectly for me with the dev branch.

Thank you very much!

<!-- gh-comment-id:3223227030 --> @strizi9 commented on GitHub (Aug 26, 2025): It also works perfectly for me with the dev branch. Thank you very much!
Author
Owner

@GlisseManTV commented on GitHub (Aug 27, 2025):

Hi !

Did you all test this also in Notes section ?
Even if the model seems retrieve content in chat section, in Notes it didn't work as well.
Did the mentioned commit solve it in notes too ?

<!-- gh-comment-id:3229959741 --> @GlisseManTV commented on GitHub (Aug 27, 2025): Hi ! Did you all test this also in Notes section ? Even if the model seems retrieve content in chat section, in Notes it didn't work as well. Did the mentioned commit solve it in notes too ?
Author
Owner

@rgaricano commented on GitHub (Aug 28, 2025):

No, In Notes audio files it don't show the content, and also the File Modal doesn't open it as audio, just provide a download link.
The recordings are not transcribed either and open the FileModal as download link.
But the embedd is done and it appear in logs.

<!-- gh-comment-id:3230284030 --> @rgaricano commented on GitHub (Aug 28, 2025): No, In Notes audio files it don't show the content, and also the File Modal doesn't open it as audio, just provide a download link. The recordings are not transcribed either and open the FileModal as download link. But the embedd is done and it appear in logs.
Author
Owner

@GlisseManTV commented on GitHub (Aug 28, 2025):

From my side (PGSQL DB), In Notes, even if I see the transcription in logs, the model didn't retrieve the content.

2025-08-28 06:49:18.869 | INFO | open_webui.routers.files:upload_file_handler:159 - file.content_type: audio/wav
2025-08-28 06:49:18.884 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "POST /api/v1/files/ HTTP/1.1" 200
2025-08-28 06:49:18.902 | INFO | open_webui.routers.audio:transcribe:824 - transcribe: /app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.wav {'language': 'en'}
2025-08-28 06:49:18.915 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "GET /api/v1/files/307f163d-73aa-4df3-920c-5c96fe3ebfa8/process/status?stream=true HTTP/1.1" 200
2025-08-28 06:49:19.214 | DEBUG | pydub.logging_utils:log_conversion:9 - subprocess.call(['ffmpeg', '-y', '-f', 'wav', '-i', '/tmp/tmp8zrt_o84', '-f', 'mp3', '/tmp/tmp_9h_wmam'])
2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'ffmpeg version 5.1.6-0+deb12u1 Copyright (c) 2000-2024 the FFmpeg developers'
2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' built with gcc 12 (Debian 12.2.0-14)'
2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' configuration: --prefix=/usr --extra-version=0+deb12u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared'
2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavutil 57. 28.100 / 57. 28.100'
2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavcodec 59. 37.100 / 59. 37.100'
2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavformat 59. 27.100 / 59. 27.100'
2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavdevice 59. 7.100 / 59. 7.100'
2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavfilter 8. 44.100 / 8. 44.100'
2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libswscale 6. 7.100 / 6. 7.100'
2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libswresample 4. 7.100 / 4. 7.100'
2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libpostproc 56. 6.100 / 56. 6.100'
2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'Guessed Channel Layout for Input Stream #0.0 : mono'
2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b"Input #0, wav, from '/tmp/tmp8zrt_o84':"
2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Duration: 00:00:06.40, bitrate: 384 kb/s'
2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, mono, s16, 384 kb/s'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'Stream mapping:'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame))'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'Press [q] to stop, [?] for help'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b"Output #0, mp3, to '/tmp/tmp_9h_wmam':"
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Metadata:'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' TSSE : Lavf59.27.100'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Stream #0:0: Audio: mp3, 24000 Hz, mono, s16p'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Metadata:'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' encoder : Lavc59.37.100 libmp3lame'
2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'size= 0kB time=-00:00:00.02 bitrate=N/A speed=N/A'
2025-08-28 06:49:19.420 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'size= 25kB time=00:00:06.40 bitrate= 32.5kbits/s speed= 118x'
2025-08-28 06:49:19.420 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'video:0kB audio:25kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.917751%'
2025-08-28 06:49:19.420 | INFO | open_webui.routers.audio:convert_audio_to_mp3:116 - Converted /app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.wav to /app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.mp3
Chunk paths: ['/app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.mp3']
2025-08-28 06:49:19.422 | DEBUG | urllib3.connectionpool:_new_conn:241 - Starting new HTTP connection (1): 192.168.0.60:9200
2025-08-28 06:49:20.451 | DEBUG | urllib3.connectionpool:_make_request:544 - http://192.168.0.60:9200/ "POST /audio/transcriptions HTTP/1.1" 200 114
2025-08-28 06:49:20.457 | DEBUG | open_webui.routers.retrieval:process_file:1500 - text_content: This is a sample voice test in English. The quick brown fox jumps over the lazy dog on
2025-08-28 06:49:20.473 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1182 - save_docs_to_vector_db: document 839f091c-abd3-4764-bbcb-eb0bd556ee52.wav file-307f163d-73aa-4df3-920c-5c96fe3ebfa8
2025-08-28 06:49:20.484 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1298 - adding to collection file-307f163d-73aa-4df3-920c-5c96fe3ebfa8
2025-08-28 06:49:20.484 | DEBUG | open_webui.retrieval.utils:generate_ollama_batch_embeddings:845 - generate_ollama_batch_embeddings:model nomic-embed-text:v1.5 batch size: 1
2025-08-28 06:49:20.486 | DEBUG | urllib3.connectionpool:_new_conn:241 - Starting new HTTP connection (1): 192.168.0.60:11434
2025-08-28 06:49:20.549 | DEBUG | urllib3.connectionpool:_make_request:544 - http://192.168.0.60:11434/ "POST /api/embed HTTP/1.1" 200 None
2025-08-28 06:49:20.608 | DEBUG | chromadb.config:start:337 - Starting component PersistentLocalHnswSegment
2025-08-28 06:49:21.169 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "POST /api/v1/notes/2335a3da-51d0-4415-b545-220d2cfcfdb3/update HTTP/1.1" 200
2025-08-28 06:49:36.056 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "GET /api/v1/files/307f163d-73aa-4df3-920c-5c96fe3ebfa8/content HTTP/1.1" 200
2025-08-28 06:49:37.415 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "GET /_app/version.json HTTP/1.1" 200
2025-08-28 06:50:18.090 | DEBUG | open_webui.utils.middleware:process_chat_payload:739 - form_data: {'model': 'hf.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF:Q3_K_M', 'stream': True, 'messages': [{'role': 'system', 'content': "Enhance existing notes using additional context provided from audio transcription or uploaded file content in the content's primary language. Your task is to make the notes more useful and comprehensive by incorporating relevant information from the provided context.\n\nInput will be provided within and XML tags, providing a structure for the existing notes and context respectively.\n\n# Output Format\n\nProvide the enhanced notes in markdown format. Use markdown syntax for headings, lists, task lists ([ ]) where tasks or checklists are strongly implied, and emphasis to improve clarity and presentation. Ensure that all integrated content from the context is accurately reflected. Return only the markdown formatted note.\n"}, {'role': 'user', 'content': '\n839f091c-abd3-4764-bbcb-eb0bd556ee52.wav: Could not extract content\n'}]

So, we can see :
2025-08-28 06:49:20.457 | DEBUG | open_webui.routers.retrieval:process_file:1500 - text_content: This is a sample voice test in English. The quick brown fox jumps over the lazy dog on
&
839f091c-abd3-4764-bbcb-eb0bd556ee52.wav: Could not extract content\n

<!-- gh-comment-id:3232066672 --> @GlisseManTV commented on GitHub (Aug 28, 2025): From my side (PGSQL DB), In Notes, even if I see the transcription in logs, the model didn't retrieve the content. > 2025-08-28 06:49:18.869 | INFO | open_webui.routers.files:upload_file_handler:159 - file.content_type: audio/wav 2025-08-28 06:49:18.884 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "POST /api/v1/files/ HTTP/1.1" 200 2025-08-28 06:49:18.902 | INFO | open_webui.routers.audio:transcribe:824 - transcribe: /app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.wav {'language': 'en'} 2025-08-28 06:49:18.915 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "GET /api/v1/files/307f163d-73aa-4df3-920c-5c96fe3ebfa8/process/status?stream=true HTTP/1.1" 200 2025-08-28 06:49:19.214 | DEBUG | pydub.logging_utils:log_conversion:9 - subprocess.call(['ffmpeg', '-y', '-f', 'wav', '-i', '/tmp/tmp8zrt_o84', '-f', 'mp3', '/tmp/tmp_9h_wmam']) 2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'ffmpeg version 5.1.6-0+deb12u1 Copyright (c) 2000-2024 the FFmpeg developers' 2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' built with gcc 12 (Debian 12.2.0-14)' 2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' configuration: --prefix=/usr --extra-version=0+deb12u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared' 2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavutil 57. 28.100 / 57. 28.100' 2025-08-28 06:49:19.416 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavcodec 59. 37.100 / 59. 37.100' 2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavformat 59. 27.100 / 59. 27.100' 2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavdevice 59. 7.100 / 59. 7.100' 2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libavfilter 8. 44.100 / 8. 44.100' 2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libswscale 6. 7.100 / 6. 7.100' 2025-08-28 06:49:19.417 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libswresample 4. 7.100 / 4. 7.100' 2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' libpostproc 56. 6.100 / 56. 6.100' 2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'Guessed Channel Layout for Input Stream #0.0 : mono' 2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b"Input #0, wav, from '/tmp/tmp8zrt_o84':" 2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Duration: 00:00:06.40, bitrate: 384 kb/s' 2025-08-28 06:49:19.418 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, mono, s16, 384 kb/s' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'Stream mapping:' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame))' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'Press [q] to stop, [?] for help' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b"Output #0, mp3, to '/tmp/tmp_9h_wmam':" 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Metadata:' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' TSSE : Lavf59.27.100' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Stream #0:0: Audio: mp3, 24000 Hz, mono, s16p' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' Metadata:' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b' encoder : Lavc59.37.100 libmp3lame' 2025-08-28 06:49:19.419 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'size= 0kB time=-00:00:00.02 bitrate=N/A speed=N/A' 2025-08-28 06:49:19.420 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'size= 25kB time=00:00:06.40 bitrate= 32.5kbits/s speed= 118x' 2025-08-28 06:49:19.420 | DEBUG | pydub.logging_utils:log_subprocess_output:14 - subprocess output: b'video:0kB audio:25kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.917751%' 2025-08-28 06:49:19.420 | INFO | open_webui.routers.audio:convert_audio_to_mp3:116 - Converted /app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.wav to /app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.mp3 Chunk paths: ['/app/backend/data/uploads/307f163d-73aa-4df3-920c-5c96fe3ebfa8_839f091c-abd3-4764-bbcb-eb0bd556ee52.mp3'] 2025-08-28 06:49:19.422 | DEBUG | urllib3.connectionpool:_new_conn:241 - Starting new HTTP connection (1): 192.168.0.60:9200 2025-08-28 06:49:20.451 | DEBUG | urllib3.connectionpool:_make_request:544 - http://192.168.0.60:9200/ "POST /audio/transcriptions HTTP/1.1" 200 114 2025-08-28 06:49:20.457 | DEBUG | open_webui.routers.retrieval:process_file:1500 - text_content: This is a sample voice test in English. The quick brown fox jumps over the lazy dog on 2025-08-28 06:49:20.473 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1182 - save_docs_to_vector_db: document 839f091c-abd3-4764-bbcb-eb0bd556ee52.wav file-307f163d-73aa-4df3-920c-5c96fe3ebfa8 2025-08-28 06:49:20.484 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1298 - adding to collection file-307f163d-73aa-4df3-920c-5c96fe3ebfa8 2025-08-28 06:49:20.484 | DEBUG | open_webui.retrieval.utils:generate_ollama_batch_embeddings:845 - generate_ollama_batch_embeddings:model nomic-embed-text:v1.5 batch size: 1 2025-08-28 06:49:20.486 | DEBUG | urllib3.connectionpool:_new_conn:241 - Starting new HTTP connection (1): 192.168.0.60:11434 2025-08-28 06:49:20.549 | DEBUG | urllib3.connectionpool:_make_request:544 - http://192.168.0.60:11434/ "POST /api/embed HTTP/1.1" 200 None 2025-08-28 06:49:20.608 | DEBUG | chromadb.config:start:337 - Starting component PersistentLocalHnswSegment 2025-08-28 06:49:21.169 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "POST /api/v1/notes/2335a3da-51d0-4415-b545-220d2cfcfdb3/update HTTP/1.1" 200 2025-08-28 06:49:36.056 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "GET /api/v1/files/307f163d-73aa-4df3-920c-5c96fe3ebfa8/content HTTP/1.1" 200 2025-08-28 06:49:37.415 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - PUBLICIP:0 - "GET /_app/version.json HTTP/1.1" 200 2025-08-28 06:50:18.090 | DEBUG | open_webui.utils.middleware:process_chat_payload:739 - form_data: {'model': 'hf.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF:Q3_K_M', 'stream': True, 'messages': [{'role': 'system', 'content': "Enhance existing notes using additional context provided from audio transcription or uploaded file content in the content's primary language. Your task is to make the notes more useful and comprehensive by incorporating relevant information from the provided context.\n\nInput will be provided within <notes> and <context> XML tags, providing a structure for the existing notes and context respectively.\n\n# Output Format\n\nProvide the enhanced notes in markdown format. Use markdown syntax for headings, lists, task lists ([ ]) where tasks or checklists are strongly implied, and emphasis to improve clarity and presentation. Ensure that all integrated content from the context is accurately reflected. Return only the markdown formatted note.\n"}, {'role': 'user', 'content': '<notes></notes>\n<context>839f091c-abd3-4764-bbcb-eb0bd556ee52.wav: Could not extract content\n</context>'}] So, we can see : 2025-08-28 06:49:20.457 | DEBUG | open_webui.routers.retrieval:process_file:1500 - text_content: This is a sample voice test in English. The quick brown fox jumps over the lazy dog on & <context>839f091c-abd3-4764-bbcb-eb0bd556ee52.wav: Could not extract content\n</context>
Author
Owner

@rgaricano commented on GitHub (Aug 28, 2025):

fixed in dev 630cea105e

<!-- gh-comment-id:3232627230 --> @rgaricano commented on GitHub (Aug 28, 2025): fixed in dev https://github.com/open-webui/open-webui/commit/630cea105e10d2480fed2814c66e08d4efe76e8b
Author
Owner

@GlisseManTV commented on GitHub (Aug 28, 2025):

fixed in dev 630cea1

Thanks for checking !

<!-- gh-comment-id:3232800559 --> @GlisseManTV commented on GitHub (Aug 28, 2025): > fixed in dev [630cea1](https://github.com/open-webui/open-webui/commit/630cea105e10d2480fed2814c66e08d4efe76e8b) Thanks for checking !
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#33615