[GH-ISSUE #23148] feat: improve diarized audio workflow in knowledge bases #58564

Closed
opened 2026-05-05 23:26:38 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @fugufisch on GitHub (Mar 27, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23148

Goal

Improve the knowledge base workflow for uploaded audio so diarized transcripts are preserved, easy to read, and easy to edit after upload.

Context

Diarization already works well for audio uploads in chat, but the knowledge base workflow is still much less usable for speaker-separated audio.

When audio is uploaded to a knowledge base, the transcript should preserve speaker segmentation and make it practical to clean up speaker names before the content is used downstream.

Proposal

  • Trigger the same diarization-capable transcription flow for audio uploads into knowledge bases.
  • Render diarized knowledge base transcripts in the editor as readable markdown instead of a flat text blob.
  • Format each segment so speaker, timestamp range, and segment text are clearly visible and easy to edit.
  • Add a lightweight speaker rename workflow in the editor, ideally as search-and-replace for speaker labels so users can quickly replace names like SPEAKER_00 with real names across the document.
  • Keep the editing workflow practical for long transcripts and make sure the edited result is what gets persisted/indexed.

Acceptance criteria

  • Audio uploads in knowledge bases trigger diarization-aware transcription when supported by the configured STT backend.
  • The resulting transcript is shown in the knowledge base editor as nicely formatted markdown.
  • Each segment is clearly separated and easy to scan and edit.
  • Users can quickly rename speakers via an easy search/replace workflow in the editor.
  • Edited speaker names are preserved and reflected in the stored/indexed transcript content.
  • Tests updated/added as appropriate.

Out of scope

  • New standalone transcript editing UI outside the existing knowledge base editor.
  • Changes to chat upload diarization behavior, which already works well.
Originally created by @fugufisch on GitHub (Mar 27, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23148 ## Goal Improve the knowledge base workflow for uploaded audio so diarized transcripts are preserved, easy to read, and easy to edit after upload. ## Context Diarization already works well for audio uploads in chat, but the knowledge base workflow is still much less usable for speaker-separated audio. When audio is uploaded to a knowledge base, the transcript should preserve speaker segmentation and make it practical to clean up speaker names before the content is used downstream. ## Proposal - Trigger the same diarization-capable transcription flow for audio uploads into knowledge bases. - Render diarized knowledge base transcripts in the editor as readable markdown instead of a flat text blob. - Format each segment so speaker, timestamp range, and segment text are clearly visible and easy to edit. - Add a lightweight speaker rename workflow in the editor, ideally as search-and-replace for speaker labels so users can quickly replace names like `SPEAKER_00` with real names across the document. - Keep the editing workflow practical for long transcripts and make sure the edited result is what gets persisted/indexed. ## Acceptance criteria - [ ] Audio uploads in knowledge bases trigger diarization-aware transcription when supported by the configured STT backend. - [ ] The resulting transcript is shown in the knowledge base editor as nicely formatted markdown. - [ ] Each segment is clearly separated and easy to scan and edit. - [ ] Users can quickly rename speakers via an easy search/replace workflow in the editor. - [ ] Edited speaker names are preserved and reflected in the stored/indexed transcript content. - [ ] Tests updated/added as appropriate. ## Out of scope - New standalone transcript editing UI outside the existing knowledge base editor. - Changes to chat upload diarization behavior, which already works well.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58564