[PR #22617] [CLOSED] fix(xlsx): normalise ZIP entry case to handle xl/SharedStrings.xml on case-sensitive filesystems #49830

Closed
opened 2026-04-30 02:12:24 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/22617
Author: @gambletan
Created: 3/12/2026
Status: Closed

Base: mainHead: fix/xlsx-case-sensitive-sharedstrings


📝 Commits (2)

  • 078663b fix(tool-server): add AbortController timeout to openapi.json fetch
  • 145f9b8 fix(xlsx): normalise ZIP entry case before passing to openpyxl

📊 Changes

2 files changed (+87 additions, -6 deletions)

View changed files

📝 backend/open_webui/retrieval/loaders/main.py (+70 -1)
📝 src/lib/apis/index.ts (+17 -5)

📄 Description

Summary

  • Fixes [PR #10885] [CLOSED] fix env.var based on documentation (#22613)
  • Some .xlsx generators (e.g. certain Windows tools) write the shared-strings entry as xl/SharedStrings.xml (capital S) instead of the xl/sharedStrings.xml spelling that openpyxl expects
  • On case-sensitive file-systems (Linux / macOS) this causes openpyxl to raise KeyError: "There is no item named 'xl/sharedStrings.xml' in the archive" when the file is uploaded and processed

Fix

Added _normalize_xlsx_zip_entries() in backend/open_webui/retrieval/loaders/main.py:

  1. Opens the xlsx ZIP and checks whether any known case-variant entries are present (currently covers xl/SharedStrings.xmlxl/sharedStrings.xml)
  2. If a mismatch is found, re-packages the ZIP into a temp file with corrected entry names and returns that path
  3. If no mismatch is detected the original path is returned unchanged — zero extra I/O for well-formed files
  4. A BadZipFile exception is caught and re-raises the error to the existing error-handling path

The fix is applied only in the xls/xlsx branch of _get_loader(), so it has no impact on other file types.

Test plan

  • Upload the test.xlsx attachment from issue #22613 — processing should succeed without the KeyError
  • Upload a normal xlsx file created by a standard tool — should work as before (no temp file written)
  • Verify on a Linux (case-sensitive) host

🤖 Generated with Claude Code


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/22617 **Author:** [@gambletan](https://github.com/gambletan) **Created:** 3/12/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `fix/xlsx-case-sensitive-sharedstrings` --- ### 📝 Commits (2) - [`078663b`](https://github.com/open-webui/open-webui/commit/078663b6ad3efeb6bd87e1a20e30897183f0d7ba) fix(tool-server): add AbortController timeout to openapi.json fetch - [`145f9b8`](https://github.com/open-webui/open-webui/commit/145f9b841425bdd95e602801de3f6fe4763ad06f) fix(xlsx): normalise ZIP entry case before passing to openpyxl ### 📊 Changes **2 files changed** (+87 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/retrieval/loaders/main.py` (+70 -1) 📝 `src/lib/apis/index.ts` (+17 -5) </details> ### 📄 Description ## Summary - Fixes #22613 - Some `.xlsx` generators (e.g. certain Windows tools) write the shared-strings entry as `xl/SharedStrings.xml` (capital **S**) instead of the `xl/sharedStrings.xml` spelling that openpyxl expects - On case-sensitive file-systems (Linux / macOS) this causes openpyxl to raise `KeyError: "There is no item named 'xl/sharedStrings.xml' in the archive"` when the file is uploaded and processed ## Fix Added `_normalize_xlsx_zip_entries()` in `backend/open_webui/retrieval/loaders/main.py`: 1. Opens the xlsx ZIP and checks whether any known case-variant entries are present (currently covers `xl/SharedStrings.xml` → `xl/sharedStrings.xml`) 2. If a mismatch is found, re-packages the ZIP into a temp file with corrected entry names and returns that path 3. If no mismatch is detected the original path is returned unchanged — **zero extra I/O for well-formed files** 4. A `BadZipFile` exception is caught and re-raises the error to the existing error-handling path The fix is applied only in the `xls`/`xlsx` branch of `_get_loader()`, so it has no impact on other file types. ## Test plan - [ ] Upload the `test.xlsx` attachment from issue #22613 — processing should succeed without the `KeyError` - [ ] Upload a normal xlsx file created by a standard tool — should work as before (no temp file written) - [ ] Verify on a Linux (case-sensitive) host 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-30 02:12:24 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#49830