mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #11448] feat: enable language configuration for tika OCR like as what paperless-ngx does #54898
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @rakehell1986 on GitHub (Mar 9, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11448
Check Existing Issues
Problem Description
After i integrated tika with openwebui, the chinese scanned document are all converted to English alphabet。 for example
仲裁申请书 申请人:额家的,女,汉族,身份证号码:3332324324525452426 住所:圣诞节啊客服经理撒打has been converted toAFH a FA Fig A: REND . 2, DK, AGES: 3332324324525452426 FERT: PINTS RIAN 4 HEE 105 Bs ZSERBA: WWIII SS KEM. KAKA RIM. WA AHBLso how to enable chinese OCR in tika ?