PDF document not working. #732

Closed
opened 2025-11-11 14:30:04 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @HougeLangley on GitHub (Apr 26, 2024).

Bug Report

Description

Bug Summary:
Click on the document and after selecting document settings, choose the local Ollama. From there, select the model file you want to download, which in this case is llama3:8b-text-q6_KE. For scanned PDF images using ORC, click 'Save.'

https://github.com/open-webui/open-webui/assets/1161594/51748463-0042-4e23-b70e-39403aa81d4e

When using a PDF reader to interact with the model file during these operations, you initially see a normal output section. However, afterwards, a mass of numerical content appears. Unfortunately, the PDF reader doesn't provide feedback on the content that matters to you.

Steps to Reproduce:
This is log


╭─hougelangley at Arch-Legion in ~ 24-04-26 - 9:53:01
╰─○ source open-webui/bin/activate
(open-webui) ╭─hougelangley at Arch-Legion in ~ 24-04-26 - 9:53:04
╰─(open-webui) ○ cd open-webui/backend
(open-webui) ╭─hougelangley at Arch-Legion in ~/open-webui/backend on main✘✘✘ 24-04-26 - 9:53:12
╰─(open-webui) ⠠⠵ ./start.sh
No WEBUI_SECRET_KEY provided
Loading WEBUI_SECRET_KEY from .webui_secret_key

  ___                    __        __   _     _   _ ___ 
 / _ \ _ __   ___ _ __   \ \      / /__| |__ | | | |_ _|
| | | | '_ \ / _ \ '_ \   \ \ /\ / / _ \ '_ \| | | || | 
| |_| | |_) |  __/ | | |   \ V  V /  __/ |_) | |_| || | 
 \___/| .__/ \___|_| |_|    \_/\_/ \___|_.__/ \___/|___|
      |_|                                               

      
v0.1.121 - building the best open-source AI user interface.      
https://github.com/open-webui/open-webui

INFO:     Started server process [18324]
INFO:     Waiting for application startup.
INFO:apps.litellm.main:start_litellm_background
INFO:apps.litellm.main:run_background_process
INFO:apps.litellm.main:Executing command: ['litellm', '--port', '14365', '--host', '127.0.0.1', '--telemetry', 'False', '--config', '/home/hougelangley/open-webui/backend/data/litellm/config.yaml']
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:apps.litellm.main:Subprocess started successfully.
INFO:     127.0.0.1:51478 - "GET /api/v1/chats/b67e83e3-5990-402d-8b42-dd089cabeaba/tags HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "DELETE /api/v1/chats/b67e83e3-5990-402d-8b42-dd089cabeaba HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "GET /api/v1/chats/c5be3985-ff85-4132-8e1a-52856c6e5d4e/tags HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "DELETE /api/v1/chats/c5be3985-ff85-4132-8e1a-52856c6e5d4e HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "GET /rag/api/v1/config HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "GET /rag/api/v1/embedding HTTP/1.1" 200 OK
INFO:     127.0.0.1:51478 - "GET /rag/api/v1/query/settings HTTP/1.1" 200 OK
INFO:apps.rag.main:Updating embedding model: sentence-transformers/all-MiniLM-L6-v2 to llama3:8b-text-q6_K
INFO:     127.0.0.1:37064 - "POST /rag/api/v1/embedding/update HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "GET /rag/api/v1/scan HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "GET /api/v1/documents/ HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "POST /rag/api/v1/config/update HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "POST /rag/api/v1/query/settings/update HTTP/1.1" 200 OK
INFO:apps.ollama.main:get_all_models()
INFO:     127.0.0.1:37064 - "GET /ollama/api/version HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "GET /rag/api/v1/config HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "GET /rag/api/v1/embedding HTTP/1.1" 200 OK
INFO:     127.0.0.1:37064 - "GET /rag/api/v1/query/settings HTTP/1.1" 200 OK
INFO:     127.0.0.1:58690 - "POST /rag/api/v1/config/update HTTP/1.1" 200 OK
INFO:     127.0.0.1:58690 - "POST /rag/api/v1/query/settings/update HTTP/1.1" 200 OK
INFO:     127.0.0.1:58694 - "GET /ollama/api/version HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "POST /api/v1/chats/new HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
INFO:apps.ollama.main:generate_ollama_embeddings model='llama3:8b-text-q6_K' prompt='Please summarize the content of the abstract' options=None keep_alive=None
INFO:apps.ollama.main:url: http://localhost:11434
INFO:apps.ollama.main:generate_ollama_embeddings {'embedding': [2.1947214603424072, -2.932509660720825, 2.073728084564209, -5.224618911743164, 1.382140040397644, 2.2824342250823975, 3.4977519512176514, 0.5653916597366333, -0.894767701625824, 2.269526481628418, 4.411888599395752, 3.484095335006714, -2.7565343379974365, -2.373713493347168, 0.8957738876342773, -0.35808876156806946, -2.1128158569335938, 0.4305896461009979, -1.2045246362686157, -0.6428171992301941, 0.5761190056800842, 1.1275595426559448, 3.5981876850128174, 1.2606561183929443, -2.435670852661133, -1.2396458387374878, 0.8457074761390686, -2.251805067062378, 2.385505437850952, -2.955451488494873, -3.17526912689209, 0.6163533926010132, -0.31827104091644287, -0.4506198763847351, -0.031493011862039566, 0.1525883674621582, -1.3778184652328491, 0.5786625146865845, 1.8393068313598633, 1.9458340406417847, 0.6579092741012573, -2.08508038520813, 4.452488422393799, -2.106900691986084, 0.15614448487758636, 0.9804162383079529, -1.5114785432815552, 9480896, 
( Something Like This, Too many)
0.5994812846183777, 0.20023688673973083, -0.12643122673034668, 0.6465083360671997, 2.57788348197937, -1.0518397092819214, -2.0232694149017334, 3.4005234241485596, -2.7176740169525146, 1.0232971906661987, 0.5144201517105103, -0.08496495336294174, 0.16579796373844147, 3.5441904067993164, 2.9036624431610107, -2.2864043712615967, 2.9468822479248047, 0.35332009196281433, -0.6708813905715942, 0.010072560980916023, -0.8060532212257385, 1.7510640621185303, -0.9851429462432861, -3.8921525478363037, 0.09995416551828384, 3.418785333633423, -0.14655126631259918, -2.024711847305298, -0.36550089716911316, -1.4673113822937012, 0.23684842884540558, -2.6708171367645264, -0.9139055013656616, 0.19241254031658173, 0.46356919407844543, 0.9632868766784668, 2.4714832305908203, 0.35406458377838135, -0.15255975723266602, 5.02350378036499, -1.7834197282791138, 0.3971042037010193, -0.654962956905365, 0.9539296627044678, -1.085590124130249, -6.017014503479004, -0.17840878665447235, -0.7410923838615417, 1.9772791862487793, 0.48041802644729614, -1.1314371824264526, 0.9293527603149414, 1.7110549211502075, -3.120006561279297, -1.770501971244812, 1.56062650680542, 1.1377243995666504, 1.1113053560256958, 1.720503330230713, 0.8741684556007385, -0.8017598986625671, -3.0359792709350586, -1.2533948421478271, -1.2245848178863525, 3.549250841140747, -1.568366527557373, -4.240169048309326, -2.467161178588867, -2.19775390625, 0.2725611925125122, -1.8473856449127197, -2.0005741119384766, 0.08446794003248215, 0.10625369101762772, 2.025334358215332, -1.8846698999404907, -2.8463125228881836, 1.2140922546386719, 0.5268837809562683, -0.10961505025625229, -0.7391000390052795, 2.5180606842041016, -11.837689399719238, 2.447671413421631, -0.21508343517780304, -2.853151798248291, 1.1251693964004517, 0.42470479011535645, 2.1673436164855957, -1.021291971206665, -0.4356257915496826, -0.9569611549377441, -0.5594292283058167, -0.8161688446998596, -1.2634739875793457, -2.064866781234741, -1.0514392852783203, -0.7314847111701965, -1.6614344120025635, -0.9996548295021057, 3.2197721004486084, 4.058717727661133, -2.063504457473755, 1.2857601642608643, -2.871476173400879, -2.3635313510894775, 1.938535451889038, -6.708254814147949, 2.713595151901245, 2.1142969131469727, -0.7022862434387207, 0.44848451018333435, 0.7791779041290283, -2.4912893772125244, 1.3050071001052856, 0.7817772626876831, -1.8905012607574463, -2.1542818546295166, -1.4717713594436646, 0.2577506899833679, 0.06584770232439041, 0.052498482167720795, 1.997460126876831]
ERROR:apps.rag.utils:no such column: collections.topic
Traceback (most recent call last):
  File "/home/hougelangley/open-webui/backend/apps/rag/utils.py", line 185, in rag_messages
    context = query_embeddings_doc(
              ^^^^^^^^^^^^^^^^^^^^^
  File "/home/hougelangley/open-webui/backend/apps/rag/utils.py", line 32, in query_embeddings_doc
    raise e
  File "/home/hougelangley/open-webui/backend/apps/rag/utils.py", line 22, in query_embeddings_doc
    collection = CHROMA_CLIENT.get_collection(name=collection_name)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/api/client.py", line 218, in get_collection
    return self._server.get_collection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 127, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/api/segment.py", line 245, in get_collection
    existing = self._sysdb.get_collections(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 127, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/db/mixins/sysdb.py", line 435, in get_collections
    rows = cur.execute(sql, params).fetchall()
           ^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: no such column: collections.topic
INFO:apps.ollama.main:url: http://localhost:11434
INFO:     127.0.0.1:53044 - "POST /ollama/api/chat HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "POST /api/v1/chats/9e7a82f5-19b4-4f4a-8c6c-9b006a8be664 HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
INFO:apps.ollama.main:url: http://localhost:11434
INFO:     127.0.0.1:53044 - "POST /ollama/v1/chat/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "POST /api/v1/chats/9e7a82f5-19b4-4f4a-8c6c-9b006a8be664 HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
INFO:     127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK

Expected Behavior:
Typically, after searching for content within a PDF file, I would respond with the contents found within the PDF itself. Since this communication will be in English, it should indeed be possible to achieve this. The content of PDF documents themselves is, in fact, in English as well.

Actual Behavior:
The output, however, would typically reflect a lack of find or inability to summarize, rather than providing such information explicitly. For instance:

"Sorry, but the content within your PDF did not match any search terms or could not be summarized at this time."

Environment

  • Open WebUI Version: [e.g., 0.1.121]

  • Ollama (if applicable): [e.g., 0.1.32]
    屏幕截图_20240426_100139

  • Operating System: [e.g., Archlinux]

  • Browser (if applicable): [e.g., Firefox 125.0.2]

Reproduction Details

Confirmation:

  • [ Y ] I have read and followed all the instructions provided in the README.md.
  • [ Y ] I am on the latest version of both Open WebUI and Ollama.
  • [ Y ] I have included the browser console logs.
  • [ No, I am not using docker container ] I have included the Docker container logs.

Logs and Screenshots

Browser Console Logs:
[Include relevant browser console logs, if applicable]

Docker Container Logs:
[Include relevant Docker container logs, if applicable]

Screenshots (if applicable):
[Attach any relevant screenshots to help illustrate the issue]

Installation Method

[Describe the method you used to install the project, e.g., manual installation, Docker, package manager, etc.]

Additional Information

[Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.]

Note

If the bug report is incomplete or does not follow the provided instructions, it may not be addressed. Please ensure that you have followed the steps outlined in the README.md and troubleshooting.md documents, and provide all necessary information for us to reproduce and address the issue. Thank you!

Originally created by @HougeLangley on GitHub (Apr 26, 2024). # Bug Report ## Description **Bug Summary:** Click on the document and after selecting document settings, choose the local Ollama. From there, select the model file you want to download, which in this case is llama3:8b-text-q6_KE. For scanned PDF images using ORC, click 'Save.' https://github.com/open-webui/open-webui/assets/1161594/51748463-0042-4e23-b70e-39403aa81d4e When using a PDF reader to interact with the model file during these operations, you initially see a normal output section. However, afterwards, a mass of numerical content appears. Unfortunately, the PDF reader doesn't provide feedback on the content that matters to you. **Steps to Reproduce:** This is log ``` ╭─hougelangley at Arch-Legion in ~ 24-04-26 - 9:53:01 ╰─○ source open-webui/bin/activate (open-webui) ╭─hougelangley at Arch-Legion in ~ 24-04-26 - 9:53:04 ╰─(open-webui) ○ cd open-webui/backend (open-webui) ╭─hougelangley at Arch-Legion in ~/open-webui/backend on main✘✘✘ 24-04-26 - 9:53:12 ╰─(open-webui) ⠠⠵ ./start.sh No WEBUI_SECRET_KEY provided Loading WEBUI_SECRET_KEY from .webui_secret_key ___ __ __ _ _ _ ___ / _ \ _ __ ___ _ __ \ \ / /__| |__ | | | |_ _| | | | | '_ \ / _ \ '_ \ \ \ /\ / / _ \ '_ \| | | || | | |_| | |_) | __/ | | | \ V V / __/ |_) | |_| || | \___/| .__/ \___|_| |_| \_/\_/ \___|_.__/ \___/|___| |_| v0.1.121 - building the best open-source AI user interface. https://github.com/open-webui/open-webui INFO: Started server process [18324] INFO: Waiting for application startup. INFO:apps.litellm.main:start_litellm_background INFO:apps.litellm.main:run_background_process INFO:apps.litellm.main:Executing command: ['litellm', '--port', '14365', '--host', '127.0.0.1', '--telemetry', 'False', '--config', '/home/hougelangley/open-webui/backend/data/litellm/config.yaml'] INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit) INFO:apps.litellm.main:Subprocess started successfully. INFO: 127.0.0.1:51478 - "GET /api/v1/chats/b67e83e3-5990-402d-8b42-dd089cabeaba/tags HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "DELETE /api/v1/chats/b67e83e3-5990-402d-8b42-dd089cabeaba HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "GET /api/v1/chats/ HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "GET /api/v1/chats/c5be3985-ff85-4132-8e1a-52856c6e5d4e/tags HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "DELETE /api/v1/chats/c5be3985-ff85-4132-8e1a-52856c6e5d4e HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "GET /api/v1/chats/ HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "GET /rag/api/v1/config HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "GET /rag/api/v1/embedding HTTP/1.1" 200 OK INFO: 127.0.0.1:51478 - "GET /rag/api/v1/query/settings HTTP/1.1" 200 OK INFO:apps.rag.main:Updating embedding model: sentence-transformers/all-MiniLM-L6-v2 to llama3:8b-text-q6_K INFO: 127.0.0.1:37064 - "POST /rag/api/v1/embedding/update HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "GET /rag/api/v1/scan HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "GET /api/v1/documents/ HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "POST /rag/api/v1/config/update HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "POST /rag/api/v1/query/settings/update HTTP/1.1" 200 OK INFO:apps.ollama.main:get_all_models() INFO: 127.0.0.1:37064 - "GET /ollama/api/version HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "GET /rag/api/v1/config HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "GET /rag/api/v1/embedding HTTP/1.1" 200 OK INFO: 127.0.0.1:37064 - "GET /rag/api/v1/query/settings HTTP/1.1" 200 OK INFO: 127.0.0.1:58690 - "POST /rag/api/v1/config/update HTTP/1.1" 200 OK INFO: 127.0.0.1:58690 - "POST /rag/api/v1/query/settings/update HTTP/1.1" 200 OK INFO: 127.0.0.1:58694 - "GET /ollama/api/version HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "POST /api/v1/chats/new HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK INFO:apps.ollama.main:generate_ollama_embeddings model='llama3:8b-text-q6_K' prompt='Please summarize the content of the abstract' options=None keep_alive=None INFO:apps.ollama.main:url: http://localhost:11434 INFO:apps.ollama.main:generate_ollama_embeddings {'embedding': [2.1947214603424072, -2.932509660720825, 2.073728084564209, -5.224618911743164, 1.382140040397644, 2.2824342250823975, 3.4977519512176514, 0.5653916597366333, -0.894767701625824, 2.269526481628418, 4.411888599395752, 3.484095335006714, -2.7565343379974365, -2.373713493347168, 0.8957738876342773, -0.35808876156806946, -2.1128158569335938, 0.4305896461009979, -1.2045246362686157, -0.6428171992301941, 0.5761190056800842, 1.1275595426559448, 3.5981876850128174, 1.2606561183929443, -2.435670852661133, -1.2396458387374878, 0.8457074761390686, -2.251805067062378, 2.385505437850952, -2.955451488494873, -3.17526912689209, 0.6163533926010132, -0.31827104091644287, -0.4506198763847351, -0.031493011862039566, 0.1525883674621582, -1.3778184652328491, 0.5786625146865845, 1.8393068313598633, 1.9458340406417847, 0.6579092741012573, -2.08508038520813, 4.452488422393799, -2.106900691986084, 0.15614448487758636, 0.9804162383079529, -1.5114785432815552, 9480896, ( Something Like This, Too many) 0.5994812846183777, 0.20023688673973083, -0.12643122673034668, 0.6465083360671997, 2.57788348197937, -1.0518397092819214, -2.0232694149017334, 3.4005234241485596, -2.7176740169525146, 1.0232971906661987, 0.5144201517105103, -0.08496495336294174, 0.16579796373844147, 3.5441904067993164, 2.9036624431610107, -2.2864043712615967, 2.9468822479248047, 0.35332009196281433, -0.6708813905715942, 0.010072560980916023, -0.8060532212257385, 1.7510640621185303, -0.9851429462432861, -3.8921525478363037, 0.09995416551828384, 3.418785333633423, -0.14655126631259918, -2.024711847305298, -0.36550089716911316, -1.4673113822937012, 0.23684842884540558, -2.6708171367645264, -0.9139055013656616, 0.19241254031658173, 0.46356919407844543, 0.9632868766784668, 2.4714832305908203, 0.35406458377838135, -0.15255975723266602, 5.02350378036499, -1.7834197282791138, 0.3971042037010193, -0.654962956905365, 0.9539296627044678, -1.085590124130249, -6.017014503479004, -0.17840878665447235, -0.7410923838615417, 1.9772791862487793, 0.48041802644729614, -1.1314371824264526, 0.9293527603149414, 1.7110549211502075, -3.120006561279297, -1.770501971244812, 1.56062650680542, 1.1377243995666504, 1.1113053560256958, 1.720503330230713, 0.8741684556007385, -0.8017598986625671, -3.0359792709350586, -1.2533948421478271, -1.2245848178863525, 3.549250841140747, -1.568366527557373, -4.240169048309326, -2.467161178588867, -2.19775390625, 0.2725611925125122, -1.8473856449127197, -2.0005741119384766, 0.08446794003248215, 0.10625369101762772, 2.025334358215332, -1.8846698999404907, -2.8463125228881836, 1.2140922546386719, 0.5268837809562683, -0.10961505025625229, -0.7391000390052795, 2.5180606842041016, -11.837689399719238, 2.447671413421631, -0.21508343517780304, -2.853151798248291, 1.1251693964004517, 0.42470479011535645, 2.1673436164855957, -1.021291971206665, -0.4356257915496826, -0.9569611549377441, -0.5594292283058167, -0.8161688446998596, -1.2634739875793457, -2.064866781234741, -1.0514392852783203, -0.7314847111701965, -1.6614344120025635, -0.9996548295021057, 3.2197721004486084, 4.058717727661133, -2.063504457473755, 1.2857601642608643, -2.871476173400879, -2.3635313510894775, 1.938535451889038, -6.708254814147949, 2.713595151901245, 2.1142969131469727, -0.7022862434387207, 0.44848451018333435, 0.7791779041290283, -2.4912893772125244, 1.3050071001052856, 0.7817772626876831, -1.8905012607574463, -2.1542818546295166, -1.4717713594436646, 0.2577506899833679, 0.06584770232439041, 0.052498482167720795, 1.997460126876831] ERROR:apps.rag.utils:no such column: collections.topic Traceback (most recent call last): File "/home/hougelangley/open-webui/backend/apps/rag/utils.py", line 185, in rag_messages context = query_embeddings_doc( ^^^^^^^^^^^^^^^^^^^^^ File "/home/hougelangley/open-webui/backend/apps/rag/utils.py", line 32, in query_embeddings_doc raise e File "/home/hougelangley/open-webui/backend/apps/rag/utils.py", line 22, in query_embeddings_doc collection = CHROMA_CLIENT.get_collection(name=collection_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/api/client.py", line 218, in get_collection return self._server.get_collection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 127, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/api/segment.py", line 245, in get_collection existing = self._sysdb.get_collections( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 127, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/hougelangley/open-webui/lib/python3.11/site-packages/chromadb/db/mixins/sysdb.py", line 435, in get_collections rows = cur.execute(sql, params).fetchall() ^^^^^^^^^^^^^^^^^^^^^^^^ sqlite3.OperationalError: no such column: collections.topic INFO:apps.ollama.main:url: http://localhost:11434 INFO: 127.0.0.1:53044 - "POST /ollama/api/chat HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "POST /api/v1/chats/9e7a82f5-19b4-4f4a-8c6c-9b006a8be664 HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK INFO:apps.ollama.main:url: http://localhost:11434 INFO: 127.0.0.1:53044 - "POST /ollama/v1/chat/completions HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "POST /api/v1/chats/9e7a82f5-19b4-4f4a-8c6c-9b006a8be664 HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK INFO: 127.0.0.1:53044 - "GET /api/v1/chats/ HTTP/1.1" 200 OK ``` **Expected Behavior:** Typically, after searching for content within a PDF file, I would respond with the contents found within the PDF itself. Since this communication will be in English, it should indeed be possible to achieve this. The content of PDF documents themselves is, in fact, in English as well. **Actual Behavior:** The output, however, would typically reflect a lack of find or inability to summarize, rather than providing such information explicitly. For instance: "Sorry, but the content within your PDF did not match any search terms or could not be summarized at this time." ## Environment - **Open WebUI Version:** [e.g., 0.1.121] - **Ollama (if applicable):** [e.g., 0.1.32] ![屏幕截图_20240426_100139](https://github.com/open-webui/open-webui/assets/1161594/dbb34d06-301e-41d7-92f5-3387b3e062df) - **Operating System:** [e.g., Archlinux] - **Browser (if applicable):** [e.g., Firefox 125.0.2] ## Reproduction Details **Confirmation:** - [ Y ] I have read and followed all the instructions provided in the README.md. - [ Y ] I am on the latest version of both Open WebUI and Ollama. - [ Y ] I have included the browser console logs. - [ No, I am not using docker container ] I have included the Docker container logs. ## Logs and Screenshots **Browser Console Logs:** [Include relevant browser console logs, if applicable] **Docker Container Logs:** [Include relevant Docker container logs, if applicable] **Screenshots (if applicable):** [Attach any relevant screenshots to help illustrate the issue] ## Installation Method [Describe the method you used to install the project, e.g., manual installation, Docker, package manager, etc.] ## Additional Information [Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.] ## Note If the bug report is incomplete or does not follow the provided instructions, it may not be addressed. Please ensure that you have followed the steps outlined in the README.md and troubleshooting.md documents, and provide all necessary information for us to reproduce and address the issue. Thank you!
Author
Owner

@andrescevp commented on GitHub (Apr 26, 2024):

In my side I get

2024-04-26 18:00:52 INFO:     172.17.0.1:44020 - "GET /api/v1/documents/ HTTP/1.1" 200 OK
2024-04-26 18:00:53 INFO:apps.rag.main:file.content_type: application/pdf
2024-04-26 18:00:59 ERROR:apps.rag.main:'/Filter'
2024-04-26 18:00:59 Traceback (most recent call last):
2024-04-26 18:00:59   File "/app/backend/apps/rag/main.py", line 624, in store_doc
2024-04-26 18:00:59     data = loader.load()
2024-04-26 18:00:59            ^^^^^^^^^^^^^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/langchain_core/document_loaders/base.py", line 29, in load
2024-04-26 18:00:59     return list(self.lazy_load())
2024-04-26 18:00:59            ^^^^^^^^^^^^^^^^^^^^^^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/pdf.py", line 193, in lazy_load
2024-04-26 18:00:59     yield from self.parser.parse(blob)
2024-04-26 18:00:59                ^^^^^^^^^^^^^^^^^^^^^^^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/langchain_core/document_loaders/base.py", line 125, in parse
2024-04-26 18:00:59     return list(self.lazy_parse(blob))
2024-04-26 18:00:59            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 96, in lazy_parse
2024-04-26 18:00:59     yield from [
2024-04-26 18:00:59                ^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 99, in <listcomp>
2024-04-26 18:00:59     + self._extract_images_from_page(page),
2024-04-26 18:00:59       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 114, in _extract_images_from_page
2024-04-26 18:00:59     if xObject[obj]["/Filter"][1:] in _PDF_FILTER_WITHOUT_LOSS:
2024-04-26 18:00:59        ~~~~~~~~~~~~^^^^^^^^^^^
2024-04-26 18:00:59   File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 409, in __getitem__
2024-04-26 18:00:59     return dict.__getitem__(self, key).get_object()
2024-04-26 18:00:59            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-26 18:00:59 KeyError: '/Filter'
2024-04-26 18:00:59 INFO:     172.17.0.1:44010 - "POST /rag/api/v1/doc HTTP/1.1" 400 Bad Request
@andrescevp commented on GitHub (Apr 26, 2024): In my side I get ``` 2024-04-26 18:00:52 INFO: 172.17.0.1:44020 - "GET /api/v1/documents/ HTTP/1.1" 200 OK 2024-04-26 18:00:53 INFO:apps.rag.main:file.content_type: application/pdf 2024-04-26 18:00:59 ERROR:apps.rag.main:'/Filter' 2024-04-26 18:00:59 Traceback (most recent call last): 2024-04-26 18:00:59 File "/app/backend/apps/rag/main.py", line 624, in store_doc 2024-04-26 18:00:59 data = loader.load() 2024-04-26 18:00:59 ^^^^^^^^^^^^^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/langchain_core/document_loaders/base.py", line 29, in load 2024-04-26 18:00:59 return list(self.lazy_load()) 2024-04-26 18:00:59 ^^^^^^^^^^^^^^^^^^^^^^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/pdf.py", line 193, in lazy_load 2024-04-26 18:00:59 yield from self.parser.parse(blob) 2024-04-26 18:00:59 ^^^^^^^^^^^^^^^^^^^^^^^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/langchain_core/document_loaders/base.py", line 125, in parse 2024-04-26 18:00:59 return list(self.lazy_parse(blob)) 2024-04-26 18:00:59 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 96, in lazy_parse 2024-04-26 18:00:59 yield from [ 2024-04-26 18:00:59 ^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 99, in <listcomp> 2024-04-26 18:00:59 + self._extract_images_from_page(page), 2024-04-26 18:00:59 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 114, in _extract_images_from_page 2024-04-26 18:00:59 if xObject[obj]["/Filter"][1:] in _PDF_FILTER_WITHOUT_LOSS: 2024-04-26 18:00:59 ~~~~~~~~~~~~^^^^^^^^^^^ 2024-04-26 18:00:59 File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 409, in __getitem__ 2024-04-26 18:00:59 return dict.__getitem__(self, key).get_object() 2024-04-26 18:00:59 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-04-26 18:00:59 KeyError: '/Filter' 2024-04-26 18:00:59 INFO: 172.17.0.1:44010 - "POST /rag/api/v1/doc HTTP/1.1" 400 Bad Request ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#732