mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 02:48:13 -05:00
[GH-ISSUE #19556] issue: PostgreSQL search crashes on invalid Unicode surrogate pairs in chat data #34453
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @rbb-dev on GitHub (Nov 27, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/19556
Check Existing Issues
Installation Method
Docker
Open WebUI Version
0.6.40
Ollama Version (if applicable)
No response
Operating System
ghcr.io/open-webui/open-webui:main
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
When searching for chats via the /api/v1/chats/search endpoint, the application should:
Database content should not be able to crash the application. Even if data becomes corrupted (e.g., from AI model responses with malformed UTF-16 sequences, migration issues, or data corruption), the application should handle it defensively.
Actual Behavior
The search endpoint crashes with a 500 Internal Server Error when the PostgreSQL database contains invalid Unicode surrogate pairs in chat JSON data. Specifically:
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type json DETAIL: Unicode low surrogate must follow a high surrogate. CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...This makes the chat search functionality completely unusable, and users cannot search through their chats at all. Note: I investigated issue #15616 which has similar symptoms (PostgreSQL search crashes), but that issue is specifically about null bytes (\u0000), not surrogate pairs. The fix for #15616 (lines 850-853 in chats.py) already filters null bytes but does not address invalid surrogate characters.
Steps to Reproduce
Prerequisites
Reproduction Steps
Root Cause
The issue is in backend/open_webui/models/chats.py, in the get_chats_by_user_id_and_search_text() method around lines 845-871. The current PostgreSQL safety filters only check for null bytes (from issue #15616):
However, PostgreSQL cannot cast JSON to text when it contains invalid Unicode surrogate pairs:
When the query executes Chat.chat::text (line 850) or message->>'content' (line 860), PostgreSQL throws InvalidTextRepresentation because these characters cannot be converted to valid UTF-8 text.
Logs & Screenshots
2025-11-28` 09:13:18.735 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - xx.xx.xx.xx:0 - "GET /api/v1/chats/search?text=request+model&page=1 HTTP/1.1" 500
Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type json
DETAIL: Unicode low surrogate must follow a high surrogate.
CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1133, in call
await super().call(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 113, in call
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/sessions.py", line 85, in call
await self.app(scope, receive, send_wrapper)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1349, in inspect_websocket
return await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1328, in check_url
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1314, in commit_session_after_request
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1305, in dispatch
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/utils/security_headers.py", line 11, in dispatch
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1261, in dispatch
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette_compress/init.py", line 92, in call
return await self._zstd(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette_compress/_zstd_legacy.py", line 100, in call
await self.app(scope, receive, wrapper)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 716, in call
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 123, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 109, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 387, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 290, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 38, in run_in_threadpool
return await anyio.to_thread.run_sync(func)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 976, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/routers/chats.py", line 179, in search_user_chats
for chat in Chats.get_chats_by_user_id_and_search_text(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/models/chats.py", line 908, in get_chats_by_user_id_and_search_text
all_chats = query.offset(skip).limit(limit).all()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2699, in all
return self._iter().all() # type: ignore
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2853, in _iter
result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2365, in execute
return self._execute_internal(
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2251, in _execute_internal
result: Result[Any] = compile_state_cls.orm_execute_statement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 305, in orm_execute_statement
result = conn.execute(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
return meth(
^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1638, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1843, in _execute_context
return self._exec_single_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1983, in _exec_single_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2352, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type json
DETAIL: Unicode low surrogate must follow a high surrogate.
CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...
[SQL: SELECT chat.id AS chat_id, chat.user_id AS chat_user_id, chat.title AS chat_title, chat.chat AS chat_chat, chat.created_at AS chat_created_at, chat.updated_at AS chat_updated_at, chat.share_id AS chat_share_id, chat.archived AS chat_archived, chat.pinned AS chat_pinned, chat.meta AS chat_meta, chat.folder_id AS chat_folder_id
FROM chat
WHERE chat.user_id = %(user_id_1)s AND chat.archived = false AND Chat.chat::text NOT LIKE '%%\u0000%%' AND Chat.title::text NOT LIKE '%%\x00%%' AND (chat.title ILIKE %(title_key)s OR
EXISTS (
SELECT 1
FROM json_array_elements(Chat.chat->'messages') AS message
WHERE json_typeof(message->'content') = 'string'
AND LOWER(message->>'content') LIKE '%%' || %(content_key)s || '%%'
)
) ORDER BY chat.updated_at DESC
LIMIT %(param_1)s OFFSET %(param_2)s]
[parameters: {'user_id_1': '714affbb-d092-4e39-af42-0c59ee82ab8d', 'title_key': '%request model%', 'content_key': 'request model', 'param_1': 60, 'param_2': 0}]
(Background on this error at: https://sqlalche.me/e/20/9h9h)
SQL Query That Fails
The query fails when PostgreSQL attempts to execute:
Both operations require PostgreSQL to convert JSON to text, which fails when the JSON contains invalid surrogate pairs.
Additional Information
UntranslatableCharacterInvalidTextRepresentation\u0000)\udc00-\udfff,\ud800-\udbff)\u0000 cannot be converted to textUnicode low surrogate must follow a high surrogateThe fix for #15616 added filters for null bytes, but did not account for invalid surrogate characters. Both types of invalid Unicode need to be filtered for robust PostgreSQL support.
@tjbck commented on GitHub (Nov 28, 2025):
Open to reviewing PRs!
@rgaricano commented on GitHub (Nov 28, 2025):
My proposal: https://github.com/open-webui/open-webui/issues/15616#issuecomment-3324388885
@rbb-dev commented on GitHub (Dec 2, 2025):
After investigating, ive found that cleaning up these records cannot be done reliably using SQL alone. Given how rarely these data issues occur, patching Open WebUI itself is not the best approach.
Instead, I have written a small pipeline that scans and repairs affected chats, specifically handling the currently reported issues with null and stray UTF-16 bytes. You can find it here:
https://github.com/rbb-dev/Open-WebUI-Chat-Repair
@rgaricano commented on GitHub (Dec 2, 2025):
( @rbb-dev, if you want add more functionality, take a look to the tool I madel for check & repair chat history: https://github.com/open-webui/open-webui/issues/15189#issuecomment-3333298297 )
@rbb-dev commented on GitHub (Dec 4, 2025):
This is still work in progress, but its a start.
db clealn up
storage clean up
inline image detachment to reduce DB size..
and repairs.
https://github.com/rbb-dev/Open-WebUI-maintenance
@Classic298 commented on GitHub (Dec 21, 2025):
https://github.com/open-webui/open-webui/pull/20072 should help here to prevent this in the future
if you already have contaminated data you'll need a script to clean your database