[GH-ISSUE #19556] issue: PostgreSQL search crashes on invalid Unicode surrogate pairs in chat data #18924

Closed
opened 2026-04-20 01:12:57 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @rbb-dev on GitHub (Nov 27, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/19556

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.40

Ollama Version (if applicable)

No response

Operating System

ghcr.io/open-webui/open-webui:main

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When searching for chats via the /api/v1/chats/search endpoint, the application should:

  1. Handle malformed Unicode data in the database gracefully
  2. Either skip chats with invalid Unicode or sanitize the data during retrieval
  3. Return valid search results or an empty list
  4. Log a warning about data integrity issues
  5. Never crash with a 500 error due to database content

Database content should not be able to crash the application. Even if data becomes corrupted (e.g., from AI model responses with malformed UTF-16 sequences, migration issues, or data corruption), the application should handle it defensively.

Actual Behavior

The search endpoint crashes with a 500 Internal Server Error when the PostgreSQL database contains invalid Unicode surrogate pairs in chat JSON data. Specifically:

sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type json DETAIL: Unicode low surrogate must follow a high surrogate. CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...

This makes the chat search functionality completely unusable, and users cannot search through their chats at all. Note: I investigated issue #15616 which has similar symptoms (PostgreSQL search crashes), but that issue is specifically about null bytes (\u0000), not surrogate pairs. The fix for #15616 (lines 850-853 in chats.py) already filters null bytes but does not address invalid surrogate characters.

Steps to Reproduce

Prerequisites

  • PostgreSQL database backend (not SQLite)
  • Open WebUI running in Docker
  • At least one chat in the database

Reproduction Steps

  1. Create malformed data: Insert a chat with invalid Unicode surrogate pairs into the database. This can happen naturally through:
    • AI model responses that generate malformed UTF-16 sequences
    • Data migration from other systems
    • Unicode normalization issues

Root Cause
The issue is in backend/open_webui/models/chats.py, in the get_chats_by_user_id_and_search_text() method around lines 845-871. The current PostgreSQL safety filters only check for null bytes (from issue #15616):

Line 850: Safety filter for null bytes in JSON
query = query.filter(text("Chat.chat::text NOT LIKE '%\\\\u0000%'"))

Line 853: Safety filter for null bytes in title  
query = query.filter(text("Chat.title::text NOT LIKE '%\\x00%'"))

However, PostgreSQL cannot cast JSON to text when it contains invalid Unicode surrogate pairs:

  • Lone low surrogates: U+DC00 through U+DFFF (e.g., \udc96)
  • Lone high surrogates: U+D800 through U+DBFF

When the query executes Chat.chat::text (line 850) or message->>'content' (line 860), PostgreSQL throws InvalidTextRepresentation because these characters cannot be converted to valid UTF-8 text.

Logs & Screenshots

2025-11-28` 09:13:18.735 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - xx.xx.xx.xx:0 - "GET /api/v1/chats/search?text=request+model&page=1 HTTP/1.1" 500
Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type json
DETAIL: Unicode low surrogate must follow a high surrogate.
CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1133, in call
await super().call(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 113, in call
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/sessions.py", line 85, in call
await self.app(scope, receive, send_wrapper)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1349, in inspect_websocket
return await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1328, in check_url
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1314, in commit_session_after_request
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1305, in dispatch
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/utils/security_headers.py", line 11, in dispatch
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in call
with recv_stream, send_stream, collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in call
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/main.py", line 1261, in dispatch
response = await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette_compress/init.py", line 92, in call
return await self._zstd(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette_compress/_zstd_legacy.py", line 100, in call
await self.app(scope, receive, wrapper)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 716, in call
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 123, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 109, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 387, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 290, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 38, in run_in_threadpool
return await anyio.to_thread.run_sync(func)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 976, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/routers/chats.py", line 179, in search_user_chats
for chat in Chats.get_chats_by_user_id_and_search_text(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/backend/open_webui/models/chats.py", line 908, in get_chats_by_user_id_and_search_text
all_chats = query.offset(skip).limit(limit).all()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2699, in all
return self._iter().all() # type: ignore
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2853, in _iter
result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2365, in execute
return self._execute_internal(
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2251, in _execute_internal
result: Result[Any] = compile_state_cls.orm_execute_statement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 305, in orm_execute_statement
result = conn.execute(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
return meth(
^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1638, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1843, in _execute_context
return self._exec_single_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1983, in _exec_single_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2352, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type json
DETAIL: Unicode low surrogate must follow a high surrogate.
CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...

[SQL: SELECT chat.id AS chat_id, chat.user_id AS chat_user_id, chat.title AS chat_title, chat.chat AS chat_chat, chat.created_at AS chat_created_at, chat.updated_at AS chat_updated_at, chat.share_id AS chat_share_id, chat.archived AS chat_archived, chat.pinned AS chat_pinned, chat.meta AS chat_meta, chat.folder_id AS chat_folder_id
FROM chat
WHERE chat.user_id = %(user_id_1)s AND chat.archived = false AND Chat.chat::text NOT LIKE '%%\u0000%%' AND Chat.title::text NOT LIKE '%%\x00%%' AND (chat.title ILIKE %(title_key)s OR
EXISTS (
SELECT 1
FROM json_array_elements(Chat.chat->'messages') AS message
WHERE json_typeof(message->'content') = 'string'
AND LOWER(message->>'content') LIKE '%%' || %(content_key)s || '%%'
)
) ORDER BY chat.updated_at DESC
LIMIT %(param_1)s OFFSET %(param_2)s]
[parameters: {'user_id_1': '714affbb-d092-4e39-af42-0c59ee82ab8d', 'title_key': '%request model%', 'content_key': 'request model', 'param_1': 60, 'param_2': 0}]
(Background on this error at: https://sqlalche.me/e/20/9h9h)

SQL Query That Fails

`SELECT chat.id, chat.user_id, chat.title, chat.chat, [...]
FROM chat
WHERE chat.user_id = '<user-id>' 
  AND chat.archived = false 
  AND Chat.chat::text NOT LIKE '%\\u0000%'      -- Line 850: filters null bytes
  AND Chat.title::text NOT LIKE '%\x00%'        -- Line 853: filters null bytes
  AND (chat.title ILIKE '%request model%' OR
    EXISTS (
        SELECT 1
        FROM json_array_elements(Chat.chat->'messages') AS message
        WHERE json_typeof(message->'content') = 'string'
        AND LOWER(message->>'content') LIKE '%' || 'request model' || '%'  -- This line fails
    )
  )
ORDER BY chat.updated_at DESC
LIMIT 60 OFFSET 0`

The query fails when PostgreSQL attempts to execute:

  1. Chat.chat::text on line 850 (when checking for null bytes)
  2. message->>'content' inside the EXISTS clause (when searching message content)

Both operations require PostgreSQL to convert JSON to text, which fails when the JSON contains invalid surrogate pairs.

Additional Information

Aspect Issue #15616 This Issue
Error Type UntranslatableCharacter InvalidTextRepresentation
Invalid Character Null bytes (\u0000) Surrogate pairs (\udc00-\udfff, \ud800-\udbff)
Error Detail \u0000 cannot be converted to text Unicode low surrogate must follow a high surrogate
Current Status Fixed in lines 850-853 Not addressed

The fix for #15616 added filters for null bytes, but did not account for invalid surrogate characters. Both types of invalid Unicode need to be filtered for robust PostgreSQL support.

Originally created by @rbb-dev on GitHub (Nov 27, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/19556 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.6.40 ### Ollama Version (if applicable) _No response_ ### Operating System ghcr.io/open-webui/open-webui:main ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior **When searching for chats via the /api/v1/chats/search endpoint, the application should:** 1. Handle malformed Unicode data in the database gracefully 2. Either skip chats with invalid Unicode or sanitize the data during retrieval 3. Return valid search results or an empty list 4. Log a warning about data integrity issues 5. Never crash with a 500 error due to database content Database content should not be able to crash the application. Even if data becomes corrupted (e.g., from AI model responses with malformed UTF-16 sequences, migration issues, or data corruption), the application should handle it defensively. ### Actual Behavior The search endpoint crashes with a 500 Internal Server Error when the PostgreSQL database contains invalid Unicode surrogate pairs in chat JSON data. Specifically: ``sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type json DETAIL: Unicode low surrogate must follow a high surrogate. CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96...`` This makes the chat search functionality completely unusable, and users cannot search through their chats at all. Note: I investigated issue #15616 which has similar symptoms (PostgreSQL search crashes), but that issue is specifically about null bytes (\u0000), not surrogate pairs. The fix for #15616 (lines 850-853 in chats.py) already filters null bytes but does not address invalid surrogate characters. ### Steps to Reproduce **Prerequisites** - PostgreSQL database backend (not SQLite) - Open WebUI running in Docker - At least one chat in the database **Reproduction Steps** 1. Create malformed data: Insert a chat with invalid Unicode surrogate pairs into the database. This can happen naturally through: - AI model responses that generate malformed UTF-16 sequences - Data migration from other systems - Unicode normalization issues **Root Cause** The issue is in backend/open_webui/models/chats.py, in the get_chats_by_user_id_and_search_text() method around lines 845-871. The current PostgreSQL safety filters only check for null bytes (from issue #15616): Line 850: Safety filter for null bytes in JSON query = query.filter(text("Chat.chat::text NOT LIKE '%\\\\u0000%'")) Line 853: Safety filter for null bytes in title query = query.filter(text("Chat.title::text NOT LIKE '%\\x00%'")) However, PostgreSQL cannot cast JSON to text when it contains invalid Unicode surrogate pairs: - Lone low surrogates: U+DC00 through U+DFFF (e.g., \udc96) - Lone high surrogates: U+D800 through U+DBFF When the query executes Chat.chat::text (line 850) or message->>'content' (line 860), PostgreSQL throws InvalidTextRepresentation because these characters cannot be converted to valid UTF-8 text. ### Logs & Screenshots 2025-11-28` 09:13:18.735 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - xx.xx.xx.xx:0 - "GET /api/v1/chats/search?text=request+model&page=1 HTTP/1.1" 500 Exception in ASGI application Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context self.dialect.do_execute( File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute cursor.execute(statement, parameters) psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type json DETAIL: Unicode low surrogate must follow a high surrogate. CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96... The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi result = await app( # type: ignore[func-returns-value] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__ return await self.app(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1133, in __call__ await super().__call__(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__ await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/sessions.py", line 85, in __call__ await self.app(scope, receive, send_wrapper) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__ await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__ with recv_stream, send_stream, collapse_excgroups(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/main.py", line 1349, in inspect_websocket return await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next raise app_exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__ with recv_stream, send_stream, collapse_excgroups(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/main.py", line 1328, in check_url response = await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next raise app_exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__ with recv_stream, send_stream, collapse_excgroups(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/main.py", line 1314, in commit_session_after_request response = await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next raise app_exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__ with recv_stream, send_stream, collapse_excgroups(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/main.py", line 1305, in dispatch response = await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next raise app_exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__ with recv_stream, send_stream, collapse_excgroups(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/utils/security_headers.py", line 11, in dispatch response = await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next raise app_exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 182, in __call__ with recv_stream, send_stream, collapse_excgroups(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 184, in __call__ response = await self.dispatch_func(request, call_next) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/main.py", line 1261, in dispatch response = await call_next(request) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 159, in call_next raise app_exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 144, in coro await self.app(scope, receive_or_disconnect, send_no_error) File "/usr/local/lib/python3.11/site-packages/starlette_compress/__init__.py", line 92, in __call__ return await self._zstd(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette_compress/_zstd_legacy.py", line 100, in __call__ await self.app(scope, receive, wrapper) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app raise exc File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__ await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__ await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 736, in app await route.handle(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 123, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app raise exc File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 109, in app response = await f(request) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 387, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 290, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 38, in run_in_threadpool return await anyio.to_thread.run_sync(func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 976, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/routers/chats.py", line 179, in search_user_chats for chat in Chats.get_chats_by_user_id_and_search_text( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/models/chats.py", line 908, in get_chats_by_user_id_and_search_text all_chats = query.offset(skip).limit(limit).all() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2699, in all return self._iter().all() # type: ignore ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2853, in _iter result: Union[ScalarResult[_T], Result[_T]] = self.session.execute( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2365, in execute return self._execute_internal( ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2251, in _execute_internal result: Result[Any] = compile_state_cls.orm_execute_statement( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 305, in orm_execute_statement result = conn.execute( ^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute return meth( ^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection return connection._execute_clauseelement( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1638, in _execute_clauseelement ret = self._execute_context( ^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1843, in _execute_context return self._exec_single_context( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1983, in _exec_single_context self._handle_dbapi_exception( File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2352, in _handle_dbapi_exception raise sqlalchemy_exception.with_traceback(exc_info[2]) from e File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context self.dialect.do_execute( File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type json DETAIL: Unicode low surrogate must follow a high surrogate. CONTEXT: JSON data, line 1: ...s, privately draft a concise checklist of 3\udc96... [SQL: SELECT chat.id AS chat_id, chat.user_id AS chat_user_id, chat.title AS chat_title, chat.chat AS chat_chat, chat.created_at AS chat_created_at, chat.updated_at AS chat_updated_at, chat.share_id AS chat_share_id, chat.archived AS chat_archived, chat.pinned AS chat_pinned, chat.meta AS chat_meta, chat.folder_id AS chat_folder_id FROM chat WHERE chat.user_id = %(user_id_1)s AND chat.archived = false AND Chat.chat::text NOT LIKE '%%\\u0000%%' AND Chat.title::text NOT LIKE '%%\x00%%' AND (chat.title ILIKE %(title_key)s OR EXISTS ( SELECT 1 FROM json_array_elements(Chat.chat->'messages') AS message WHERE json_typeof(message->'content') = 'string' AND LOWER(message->>'content') LIKE '%%' || %(content_key)s || '%%' ) ) ORDER BY chat.updated_at DESC LIMIT %(param_1)s OFFSET %(param_2)s] [parameters: {'user_id_1': '714affbb-d092-4e39-af42-0c59ee82ab8d', 'title_key': '%request model%', 'content_key': 'request model', 'param_1': 60, 'param_2': 0}] (Background on this error at: https://sqlalche.me/e/20/9h9h) SQL Query That Fails `SELECT chat.id, chat.user_id, chat.title, chat.chat, [...] FROM chat WHERE chat.user_id = '<user-id>' AND chat.archived = false AND Chat.chat::text NOT LIKE '%\\u0000%' -- Line 850: filters null bytes AND Chat.title::text NOT LIKE '%\x00%' -- Line 853: filters null bytes AND (chat.title ILIKE '%request model%' OR EXISTS ( SELECT 1 FROM json_array_elements(Chat.chat->'messages') AS message WHERE json_typeof(message->'content') = 'string' AND LOWER(message->>'content') LIKE '%' || 'request model' || '%' -- This line fails ) ) ORDER BY chat.updated_at DESC LIMIT 60 OFFSET 0` The query fails when PostgreSQL attempts to execute: 1. Chat.chat::text on line 850 (when checking for null bytes) 2. message->>'content' inside the EXISTS clause (when searching message content) Both operations require PostgreSQL to convert JSON to text, which fails when the JSON contains invalid surrogate pairs. ### Additional Information | Aspect| Issue #15616 | This Issue | |--------|--------|--------| | Error Type | `UntranslatableCharacter` | `InvalidTextRepresentation` | | Invalid Character | Null bytes (`\u0000`) | Surrogate pairs (`\udc00-\udfff`, `\ud800-\udbff`) | | Error Detail | `\u0000 cannot be converted to text` | `Unicode low surrogate must follow a high surrogate` | | Current Status| Fixed in lines 850-853| **Not addressed** | The fix for #15616 added filters for null bytes, but did not account for invalid surrogate characters. Both types of invalid Unicode need to be filtered for robust PostgreSQL support.
GiteaMirror added the bug label 2026-04-20 01:12:57 -05:00
Author
Owner

@tjbck commented on GitHub (Nov 28, 2025):

Open to reviewing PRs!

<!-- gh-comment-id:3587601854 --> @tjbck commented on GitHub (Nov 28, 2025): Open to reviewing PRs!
Author
Owner

@rgaricano commented on GitHub (Nov 28, 2025):

My proposal: https://github.com/open-webui/open-webui/issues/15616#issuecomment-3324388885

<!-- gh-comment-id:3588435972 --> @rgaricano commented on GitHub (Nov 28, 2025): My proposal: https://github.com/open-webui/open-webui/issues/15616#issuecomment-3324388885
Author
Owner

@rbb-dev commented on GitHub (Dec 2, 2025):

After investigating, ive found that cleaning up these records cannot be done reliably using SQL alone. Given how rarely these data issues occur, patching Open WebUI itself is not the best approach.

Instead, I have written a small pipeline that scans and repairs affected chats, specifically handling the currently reported issues with null and stray UTF-16 bytes. You can find it here:
https://github.com/rbb-dev/Open-WebUI-Chat-Repair

<!-- gh-comment-id:3601880287 --> @rbb-dev commented on GitHub (Dec 2, 2025): After investigating, ive found that cleaning up these records cannot be done reliably using SQL alone. Given how rarely these data issues occur, patching Open WebUI itself is not the best approach. Instead, I have written a small pipeline that scans and repairs affected chats, specifically handling the currently reported issues with null and stray UTF-16 bytes. You can find it here: https://github.com/rbb-dev/Open-WebUI-Chat-Repair
Author
Owner

@rgaricano commented on GitHub (Dec 2, 2025):

( @rbb-dev, if you want add more functionality, take a look to the tool I madel for check & repair chat history: https://github.com/open-webui/open-webui/issues/15189#issuecomment-3333298297 )

<!-- gh-comment-id:3602362770 --> @rgaricano commented on GitHub (Dec 2, 2025): ( @rbb-dev, if you want add more functionality, take a look to the tool I madel for check & repair chat history: https://github.com/open-webui/open-webui/issues/15189#issuecomment-3333298297 )
Author
Owner

@rbb-dev commented on GitHub (Dec 4, 2025):

This is still work in progress, but its a start.
db clealn up
storage clean up
inline image detachment to reduce DB size..
and repairs.
https://github.com/rbb-dev/Open-WebUI-maintenance

<!-- gh-comment-id:3610308111 --> @rbb-dev commented on GitHub (Dec 4, 2025): This is still work in progress, but its a start. db clealn up storage clean up inline image detachment to reduce DB size.. and repairs. https://github.com/rbb-dev/Open-WebUI-maintenance
Author
Owner

@Classic298 commented on GitHub (Dec 21, 2025):

https://github.com/open-webui/open-webui/pull/20072 should help here to prevent this in the future

if you already have contaminated data you'll need a script to clean your database

<!-- gh-comment-id:3678752453 --> @Classic298 commented on GitHub (Dec 21, 2025): https://github.com/open-webui/open-webui/pull/20072 should help here to prevent this in the future if you already have contaminated data you'll need a script to clean your database
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18924