[GH-ISSUE #24360] feat: reasoning support for opus 4.7? #58946

Closed
opened 2026-05-06 00:35:08 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @zplizzi on GitHub (May 4, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24360

Check Existing Issues

  • I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

  • I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

i can't figure out how to enable thinking mode for opus 4.7. seems like they changed the settings for this recently: https://platform.claude.com/docs/en/build-with-claude/extended-thinking

but also maybe openweb-ui is using their openai-compat endpoint, which i think doesn't give settings to configure thinking, and doesn't return the thinking tokens (?). that would be a major limitation imo. but if this project's stance is to only support openai-compat providers, i would understand.

Desired Solution you'd like

have support for configuring thinking and seeing reasoning traces.

Alternatives Considered

No response

Additional Context

No response

Originally created by @zplizzi on GitHub (May 4, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24360 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description i can't figure out how to enable thinking mode for opus 4.7. seems like they changed the settings for this recently: https://platform.claude.com/docs/en/build-with-claude/extended-thinking but also maybe openweb-ui is using their openai-compat endpoint, which i think doesn't give settings to configure thinking, and doesn't return the thinking tokens (?). that would be a major limitation imo. but if this project's stance is to only support openai-compat providers, i would understand. ### Desired Solution you'd like have support for configuring thinking and seeing reasoning traces. ### Alternatives Considered _No response_ ### Additional Context _No response_
Author
Owner

@owui-terminator[bot] commented on GitHub (May 4, 2026):

🔍 Similar Issues Found

I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions:

  1. #23260 feat: support of reasoning field for reasoning content of the LLM
    by joazker-bit

  2. #19328 feat: Support reasoning_details across tool steps (currently breaks Gemini 3 pro + native function calling)
    by ciejer

  3. #18381 feat: Qwen3-Next reasoning support
    by R3tr0ooo

  4. #14963 feat: Support Magistral models (keeping reasoning traces)
    by Tureti

  5. #14867 feat: Add support to toggle thinking (sending reasoning_effort) for supported models
    by fkrauthan


💡 If this is a duplicate, consider closing it and adding details to the existing issue.

This comment was generated automatically. React with 👍 if helpful, 👎 if not.

<!-- gh-comment-id:4374088237 --> @owui-terminator[bot] commented on GitHub (May 4, 2026): 🔍 **Similar Issues Found** I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions: 1. [#23260](https://github.com/open-webui/open-webui/issues/23260) **feat: support of reasoning field for reasoning content of the LLM** *by joazker-bit* 2. [#19328](https://github.com/open-webui/open-webui/issues/19328) **feat: Support `reasoning_details` across tool steps (currently breaks Gemini 3 pro + native function calling)** *by ciejer* 3. [#18381](https://github.com/open-webui/open-webui/issues/18381) **feat: Qwen3-Next reasoning support** *by R3tr0ooo* 4. [#14963](https://github.com/open-webui/open-webui/issues/14963) **feat: Support Magistral models (keeping reasoning traces)** *by Tureti* 5. [#14867](https://github.com/open-webui/open-webui/issues/14867) **feat: Add support to toggle thinking (sending `reasoning_effort`) for supported models** *by fkrauthan* --- 💡 If this is a duplicate, consider closing it and adding details to the existing issue. *This comment was generated automatically.* React with 👍 if helpful, 👎 if not.
Author
Owner

@zplizzi commented on GitHub (May 4, 2026):

here's a mostly working attempt to get it working through a "function". set in admin->functions->add new. then add your API key in the function settings (not in the code), and you have to manually enable it. then there will be new model options in the dropdown.

the "stop" button doesn't seem to work, though. and i'd love if this worked natively; telling someone to copy in some python code to get one of the the most popular model families to work is not good UX (or good safety).

"""
title: Anthropic Opus 4.7 (Adaptive Thinking)
author: you
version: 0.2.0
license: MIT
required_open_webui_version: 0.5.0
"""

import os
import json
import requests
from typing import Union, Generator, Iterator
from pydantic import BaseModel, Field
from open_webui.utils.misc import pop_system_message


class Pipe:
    class Valves(BaseModel):
        ANTHROPIC_API_KEY: str = Field(default="", description="Your Anthropic API key")
        EFFORT: str = Field(
            default="high",
            description="Default effort if model id has no suffix: low, medium, high, xhigh, max",
        )
        DISPLAY_THINKING: bool = Field(
            default=True,
            description="If true, request summarized thinking; if false, omit it",
        )
        MAX_TOKENS: int = Field(
            default=16000,
            description="Max output tokens. Bump to 64000+ for xhigh/max effort.",
        )

    def __init__(self):
        self.type = "manifold"
        self.id = "anthropic_opus_47"
        self.name = "anthropic/"
        self.valves = self.Valves(
            ANTHROPIC_API_KEY=os.getenv("ANTHROPIC_API_KEY", ""),
        )

    def pipes(self):
        # One entry per effort level so you can pick from the model dropdown.
        return [
            {"id": "claude-opus-4-7-low", "name": "claude-opus-4.7 (low)"},
            {"id": "claude-opus-4-7-medium", "name": "claude-opus-4.7 (medium)"},
            {"id": "claude-opus-4-7-high", "name": "claude-opus-4.7 (high)"},
            {"id": "claude-opus-4-7-xhigh", "name": "claude-opus-4.7 (xhigh)"},
            {"id": "claude-opus-4-7-max", "name": "claude-opus-4.7 (max)"},
        ]

    def pipe(self, body: dict) -> Union[str, Generator, Iterator]:
        if not self.valves.ANTHROPIC_API_KEY:
            return "Error: ANTHROPIC_API_KEY is not set in valves."

        # The model id arrives as "<pipe_id>.<model_id>", e.g.
        # "anthropic_opus_47.claude-opus-4-7-high". Strip the prefix.
        model_id = body["model"].split(".", 1)[-1]

        if model_id.endswith("-low"):
            effort = "low"
        elif model_id.endswith("-medium"):
            effort = "medium"
        elif model_id.endswith("-high") and not model_id.endswith("-xhigh"):
            effort = "high"
        elif model_id.endswith("-xhigh"):
            effort = "xhigh"
        elif model_id.endswith("-max"):
            effort = "max"
        else:
            effort = self.valves.EFFORT

        system_message, messages = pop_system_message(body["messages"])

        payload = {
            "model": "claude-opus-4-7",
            "messages": messages,
            "max_tokens": body.get("max_tokens", self.valves.MAX_TOKENS),
            "stream": body.get("stream", False),
            "thinking": {
                "type": "adaptive",
                "display": "summarized" if self.valves.DISPLAY_THINKING else "omitted",
            },
            "output_config": {"effort": effort},
        }
        if system_message:
            payload["system"] = str(system_message)

        headers = {
            "x-api-key": self.valves.ANTHROPIC_API_KEY,
            "anthropic-version": "2023-06-01",
            "content-type": "application/json",
        }

        url = "https://api.anthropic.com/v1/messages"

        if payload["stream"]:
            return self._stream(url, headers, payload)
        else:
            r = requests.post(url, headers=headers, json=payload, timeout=300)
            if r.status_code != 200:
                return (
                    f"Anthropic {r.status_code}: {r.text}\n\n"
                    f"Payload sent:\n{json.dumps(payload, indent=2)}"
                )
            data = r.json()
            out = []
            for block in data.get("content", []):
                if block.get("type") == "thinking" and block.get("thinking"):
                    out.append(f"<think>\n{block['thinking']}\n</think>\n\n")
                elif block.get("type") == "text":
                    out.append(block.get("text", ""))
            return "".join(out)

    def _stream(self, url, headers, payload):
        r = requests.post(url, headers=headers, json=payload, stream=True, timeout=300)
        try:
            if r.status_code != 200:
                yield (
                    f"Anthropic {r.status_code}: "
                    f"{r.read().decode('utf-8', 'replace')}"
                )
                return

            in_thinking = False
            for line in r.iter_lines():
                if not line or not line.startswith(b"data: "):
                    continue
                chunk = line[6:]
                if chunk == b"[DONE]":
                    break
                try:
                    evt = json.loads(chunk)
                except Exception:
                    continue

                etype = evt.get("type")
                if etype == "content_block_start":
                    if evt["content_block"]["type"] == "thinking":
                        in_thinking = True
                        yield "<think>\n"
                elif etype == "content_block_delta":
                    d = evt["delta"]
                    if d.get("type") == "thinking_delta":
                        yield d.get("thinking", "")
                    elif d.get("type") == "text_delta":
                        yield d.get("text", "")
                elif etype == "content_block_stop":
                    if in_thinking:
                        in_thinking = False
                        yield "\n</think>\n\n"
        except GeneratorExit:
            # Client disconnected (user pressed stop). Tear down the upstream
            # connection so Anthropic stops generating and we stop billing.
            r.close()
            raise
        finally:
            r.close()
<!-- gh-comment-id:4374198135 --> @zplizzi commented on GitHub (May 4, 2026): here's a mostly working attempt to get it working through a "function". set in admin->functions->add new. then add your API key in the function settings (not in the code), and you have to manually enable it. then there will be new model options in the dropdown. the "stop" button doesn't seem to work, though. and i'd love if this worked natively; telling someone to copy in some python code to get one of the the most popular model families to work is not good UX (or good safety). ```python """ title: Anthropic Opus 4.7 (Adaptive Thinking) author: you version: 0.2.0 license: MIT required_open_webui_version: 0.5.0 """ import os import json import requests from typing import Union, Generator, Iterator from pydantic import BaseModel, Field from open_webui.utils.misc import pop_system_message class Pipe: class Valves(BaseModel): ANTHROPIC_API_KEY: str = Field(default="", description="Your Anthropic API key") EFFORT: str = Field( default="high", description="Default effort if model id has no suffix: low, medium, high, xhigh, max", ) DISPLAY_THINKING: bool = Field( default=True, description="If true, request summarized thinking; if false, omit it", ) MAX_TOKENS: int = Field( default=16000, description="Max output tokens. Bump to 64000+ for xhigh/max effort.", ) def __init__(self): self.type = "manifold" self.id = "anthropic_opus_47" self.name = "anthropic/" self.valves = self.Valves( ANTHROPIC_API_KEY=os.getenv("ANTHROPIC_API_KEY", ""), ) def pipes(self): # One entry per effort level so you can pick from the model dropdown. return [ {"id": "claude-opus-4-7-low", "name": "claude-opus-4.7 (low)"}, {"id": "claude-opus-4-7-medium", "name": "claude-opus-4.7 (medium)"}, {"id": "claude-opus-4-7-high", "name": "claude-opus-4.7 (high)"}, {"id": "claude-opus-4-7-xhigh", "name": "claude-opus-4.7 (xhigh)"}, {"id": "claude-opus-4-7-max", "name": "claude-opus-4.7 (max)"}, ] def pipe(self, body: dict) -> Union[str, Generator, Iterator]: if not self.valves.ANTHROPIC_API_KEY: return "Error: ANTHROPIC_API_KEY is not set in valves." # The model id arrives as "<pipe_id>.<model_id>", e.g. # "anthropic_opus_47.claude-opus-4-7-high". Strip the prefix. model_id = body["model"].split(".", 1)[-1] if model_id.endswith("-low"): effort = "low" elif model_id.endswith("-medium"): effort = "medium" elif model_id.endswith("-high") and not model_id.endswith("-xhigh"): effort = "high" elif model_id.endswith("-xhigh"): effort = "xhigh" elif model_id.endswith("-max"): effort = "max" else: effort = self.valves.EFFORT system_message, messages = pop_system_message(body["messages"]) payload = { "model": "claude-opus-4-7", "messages": messages, "max_tokens": body.get("max_tokens", self.valves.MAX_TOKENS), "stream": body.get("stream", False), "thinking": { "type": "adaptive", "display": "summarized" if self.valves.DISPLAY_THINKING else "omitted", }, "output_config": {"effort": effort}, } if system_message: payload["system"] = str(system_message) headers = { "x-api-key": self.valves.ANTHROPIC_API_KEY, "anthropic-version": "2023-06-01", "content-type": "application/json", } url = "https://api.anthropic.com/v1/messages" if payload["stream"]: return self._stream(url, headers, payload) else: r = requests.post(url, headers=headers, json=payload, timeout=300) if r.status_code != 200: return ( f"Anthropic {r.status_code}: {r.text}\n\n" f"Payload sent:\n{json.dumps(payload, indent=2)}" ) data = r.json() out = [] for block in data.get("content", []): if block.get("type") == "thinking" and block.get("thinking"): out.append(f"<think>\n{block['thinking']}\n</think>\n\n") elif block.get("type") == "text": out.append(block.get("text", "")) return "".join(out) def _stream(self, url, headers, payload): r = requests.post(url, headers=headers, json=payload, stream=True, timeout=300) try: if r.status_code != 200: yield ( f"Anthropic {r.status_code}: " f"{r.read().decode('utf-8', 'replace')}" ) return in_thinking = False for line in r.iter_lines(): if not line or not line.startswith(b"data: "): continue chunk = line[6:] if chunk == b"[DONE]": break try: evt = json.loads(chunk) except Exception: continue etype = evt.get("type") if etype == "content_block_start": if evt["content_block"]["type"] == "thinking": in_thinking = True yield "<think>\n" elif etype == "content_block_delta": d = evt["delta"] if d.get("type") == "thinking_delta": yield d.get("thinking", "") elif d.get("type") == "text_delta": yield d.get("text", "") elif etype == "content_block_stop": if in_thinking: in_thinking = False yield "\n</think>\n\n" except GeneratorExit: # Client disconnected (user pressed stop). Tear down the upstream # connection so Anthropic stops generating and we stop billing. r.close() raise finally: r.close() ```
Author
Owner

@zplizzi commented on GitHub (May 4, 2026):

also, seems like for 4.7+, the thinking returned to the user is just a "summary" of the real thinking tokens, which is sad :/

https://platform.claude.com/docs/en/build-with-claude/extended-thinking#summarized-thinking

<!-- gh-comment-id:4374208763 --> @zplizzi commented on GitHub (May 4, 2026): also, seems like for 4.7+, the thinking returned to the user is just a "summary" of the real thinking tokens, which is sad :/ https://platform.claude.com/docs/en/build-with-claude/extended-thinking#summarized-thinking
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

i myself cant get opus 4.7 to reason via API, anthropic seems to have disabled the ability to choose when the model reasons. I can't get it to reason at all anymore but this is not an open webui issue

<!-- gh-comment-id:4374256147 --> @Classic298 commented on GitHub (May 4, 2026): i myself cant get opus 4.7 to reason via API, anthropic seems to have disabled the ability to choose when the model reasons. I can't get it to reason at all anymore but this is not an open webui issue
Author
Owner

@zplizzi commented on GitHub (May 4, 2026):

er, that's not true - the options are kinda awkward, but you definitely can enable and disable reasoning, and set the "effort" level. the code snipped above does this and it works.

<!-- gh-comment-id:4374277255 --> @zplizzi commented on GitHub (May 4, 2026): er, that's not true - the options are kinda awkward, but you definitely can enable and disable reasoning, and set the "effort" level. the code snipped above does this and it works.
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

can you get the model to REALIABLY ALWAYS reason??

<!-- gh-comment-id:4374283881 --> @Classic298 commented on GitHub (May 4, 2026): can you get the model to REALIABLY ALWAYS reason??
Author
Owner

@zplizzi commented on GitHub (May 5, 2026):

i think it chooses if reasoning is needed, but that’s fine enough. it
should just be enableable and show the reasoning if it uses it.

On Mon, May 4, 2026 at 1:34 PM Classic298 @.***> wrote:

Classic298 left a comment (open-webui/open-webui#24360)
https://github.com/open-webui/open-webui/issues/24360#issuecomment-4374283881

can you get the model to REALIABLY ALWAYS reason??


Reply to this email directly, view it on GitHub
https://github.com/open-webui/open-webui/issues/24360#issuecomment-4374283881,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABKW56ED2C47G3NZNZ4MOL34ZD5FFAVCNFSM6AAAAACYQUITGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGNZUGI4DGOBYGE
.
Triage notifications on the go with GitHub Mobile for iOS
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
or Android
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID:
@.***>

<!-- gh-comment-id:4375472049 --> @zplizzi commented on GitHub (May 5, 2026): i think it chooses if reasoning is needed, but that’s fine enough. it should just be enableable and show the reasoning if it uses it. On Mon, May 4, 2026 at 1:34 PM Classic298 ***@***.***> wrote: > *Classic298* left a comment (open-webui/open-webui#24360) > <https://github.com/open-webui/open-webui/issues/24360#issuecomment-4374283881> > > can you get the model to REALIABLY ALWAYS reason?? > > — > Reply to this email directly, view it on GitHub > <https://github.com/open-webui/open-webui/issues/24360#issuecomment-4374283881>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABKW56ED2C47G3NZNZ4MOL34ZD5FFAVCNFSM6AAAAACYQUITGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGNZUGI4DGOBYGE> > . > Triage notifications on the go with GitHub Mobile for iOS > <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> > or Android > <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. > > You are receiving this because you authored the thread.Message ID: > ***@***.***> >
Author
Owner

@Podden commented on GitHub (May 5, 2026):

Opus 4.7 and Mythos are hiding thinking tokens by default for speedup you can control this by passing the display parameter in your thinking block

thinking: {
type: "adaptive", 
display: "summarized" //defaults to "omitted" now
}

I've implemented it in my pipe if you want to give it a go.

<!-- gh-comment-id:4376985482 --> @Podden commented on GitHub (May 5, 2026): Opus 4.7 and Mythos are hiding thinking tokens by default for speedup you can control this by passing the display parameter in your thinking block ``` thinking: { type: "adaptive", display: "summarized" //defaults to "omitted" now } ``` I've implemented it in my [pipe](https://openwebui.com/posts/complex_anthropic_pipe_d51d4333) if you want to give it a go.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58946