[GH-ISSUE #9549] 502 Bad Gateway with stream=True using ollama and httpx on Windows (v0.5.7) #6228

Closed
opened 2026-04-12 17:38:28 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @lujixiang on GitHub (Mar 6, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9549

What is the issue?

描述:
服务版本:0.5.7
库版本:ollama 0.4.7, httpx
环境:Windows
问题:流式请求报 502,非流式正常;requests 流式正常。
重现步骤:提供脚本和输出。

Relevant log output

import requests
import json

def test_non_stream():
    print("测试非流式请求...")
    url = "http://localhost:11434/api/generate"
    data = {
        "model": "qwen2.5:7b",
        "prompt": "Hello",
        "stream": False
    }
    response = requests.post(url, json=data)
    print(f"状态码: {response.status_code}")
    print(f"响应内容: {response.json()}")

def test_stream():
    print("\n测试流式请求...")
    url = "http://localhost:11434/api/generate"
    data = {
        "model": "qwen2.5:7b",
        "prompt": "Hello",
        "stream": True
    }
    response = requests.post(url, json=data, stream=True)
    print(f"状态码: {response.status_code}")
    full_response = ""
    for line in response.iter_lines():
        if line:
            chunk = json.loads(line.decode('utf-8'))
            print(f"原始响应: {line.decode('utf-8')}")
            if "response" in chunk:
                full_response += chunk["response"]
    print(f"完整回答: {full_response}")

if __name__ == "__main__":
    test_non_stream()
    test_stream()
python test_ollama.py
200 {'model': 'qwen2.5:7b', 'created_at': '2025-03-06T13:56:52.9099449Z', 'response': 'Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific.', 'done': True, 'done_reason': 'stop', 'context': [151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 9707, 151645, 198, 151644, 77091, 198, 9707, 0, 2585, 646, 358, 7789, 498, 3351, 30, 31733, 1910, 311, 2548, 752, 894, 4755, 476, 1077, 752, 1414, 421, 498, 1184, 1492, 448, 4113, 3151, 13], 'total_duration': 20852036000, 'load_duration': 2624889800, 'prompt_eval_count': 30, 'prompt_eval_duration': 5905000000, 'eval_count': 29, 'eval_duration': 12319000000}
PS G:\ChineseMedicine\RAGQnASystem> python test_ollama.py
测试非流式请求...
状态码: 200
响应内容: {'model': 'qwen2.5:7b', 'created_at': '2025-03-06T14:00:50.5144825Z', 'response': 'Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything.', 'done': True, 'done_reason': 'stop', 'context': [151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 9707, 151645, 198, 151644, 77091, 198, 9707, 0, 2585, 646, 358, 7789, 498, 3351, 30, 31733, 1910, 311, 2548, 752, 894, 4755, 476, 1077, 752, 1414, 421, 498, 1184, 1492, 448, 4113, 13], 'total_duration': 9805980600, 'load_duration': 19316100, 'prompt_eval_count': 30, 'prompt_eval_duration': 545000000, 'eval_count': 28, 'eval_duration': 9240000000}

测试流式请求...
状态码: 200
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:53.133863Z","response":"Hello","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:53.5364512Z","response":"!","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:53.9203841Z","response":" Nice","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.3023129Z","response":" to","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.4141542Z","response":" meet","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.5314663Z","response":" you","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.6494147Z","response":".","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.774694Z","response":" How","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.8839354Z","response":" can","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:55.3037153Z","response":" I","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:55.7045539Z","response":" assist","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:56.1183736Z","response":" you","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:56.5583607Z","response":" today","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:56.8107199Z","response":"?","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:57.2806849Z","response":" Whether","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:57.4403473Z","response":" you","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:57.8858919Z","response":" have","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:58.0407432Z","response":" questions","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:58.4702173Z","response":",","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:58.941108Z","response":" need","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:59.1471744Z","response":" information","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:59.3500194Z","response":",","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:59.818258Z","response":" or","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.3007422Z","response":" just","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.7110117Z","response":" want","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.8368074Z","response":" to","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.9453674Z","response":" chat","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:01.3350672Z","response":",","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:01.7314499Z","response":" feel","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:02.1333104Z","response":" free","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:02.6028868Z","response":" to","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:03.0893268Z","response":" let","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:03.6624977Z","response":" me","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:04.1405238Z","response":" know","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:04.5443316Z","response":".","done":false}
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:05.017529Z","response":"","done":true,"done_reason":"stop","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,9707,151645,198,151644,77091,198,9707,0,28859,311,3367,498,13,2585,646,358,7789,498,3351,30,13139,498,614,4755,11,1184,1995,11,476,1101,1366,311,6236,11,2666,1910,311,1077,752,1414,13],"total_duration":12472752600,"load_duration":16525600,"prompt_eval_count":30,"prompt_eval_duration":569000000,"eval_count":36,"eval_duration":11886000000}
完整回答: Hello! Nice to meet you. How can I assist you today? Whether you have questions, need information, or just want to chat, feel free to let me know.
使用之前的脚本(包含 ollama 和 httpx
import ollama
import httpx
import json

def test_ollama_stream():
    print("测试 ollama 库流式请求...")
    client = ollama.Client(host="http://localhost:11434")
    print("请求服务: http://localhost:11434/api/generate")
    print("请求参数: model='qwen2.5:7b', prompt='Hello', stream=True")
    
    try:
        full_response = ""
        for chunk in client.generate(model="qwen2.5:7b", prompt="Hello", stream=True):
            if "response" in chunk:
                full_response += chunk["response"]
                print(f"部分回答: {chunk['response']}")
            elif "error" in chunk:
                print(f"服务返回错误: {chunk['error']}")
                break
            if chunk.get("done", False):
                break
        print(f"完整回答: {full_response}")
    except ollama.ResponseError as e:
        print(f"服务错误: {e.status_code} - {e.error}")
        print(f"异常详情: {str(e)}")

def test_httpx_stream():
    print("\n测试 httpx 流式请求...")
    url = "http://localhost:11434/api/generate"
    data = {"model": "qwen2.5:7b", "prompt": "Hello", "stream": True}
    headers = {"Content-Type": "application/json", "User-Agent": "curl/8.1.2"}
    
    try:
        with httpx.stream("POST", url, json=data, headers=headers, timeout=30) as response:
            print(f"响应状态码: {response.status_code}")
            full_response = ""
            for line in response.iter_lines():
                if line:
                    chunk = json.loads(line)
                    print(f"原始响应: {line}")
                    if "response" in chunk:
                        full_response += chunk["response"]
                        print(f"部分回答: {chunk['response']}")
                    if chunk.get("done", False):
                        break
            print(f"完整回答: {full_response}")
    except httpx.HTTPStatusError as e:
        print(f"服务错误: {e.response.status_code} - {e.response.text}")
    except Exception as e:
        print(f"未知错误: {str(e)}")

if __name__ == "__main__":
    test_ollama_stream()
    test_httpx_stream()
python test_ollama.py
测试 ollama 库流式请求...
请求服务: http://localhost:11434/api/generate
请求参数: model='qwen2.5:7b', prompt='Hello', stream=True
服务错误: 502 - 
异常详情:  (status code: 502)

测试 httpx 流式请求...
响应状态码: 502
完整回答:


从你提供的运行结果来看,执行 python test_ollama.py 时,使用 ollama 库和 httpx 的流式请求(stream=True)都返回了 502 Bad Gateway,并且响应内容为空。这与之前使用 requests 库成功的情况形成对比。以下是详细分析和最终解决方案。关键点
ollama 库请求:
调用 ollama.generate(stream=True)返回 502。
e.error 为空(服务错误: 502 - )。
异常详情仅显示状态码,无额外信息。
httpx 流式请求:
状态码 502,无响应数据(完整回答: 为空)。
未显示响应头(���能脚本未完全打印)。
对比之前:
使用 requests(stream=True 和 stream=False)都返回 200,成功生成回答。
ollama 和 httpx 的流式请求持续失败。
已知信息
服务运行:curl http://localhost:11434/ 返回 "Ollama is running"。
模型可用:ollama run qwen2.5:7b 生成回答。
API 测试:curl -X POST 返回 JSON。
requests 测试:流式和非流式请求成功。
版本:服务 0.5.7,库 0.4.7。
问题分析
1. 当前失败的原因
流式请求问题:
ollama 库和 httpx 的流式请求(stream=True)触发了服务端的 502。
服务端可能无法正确处理这些客户端的流式请求格式或连接方式。
客户端差异:
requests 的流式实现(stream=True + iter_lines)与服务端兼容。
httpx 的流式实现(httpx.stream)和 ollama 库的内部实现可能不兼容。
2. 成功与失败的对比
成功:
requests(流式和非流式)工作正常。
curl(流式和非流式)工作正常。
失败:
ollama 和 httpx 的流式请求报 502。
结论:
问题出在 ollama 和 httpx 的流式请求实现,与服务端 0.5.7 在 Windows 环境下的兼容性有关。
3. 可能的根源
服务端 bug:
0.5.7 在 Windows 上对某些流式请求处理异常。
客户端配置:
httpx 和 ollama 的默认超时、头部或连接管理可能触发服务端问题。
模型加载:
虽然 requests 成功,但未预加载模型时可能影响其他客户端。
解决步骤
1. 最终解决方案
使用 requests 代替 ollama 和 httpx:
当前环境下,requests 是唯一稳定支持流式和非流式请求的库。

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

版本:服务 0.5.7,库 0.4.7。

Originally created by @lujixiang on GitHub (Mar 6, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9549 ### What is the issue? 描述: 服务版本:0.5.7 库版本:ollama 0.4.7, httpx 环境:Windows 问题:流式请求报 502,非流式正常;requests 流式正常。 重现步骤:提供脚本和输出。 ### Relevant log output ```shell import requests import json def test_non_stream(): print("测试非流式请求...") url = "http://localhost:11434/api/generate" data = { "model": "qwen2.5:7b", "prompt": "Hello", "stream": False } response = requests.post(url, json=data) print(f"状态码: {response.status_code}") print(f"响应内容: {response.json()}") def test_stream(): print("\n测试流式请求...") url = "http://localhost:11434/api/generate" data = { "model": "qwen2.5:7b", "prompt": "Hello", "stream": True } response = requests.post(url, json=data, stream=True) print(f"状态码: {response.status_code}") full_response = "" for line in response.iter_lines(): if line: chunk = json.loads(line.decode('utf-8')) print(f"原始响应: {line.decode('utf-8')}") if "response" in chunk: full_response += chunk["response"] print(f"完整回答: {full_response}") if __name__ == "__main__": test_non_stream() test_stream() python test_ollama.py 200 {'model': 'qwen2.5:7b', 'created_at': '2025-03-06T13:56:52.9099449Z', 'response': 'Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific.', 'done': True, 'done_reason': 'stop', 'context': [151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 9707, 151645, 198, 151644, 77091, 198, 9707, 0, 2585, 646, 358, 7789, 498, 3351, 30, 31733, 1910, 311, 2548, 752, 894, 4755, 476, 1077, 752, 1414, 421, 498, 1184, 1492, 448, 4113, 3151, 13], 'total_duration': 20852036000, 'load_duration': 2624889800, 'prompt_eval_count': 30, 'prompt_eval_duration': 5905000000, 'eval_count': 29, 'eval_duration': 12319000000} PS G:\ChineseMedicine\RAGQnASystem> python test_ollama.py 测试非流式请求... 状态码: 200 响应内容: {'model': 'qwen2.5:7b', 'created_at': '2025-03-06T14:00:50.5144825Z', 'response': 'Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything.', 'done': True, 'done_reason': 'stop', 'context': [151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 9707, 151645, 198, 151644, 77091, 198, 9707, 0, 2585, 646, 358, 7789, 498, 3351, 30, 31733, 1910, 311, 2548, 752, 894, 4755, 476, 1077, 752, 1414, 421, 498, 1184, 1492, 448, 4113, 13], 'total_duration': 9805980600, 'load_duration': 19316100, 'prompt_eval_count': 30, 'prompt_eval_duration': 545000000, 'eval_count': 28, 'eval_duration': 9240000000} 测试流式请求... 状态码: 200 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:53.133863Z","response":"Hello","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:53.5364512Z","response":"!","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:53.9203841Z","response":" Nice","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.3023129Z","response":" to","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.4141542Z","response":" meet","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.5314663Z","response":" you","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.6494147Z","response":".","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.774694Z","response":" How","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:54.8839354Z","response":" can","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:55.3037153Z","response":" I","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:55.7045539Z","response":" assist","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:56.1183736Z","response":" you","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:56.5583607Z","response":" today","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:56.8107199Z","response":"?","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:57.2806849Z","response":" Whether","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:57.4403473Z","response":" you","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:57.8858919Z","response":" have","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:58.0407432Z","response":" questions","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:58.4702173Z","response":",","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:58.941108Z","response":" need","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:59.1471744Z","response":" information","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:59.3500194Z","response":",","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:00:59.818258Z","response":" or","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.3007422Z","response":" just","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.7110117Z","response":" want","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.8368074Z","response":" to","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:00.9453674Z","response":" chat","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:01.3350672Z","response":",","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:01.7314499Z","response":" feel","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:02.1333104Z","response":" free","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:02.6028868Z","response":" to","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:03.0893268Z","response":" let","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:03.6624977Z","response":" me","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:04.1405238Z","response":" know","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:04.5443316Z","response":".","done":false} 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T14:01:05.017529Z","response":"","done":true,"done_reason":"stop","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,9707,151645,198,151644,77091,198,9707,0,28859,311,3367,498,13,2585,646,358,7789,498,3351,30,13139,498,614,4755,11,1184,1995,11,476,1101,1366,311,6236,11,2666,1910,311,1077,752,1414,13],"total_duration":12472752600,"load_duration":16525600,"prompt_eval_count":30,"prompt_eval_duration":569000000,"eval_count":36,"eval_duration":11886000000} 完整回答: Hello! Nice to meet you. How can I assist you today? Whether you have questions, need information, or just want to chat, feel free to let me know. 使用之前的脚本(包含 ollama 和 httpx import ollama import httpx import json def test_ollama_stream(): print("测试 ollama 库流式请求...") client = ollama.Client(host="http://localhost:11434") print("请求服务: http://localhost:11434/api/generate") print("请求参数: model='qwen2.5:7b', prompt='Hello', stream=True") try: full_response = "" for chunk in client.generate(model="qwen2.5:7b", prompt="Hello", stream=True): if "response" in chunk: full_response += chunk["response"] print(f"部分回答: {chunk['response']}") elif "error" in chunk: print(f"服务返回错误: {chunk['error']}") break if chunk.get("done", False): break print(f"完整回答: {full_response}") except ollama.ResponseError as e: print(f"服务错误: {e.status_code} - {e.error}") print(f"异常详情: {str(e)}") def test_httpx_stream(): print("\n测试 httpx 流式请求...") url = "http://localhost:11434/api/generate" data = {"model": "qwen2.5:7b", "prompt": "Hello", "stream": True} headers = {"Content-Type": "application/json", "User-Agent": "curl/8.1.2"} try: with httpx.stream("POST", url, json=data, headers=headers, timeout=30) as response: print(f"响应状态码: {response.status_code}") full_response = "" for line in response.iter_lines(): if line: chunk = json.loads(line) print(f"原始响应: {line}") if "response" in chunk: full_response += chunk["response"] print(f"部分回答: {chunk['response']}") if chunk.get("done", False): break print(f"完整回答: {full_response}") except httpx.HTTPStatusError as e: print(f"服务错误: {e.response.status_code} - {e.response.text}") except Exception as e: print(f"未知错误: {str(e)}") if __name__ == "__main__": test_ollama_stream() test_httpx_stream() python test_ollama.py 测试 ollama 库流式请求... 请求服务: http://localhost:11434/api/generate 请求参数: model='qwen2.5:7b', prompt='Hello', stream=True 服务错误: 502 - 异常详情: (status code: 502) 测试 httpx 流式请求... 响应状态码: 502 完整回答: 从你提供的运行结果来看,执行 python test_ollama.py 时,使用 ollama 库和 httpx 的流式请求(stream=True)都返回了 502 Bad Gateway,并且响应内容为空。这与之前使用 requests 库成功的情况形成对比。以下是详细分析和最终解决方案。关键点 ollama 库请求: 调用 ollama.generate(stream=True)返回 502。 e.error 为空(服务错误: 502 - )。 异常详情仅显示状态码,无额外信息。 httpx 流式请求: 状态码 502,无响应数据(完整回答: 为空)。 未显示响应头(���能脚本未完全打印)。 对比之前: 使用 requests(stream=True 和 stream=False)都返回 200,成功生成回答。 ollama 和 httpx 的流式请求持续失败。 已知信息 服务运行:curl http://localhost:11434/ 返回 "Ollama is running"。 模型可用:ollama run qwen2.5:7b 生成回答。 API 测试:curl -X POST 返回 JSON。 requests 测试:流式和非流式请求成功。 版本:服务 0.5.7,库 0.4.7。 问题分析 1. 当前失败的原因 流式请求问题: ollama 库和 httpx 的流式请求(stream=True)触发了服务端的 502。 服务端可能无法正确处理这些客户端的流式请求格式或连接方式。 客户端差异: requests 的流式实现(stream=True + iter_lines)与服务端兼容。 httpx 的流式实现(httpx.stream)和 ollama 库的内部实现可能不兼容。 2. 成功与失败的对比 成功: requests(流式和非流式)工作正常。 curl(流式和非流式)工作正常。 失败: ollama 和 httpx 的流式请求报 502。 结论: 问题出在 ollama 和 httpx 的流式请求实现,与服务端 0.5.7 在 Windows 环境下的兼容性有关。 3. 可能的根源 服务端 bug: 0.5.7 在 Windows 上对某些流式请求处理异常。 客户端配置: httpx 和 ollama 的默认超时、头部或连接管理可能触发服务端问题。 模型加载: 虽然 requests 成功,但未预加载模型时可能影响其他客户端。 解决步骤 1. 最终解决方案 使用 requests 代替 ollama 和 httpx: 当前环境下,requests 是唯一稳定支持流式和非流式请求的库。 ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 版本:服务 0.5.7,库 0.4.7。
GiteaMirror added the bug label 2026-04-12 17:38:29 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 6, 2025):

Linux:

$ ./9549.py
测试 ollama 库流式请求...
请求服务: http://localhost:11434/api/generate
请求参数: model='qwen2.5:7b', prompt='Hello', stream=True
部分回答: Hello
部分回答: !
部分回答:  Nice
部分回答:  to
部分回答:  meet
部分回答:  you
部分回答: .
部分回答:  How
部分回答:  can
部分回答:  I
部分回答:  assist
部分回答:  you
部分回答:  today
部分回答: ?
部分回答:  Whether
部分回答:  you
部分回答:  have
部分回答:  questions
部分回答: ,
部分回答:  need
部分回答:  information
部分回答: ,
部分回答:  or
部分回答:  just
部分回答:  want
部分回答:  to
部分回答:  chat
部分回答: ,
部分回答:  feel
部分回答:  free
部分回答:  to
部分回答:  let
部分回答:  me
部分回答:  know
部分回答: .
部分回答: 
完整回答: Hello! Nice to meet you. How can I assist you today? Whether you have questions, need information, or just want to chat, feel free to let me know.

测试 httpx 流式请求...
响应状态码: 200
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.354411281Z","response":"Hello","done":false}
部分回答: Hello
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.366253525Z","response":"!","done":false}
部分回答: !
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.378196334Z","response":" How","done":false}
部分回答:  How
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.391555329Z","response":" can","done":false}
部分回答:  can
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.403485553Z","response":" I","done":false}
部分回答:  I
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.415332411Z","response":" assist","done":false}
部分回答:  assist
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.427167253Z","response":" you","done":false}
部分回答:  you
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.4391184Z","response":" today","done":false}
部分回答:  today
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.451074902Z","response":"?","done":false}
部分回答: ?
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.463017349Z","response":" Feel","done":false}
部分回答:  Feel
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.474858261Z","response":" free","done":false}
部分回答:  free
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.486594985Z","response":" to","done":false}
部分回答:  to
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.49831493Z","response":" ask","done":false}
部分回答:  ask
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.510140178Z","response":" me","done":false}
部分回答:  me
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.521873395Z","response":" any","done":false}
部分回答:  any
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.533659049Z","response":" questions","done":false}
部分回答:  questions
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.54544584Z","response":" or","done":false}
部分回答:  or
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.557086334Z","response":" let","done":false}
部分回答:  let
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.568812606Z","response":" me","done":false}
部分回答:  me
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.580735361Z","response":" know","done":false}
部分回答:  know
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.592486101Z","response":" if","done":false}
部分回答:  if
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.604325581Z","response":" you","done":false}
部分回答:  you
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.616067253Z","response":" need","done":false}
部分回答:  need
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.627863018Z","response":" help","done":false}
部分回答:  help
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.639650489Z","response":" with","done":false}
部分回答:  with
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.651583483Z","response":" anything","done":false}
部分回答:  anything
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.663281511Z","response":" specific","done":false}
部分回答:  specific
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.675183873Z","response":".","done":false}
部分回答: .
原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.687032885Z","response":"","done":true,"done_reason":"stop","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,9707,151645,198,151644,77091,198,9707,0,2585,646,358,7789,498,3351,30,31733,1910,311,2548,752,894,4755,476,1077,752,1414,421,498,1184,1492,448,4113,3151,13],"total_duration":630280967,"load_duration":278832620,"prompt_eval_count":30,"prompt_eval_duration":6000000,"eval_count":29,"eval_duration":344000000}
部分回答: 
完整回答: Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific.

<!-- gh-comment-id:2704237411 --> @rick-github commented on GitHub (Mar 6, 2025): Linux: ```console $ ./9549.py 测试 ollama 库流式请求... 请求服务: http://localhost:11434/api/generate 请求参数: model='qwen2.5:7b', prompt='Hello', stream=True 部分回答: Hello 部分回答: ! 部分回答: Nice 部分回答: to 部分回答: meet 部分回答: you 部分回答: . 部分回答: How 部分回答: can 部分回答: I 部分回答: assist 部分回答: you 部分回答: today 部分回答: ? 部分回答: Whether 部分回答: you 部分回答: have 部分回答: questions 部分回答: , 部分回答: need 部分回答: information 部分回答: , 部分回答: or 部分回答: just 部分回答: want 部分回答: to 部分回答: chat 部分回答: , 部分回答: feel 部分回答: free 部分回答: to 部分回答: let 部分回答: me 部分回答: know 部分回答: . 部分回答: 完整回答: Hello! Nice to meet you. How can I assist you today? Whether you have questions, need information, or just want to chat, feel free to let me know. 测试 httpx 流式请求... 响应状态码: 200 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.354411281Z","response":"Hello","done":false} 部分回答: Hello 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.366253525Z","response":"!","done":false} 部分回答: ! 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.378196334Z","response":" How","done":false} 部分回答: How 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.391555329Z","response":" can","done":false} 部分回答: can 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.403485553Z","response":" I","done":false} 部分回答: I 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.415332411Z","response":" assist","done":false} 部分回答: assist 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.427167253Z","response":" you","done":false} 部分回答: you 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.4391184Z","response":" today","done":false} 部分回答: today 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.451074902Z","response":"?","done":false} 部分回答: ? 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.463017349Z","response":" Feel","done":false} 部分回答: Feel 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.474858261Z","response":" free","done":false} 部分回答: free 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.486594985Z","response":" to","done":false} 部分回答: to 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.49831493Z","response":" ask","done":false} 部分回答: ask 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.510140178Z","response":" me","done":false} 部分回答: me 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.521873395Z","response":" any","done":false} 部分回答: any 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.533659049Z","response":" questions","done":false} 部分回答: questions 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.54544584Z","response":" or","done":false} 部分回答: or 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.557086334Z","response":" let","done":false} 部分回答: let 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.568812606Z","response":" me","done":false} 部分回答: me 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.580735361Z","response":" know","done":false} 部分回答: know 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.592486101Z","response":" if","done":false} 部分回答: if 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.604325581Z","response":" you","done":false} 部分回答: you 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.616067253Z","response":" need","done":false} 部分回答: need 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.627863018Z","response":" help","done":false} 部分回答: help 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.639650489Z","response":" with","done":false} 部分回答: with 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.651583483Z","response":" anything","done":false} 部分回答: anything 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.663281511Z","response":" specific","done":false} 部分回答: specific 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.675183873Z","response":".","done":false} 部分回答: . 原始响应: {"model":"qwen2.5:7b","created_at":"2025-03-06T15:42:51.687032885Z","response":"","done":true,"done_reason":"stop","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,9707,151645,198,151644,77091,198,9707,0,2585,646,358,7789,498,3351,30,31733,1910,311,2548,752,894,4755,476,1077,752,1414,421,498,1184,1492,448,4113,3151,13],"total_duration":630280967,"load_duration":278832620,"prompt_eval_count":30,"prompt_eval_duration":6000000,"eval_count":29,"eval_duration":344000000} 部分回答: 完整回答: Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific. ```
Author
Owner

@rick-github commented on GitHub (Mar 7, 2025):

Tried it on Win11 24H2 (26100.3194) and it worked fine.

ollama version is 0.5.7

Name: ollama
Version: 0.4.7
Summary: The official Python client for Ollama.
Home-page: https://ollama.com
Author: Ollama
Author-email: hello@ollama.com
License: MIT
Location: C:\Users\bill\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.13_qbz5n2kfra8p0\LocalCache\local-packages\Python313\site-packages
Requires: httpx, pydantic
Required-by:
---
Name: httpx
Version: 0.28.1
Summary: The next generation HTTP client.
Home-page: https://github.com/encode/httpx
Author:
Author-email: Tom Christie <tom@tomchristie.com>
License: BSD-3-Clause
Location: C:\Users\bill\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.13_qbz5n2kfra8p0\LocalCache\local-packages\Python313\site-packages
Requires: anyio, certifi, httpcore, idna
Required-by: ollama
<!-- gh-comment-id:2706957481 --> @rick-github commented on GitHub (Mar 7, 2025): Tried it on Win11 24H2 (26100.3194) and it worked fine. ``` ollama version is 0.5.7 Name: ollama Version: 0.4.7 Summary: The official Python client for Ollama. Home-page: https://ollama.com Author: Ollama Author-email: hello@ollama.com License: MIT Location: C:\Users\bill\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.13_qbz5n2kfra8p0\LocalCache\local-packages\Python313\site-packages Requires: httpx, pydantic Required-by: --- Name: httpx Version: 0.28.1 Summary: The next generation HTTP client. Home-page: https://github.com/encode/httpx Author: Author-email: Tom Christie <tom@tomchristie.com> License: BSD-3-Clause Location: C:\Users\bill\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.13_qbz5n2kfra8p0\LocalCache\local-packages\Python313\site-packages Requires: anyio, certifi, httpcore, idna Required-by: ollama ```
Author
Owner

@chuanxinwong commented on GitHub (Mar 9, 2025):

看看是不是你的环境变量中有 http_proxy=xxxx, 如果有的话, 先执行 set http_proxy= , 这个环境变量有影响,我之前遇到过多次这种情况,一些sb库依赖环境变量又不说, 出现莫名其妙的问题

<!-- gh-comment-id:2708715779 --> @chuanxinwong commented on GitHub (Mar 9, 2025): 看看是不是你的环境变量中有 http_proxy=xxxx, 如果有的话, 先执行 set http_proxy= , 这个环境变量有影响,我之前遇到过多次这种情况,一些sb库依赖环境变量又不说, 出现莫名其妙的问题
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6228