[GH-ISSUE #8213] do embedding request: Post \"http://127.0.0.1:57955/embedding\": read tcp 127.0.0.1:57957->127.0.0.1:57955: wsarecv: An existing connection was forcibly closed by the remote host. #5242

Closed
opened 2026-04-12 16:23:04 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @conflictpeng on GitHub (Dec 23, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/8213

What is the issue?

这个好想在处理pdf交大的文件的时候是不行的。

OS

Windows

GPU

Nvidia

CPU

Intel, AMD

Ollama version

0.5.1

Originally created by @conflictpeng on GitHub (Dec 23, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/8213 ### What is the issue? 这个好想在处理pdf交大的文件的时候是不行的。 ### OS Windows ### GPU Nvidia ### CPU Intel, AMD ### Ollama version 0.5.1
GiteaMirror added the bug label 2026-04-12 16:23:04 -05:00
Author
Owner

@rick-github commented on GitHub (Dec 23, 2024):

Server logs with OLLAMA_DEBUG=1 set in the server environment will aid debugging.

<!-- gh-comment-id:2558780621 --> @rick-github commented on GitHub (Dec 23, 2024): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) with `OLLAMA_DEBUG=1` set in the [server environment](https://github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows) will aid debugging.
Author
Owner

@conflictpeng commented on GitHub (Dec 23, 2024):

time=2024-12-23T10:40:49.882+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=C:\Users\Administrator.ollama\models\blobs\sha256-92b37e50807d951e27ead73c059cf9c3b14941498e37dfde57271e19e6d411df refCount=0
time=2024-12-23T10:40:49.931+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=C:\Users\Administrator.ollama\models\blobs\sha256-92b37e50807d951e27ead73c059cf9c3b14941498e37dfde57271e19e6d411df
time=2024-12-23T10:40:49.933+08:00 level=DEBUG source=runner.go:752 msg="embedding request" content=" \n 人员类别 主要安全职责 \n 动火人 1.作业前充分了解作业内容、时间、地点、要求,熟知作业中的危害 因素; 2.接受安全技术交底并逐项确认相关安全措施的落实情况; 3.有权拒绝不符合安全条件的动火作业; 4.动火作业结束后清理火种,切断电源、气源; 5.随身携带《动火 许可证》。 \n 监火人 1.应持培训合格证上岗;作业前检查《动火许可证》,确保作业证与 作业内容相符并在有效期内;对《动火 许可证》中的安全措施的落实 情况进行检查,发现制定的措施不当或落实不到位等情况,应立即制 止作业; 2.确认动火人持相应有效 资格证书上岗; 3.核查动火人佩戴的个体防护用品是否满足作业要求; 4.作业期间不得擅离现场或做与监护无关的事,确需离开现场时,应 中止作业; 5.发现动火部位与《动火许可证》不符或动火出现异常情况时,立即 停止动火; 6.动火作业完成后,检查作业现场,确认无安全隐患。 \n 动火作业负责人 动火单位负责人 1.负责办理《动火许可证》,并对动火作业现场全面负责; 2.在动火作业前详细了解作业内容和动火部位及周围情况,参与动火 方案的制定,并督促动火现场安全措施的落实,向作业人员进行安全 技术交底; 3.作业完成后,组织检查现场,确认无遗留火种后方可离开。 1.负责本单位一级《动火许可证》的审核,二级《动火许可证》的审 批; 2.对动火作业存在的风险进行分析、评估,组织制定动火作业方案及 安全措施; 3.对动火作业现场安全措施落实情况进行检查。 \n 动火作业审批人 1.动火作业的审批人是动火作业安全措施落实情况的最终确认人,对 自己的批准签字负责; 2.审查《作业证》的办理是否符合要求; 3.检查、确认《作业证》审批手续,对手续不完备的《作业证》应及 时制止动火作业。 \n "
time=2024-12-23T10:40:49.938+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=33 prompt=702 used=0 remaining=702
ggml.c:13425: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed
time=2024-12-23T10:40:50.268+08:00 level=INFO source=routes.go:507 msg="embedding generation failed: do embedding request: Post "http://127.0.0.1:51132/embedding": read tcp 127.0.0.1:51134->127.0.0.1:51132: wsarecv: An existing connection was forcibly closed by the remote host."
[GIN] 2024/12/23 - 10:40:50 | 500 | 340.1143ms | 192.168.8.103 | POST "/api/embeddings"
time=2024-12-23T10:40:50.268+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
time=2024-12-23T10:40:50.269+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=C:\Users\Administrator.ollama\models\blobs\sha256-92b37e50807d951e27ead73c059cf9c3b14941498e37dfde57271e19e6d411df duration=5m0s

OLLAMA_DEBUG=1服务器环境中设置的服务器日志将有助于调试。

在嵌入的时候进行报错,流程开始于:
Mon, 23 Dec 2024 10:31:05 GMT
过程持续时间:
65.35 s
进度消息:
Task has been received.
Page(113): OCR started
Page(1
13): OCR finished (18.06s)
Page(113): Layout analysis (0.87s)
Page(1
13): Table analysis (0.13s)
Page(113): Text merged (0.00s)
Page(1
13): Generate 15 chunks
Page(113): [ERROR]Generate embedding error:{}
Task has been received.
Page(13
25): OCR started
Page(1325): OCR finished (9.81s)
Page(13
25): Layout analysis (0.85s)
Page(1325): Table analysis (0.54s)
Page(13
25): Text merged (0.00s)
Page(1325): Generate 18 chunks
Page(13
25): [ERROR]Generate embedding error:{}
Task has been received.
Page(2536): OCR started
Page(25
36): OCR finished (11.08s)
Page(2536): Layout analysis (0.85s)
Page(25
36): Table analysis (0.08s)
Page(2536): Text merged (0.00s)
Page(25
36): Generate 5 chunks
Page(25~36): [ERROR]Generate embedding error:{} 好像大文件是不行的

<!-- gh-comment-id:2558794541 --> @conflictpeng commented on GitHub (Dec 23, 2024): time=2024-12-23T10:40:49.882+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=C:\Users\Administrator\.ollama\models\blobs\sha256-92b37e50807d951e27ead73c059cf9c3b14941498e37dfde57271e19e6d411df refCount=0 time=2024-12-23T10:40:49.931+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=C:\Users\Administrator\.ollama\models\blobs\sha256-92b37e50807d951e27ead73c059cf9c3b14941498e37dfde57271e19e6d411df time=2024-12-23T10:40:49.933+08:00 level=DEBUG source=runner.go:752 msg="embedding request" content=" \n 人员类别 主要安全职责 \n 动火人 1.作业前充分了解作业内容、时间、地点、要求,熟知作业中的危害 因素; 2.接受安全技术交底并逐项确认相关安全措施的落实情况; 3.有权拒绝不符合安全条件的动火作业; 4.动火作业结束后清理火种,切断电源、气源; 5.随身携带《动火 许可证》。 \n 监火人 1.应持培训合格证上岗;作业前检查《动火许可证》,确保作业证与 作业内容相符并在有效期内;对《动火 许可证》中的安全措施的落实 情况进行检查,发现制定的措施不当或落实不到位等情况,应立即制 止作业; 2.确认动火人持相应有效 资格证书上岗; 3.核查动火人佩戴的个体防护用品是否满足作业要求; 4.作业期间不得擅离现场或做与监护无关的事,确需离开现场时,应 中止作业; 5.发现动火部位与《动火许可证》不符或动火出现异常情况时,立即 停止动火; 6.动火作业完成后,检查作业现场,确认无安全隐患。 \n 动火作业负责人 动火单位负责人 1.负责办理《动火许可证》,并对动火作业现场全面负责; 2.在动火作业前详细了解作业内容和动火部位及周围情况,参与动火 方案的制定,并督促动火现场安全措施的落实,向作业人员进行安全 技术交底; 3.作业完成后,组织检查现场,确认无遗留火种后方可离开。 1.负责本单位一级《动火许可证》的审核,二级《动火许可证》的审 批; 2.对动火作业存在的风险进行分析、评估,组织制定动火作业方案及 安全措施; 3.对动火作业现场安全措施落实情况进行检查。 \n 动火作业审批人 1.动火作业的审批人是动火作业安全措施落实情况的最终确认人,对 自己的批准签字负责; 2.审查《作业证》的办理是否符合要求; 3.检查、确认《作业证》审批手续,对手续不完备的《作业证》应及 时制止动火作业。 \n " time=2024-12-23T10:40:49.938+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=33 prompt=702 used=0 remaining=702 ggml.c:13425: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed time=2024-12-23T10:40:50.268+08:00 level=INFO source=routes.go:507 msg="embedding generation failed: do embedding request: Post \"http://127.0.0.1:51132/embedding\": read tcp 127.0.0.1:51134->127.0.0.1:51132: wsarecv: An existing connection was forcibly closed by the remote host." [GIN] 2024/12/23 - 10:40:50 | 500 | 340.1143ms | 192.168.8.103 | POST "/api/embeddings" time=2024-12-23T10:40:50.268+08:00 level=DEBUG source=sched.go:407 msg="context for request finished" time=2024-12-23T10:40:50.269+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=C:\Users\Administrator\.ollama\models\blobs\sha256-92b37e50807d951e27ead73c059cf9c3b14941498e37dfde57271e19e6d411df duration=5m0s > [](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues)`OLLAMA_DEBUG=1`在[服务器环境](https://github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows)中设置的[服务器日志](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues)将有助于调试。 在嵌入的时候进行报错,流程开始于: Mon, 23 Dec 2024 10:31:05 GMT 过程持续时间: 65.35 s 进度消息: Task has been received. Page(1~13): OCR started Page(1~13): OCR finished (18.06s) Page(1~13): Layout analysis (0.87s) Page(1~13): Table analysis (0.13s) Page(1~13): Text merged (0.00s) Page(1~13): Generate 15 chunks Page(1~13): [ERROR]Generate embedding error:{} Task has been received. Page(13~25): OCR started Page(13~25): OCR finished (9.81s) Page(13~25): Layout analysis (0.85s) Page(13~25): Table analysis (0.54s) Page(13~25): Text merged (0.00s) Page(13~25): Generate 18 chunks Page(13~25): [ERROR]Generate embedding error:{} Task has been received. Page(25~36): OCR started Page(25~36): OCR finished (11.08s) Page(25~36): Layout analysis (0.85s) Page(25~36): Table analysis (0.08s) Page(25~36): Text merged (0.00s) Page(25~36): Generate 5 chunks Page(25~36): [ERROR]Generate embedding error:{} 好像大文件是不行的
Author
Owner

@rick-github commented on GitHub (Dec 23, 2024):

Looks like #7288. Your prompt is 702 tokens and bge-large:335m-en-v1.5-fp16 has a context window of 512 tokens. You can try setting the default context window for the model to 512 tokens. Ideally you should change the chunk size you are using because you are losing semantic information.

$ ollama show --modelfile bge-large:335m-en-v1.5-fp16 | sed -e 's/^FROM.*/FROM bge-large:335m-en-v1.5-fp16/' > Modelfile
$ echo "PARAMETER num_ctx 512" >> Modelfile
$ ollama create bpe-large-512

then have your client use "model":"bge-large-512" for the embedding calls. Alternatively, configure your client to send "options":{"num_ctx":512} with the embedding calls.

<!-- gh-comment-id:2558826791 --> @rick-github commented on GitHub (Dec 23, 2024): Looks like #7288. Your prompt is 702 tokens and bge-large:335m-en-v1.5-fp16 has a context window of 512 tokens. You can try setting the default context window for the model to 512 tokens. Ideally you should change the chunk size you are using because you are losing semantic information. ```sh $ ollama show --modelfile bge-large:335m-en-v1.5-fp16 | sed -e 's/^FROM.*/FROM bge-large:335m-en-v1.5-fp16/' > Modelfile $ echo "PARAMETER num_ctx 512" >> Modelfile $ ollama create bpe-large-512 ``` then have your client use `"model":"bge-large-512"` for the embedding calls. Alternatively, configure your client to send `"options":{"num_ctx":512}` with the embedding calls.
Author
Owner

@conflictpeng commented on GitHub (Dec 23, 2024):

看起来像#7288。您的提示是 702 个标记,而 bge-large:335m-en-v1.5-fp16 的上下文窗口有 512 个标记。您可以尝试将模型的默认上下文窗口设置为 512 个标记理想情况下,您应该更改正在使用的块大小,因为您正在丢失备注信息。

$ ollama show --modelfile bge-large:335m-en-v1.5-fp16 | sed -e 's/^FROM.*/FROM bge-large:335m-en-v1.5-fp16/' > Modelfile
$ echo "PARAMETER num_ctx 512" >> Modelfile
$ ollama create bpe-large-512

然后让您的客户端使用"model":"bge-large-512"嵌入调用。或者,配置您的客户端以"options":{"num_ctx":512}使用嵌入调用发送。

谢谢提供思路,我使用了nomic-embed-text:latest 这个num_ctx是很大的能跑通了

<!-- gh-comment-id:2559234871 --> @conflictpeng commented on GitHub (Dec 23, 2024): > 看起来像#7288。您的提示是 702 个标记,而 bge-large:335m-en-v1.5-fp16 的上下文窗口有 512 个标记。您可以尝试将模型的默认上下文窗口设置为 512 个标记理想情况下,您应该更改正在使用的块大小,因为您正在丢失备注信息。 > > ```shell > $ ollama show --modelfile bge-large:335m-en-v1.5-fp16 | sed -e 's/^FROM.*/FROM bge-large:335m-en-v1.5-fp16/' > Modelfile > $ echo "PARAMETER num_ctx 512" >> Modelfile > $ ollama create bpe-large-512 > ``` > > 然后让您的客户端使用`"model":"bge-large-512"`嵌入调用。或者,配置您的客户端以`"options":{"num_ctx":512}`使用嵌入调用发送。 谢谢提供思路,我使用了nomic-embed-text:latest 这个num_ctx是很大的能跑通了
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5242