[GH-ISSUE #1382] litellm leaves defunct processes behind #62766

Closed
opened 2026-05-03 10:15:35 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @iplayfast on GitHub (Dec 4, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1382

I'm not sure who's at fault here.
https://github.com/BerriAI/litellm/issues/992

litellm -m ollama/alfred
litellm -m ollama/mistral
run an autogen application that uses these guys
The autogen get's stuck, so you must ctrl-c out.
The ollama models you started are now defunct

If on a linux system you do
ps aux | grep ollama
it will show things like:
ps aux | grep ollama
ollama 1581 0.0 0.0 4365940 17680 ? Ssl Dec01 0:47 /usr/local/bin/ollama serve
chris 735058 0.1 0.0 2740828 16376 pts/6 Sl+ Dec03 1:46 ollama run starling-lm
chris 1237946 0.4 0.0 2814560 16280 pts/8 Sl+ 12:11 1:25 ollama run orca2:13b
chris 1290228 0.2 0.0 299108 123796 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/alfred --port 9000
chris 1290229 0.2 0.0 371844 123444 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/DeepSeek-Coder --port 9001
chris 1290230 0.2 0.0 224088 123660 pts/9 S 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/starling-lm --port 9002
chris 1290243 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama]
chris 1290244 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama]
chris 1290245 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama]
chris 1290540 0.4 0.1 1501464 155380 pts/12 Sl+ 17:18 0:05 /home/chris/anaconda3/envs/panel/bin/python3.11 /home/chris/anaconda3/envs/panel/bin/panel serve panel_autogenollama.py
ollama 1291438 1592 3.9 5711180 5139892 ? S<l 17:24 246:30 /tmp/ollama236051357/llama.cpp/gguf/build/cpu/bin/ollama-runner --model /usr/share/ollama/.ollama/models/blobs/sha256:92da6238854f2fa902d8b2ad79d548536af1d3ab06821f323bd5bbcea2013276 --ctx-size 2048 --batch-size 512 --n-gpu-layers 110 --embedding --port 54099
chris 1367452 0.0 0.0 9528 2400 pts/13 S+ 17:40 0:00 grep --color=auto ollama

Originally created by @iplayfast on GitHub (Dec 4, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1382 I'm not sure who's at fault here. https://github.com/BerriAI/litellm/issues/992 litellm -m ollama/alfred litellm -m ollama/mistral run an autogen application that uses these guys The autogen get's stuck, so you must ctrl-c out. The ollama models you started are now defunct If on a linux system you do ps aux | grep ollama it will show things like: ps aux | grep ollama ollama 1581 0.0 0.0 4365940 17680 ? Ssl Dec01 0:47 /usr/local/bin/ollama serve chris 735058 0.1 0.0 2740828 16376 pts/6 Sl+ Dec03 1:46 ollama run starling-lm chris 1237946 0.4 0.0 2814560 16280 pts/8 Sl+ 12:11 1:25 ollama run orca2:13b chris 1290228 0.2 0.0 299108 123796 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/alfred --port 9000 chris 1290229 0.2 0.0 371844 123444 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/DeepSeek-Coder --port 9001 chris 1290230 0.2 0.0 224088 123660 pts/9 S 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/starling-lm --port 9002 chris 1290243 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama] chris 1290244 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama] chris 1290245 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama] chris 1290540 0.4 0.1 1501464 155380 pts/12 Sl+ 17:18 0:05 /home/chris/anaconda3/envs/panel/bin/python3.11 /home/chris/anaconda3/envs/panel/bin/panel serve panel_autogenollama.py ollama 1291438 1592 3.9 5711180 5139892 ? S<l 17:24 246:30 /tmp/ollama236051357/llama.cpp/gguf/build/cpu/bin/ollama-runner --model /usr/share/ollama/.ollama/models/blobs/sha256:92da6238854f2fa902d8b2ad79d548536af1d3ab06821f323bd5bbcea2013276 --ctx-size 2048 --batch-size 512 --n-gpu-layers 110 --embedding --port 54099 chris 1367452 0.0 0.0 9528 2400 pts/13 S+ 17:40 0:00 grep --color=auto ollama
Author
Owner

@iplayfast commented on GitHub (Dec 5, 2023):

Git hub is hiding the defunct processes. (angle brackets), The porcesses 1290243,4,5 are zombie processes. This occurs when the child process is kill but the parent process lives on. I'm not sure of the resolution, but I think the parent process isn't spawning the child process correctly so it's an independent process. You might be right about asking them. I have, but I'm linking it here as well as this involves the interplay between processes.

<!-- gh-comment-id:1839980687 --> @iplayfast commented on GitHub (Dec 5, 2023): Git hub is hiding the defunct processes. (angle brackets), The porcesses 1290243,4,5 are zombie processes. This occurs when the child process is kill but the parent process lives on. I'm not sure of the resolution, but I think the parent process isn't spawning the child process correctly so it's an independent process. You might be right about asking them. I have, but I'm linking it here as well as this involves the interplay between processes.
Author
Owner

@mxyng commented on GitHub (Dec 6, 2023):

I'm going to try and break down what I'm seeing:

PID 1581 serves the ollama API

ollama 1581 0.0 0.0 4365940 17680 ? Ssl Dec01 0:47 /usr/local/bin/ollama serve

PIDs 735058 and 1237946 serve ollama REPL for models starling-lm and orca2:13b

chris 735058 0.1 0.0 2740828 16376 pts/6 Sl+ Dec03 1:46 ollama run starling-lm
chris 1237946 0.4 0.0 2814560 16280 pts/8 Sl+ 12:11 1:25 ollama run orca2:13b

PIDs 1290228, 1290229, and 1290230

chris 1290228 0.2 0.0 299108 123796 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/alfred --port 9000
chris 1290229 0.2 0.0 371844 123444 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/DeepSeek-Coder --port 9001
chris 1290230 0.2 0.0 224088 123660 pts/9 S 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/starling-lm --port 9002

PIDs 1290243, 1290244, and 1290245 are zombies

chris 1290243 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama]
chris 1290244 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama]
chris 1290245 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama]

Based on this code, LiteLLM tries to run ollama serve by running subprocess.Popen which will fail because PID 1581 is already listening on port 11434. The linked code snippet prints the error but makes no effort to reap the process which leaves the zombie.

Since the parent LiteLLM process doesn't exit on this error, the zombie process persists

Here's a condensed example:

In one terminal, start a Python REPL.

import subprocess
subprocess.Popen('false')

Without closing the previous terminal, open a new terminal and run ps auxf. You'll see something like this

root        12  0.1  0.0  20940 17084 pts/0    S+   19:59   0:00 python
root        13  0.0  0.0      0     0 pts/0    Z+   19:59   0:00  \_ [false] <defunct>

Based on this, it looks like a LiteLLM bug

<!-- gh-comment-id:1843603287 --> @mxyng commented on GitHub (Dec 6, 2023): I'm going to try and break down what I'm seeing: PID 1581 serves the ollama API ``` ollama 1581 0.0 0.0 4365940 17680 ? Ssl Dec01 0:47 /usr/local/bin/ollama serve ``` PIDs 735058 and 1237946 serve ollama REPL for models starling-lm and orca2:13b ``` chris 735058 0.1 0.0 2740828 16376 pts/6 Sl+ Dec03 1:46 ollama run starling-lm chris 1237946 0.4 0.0 2814560 16280 pts/8 Sl+ 12:11 1:25 ollama run orca2:13b ``` PIDs 1290228, 1290229, and 1290230 ``` chris 1290228 0.2 0.0 299108 123796 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/alfred --port 9000 chris 1290229 0.2 0.0 371844 123444 pts/9 Sl 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/DeepSeek-Coder --port 9001 chris 1290230 0.2 0.0 224088 123660 pts/9 S 17:14 0:04 /home/chris/anaconda3/envs/autogen/bin/python /home/chris/anaconda3/envs/autogen/bin/litellm -m ollama/starling-lm --port 9002 ``` PIDs 1290243, 1290244, and 1290245 are zombies ``` chris 1290243 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama] chris 1290244 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama] chris 1290245 0.0 0.0 0 0 pts/9 Z 17:14 0:00 [ollama] ``` Based on [this](https://github.com/BerriAI/litellm/blob/aefa4f36f9206b9533fd55ad573d9b6ab7057694/litellm/proxy/proxy_cli.py#L20-L29) code, LiteLLM tries to run `ollama serve` by running `subprocess.Popen` which will fail because PID 1581 is already listening on port 11434. The linked code snippet prints the error but makes no effort to reap the process which leaves the zombie. Since the parent LiteLLM process doesn't exit on this error, the zombie process persists Here's a condensed example: In one terminal, start a Python REPL. ```python import subprocess subprocess.Popen('false') ``` Without closing the previous terminal, open a new terminal and run `ps auxf`. You'll see something like this ```python root 12 0.1 0.0 20940 17084 pts/0 S+ 19:59 0:00 python root 13 0.0 0.0 0 0 pts/0 Z+ 19:59 0:00 \_ [false] <defunct> ``` Based on this, it looks like a LiteLLM bug
Author
Owner

@ghost commented on GitHub (Dec 7, 2023):

@mxyng what's a good fix we can implement on our end to prevent this from happening?

<!-- gh-comment-id:1843970786 --> @ghost commented on GitHub (Dec 7, 2023): @mxyng what's a good fix we can implement on our end to prevent this from happening?
Author
Owner

@mxyng commented on GitHub (Dec 7, 2023):

Subprocesses started with Popen should be cleaned up. This is usually done with subproc.communicate() but that's a blocking call. In Python 3, I would normally suggest using asyncio to manage processes but you can also threading or multiprocessing

<!-- gh-comment-id:1845987575 --> @mxyng commented on GitHub (Dec 7, 2023): Subprocesses started with `Popen` should be cleaned up. This is usually done with `subproc.communicate()` but that's a blocking call. In Python 3, I would normally suggest using `asyncio` to manage processes but you can also `threading` or `multiprocessing`
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#62766