[GH-ISSUE #7958] Model request: HunyuanVideo text-to-video #5095

Closed
opened 2026-04-12 16:11:47 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @artem-zinnatullin on GitHub (Dec 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7958

It's an "open-source" rich text-to-video model by Tencent:

HunyuanVideo represents the most parameter-rich and high-performce text-to-video model currently available in the open-source domain. With 13 billion parameters, it is capable of generating videos that exhibit high physical accuracy and scene consistency, thereby actualizing conceptual visions and fostering creative expression.

Links:

Abstract:

We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. HunyuanVideo features a comprehensive framework that integrates several key contributions, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models.

We conducted extensive experiments and implemented a series of targeted designs to ensure high visual quality, motion diversity, text-video alignment, and generation stability. According to professional human evaluation results, HunyuanVideo outperforms previous state-of-the-art models, including Runway Gen-3, Luma 1.6, and 3 top performing Chinese video generative models. By releasing the code and weights of the foundation model and its applications, we aim to bridge the gap between closed-source and open-source video foundation models. This initiative will empower everyone in the community to experiment with their ideas, fostering a more dynamic and vibrant video generation ecosystem.

This is a tracking issue for HunyuanVideo model support

Originally created by @artem-zinnatullin on GitHub (Dec 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7958 It's an "open-source" rich text-to-video model by Tencent: >HunyuanVideo represents the most parameter-rich and high-performce text-to-video model currently available in the open-source domain. With 13 billion parameters, it is capable of generating videos that exhibit high physical accuracy and scene consistency, thereby actualizing conceptual visions and fostering creative expression. Links: - Model website: https://aivideo.hunyuan.tencent.com/ - GitHub: https://github.com/Tencent/HunyuanVideo - HuggingFace: https://huggingface.co/tencent/HunyuanVideo - Paper: https://github.com/Tencent/HunyuanVideo/blob/main/assets/hunyuanvideo.pdf Abstract: >We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. HunyuanVideo features a comprehensive framework that integrates several key contributions, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models. > >We conducted extensive experiments and implemented a series of targeted designs to ensure high visual quality, motion diversity, text-video alignment, and generation stability. According to professional human evaluation results, HunyuanVideo outperforms previous state-of-the-art models, including Runway Gen-3, Luma 1.6, and 3 top performing Chinese video generative models. By releasing the code and weights of the foundation model and its applications, we aim to bridge the gap between closed-source and open-source video foundation models. This initiative will empower everyone in the community to experiment with their ideas, fostering a more dynamic and vibrant video generation ecosystem. This is a tracking issue for HunyuanVideo model support ⌛
GiteaMirror added the model label 2026-04-12 16:11:47 -05:00
Author
Owner

@rick-github commented on GitHub (Dec 5, 2024):

This type of model is not supported in ollama, and probably not any time soon. It's more appropriate for a project like ComfyUI.

<!-- gh-comment-id:2521566894 --> @rick-github commented on GitHub (Dec 5, 2024): This type of model is not supported in ollama, and probably not any time soon. It's more appropriate for a project like [ComfyUI](https://github.com/comfyanonymous/ComfyUI/issues/3751).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5095