[PR #10634] Add Huggingface model card README.md (YAML) to GGUF converter #39175

Open
opened 2026-04-22 23:49:36 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10634
Author: @mrutkows
Created: 5/9/2025
Status: 🔄 Open

Base: mainHead: convert-hf-metadata2


📝 Commits (1)

  • 825619c Add Huggingface model card README.md (YAML) to GGUF convert

📊 Changes

15 files changed (+469 additions, -3 deletions)

View changed files

📝 convert/convert.go (+23 -3)
convert/convert_modelcard.go (+148 -0)
convert/convert_modelcard_test.go (+96 -0)
convert/testdata/modelcard/README md.Llama-4-Maverick-17B-128E-Instruct (+102 -0)
convert/testdata/modelcard/README.md.OpenHermes-2-Mistral-7B (+23 -0)
convert/testdata/modelcard/README.md.Phi-3.5-mini-instruct (+18 -0)
convert/testdata/modelcard/README.md.Qwen2.5-7B-Instruct (+14 -0)
convert/testdata/modelcard/README.md.bad_delim (+9 -0)
convert/testdata/modelcard/README.md.empty_array (+6 -0)
convert/testdata/modelcard/README.md.missing_delim (+5 -0)
convert/testdata/modelcard/README.md.tiny-LlamaForCausalLM-3.2 (+9 -0)
📝 fs/ggml/ggml.go (+6 -0)
📝 go.mod (+1 -0)
📝 go.sum (+2 -0)
📝 parser/parser.go (+7 -0)

📄 Description

Huggingface has standardized a set of YAML (tags) that describe such things as license information, model provenance and organizational ownership that is supported by the GGUF format and can be added as general.xx entries in the KV header values (see https://huggingface.co/docs/hub/en/model-cards#editing-the-yaml-section-of-the-readmemd-file and https://github.com/ggml-org/llama.cpp/blob/master/gguf-py/gguf/constants.py respectively).

This PR seeks to automate that inclusion of that data as part of the model conversion (i.e., create) process since this information is stored at the top of each Huggingface model's README.md (even templated on model repo. creation) and is fairly easily extracted, parsed and mapped.

Please note that only the information that is relevant to the model itself, from a larger set HF supports, is brought over and aligns with what has been done in other converters.

Testing

See actual README.md YAML snippets for the following models which provided a good cross-section of use of the YAML schema:
- Llama-4-Maverick-17B-128E-Instruct
- OpenHermes-2-Mistral-7B
- Phi-3.5-mini-instruct
- Qwen2.5-7B-Instruct
- tiny-LlamaForCausalLM-3.2

As well as unit tests for incorrect YAML, etc.:
- bad_delim
- missing_delim
- empty_array

Post "create" (convert) testing of actual HF models:

  • Included new unit/functional tests for a variety of popular models with a good representation of different YAML metadata
  • Tested "create" API on the following models and independently ran the GGUFs on llama.cpp for additional validation:
    • granite-3.2-2b-instruct, granite-3.1-3b-a800m-instruct (with PRs waiting for review)
    • Phi-3.5-mini-instruct
    • Qwen2.5-7B-Instruct
    • tiny-LlamaForCausalLM-3.2

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10634 **Author:** [@mrutkows](https://github.com/mrutkows) **Created:** 5/9/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `convert-hf-metadata2` --- ### 📝 Commits (1) - [`825619c`](https://github.com/ollama/ollama/commit/825619c38e27920c63841c108eb682244bbbbf09) Add Huggingface model card README.md (YAML) to GGUF convert ### 📊 Changes **15 files changed** (+469 additions, -3 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert.go` (+23 -3) ➕ `convert/convert_modelcard.go` (+148 -0) ➕ `convert/convert_modelcard_test.go` (+96 -0) ➕ `convert/testdata/modelcard/README md.Llama-4-Maverick-17B-128E-Instruct` (+102 -0) ➕ `convert/testdata/modelcard/README.md.OpenHermes-2-Mistral-7B` (+23 -0) ➕ `convert/testdata/modelcard/README.md.Phi-3.5-mini-instruct` (+18 -0) ➕ `convert/testdata/modelcard/README.md.Qwen2.5-7B-Instruct` (+14 -0) ➕ `convert/testdata/modelcard/README.md.bad_delim` (+9 -0) ➕ `convert/testdata/modelcard/README.md.empty_array` (+6 -0) ➕ `convert/testdata/modelcard/README.md.missing_delim` (+5 -0) ➕ `convert/testdata/modelcard/README.md.tiny-LlamaForCausalLM-3.2` (+9 -0) 📝 `fs/ggml/ggml.go` (+6 -0) 📝 `go.mod` (+1 -0) 📝 `go.sum` (+2 -0) 📝 `parser/parser.go` (+7 -0) </details> ### 📄 Description Huggingface has standardized a set of YAML (tags) that describe such things as license information, model provenance and organizational ownership that is supported by the GGUF format and can be added as general.xx entries in the KV header values (see https://huggingface.co/docs/hub/en/model-cards#editing-the-yaml-section-of-the-readmemd-file and https://github.com/ggml-org/llama.cpp/blob/master/gguf-py/gguf/constants.py respectively). This PR seeks to automate that inclusion of that data as part of the model conversion (i.e., create) process since this information is stored at the top of each Huggingface model's README.md (even templated on model repo. creation) and is fairly easily extracted, parsed and mapped. Please note that only the information that is relevant to the model itself, from a larger set HF supports, is brought over and aligns with what has been done in other converters. ### Testing See actual README.md YAML snippets for the following models which provided a good cross-section of use of the YAML schema: - Llama-4-Maverick-17B-128E-Instruct - OpenHermes-2-Mistral-7B - Phi-3.5-mini-instruct - Qwen2.5-7B-Instruct - tiny-LlamaForCausalLM-3.2 As well as unit tests for incorrect YAML, etc.: - bad_delim - missing_delim - empty_array #### Post "create" (convert) testing of actual HF models: - Included new unit/functional tests for a variety of popular models with a good representation of different YAML metadata - Tested "create" API on the following models and independently ran the GGUFs on llama.cpp for additional validation: - granite-3.2-2b-instruct, granite-3.1-3b-a800m-instruct (with PRs waiting for review) - Phi-3.5-mini-instruct - Qwen2.5-7B-Instruct - tiny-LlamaForCausalLM-3.2 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:49:36 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#39175