[PR #12] Fix/upload #13

Open
opened 2026-04-10 16:03:45 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/KohakuBlueleaf/KohakuHub/pull/12
Author: @MigoXV
Created: 3/17/2026
Status: 🔄 Open

Base: mainHead: fix/upload


📝 Commits (4)

  • 1282875 feat(api): enhance commit and file handling with new endpoints and optimizations
  • bc2a7a8 feat(commit): add commit date formatting for huggingface_hub compatibility
  • 81795ac fix(history): remove unused 'formatted' field in commit response
  • 41d5025 feat(huggingface): normalize commit hashes to 40 characters for compatibility

📊 Changes

9 files changed (+563 additions, -134 deletions)

View changed files

📝 docs/api/huggingface-compatible.md (+7 -3)
📝 src/kohakuhub/README.md (+1 -0)
📝 src/kohakuhub/api/commit/routers/history.py (+18 -16)
📝 src/kohakuhub/api/files.py (+81 -4)
📝 src/kohakuhub/api/repo/routers/info.py (+4 -3)
📝 src/kohakuhub/api/repo/routers/tree.py (+176 -108)
📝 src/kohakuhub/api/repo/utils/hf.py (+30 -0)
tests/conftest.py (+11 -0)
tests/test_hf_commit_hash_compat.py (+235 -0)

📄 Description

This pull request introduces several improvements to HuggingFace Hub compatibility in KohakuHub, focusing on commit hash normalization, response formatting, and expanded API endpoints. The most important changes include normalizing commit hashes to 40-character hexadecimal values on HF-compatible surfaces, updating documentation and response headers, and adding new endpoints for path information. These updates ensure better interoperability with HuggingFace clients and improve error handling and metadata consistency.

HuggingFace Compatibility & Commit Hash Normalization:

  • All HF-compatible API responses now expose commit hashes as 40-character hexadecimal values, regardless of internal LakeFS commit ID length. This affects metadata fields, response headers, and file endpoints (format_hf_commit_hash used throughout). [1] [2] [3] [4] [5]
  • Documentation updated to clarify hash normalization and response formats, including explicit notes about 40-character commit hashes and header changes. [1] [2] [3] [4] [5]

API Response & Metadata Improvements:

  • Revision metadata endpoints now return siblings and files fields with HuggingFace-compatible file lists, including LFS metadata. [1] [2]
  • Commit listing endpoints use improved date formatting and author metadata for better HF compatibility. [1] [2]

Error Handling & Path Info Endpoints:

  • Tree listing and path info endpoints now provide detailed HF-compatible error responses for non-existent paths, including new hf_entry_not_found handling. [1] [2]
  • Added new GET endpoint for /paths-info/{revision} to support query-string based path info requests, improving compatibility with HF clients. [1] [2]

Codebase & Utility Enhancements:

  • Introduced utility functions for path normalization and existence checking, used across tree and path info endpoints for robust behavior. [1] [2]
  • Refactored commit history endpoint to return a simple list of commits and improved pagination handling. [1] [2]

These changes collectively improve HuggingFace compatibility, metadata accuracy, and error handling in KohakuHub.This pull request introduces several improvements to the repository API, focusing on compatibility with HuggingFace Hub, enhanced file and path handling, and improved commit and revision metadata. The changes include new utility functions, refactoring of endpoints for better code reuse, and more robust error handling for path and entry lookups.

HuggingFace Compatibility & Metadata Improvements:

  • Added _format_commit_date utility to standardize commit timestamps for HuggingFace compatibility in history.py.
  • Refactored the revision endpoint to build HuggingFace-compatible siblings metadata, including LFS file handling, and updated the response to include both siblings and files. [1] [2] [3]

Path and Entry Handling Enhancements:

  • Introduced normalize_repo_path and path_exists_in_revision utilities to standardize path handling and check for file/directory existence at a given revision.
  • Improved error handling in the repo tree endpoint to return HuggingFace-style entry-not-found errors instead of empty lists for non-existent paths, and added path existence checks for empty results. [1] [2]

API Endpoint Refactoring & Code Reuse:

  • Refactored the paths-info endpoints to use a shared implementation (get_paths_info_impl), supporting both POST and GET methods for better compatibility and maintainability. [1] [2] [3]

Commit Listing Response Simplification:

  • Simplified the commit listing endpoint to return a flat list of commits instead of a paginated response object, and updated commit metadata formatting. [1] [2] [3]

Minor Improvements:

  • Added missing imports and fixed minor issues to support new functionality and maintain code consistency. [1] [2] [3]

These changes collectively improve HuggingFace compatibility, error handling, and code maintainability in the repository API.# Pull Request

What changed?

Why?

Fixes #

Testing

  • Tested locally
  • Tested in Docker (if relevant)

Checklist

  • Code follows project style (see CONTRIBUTING.md)
  • Updated docs if needed (README, API.md, CLI.md, etc.)
  • No breaking changes (or documented them)
  • Tested my changes

Screenshots


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/KohakuBlueleaf/KohakuHub/pull/12 **Author:** [@MigoXV](https://github.com/MigoXV) **Created:** 3/17/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix/upload` --- ### 📝 Commits (4) - [`1282875`](https://github.com/KohakuBlueleaf/KohakuHub/commit/1282875ab13a654709171b2e2c87339984d698ad) feat(api): enhance commit and file handling with new endpoints and optimizations - [`bc2a7a8`](https://github.com/KohakuBlueleaf/KohakuHub/commit/bc2a7a80c2d8c5331c37863f6059a29f9ace4d14) feat(commit): add commit date formatting for huggingface_hub compatibility - [`81795ac`](https://github.com/KohakuBlueleaf/KohakuHub/commit/81795acff80c9292eab3abf4fcfdbe6e90bb9a91) fix(history): remove unused 'formatted' field in commit response - [`41d5025`](https://github.com/KohakuBlueleaf/KohakuHub/commit/41d5025a732da4e8e0f0467041aad0b3d3e4eb40) feat(huggingface): normalize commit hashes to 40 characters for compatibility ### 📊 Changes **9 files changed** (+563 additions, -134 deletions) <details> <summary>View changed files</summary> 📝 `docs/api/huggingface-compatible.md` (+7 -3) 📝 `src/kohakuhub/README.md` (+1 -0) 📝 `src/kohakuhub/api/commit/routers/history.py` (+18 -16) 📝 `src/kohakuhub/api/files.py` (+81 -4) 📝 `src/kohakuhub/api/repo/routers/info.py` (+4 -3) 📝 `src/kohakuhub/api/repo/routers/tree.py` (+176 -108) 📝 `src/kohakuhub/api/repo/utils/hf.py` (+30 -0) ➕ `tests/conftest.py` (+11 -0) ➕ `tests/test_hf_commit_hash_compat.py` (+235 -0) </details> ### 📄 Description This pull request introduces several improvements to HuggingFace Hub compatibility in KohakuHub, focusing on commit hash normalization, response formatting, and expanded API endpoints. The most important changes include normalizing commit hashes to 40-character hexadecimal values on HF-compatible surfaces, updating documentation and response headers, and adding new endpoints for path information. These updates ensure better interoperability with HuggingFace clients and improve error handling and metadata consistency. **HuggingFace Compatibility & Commit Hash Normalization:** * All HF-compatible API responses now expose commit hashes as 40-character hexadecimal values, regardless of internal LakeFS commit ID length. This affects metadata fields, response headers, and file endpoints (`format_hf_commit_hash` used throughout). [[1]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046R397-R419) [[2]](diffhunk://#diff-9e0dec3303a5e52fc46a8717cac053dd53f180275809b8c81249f7b4907db30fL238-R239) [[3]](diffhunk://#diff-9e0dec3303a5e52fc46a8717cac053dd53f180275809b8c81249f7b4907db30fL381-R382) [[4]](diffhunk://#diff-9e0dec3303a5e52fc46a8717cac053dd53f180275809b8c81249f7b4907db30fL551-R552) [[5]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046L425-R502) * Documentation updated to clarify hash normalization and response formats, including explicit notes about 40-character commit hashes and header changes. [[1]](diffhunk://#diff-c7f9e7b8f5aa6084ad5921b12f8cec37c753187c480816df62f8680d9b969ee2L75-R75) [[2]](diffhunk://#diff-c7f9e7b8f5aa6084ad5921b12f8cec37c753187c480816df62f8680d9b969ee2L157-R157) [[3]](diffhunk://#diff-c7f9e7b8f5aa6084ad5921b12f8cec37c753187c480816df62f8680d9b969ee2L258-R258) [[4]](diffhunk://#diff-c7f9e7b8f5aa6084ad5921b12f8cec37c753187c480816df62f8680d9b969ee2R359-R362) [[5]](diffhunk://#diff-4466333bfe1d1fa6e3cd17d3d7f71377bd27f679865b2be00dbbef5b0b6bac1aR257) **API Response & Metadata Improvements:** * Revision metadata endpoints now return `siblings` and `files` fields with HuggingFace-compatible file lists, including LFS metadata. [[1]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046R274-R341) [[2]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046R397-R419) * Commit listing endpoints use improved date formatting and author metadata for better HF compatibility. [[1]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700R24-R36) [[2]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700L119-R130) **Error Handling & Path Info Endpoints:** * Tree listing and path info endpoints now provide detailed HF-compatible error responses for non-existent paths, including new `hf_entry_not_found` handling. [[1]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R19) [[2]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574L282-R428) * Added new GET endpoint for `/paths-info/{revision}` to support query-string based path info requests, improving compatibility with HF clients. [[1]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R449-R472) [[2]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R222-R348) **Codebase & Utility Enhancements:** * Introduced utility functions for path normalization and existence checking, used across tree and path info endpoints for robust behavior. [[1]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R222-R348) [[2]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574L267-R395) * Refactored commit history endpoint to return a simple list of commits and improved pagination handling. [[1]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700L84-R98) [[2]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700L131-R143) These changes collectively improve HuggingFace compatibility, metadata accuracy, and error handling in KohakuHub.This pull request introduces several improvements to the repository API, focusing on compatibility with HuggingFace Hub, enhanced file and path handling, and improved commit and revision metadata. The changes include new utility functions, refactoring of endpoints for better code reuse, and more robust error handling for path and entry lookups. **HuggingFace Compatibility & Metadata Improvements:** * Added `_format_commit_date` utility to standardize commit timestamps for HuggingFace compatibility in `history.py`. * Refactored the revision endpoint to build HuggingFace-compatible `siblings` metadata, including LFS file handling, and updated the response to include both `siblings` and `files`. [[1]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046R273-R340) [[2]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046R396-R404) [[3]](diffhunk://#diff-3e99b8486b8e5663996bfbe280e38bb3a26f53d3369f15f732093ecc6a5b8046L338-R414) **Path and Entry Handling Enhancements:** * Introduced `normalize_repo_path` and `path_exists_in_revision` utilities to standardize path handling and check for file/directory existence at a given revision. * Improved error handling in the repo tree endpoint to return HuggingFace-style entry-not-found errors instead of empty lists for non-existent paths, and added path existence checks for empty results. [[1]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574L267-R395) [[2]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574L282-R428) **API Endpoint Refactoring & Code Reuse:** * Refactored the `paths-info` endpoints to use a shared implementation (`get_paths_info_impl`), supporting both POST and GET methods for better compatibility and maintainability. [[1]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R222-R348) [[2]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R449-R472) [[3]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574L338-L441) **Commit Listing Response Simplification:** * Simplified the commit listing endpoint to return a flat list of commits instead of a paginated response object, and updated commit metadata formatting. [[1]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700L84-R98) [[2]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700L119-R130) [[3]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700L131-R143) **Minor Improvements:** * Added missing imports and fixed minor issues to support new functionality and maintain code consistency. [[1]](diffhunk://#diff-80e6e37844f63a402e09d3c46834f6e2b6cc6be6c614874fa9392c58ff32b700R5) [[2]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574L7-R7) [[3]](diffhunk://#diff-f96ed58b1e082e02e1eaef0851a0946f390afa79d7abb7ac1d2ab4cc23e67574R19) These changes collectively improve HuggingFace compatibility, error handling, and code maintainability in the repository API.# Pull Request ## What changed? <!-- Describe your changes --> ## Why? <!-- Why are you making this change? Link related issues --> Fixes # ## Testing <!-- How did you test this? --> - [ ] Tested locally - [ ] Tested in Docker (if relevant) ## Checklist - [ ] Code follows project style (see CONTRIBUTING.md) - [ ] Updated docs if needed (README, API.md, CLI.md, etc.) - [ ] No breaking changes (or documented them) - [ ] Tested my changes ## Screenshots <!-- If UI changes, add before/after screenshots --> --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-10 16:03:45 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/KohakuHub#13