2025-10-05 02:21:00 +08:00
2025-10-04 13:59:22 +08:00
2025-10-04 18:38:19 +08:00
2025-10-04 18:38:19 +08:00
2025-10-01 14:55:06 +08:00
2025-10-03 21:31:07 +08:00
2025-10-02 01:43:28 +08:00
2025-10-04 18:38:19 +08:00
2025-10-04 18:38:19 +08:00
2025-10-04 18:06:16 +08:00

KohakuHub: Self-Hosted HuggingFace Hub Alternative

Ask DeepWiki

⚠️ Work In Progress - Not Ready for Production

Join our community!: https://discord.gg/xWYrkyvJ2s

1759520817420 1759521021890

KohakuHub is a minimal, self-hosted alternative to HuggingFace Hub that lets you host and version your own models, datasets, and other AI artifacts with full HuggingFace client compatibility.

What is KohakuHub?

KohakuHub provides a simple but functional solution for teams and individuals who want to:

  • Host their own AI models and datasets without relying on external services
  • Maintain version control with Git-like branching and commits via LakeFS
  • Scale storage independently using S3-compatible object storage
  • Keep existing workflows with full huggingface_hub Python client compatibility

Key Features

  • HuggingFace Compatible: Works seamlessly with existing huggingface_hub client code
  • S3-Compatible Storage: Use any S3-compatible backend (MinIO, Cloudflare R2, Wasabi, AWS S3, etc.)
  • Repository Management: Create, list, and delete model/dataset/space repositories
  • File Operations: Upload, download, copy, and delete files with automatic deduplication
  • Large File Support: Handles files of any size with Git LFS protocol
  • Version Control: Git-like branching and commit history via LakeFS
  • Authentication & Authorization: Complete user registration, session management, and API tokens
  • Organization Management: Full organization support with member roles (admin, super-admin, member)
  • Permission System: Namespace-based permissions for repositories and organizations
  • Web UI: Modern Vue 3 interface with file browsing, editing, and repository management
  • Code Highlighting: Syntax highlighting for code files with Monaco Editor integration
  • Markdown Support: Built-in markdown rendering for documentation
  • 🚧 CLI Tool: Basic functionality available (user/org management), more features in development

Architecture

KohakuHub combines three powerful technologies:

  • LakeFS: Provides Git-like versioning for your data (branches, commits, diffs)
  • MinIO/S3: Object storage backend for actual file storage
  • PostgreSQL/SQLite: Lightweight metadata database for deduplication and indexing
  • FastAPI: HuggingFace-compatible API layer

See API.md for detailed API documentation and workflow diagrams.

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Node.js and npm (or compatible package manager) for building the frontend
  • Python 3.10+ (for testing with huggingface_hub client)
  • [Optional]
    • S3 Storage (MinIO is default one which run with docker compose)
    • SMTP service (For optional email verification)

1. Clone the Repository

git clone https://github.com/KohakuBlueleaf/Kohaku-Hub.git
cd Kohaku-Hub

2. Configure Docker Compose

Before starting, review and customize docker/docker-compose.yml:

Important: Security Configuration

⚠️ Change Default Passwords! The default configuration uses weak credentials. For any serious deployment:

# MinIO credentials - CHANGE THESE!
environment:
  - MINIO_ROOT_USER=your_secure_username
  - MINIO_ROOT_PASSWORD=your_very_secure_password_here

# PostgreSQL credentials - CHANGE THESE!
environment:
  - POSTGRES_USER=hub
  - POSTGRES_PASSWORD=your_secure_db_password
  - POSTGRES_DB=hubdb

# LakeFS encryption key - CHANGE THIS!
environment:
  - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=generate_a_long_random_key_here

Port Configuration

The default setup exposes these ports:

  • 28080 - KohakuHub Web UI (main user/API interface)
  • 48888 - KohakuHub API (for clients like huggingface_hub)
  • 28000 - LakeFS Web UI + API
  • 29000 - MinIO Web Console
  • 29001 - MinIO S3 API
  • 25432 - PostgreSQL (optional, for external access)

For production deployment, you should:

  1. Only expose port 8080 (or 443 with HTTPS) to users.
  2. The Web UI will proxy requests to the API. Keep other ports internal or behind a firewall.
  3. Use a reverse proxy (nginx/traefik) with HTTPS.

Public Endpoint Configuration

If deploying on a server, update these environment variables:

environment:
  # Replace with your actual domain/IP
  - KOHAKU_HUB_BASE_URL=https://your-domain.com
  - KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com

The S3 public endpoint is used for generating download URLs. It should point to wherever your MinIO S3 API is accessible (port 29001 by default).

3. Start the Services

# Set user/group ID for proper permissions
export UID=$(id -u)
export GID=$(id -g)

# Build Frontend and Start all services
./deploy.sh

## You can manually set them up
# npm install --prefix ./src/kohaku-hub-ui
# npm run build --prefix ./src/kohaku-hub-ui
# docker compose up -d --build

Services will start in this order:

  1. MinIO (S3 storage)
  2. PostgreSQL (metadata database)
  3. LakeFS (version control)
  4. KohakuHub API (backend application)
  5. KohakuHub Web UI (Nginx, fronten + reverse proxy to API server)

5. Verify Installation

Check that all services are running:

docker compose ps

Access the web interfaces:

6. Test with Python Client

# Install the official HuggingFace client
pip install huggingface_hub

# Run the test script
python scripts/test.py

The test script will:

  • Create a test repository
  • Upload files (both small and large)
  • Download them back
  • Verify content integrity

kohub-cli Usage

KohakuHub includes a powerful command-line tool with both a Python API for programmatic access and a CLI for interactive and scripted usage.

Installation

The CLI is included in the source code. To install:

# Clone and install
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
pip install -r requirements.txt
pip install -e .

Python API

Use KohakuHub programmatically in your Python scripts:

from kohub_cli import KohubClient

# Initialize client
client = KohubClient(endpoint="http://localhost:8000")

# Login
client.login(username="alice", password="secret")

# Create a repository
client.create_repo("my-org/my-model", repo_type="model", private=False)

# List files
files = client.list_repo_tree("my-org/my-model", repo_type="model")

# Create an API token
token_info = client.create_token(name="my-laptop")
print(f"Token: {token_info['token']}")

See CLI.md for complete Python API documentation.

Command-Line Interface

kohub-cli supports both interactive and command-line modes.

Interactive Mode (Default)

Run without arguments to launch the interactive menu:

kohub-cli
# or explicitly
kohub-cli interactive

Command-Line Mode

Use specific commands for scripting and automation:

# Authentication
kohub-cli auth login
kohub-cli auth whoami
kohub-cli auth token create --name "my-laptop"
kohub-cli auth token list

# Repository management
kohub-cli repo create my-org/my-model --type model
kohub-cli repo list --type model --author my-org
kohub-cli repo info my-org/my-model --type model
kohub-cli repo files my-org/my-model --recursive

# Organization management
kohub-cli org create my-org --description "My organization"
kohub-cli org list
kohub-cli org member add my-org bob --role member

# Configuration
kohub-cli config set endpoint http://localhost:8000
kohub-cli config list

Global Options

All commands support these global options:

--endpoint URL          # Override endpoint (or use HF_ENDPOINT env var)
--token TOKEN           # Override token (or use HF_TOKEN env var)
--output {json,text}    # Output format (default: text)

Examples:

kohub-cli --endpoint http://localhost:8000 auth whoami
kohub-cli --output json repo list --type model

Getting Help

# General help
kohub-cli --help

# Command-specific help
kohub-cli auth --help
kohub-cli repo create --help

Using KohakuHub

With Python Client

To interact with your private repositories, you need to provide your API token.

import os
from huggingface_hub import HfApi

# Point to your KohakuHub instance
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
# Provide your API token
os.environ["HF_TOKEN"] = "your_api_token_here"

api = HfApi(
    endpoint=os.environ["HF_ENDPOINT"],
    token=os.environ["HF_TOKEN"]
)

# Create a repository
api.create_repo("my-org/my-model", repo_type="model")

# Upload files
api.upload_file(
    path_or_fileobj="model.safetensors",
    path_in_repo="model.safetensors",
    repo_id="my-org/my-model",
)

# Download files
file = api.hf_hub_download(
    repo_id="my-org/my-model",
    filename="model.safetensors",
)

With hfutils

hfutils: https://github.com/deepghs/hfutils

With hfutils you can also upload your whole folder easily and utilize KohakuHub in any huggingface model loader like transformers or diffusers

export HF_ENDPOINT="https://huggingface.co/"
hfutils download -t model -r KBlueLeaf/EQ-SDXL-VAE -d . -o ./eq-sdxl
export HF_ENDPOINT="http://127.0.0.1:48888/"
export HF_TOKEN="your_api_token_here"
hfutils upload -t model -r KBlueLeaf/EQ-SDXL-VAE -d . -i ./eq-sdxl

Use KohakuHub in transformers and diffusers

You can utilize your model on KohakuHub in transformers and diffusers directly:

import os
os.environ["HF_ENDPOINT"] = "http://127.0.0.1:48888/"
os.environ["HF_TOKEN"] = "your_api_token_here"
from diffusers import AutoencoderKL

vae = AutoencoderKL.from_pretrained("KBlueLeaf/EQ-SDXL-VAE")

Accessing LakeFS Web UI

LakeFS credentials are automatically generated on first startup and stored in:

docker/hub-meta/hub-api/credentials.env
# or other path if you have modified docker-compose.yml

Use these credentials to log into the LakeFS web interface at http://localhost:28000 and browse your repositories.

Configuration Options

For advanced configuration, you can create a config.toml file or use environment variables. See config-example.toml for all available options.

Key settings include:

  • LFS threshold: Files larger than this use Git LFS protocol (default: 10MB)
  • Database backend: Choose between SQLite (default) or PostgreSQL
  • Authentication: Enable email verification, set session expiry, etc.

Project Status & Roadmap

See TODO.md for detailed development status.

Current Status:

  • Core API (upload, download, version control)
    • Some Path related API may not be 100% supported, report if they are important for you.
  • HuggingFace client compatibility
  • Large file support (Git LFS)
  • Docker deployment with docker-compose
  • Authentication & Authorization
    • User registration with email verification (optional)
    • Session-based authentication with secure cookies
    • API token generation and management
    • Permission system for repositories and organizations
  • Organization Management
    • Create/delete organizations
    • Member management with roles (admin, super-admin, member)
    • Organization-based namespaces for repositories
  • Web User Interface
    • Vue 3 + Vite frontend with modern UI
    • Repository browsing and file viewing
    • Code editor with syntax highlighting
    • File upload/download interface
    • Markdown documentation rendering
    • User authentication pages (login/register)
    • Settings and organization management pages
  • 🚧 CLI for administration
    • User and organization management functional
    • Additional administrative features in development

Contributing

We welcome contributions! Especially:

🎨 Web Interface (High Priority!)

We're looking for frontend developers to help build a modern web UI. Preferred stack:

  • Vue 3 + Vite for the framework
  • Tailwind CSS for styling
  • Similar UX to HuggingFace Hub

If you're interested in leading the web UI development, please reach out on Discord!

Other Contributions

  • Bug reports and feature requests via GitHub Issues
  • Code improvements and bug fixes via Pull Requests
  • Documentation improvements
  • Testing and feedback

Join our community on Discord: https://discord.gg/xWYrkyvJ2s

License

Currently licensed under AGPL-3.0. The license may be updated to a more permissive option after initial development is complete.

Acknowledgments

  • HuggingFace for the amazing Hub platform and client library
  • LakeFS for Git-like data versioning
  • MinIO for S3-compatible object storage

Support & Community


Note: This project is in active development. APIs may change, and features may be incomplete. Not recommended for production use yet.

Description
No description provided
Readme AGPL-3.0 7.6 MiB
Languages
Python 55%
Vue 38.2%
JavaScript 5.9%
CSS 0.4%
HTML 0.3%
Other 0.2%