KohakuHub: Self-Hosted HuggingFace Hub Alternative
⚠️ Work In Progress - Not Ready for Production
Join our community!: https://discord.gg/xWYrkyvJ2s
KohakuHub is a minimal, self-hosted alternative to HuggingFace Hub that lets you host and version your own models, datasets, and other AI artifacts with full HuggingFace client compatibility.
What is KohakuHub?
KohakuHub provides a simple but functional solution for teams and individuals who want to:
- Host their own AI models and datasets without relying on external services
- Maintain version control with Git-like branching and commits via LakeFS
- Scale storage independently using S3-compatible object storage
- Keep existing workflows with full
huggingface_hubPython client compatibility
Key Features
- ✅ HuggingFace Compatible: Works seamlessly with existing
huggingface_hubclient code - ✅ S3-Compatible Storage: Use any S3-compatible backend (MinIO, Cloudflare R2, Wasabi, AWS S3, etc.)
- ✅ Repository Management: Create, list, and delete model/dataset/space repositories
- ✅ File Operations: Upload, download, copy, and delete files with automatic deduplication
- ✅ Large File Support: Handles files of any size with Git LFS protocol
- ✅ Version Control: Git-like branching and commit history via LakeFS
- ✅ Authentication & Authorization: Secure user registration, session management, and API tokens
- ✅ Organization Management: Create organizations and manage member roles
- ✅ CLI Tool:
kohub-clifor easy user and organization management - 🚧 Web UI: Coming soon (contributions welcome!)
Architecture
KohakuHub combines three powerful technologies:
- LakeFS: Provides Git-like versioning for your data (branches, commits, diffs)
- MinIO/S3: Object storage backend for actual file storage
- PostgreSQL/SQLite: Lightweight metadata database for deduplication and indexing
- FastAPI: HuggingFace-compatible API layer
See API.md for detailed API documentation and workflow diagrams.
Quick Start
Prerequisites
- Docker and Docker Compose
- Python 3.10+ (for testing with
huggingface_hubclient) - [Optional]
- S3 Storage (MinIO is default one which run with docker compose)
- SMTP service (For optional email verification)
1. Clone the Repository
git clone https://github.com/KohakuBlueleaf/Kohaku-Hub.git
cd Kohaku-Hub
2. Configure Docker Compose
Before starting, review and customize docker/docker-compose.yml:
Important: Security Configuration
⚠️ Change Default Passwords! The default configuration uses weak credentials. For any serious deployment:
# MinIO credentials - CHANGE THESE!
environment:
- MINIO_ROOT_USER=your_secure_username
- MINIO_ROOT_PASSWORD=your_very_secure_password_here
# PostgreSQL credentials - CHANGE THESE!
environment:
- POSTGRES_USER=hub
- POSTGRES_PASSWORD=your_secure_db_password
- POSTGRES_DB=hubdb
# LakeFS encryption key - CHANGE THIS!
environment:
- LAKEFS_AUTH_ENCRYPT_SECRET_KEY=generate_a_long_random_key_here
Port Configuration
The default setup exposes these ports:
48888- KohakuHub API (main interface)28000- LakeFS Web UI + API29000- MinIO Web Console29001- MinIO S3 API25432- PostgreSQL (optional, for external access)
For production deployment, you should:
- Only expose port 48888 (KohakuHub API) to users
- Keep other ports internal or behind a firewall
- Use a reverse proxy (nginx/traefik) with HTTPS
Public Endpoint Configuration
If deploying on a server, update these environment variables:
environment:
# Replace with your actual domain/IP
- KOHAKU_HUB_BASE_URL=https://your-domain.com
- KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com
The S3 public endpoint is used for generating download URLs. It should point to wherever your MinIO S3 API is accessible (port 29001 by default).
3. Start the Services
# Set user/group ID for proper permissions
export UID=$(id -u)
export GID=$(id -g)
# Start all services
docker compose up -d --build
Services will start in this order:
- MinIO (S3 storage)
- PostgreSQL (metadata database)
- LakeFS (version control)
- KohakuHub API (main application)
4. Verify Installation
Check that all services are running:
docker compose ps
Access the web interfaces:
- KohakuHub API: http://localhost:48888/docs (API documentation)
- LakeFS Web UI: http://localhost:28000 (repository browser)
- MinIO Console: http://localhost:29000 (storage browser)
5. Test with Python Client
# Install the official HuggingFace client
pip install huggingface_hub
# Run the test script
python scripts/test.py
The test script will:
- Create a test repository
- Upload files (both small and large)
- Download them back
- Verify content integrity
kohub-cli Usage
KohakuHub includes a command-line tool, kohub-cli, to simplify user and organization management.
Installation
The CLI is included in the source code. To run it, first install the dependencies:
pip install -r requirements.txt
pip install -e .
Then run the CLI:
kohub-cli
User Management
The CLI provides an interactive menu for user management.
1. Register a New User
- Run the CLI and select
User Management->Register. - Follow the prompts to enter a username, email, and password.
2. Login
- Select
User Management->Login. - Enter your username and password to create a session.
3. Generate an API Token
- After logging in, select
User Management->Create Token. - Give the token a name (e.g., "my-laptop").
- The CLI will print a new API token. Save this token securely!
Organization Management
You can also manage organizations and members.
- Create Organization:
Organization Management->Create Organization - Manage Members: Add, remove, or update member roles within an organization.
Using KohakuHub
With Python Client
To interact with your private repositories, you need to provide your API token.
import os
from huggingface_hub import HfApi
# Point to your KohakuHub instance
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
# Provide your API token
os.environ["HF_TOKEN"] = "your_api_token_here"
api = HfApi(
endpoint=os.environ["HF_ENDPOINT"],
token=os.environ["HF_TOKEN"]
)
# Create a repository
api.create_repo("my-org/my-model", repo_type="model")
# Upload files
api.upload_file(
path_or_fileobj="model.safetensors",
path_in_repo="model.safetensors",
repo_id="my-org/my-model",
)
# Download files
file = api.hf_hub_download(
repo_id="my-org/my-model",
filename="model.safetensors",
)
With hfutils
hfutils: https://github.com/deepghs/hfutils
With hfutils you can also upload your whole folder easily and utilize KohakuHub in any huggingface model loader like transformers or diffusers
export HF_ENDPOINT="https://huggingface.co/"
hfutils download -t model -r KBlueLeaf/EQ-SDXL-VAE -d . -o ./eq-sdxl
export HF_ENDPOINT="http://127.0.0.1:48888/"
export HF_TOKEN="your_api_token_here"
hfutils upload -t model -r KBlueLeaf/EQ-SDXL-VAE -d . -i ./eq-sdxl
Use KohakuHub in transformers and diffusers
You can utilize your model on KohakuHub in transformers and diffusers directly:
import os
os.environ["HF_ENDPOINT"] = "http://127.0.0.1:48888/"
os.environ["HF_TOKEN"] = "your_api_token_here"
from diffusers import AutoencoderKL
vae = AutoencoderKL.from_pretrained("KBlueLeaf/EQ-SDXL-VAE")
Accessing LakeFS Web UI
LakeFS credentials are automatically generated on first startup and stored in:
docker/hub-meta/hub-api/credentials.env
# or other path if you have modified docker-compose.yml
Use these credentials to log into the LakeFS web interface at http://localhost:28000 and browse your repositories.
Configuration Options
For advanced configuration, you can create a config.toml file or use environment variables. See config-example.toml for all available options.
Key settings include:
- LFS threshold: Files larger than this use Git LFS protocol (default: 10MB)
- Database backend: Choose between SQLite (default) or PostgreSQL
- Authentication: Enable email verification, set session expiry, etc.
Project Status & Roadmap
See TODO.md for detailed development status.
Current Status:
- ✅ Core API (upload, download, version control)
- ✅ HuggingFace client compatibility
- ✅ Large file support (Git LFS)
- ✅ Docker deployment
- ✅ Authentication & Authorization
- ✅ Organization Management
- ✅ CLI for administration
- 🚧 Web user interface
Contributing
We welcome contributions! Especially:
🎨 Web Interface (High Priority!)
We're looking for frontend developers to help build a modern web UI. Preferred stack:
- Vue 3 + Vite for the framework
- Tailwind CSS for styling
- Similar UX to HuggingFace Hub
If you're interested in leading the web UI development, please reach out on Discord!
Other Contributions
- Bug reports and feature requests via GitHub Issues
- Code improvements and bug fixes via Pull Requests
- Documentation improvements
- Testing and feedback
Join our community on Discord: https://discord.gg/xWYrkyvJ2s
License
Currently licensed under AGPL-3.0. The license may be updated to a more permissive option after initial development is complete.
Acknowledgments
- HuggingFace for the amazing Hub platform and client library
- LakeFS for Git-like data versioning
- MinIO for S3-compatible object storage
Support & Community
- Discord: https://discord.gg/xWYrkyvJ2s
- GitHub Issues: Bug reports and feature requests
- Discussions: Design discussions and questions
Note: This project is in active development. APIs may change, and features may be incomplete. Not recommended for production use yet.