11 KiB
KohakuHub Deployment Architecture
Setup Instructions
First Time Setup
Option 1: Interactive Generator (Recommended)
Use the interactive generator to create a customized docker-compose.yml:
# Run the generator
python scripts/generate_docker_compose.py
The generator will ask you to configure:
- PostgreSQL (built-in container or external database)
- LakeFS database backend (PostgreSQL or SQLite)
- S3 storage (built-in MinIO or external S3/R2)
- Security keys (auto-generated or custom)
See scripts/README.md for detailed usage.
Option 2: Manual Configuration
-
Copy configuration file:
cp docker-compose.example.yml docker-compose.yml -
Edit docker-compose.yml:
- Change MinIO credentials (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD)
- Change PostgreSQL password (POSTGRES_PASSWORD)
- Change LakeFS secret key (LAKEFS_AUTH_ENCRYPT_SECRET_KEY)
- Change session secret (KOHAKU_HUB_SESSION_SECRET)
- Update BASE_URL if deploying to a domain
Build and Start
After configuration (either option):
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
docker-compose up -d --build
Note: The repository only includes docker-compose.example.yml as a template. Your customized docker-compose.yml is excluded from git to prevent committing sensitive credentials.
Port Configuration
Production Deployment (Docker)
Exposed Port:
- 28080 - Main entry point (Web UI + API via nginx reverse proxy)
Internal Ports (not exposed to users):
- 48888 - Backend API server (proxied by nginx)
- 28000 - LakeFS UI (admin only)
- 29000 - MinIO Console (admin only)
- 29001 - MinIO S3 API (used by backend)
- 25432 - PostgreSQL (optional, for external access)
Nginx Reverse Proxy
Configuration: docker/nginx/default.conf
graph LR
subgraph "Nginx (Port 28080)"
direction TB
Router[Request Router]
Static[Static Files Handler]
Proxy[API Proxy]
end
Client[Client] -->|Request| Router
Router -->|"/", "/*.html", "/*.js"| Static
Router -->|"/api/*"| Proxy
Router -->|"/org/*"| Proxy
Router -->|"/{ns}/{repo}.git/*"| Proxy
Router -->|"/resolve/*"| Proxy
Static -->|Serve| Vue[Vue 3 Frontend]
Proxy -->|Forward| FastAPI["FastAPI:48888"]
Nginx routing rules:
- Serves frontend static files from
/usr/share/nginx/html - Proxies API requests to
hub-api:48888:/api/*→ API endpoints/org/*→ Organization endpoints/{namespace}/{name}.git/*→ Git Smart HTTP protocol/{type}s/{namespace}/{name}/resolve/*→ File download endpoints/admin/*→ Admin portal (if enabled)
Client Configuration
For HuggingFace Client:
import os
os.environ["HF_ENDPOINT"] = "http://localhost:28080" # Use nginx port
os.environ["HF_TOKEN"] = "your_token"
For kohub-cli:
export HF_ENDPOINT=http://localhost:28080
kohub-cli auth login
For Git Clone:
# Clone repository
git clone http://localhost:28080/namespace/repo.git
# With authentication (private repos)
git clone http://username:token@localhost:28080/namespace/repo.git
# Download large files
cd repo
git lfs install
git lfs pull
❌ WRONG:
os.environ["HF_ENDPOINT"] = "http://localhost:48888" # Don't use backend port directly
Architecture Diagram
graph TB
subgraph "External Access"
Client["Client<br/>(Browser, Git, Python SDK, CLI)"]
end
subgraph "Nginx Container (hub-ui)<br/>Port 28080"
Nginx["Nginx Reverse Proxy<br/>- Static files: Vue 3 frontend<br/>- Proxy: /api, /org, resolve"]
end
subgraph "FastAPI Container (hub-api)<br/>Port 48888 (internal)"
FastAPI["FastAPI Application<br/>- HF-compatible REST API<br/>- Git Smart HTTP<br/>- LFS protocol<br/>- Authentication"]
end
subgraph "Storage Layer"
LakeFS["LakeFS Container<br/>Port 28000 (admin)<br/>- Git-like versioning<br/>- Branch management<br/>- Commit history"]
MinIO["MinIO Container<br/>Port 29000 (console)<br/>Port 29001 (S3 API)<br/>- S3-compatible storage<br/>- Object storage"]
Postgres["PostgreSQL Container<br/>Port 25432 (optional)<br/>- User data<br/>- Metadata<br/>- Quotas"]
end
Client -->|HTTPS/HTTP| Nginx
Nginx -->|Static| Client
Nginx -->|Proxy API| FastAPI
FastAPI -->|REST API| LakeFS
FastAPI -->|SQL| Postgres
FastAPI -->|S3 API| MinIO
LakeFS -->|Store objects| MinIO
Port Mapping:
- 28080 - Public entry point (Nginx)
- 48888 - Internal FastAPI (not exposed)
- 28000 - LakeFS admin UI (optional, for admins)
- 29000 - MinIO console (optional, for admins)
- 29001 - MinIO S3 API (internal + public for downloads)
- 25432 - PostgreSQL (optional, for external access)
Development vs Production
Development
Frontend Dev Server (port 5173):
npm run dev --prefix ./src/kohaku-hub-ui
# Proxies /api → http://localhost:48888
Backend (port 48888):
# Single worker (development with hot reload)
uvicorn kohakuhub.main:app --reload --port 48888
# Multi-worker (production-like testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
Client Access:
- Frontend: http://localhost:5173
- API: http://localhost:48888 (direct)
- Swagger Docs: http://localhost:48888/docs
Production (Docker)
All services via docker-compose:
./deploy.sh
Client Access:
- Everything: http://localhost:28080 (Web UI + API)
- Swagger Docs (dev): http://localhost:48888/docs (if port exposed)
Multi-Worker Deployment
KohakuHub supports horizontal scaling with multiple worker processes.
Database Architecture
Synchronous Database Operations:
- Uses Peewee ORM with synchronous operations
db.atomic()transactions ensure consistency across workers- No async database wrappers needed
- Safe for multi-worker deployments
Why Synchronous?
- PostgreSQL and SQLite handle concurrent connections internally
- Atomic transactions prevent race conditions
- Simpler code without async/await complexity
- Better compatibility with multi-worker setups
Future: Migration to peewee-async is planned for improved concurrency.
Running Multi-Worker
Development/Testing:
# 4 workers (recommended for testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
# 8 workers (production-like load)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 8
Docker Deployment:
# docker-compose.yml
services:
hub-api:
command: uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
Worker Recommendations
| Deployment | Workers | CPU | Memory | Notes |
|---|---|---|---|---|
| Development | 1 | 2 cores | 2GB | Hot reload enabled |
| Small | 2-4 | 4 cores | 4GB | For <100 users |
| Medium | 4-8 | 8 cores | 8GB | For <1000 users |
| Large | 8-16 | 16+ cores | 16GB+ | For >1000 users |
Formula: Workers = (2 × CPU cores) + 1
Benefits of Multi-Worker
- Horizontal Scaling: Handle more concurrent requests
- High Availability: Worker crashes don't affect others
- Better Resource Utilization: Leverage multiple CPU cores
- Load Distribution: Requests distributed across workers
Limitations
- Cannot use
--reloadwith multiple workers - Shared state must use database or external cache
- Log aggregation recommended for debugging
Security Best Practices
Production Deployment
-
Only expose port 28080
# docker-compose.yml hub-ui: ports: - "28080:80" # ONLY THIS PORT hub-api: # NO ports section - internal only -
Use HTTPS with reverse proxy
# Production nginx config server { listen 443 ssl; server_name your-domain.com; ssl_certificate /etc/nginx/ssl/cert.pem; ssl_certificate_key /etc/nginx/ssl/key.pem; location / { proxy_pass http://hub-ui:80; } } -
Set BASE_URL to your domain
environment: - KOHAKU_HUB_BASE_URL=https://your-domain.com - KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com
Common Mistakes
❌ Don't do this:
# Wrong - bypassing nginx
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
✅ Do this:
# Correct - using nginx reverse proxy
os.environ["HF_ENDPOINT"] = "http://localhost:28080"
Data Flow Examples
Upload Flow (with LFS)
sequenceDiagram
participant User
participant Nginx
participant FastAPI
participant LakeFS
participant MinIO
User->>Nginx: POST /api/models/org/model/commit/main
Nginx->>FastAPI: Forward request
FastAPI->>FastAPI: Parse NDJSON (header + files + lfsFiles)
alt Small File (<5MB)
FastAPI->>LakeFS: Upload object (base64 decoded)
LakeFS->>MinIO: Store object
else Large File (>5MB)
Note over FastAPI,MinIO: File already uploaded via presigned URL
FastAPI->>LakeFS: Link physical address
end
FastAPI->>LakeFS: Commit with message
LakeFS-->>FastAPI: Commit ID
FastAPI-->>Nginx: 200 OK + commit URL
Nginx-->>User: Commit successful
Download Flow (Direct S3)
sequenceDiagram
participant User
participant Nginx
participant FastAPI
participant LakeFS
participant MinIO
User->>Nginx: GET /org/model/resolve/main/model.safetensors
Nginx->>FastAPI: Forward request
FastAPI->>LakeFS: Stat object (get metadata)
LakeFS-->>FastAPI: Physical address + SHA256
FastAPI->>MinIO: Generate presigned URL (1 hour)
FastAPI-->>Nginx: 302 Redirect
Nginx-->>User: Redirect to presigned URL
User->>MinIO: Direct download
MinIO-->>User: File content
Why This Architecture?
- Single Entry Point: Users only need to know one port (28080)
- Security: Backend (48888) not exposed to internet
- SSL Termination: Nginx handles HTTPS
- Static File Serving: Nginx serves frontend efficiently
- Load Balancing: Can add multiple backend instances behind nginx
- Caching: Nginx can cache static assets
- Direct Downloads: Files downloaded directly from S3, not proxied
- Scalability: Each component can scale independently
Troubleshooting
"Connection refused to localhost:48888"
Problem: Client trying to connect directly to backend
Solution: Change HF_ENDPOINT to use port 28080:
export HF_ENDPOINT=http://localhost:28080
"CORS errors in browser"
Problem: Frontend trying to access wrong port
Solution: Ensure KOHAKU_HUB_BASE_URL is set correctly:
environment:
- KOHAKU_HUB_BASE_URL=http://localhost:28080
"API calls returning HTML instead of JSON"
Problem: Hitting nginx for a non-proxied path
Solution: Check nginx config ensures all API paths are proxied