github-starred/KohakuHub

Fork 0

mirror of https://github.com/KohakuBlueleaf/KohakuHub.git synced 2026-04-27 17:17:58 -05:00

Files

Kohaku-Blueleaf a023ba593b update docs

2025-10-11 22:28:24 +08:00

11 KiB

Raw Blame History

KohakuHub Deployment Architecture

Setup Instructions

First Time Setup

Option 1: Interactive Generator (Recommended)

Use the interactive generator to create a customized docker-compose.yml:

# Run the generator
python scripts/generate_docker_compose.py

The generator will ask you to configure:

PostgreSQL (built-in container or external database)
LakeFS database backend (PostgreSQL or SQLite)
S3 storage (built-in MinIO or external S3/R2)
Security keys (auto-generated or custom)

See scripts/README.md for detailed usage.

Option 2: Manual Configuration

Copy configuration file:

cp docker-compose.example.yml docker-compose.yml

Edit docker-compose.yml:
- Change MinIO credentials (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD)
- Change PostgreSQL password (POSTGRES_PASSWORD)
- Change LakeFS secret key (LAKEFS_AUTH_ENCRYPT_SECRET_KEY)
- Change session secret (KOHAKU_HUB_SESSION_SECRET)
- Update BASE_URL if deploying to a domain

Build and Start

After configuration (either option):

npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
docker-compose up -d --build

Note: The repository only includes docker-compose.example.yml as a template. Your customized docker-compose.yml is excluded from git to prevent committing sensitive credentials.

Port Configuration

Production Deployment (Docker)

Exposed Port:

28080 - Main entry point (Web UI + API via nginx reverse proxy)

Internal Ports (not exposed to users):

48888 - Backend API server (proxied by nginx)
28000 - LakeFS UI (admin only)
29000 - MinIO Console (admin only)
29001 - MinIO S3 API (used by backend)
25432 - PostgreSQL (optional, for external access)

Nginx Reverse Proxy

Configuration: docker/nginx/default.conf

graph LR
    subgraph "Nginx (Port 28080)"
        direction TB
        Router[Request Router]
        Static[Static Files Handler]
        Proxy[API Proxy]
    end

    Client[Client] -->|Request| Router
    Router -->|"/", "/*.html", "/*.js"| Static
    Router -->|"/api/*"| Proxy
    Router -->|"/org/*"| Proxy
    Router -->|"/{ns}/{repo}.git/*"| Proxy
    Router -->|"/resolve/*"| Proxy

    Static -->|Serve| Vue[Vue 3 Frontend]
    Proxy -->|Forward| FastAPI["FastAPI:48888"]

Nginx routing rules:

Serves frontend static files from /usr/share/nginx/html
Proxies API requests to hub-api:48888:
- /api/* → API endpoints
- /org/* → Organization endpoints
- /{namespace}/{name}.git/* → Git Smart HTTP protocol
- /{type}s/{namespace}/{name}/resolve/* → File download endpoints
- /admin/* → Admin portal (if enabled)

Client Configuration

For HuggingFace Client:

import os
os.environ["HF_ENDPOINT"] = "http://localhost:28080"  # Use nginx port
os.environ["HF_TOKEN"] = "your_token"

For kohub-cli:

export HF_ENDPOINT=http://localhost:28080
kohub-cli auth login

For Git Clone:

# Clone repository
git clone http://localhost:28080/namespace/repo.git

# With authentication (private repos)
git clone http://username:token@localhost:28080/namespace/repo.git

# Download large files
cd repo
git lfs install
git lfs pull

❌ WRONG:

os.environ["HF_ENDPOINT"] = "http://localhost:48888"  # Don't use backend port directly

Architecture Diagram

graph TB
    subgraph "External Access"
        Client["Client<br/>(Browser, Git, Python SDK, CLI)"]
    end

    subgraph "Nginx Container (hub-ui)<br/>Port 28080"
        Nginx["Nginx Reverse Proxy<br/>- Static files: Vue 3 frontend<br/>- Proxy: /api, /org, resolve"]
    end

    subgraph "FastAPI Container (hub-api)<br/>Port 48888 (internal)"
        FastAPI["FastAPI Application<br/>- HF-compatible REST API<br/>- Git Smart HTTP<br/>- LFS protocol<br/>- Authentication"]
    end

    subgraph "Storage Layer"
        LakeFS["LakeFS Container<br/>Port 28000 (admin)<br/>- Git-like versioning<br/>- Branch management<br/>- Commit history"]
        MinIO["MinIO Container<br/>Port 29000 (console)<br/>Port 29001 (S3 API)<br/>- S3-compatible storage<br/>- Object storage"]
        Postgres["PostgreSQL Container<br/>Port 25432 (optional)<br/>- User data<br/>- Metadata<br/>- Quotas"]
    end

    Client -->|HTTPS/HTTP| Nginx
    Nginx -->|Static| Client
    Nginx -->|Proxy API| FastAPI
    FastAPI -->|REST API| LakeFS
    FastAPI -->|SQL| Postgres
    FastAPI -->|S3 API| MinIO
    LakeFS -->|Store objects| MinIO

Port Mapping:

28080 - Public entry point (Nginx)
48888 - Internal FastAPI (not exposed)
28000 - LakeFS admin UI (optional, for admins)
29000 - MinIO console (optional, for admins)
29001 - MinIO S3 API (internal + public for downloads)
25432 - PostgreSQL (optional, for external access)

Development vs Production

Development

Frontend Dev Server (port 5173):

npm run dev --prefix ./src/kohaku-hub-ui
# Proxies /api → http://localhost:48888

Backend (port 48888):

# Single worker (development with hot reload)
uvicorn kohakuhub.main:app --reload --port 48888

# Multi-worker (production-like testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4

Client Access:

Frontend: http://localhost:5173
API: http://localhost:48888 (direct)
Swagger Docs: http://localhost:48888/docs

Production (Docker)

All services via docker-compose:

./deploy.sh

Client Access:

Everything: http://localhost:28080 (Web UI + API)
Swagger Docs (dev): http://localhost:48888/docs (if port exposed)

Multi-Worker Deployment

KohakuHub supports horizontal scaling with multiple worker processes.

Database Architecture

Synchronous Database Operations:

Uses Peewee ORM with synchronous operations
db.atomic() transactions ensure consistency across workers
No async database wrappers needed
Safe for multi-worker deployments

Why Synchronous?

PostgreSQL and SQLite handle concurrent connections internally
Atomic transactions prevent race conditions
Simpler code without async/await complexity
Better compatibility with multi-worker setups

Future: Migration to peewee-async is planned for improved concurrency.

Running Multi-Worker

Development/Testing:

# 4 workers (recommended for testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4

# 8 workers (production-like load)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 8

Docker Deployment:

# docker-compose.yml
services:
  hub-api:
    command: uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4

Worker Recommendations

Deployment	Workers	CPU	Memory	Notes
Development	1	2 cores	2GB	Hot reload enabled
Small	2-4	4 cores	4GB	For <100 users
Medium	4-8	8 cores	8GB	For <1000 users
Large	8-16	16+ cores	16GB+	For >1000 users

Formula: Workers = (2 × CPU cores) + 1

Benefits of Multi-Worker

Horizontal Scaling: Handle more concurrent requests
High Availability: Worker crashes don't affect others
Better Resource Utilization: Leverage multiple CPU cores
Load Distribution: Requests distributed across workers

Limitations

Cannot use --reload with multiple workers
Shared state must use database or external cache
Log aggregation recommended for debugging

Security Best Practices

Production Deployment

Only expose port 28080

# docker-compose.yml
hub-ui:
  ports:
    - "28080:80"  # ONLY THIS PORT

hub-api:
  # NO ports section - internal only

Use HTTPS with reverse proxy

# Production nginx config
server {
    listen 443 ssl;
    server_name your-domain.com;

    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;

    location / {
        proxy_pass http://hub-ui:80;
    }
}

Set BASE_URL to your domain

environment:
  - KOHAKU_HUB_BASE_URL=https://your-domain.com
  - KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com

Common Mistakes

❌ Don't do this:

# Wrong - bypassing nginx
os.environ["HF_ENDPOINT"] = "http://localhost:48888"

✅ Do this:

# Correct - using nginx reverse proxy
os.environ["HF_ENDPOINT"] = "http://localhost:28080"

Data Flow Examples

Upload Flow (with LFS)

sequenceDiagram
    participant User
    participant Nginx
    participant FastAPI
    participant LakeFS
    participant MinIO

    User->>Nginx: POST /api/models/org/model/commit/main
    Nginx->>FastAPI: Forward request
    FastAPI->>FastAPI: Parse NDJSON (header + files + lfsFiles)

    alt Small File (<5MB)
        FastAPI->>LakeFS: Upload object (base64 decoded)
        LakeFS->>MinIO: Store object
    else Large File (>5MB)
        Note over FastAPI,MinIO: File already uploaded via presigned URL
        FastAPI->>LakeFS: Link physical address
    end

    FastAPI->>LakeFS: Commit with message
    LakeFS-->>FastAPI: Commit ID
    FastAPI-->>Nginx: 200 OK + commit URL
    Nginx-->>User: Commit successful

Download Flow (Direct S3)

sequenceDiagram
    participant User
    participant Nginx
    participant FastAPI
    participant LakeFS
    participant MinIO

    User->>Nginx: GET /org/model/resolve/main/model.safetensors
    Nginx->>FastAPI: Forward request
    FastAPI->>LakeFS: Stat object (get metadata)
    LakeFS-->>FastAPI: Physical address + SHA256
    FastAPI->>MinIO: Generate presigned URL (1 hour)
    FastAPI-->>Nginx: 302 Redirect
    Nginx-->>User: Redirect to presigned URL
    User->>MinIO: Direct download
    MinIO-->>User: File content

Why This Architecture?

Single Entry Point: Users only need to know one port (28080)
Security: Backend (48888) not exposed to internet
SSL Termination: Nginx handles HTTPS
Static File Serving: Nginx serves frontend efficiently
Load Balancing: Can add multiple backend instances behind nginx
Caching: Nginx can cache static assets
Direct Downloads: Files downloaded directly from S3, not proxied
Scalability: Each component can scale independently

Troubleshooting

"Connection refused to localhost:48888"

Problem: Client trying to connect directly to backend

Solution: Change HF_ENDPOINT to use port 28080:

export HF_ENDPOINT=http://localhost:28080

"CORS errors in browser"

Problem: Frontend trying to access wrong port

Solution: Ensure KOHAKU_HUB_BASE_URL is set correctly:

environment:
  - KOHAKU_HUB_BASE_URL=http://localhost:28080

"API calls returning HTML instead of JSON"

Problem: Hitting nginx for a non-proxied path

Solution: Check nginx config ensures all API paths are proxied

11 KiB Raw Blame History Unescape Escape