Files
KohakuHub/docs/deployment.md
Kohaku-Blueleaf a023ba593b update docs
2025-10-11 22:28:24 +08:00

410 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# KohakuHub Deployment Architecture
## Setup Instructions
### First Time Setup
#### Option 1: Interactive Generator (Recommended)
Use the interactive generator to create a customized `docker-compose.yml`:
```bash
# Run the generator
python scripts/generate_docker_compose.py
```
The generator will ask you to configure:
- PostgreSQL (built-in container or external database)
- LakeFS database backend (PostgreSQL or SQLite)
- S3 storage (built-in MinIO or external S3/R2)
- Security keys (auto-generated or custom)
See [scripts/README.md](../scripts/README.md#docker-compose-generator) for detailed usage.
#### Option 2: Manual Configuration
1. **Copy configuration file:**
```bash
cp docker-compose.example.yml docker-compose.yml
```
2. **Edit docker-compose.yml:**
- Change MinIO credentials (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD)
- Change PostgreSQL password (POSTGRES_PASSWORD)
- Change LakeFS secret key (LAKEFS_AUTH_ENCRYPT_SECRET_KEY)
- Change session secret (KOHAKU_HUB_SESSION_SECRET)
- Update BASE_URL if deploying to a domain
#### Build and Start
After configuration (either option):
```bash
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
docker-compose up -d --build
```
**Note:** The repository only includes `docker-compose.example.yml` as a template. Your customized `docker-compose.yml` is excluded from git to prevent committing sensitive credentials.
## Port Configuration
### Production Deployment (Docker)
**Exposed Port:**
- **28080** - Main entry point (Web UI + API via nginx reverse proxy)
**Internal Ports (not exposed to users):**
- 48888 - Backend API server (proxied by nginx)
- 28000 - LakeFS UI (admin only)
- 29000 - MinIO Console (admin only)
- 29001 - MinIO S3 API (used by backend)
- 25432 - PostgreSQL (optional, for external access)
### Nginx Reverse Proxy
**Configuration:** `docker/nginx/default.conf`
```mermaid
graph LR
subgraph "Nginx (Port 28080)"
direction TB
Router[Request Router]
Static[Static Files Handler]
Proxy[API Proxy]
end
Client[Client] -->|Request| Router
Router -->|"/", "/*.html", "/*.js"| Static
Router -->|"/api/*"| Proxy
Router -->|"/org/*"| Proxy
Router -->|"/{ns}/{repo}.git/*"| Proxy
Router -->|"/resolve/*"| Proxy
Static -->|Serve| Vue[Vue 3 Frontend]
Proxy -->|Forward| FastAPI["FastAPI:48888"]
```
**Nginx routing rules:**
1. Serves frontend static files from `/usr/share/nginx/html`
2. Proxies API requests to `hub-api:48888`:
- `/api/*` → API endpoints
- `/org/*` → Organization endpoints
- `/{namespace}/{name}.git/*` → Git Smart HTTP protocol
- `/{type}s/{namespace}/{name}/resolve/*` → File download endpoints
- `/admin/*` → Admin portal (if enabled)
### Client Configuration
**For HuggingFace Client:**
```python
import os
os.environ["HF_ENDPOINT"] = "http://localhost:28080" # Use nginx port
os.environ["HF_TOKEN"] = "your_token"
```
**For kohub-cli:**
```bash
export HF_ENDPOINT=http://localhost:28080
kohub-cli auth login
```
**For Git Clone:**
```bash
# Clone repository
git clone http://localhost:28080/namespace/repo.git
# With authentication (private repos)
git clone http://username:token@localhost:28080/namespace/repo.git
# Download large files
cd repo
git lfs install
git lfs pull
```
**❌ WRONG:**
```python
os.environ["HF_ENDPOINT"] = "http://localhost:48888" # Don't use backend port directly
```
## Architecture Diagram
```mermaid
graph TB
subgraph "External Access"
Client["Client<br/>(Browser, Git, Python SDK, CLI)"]
end
subgraph "Nginx Container (hub-ui)<br/>Port 28080"
Nginx["Nginx Reverse Proxy<br/>- Static files: Vue 3 frontend<br/>- Proxy: /api, /org, resolve"]
end
subgraph "FastAPI Container (hub-api)<br/>Port 48888 (internal)"
FastAPI["FastAPI Application<br/>- HF-compatible REST API<br/>- Git Smart HTTP<br/>- LFS protocol<br/>- Authentication"]
end
subgraph "Storage Layer"
LakeFS["LakeFS Container<br/>Port 28000 (admin)<br/>- Git-like versioning<br/>- Branch management<br/>- Commit history"]
MinIO["MinIO Container<br/>Port 29000 (console)<br/>Port 29001 (S3 API)<br/>- S3-compatible storage<br/>- Object storage"]
Postgres["PostgreSQL Container<br/>Port 25432 (optional)<br/>- User data<br/>- Metadata<br/>- Quotas"]
end
Client -->|HTTPS/HTTP| Nginx
Nginx -->|Static| Client
Nginx -->|Proxy API| FastAPI
FastAPI -->|REST API| LakeFS
FastAPI -->|SQL| Postgres
FastAPI -->|S3 API| MinIO
LakeFS -->|Store objects| MinIO
```
**Port Mapping:**
- **28080** - Public entry point (Nginx)
- **48888** - Internal FastAPI (not exposed)
- **28000** - LakeFS admin UI (optional, for admins)
- **29000** - MinIO console (optional, for admins)
- **29001** - MinIO S3 API (internal + public for downloads)
- **25432** - PostgreSQL (optional, for external access)
## Development vs Production
### Development
**Frontend Dev Server** (port 5173):
```bash
npm run dev --prefix ./src/kohaku-hub-ui
# Proxies /api → http://localhost:48888
```
**Backend** (port 48888):
```bash
# Single worker (development with hot reload)
uvicorn kohakuhub.main:app --reload --port 48888
# Multi-worker (production-like testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
```
**Client Access:**
- Frontend: http://localhost:5173
- API: http://localhost:48888 (direct)
- Swagger Docs: http://localhost:48888/docs
### Production (Docker)
**All services via docker-compose:**
```bash
./deploy.sh
```
**Client Access:**
- **Everything:** http://localhost:28080 (Web UI + API)
- Swagger Docs (dev): http://localhost:48888/docs (if port exposed)
## Multi-Worker Deployment
KohakuHub supports horizontal scaling with multiple worker processes.
### Database Architecture
**Synchronous Database Operations:**
- Uses Peewee ORM with synchronous operations
- `db.atomic()` transactions ensure consistency across workers
- No async database wrappers needed
- Safe for multi-worker deployments
**Why Synchronous?**
- PostgreSQL and SQLite handle concurrent connections internally
- Atomic transactions prevent race conditions
- Simpler code without async/await complexity
- Better compatibility with multi-worker setups
**Future:** Migration to peewee-async is planned for improved concurrency.
### Running Multi-Worker
**Development/Testing:**
```bash
# 4 workers (recommended for testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
# 8 workers (production-like load)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 8
```
**Docker Deployment:**
```yaml
# docker-compose.yml
services:
hub-api:
command: uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
```
### Worker Recommendations
| Deployment | Workers | CPU | Memory | Notes |
|------------|---------|-----|--------|-------|
| Development | 1 | 2 cores | 2GB | Hot reload enabled |
| Small | 2-4 | 4 cores | 4GB | For <100 users |
| Medium | 4-8 | 8 cores | 8GB | For <1000 users |
| Large | 8-16 | 16+ cores | 16GB+ | For >1000 users |
**Formula:** Workers = (2 × CPU cores) + 1
### Benefits of Multi-Worker
1. **Horizontal Scaling:** Handle more concurrent requests
2. **High Availability:** Worker crashes don't affect others
3. **Better Resource Utilization:** Leverage multiple CPU cores
4. **Load Distribution:** Requests distributed across workers
### Limitations
- Cannot use `--reload` with multiple workers
- Shared state must use database or external cache
- Log aggregation recommended for debugging
## Security Best Practices
### Production Deployment
1. **Only expose port 28080**
```yaml
# docker-compose.yml
hub-ui:
ports:
- "28080:80" # ONLY THIS PORT
hub-api:
# NO ports section - internal only
```
2. **Use HTTPS with reverse proxy**
```nginx
# Production nginx config
server {
listen 443 ssl;
server_name your-domain.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
location / {
proxy_pass http://hub-ui:80;
}
}
```
3. **Set BASE_URL to your domain**
```yaml
environment:
- KOHAKU_HUB_BASE_URL=https://your-domain.com
- KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com
```
## Common Mistakes
❌ **Don't do this:**
```python
# Wrong - bypassing nginx
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
```
✅ **Do this:**
```python
# Correct - using nginx reverse proxy
os.environ["HF_ENDPOINT"] = "http://localhost:28080"
```
## Data Flow Examples
### Upload Flow (with LFS)
```mermaid
sequenceDiagram
participant User
participant Nginx
participant FastAPI
participant LakeFS
participant MinIO
User->>Nginx: POST /api/models/org/model/commit/main
Nginx->>FastAPI: Forward request
FastAPI->>FastAPI: Parse NDJSON (header + files + lfsFiles)
alt Small File (<5MB)
FastAPI->>LakeFS: Upload object (base64 decoded)
LakeFS->>MinIO: Store object
else Large File (>5MB)
Note over FastAPI,MinIO: File already uploaded via presigned URL
FastAPI->>LakeFS: Link physical address
end
FastAPI->>LakeFS: Commit with message
LakeFS-->>FastAPI: Commit ID
FastAPI-->>Nginx: 200 OK + commit URL
Nginx-->>User: Commit successful
```
### Download Flow (Direct S3)
```mermaid
sequenceDiagram
participant User
participant Nginx
participant FastAPI
participant LakeFS
participant MinIO
User->>Nginx: GET /org/model/resolve/main/model.safetensors
Nginx->>FastAPI: Forward request
FastAPI->>LakeFS: Stat object (get metadata)
LakeFS-->>FastAPI: Physical address + SHA256
FastAPI->>MinIO: Generate presigned URL (1 hour)
FastAPI-->>Nginx: 302 Redirect
Nginx-->>User: Redirect to presigned URL
User->>MinIO: Direct download
MinIO-->>User: File content
```
## Why This Architecture?
1. **Single Entry Point:** Users only need to know one port (28080)
2. **Security:** Backend (48888) not exposed to internet
3. **SSL Termination:** Nginx handles HTTPS
4. **Static File Serving:** Nginx serves frontend efficiently
5. **Load Balancing:** Can add multiple backend instances behind nginx
6. **Caching:** Nginx can cache static assets
7. **Direct Downloads:** Files downloaded directly from S3, not proxied
8. **Scalability:** Each component can scale independently
## Troubleshooting
### "Connection refused to localhost:48888"
**Problem:** Client trying to connect directly to backend
**Solution:** Change `HF_ENDPOINT` to use port 28080:
```bash
export HF_ENDPOINT=http://localhost:28080
```
### "CORS errors in browser"
**Problem:** Frontend trying to access wrong port
**Solution:** Ensure `KOHAKU_HUB_BASE_URL` is set correctly:
```yaml
environment:
- KOHAKU_HUB_BASE_URL=http://localhost:28080
```
### "API calls returning HTML instead of JSON"
**Problem:** Hitting nginx for a non-proxied path
**Solution:** Check nginx config ensures all API paths are proxied