Files
KohakuHub/docs/setup.md
Kohaku-Blueleaf ef35a514de Update documents
2025-10-18 12:01:44 +08:00

331 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# KohakuHub Setup Guide
*Last Updated: January 2025*
## Quick Start
### 1. Clone Repository
```bash
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
```
### 2. Configure Docker Compose
Choose one of the following methods:
**Option A: Interactive Generator (Recommended)**
```bash
python scripts/generate_docker_compose.py
```
The generator will guide you through:
- PostgreSQL setup (built-in vs external)
- LakeFS database backend
- S3 storage (MinIO vs external)
- Security key generation
**Option B: Manual Configuration**
```bash
cp docker-compose.example.yml docker-compose.yml
# Edit docker-compose.yml and change all security settings
```
### 3. Build Frontend
```bash
npm install --prefix ./src/kohaku-hub-ui
npm install --prefix ./src/kohaku-hub-admin
npm run build --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-admin
```
### 4. Start Services
```bash
docker-compose up -d --build
```
### 5. Verify Installation
```bash
# Check all services are running
docker-compose ps
# View logs
docker-compose logs -f hub-api
```
### 6. Access KohakuHub
- **Web UI & API:** http://localhost:28080
- **Admin Portal:** http://localhost:28080/admin
- **API Docs (Swagger):** http://localhost:48888/docs
## Required Configuration Changes
**IMPORTANT:** You must change these security values before production deployment:
| Variable | Default | Change To | Why |
|----------|---------|-----------|-----|
| `MINIO_ROOT_USER` | minioadmin | your_username | S3 storage security |
| `MINIO_ROOT_PASSWORD` | minioadmin | strong_password | S3 storage security |
| `POSTGRES_PASSWORD` | hubpass | strong_password | Database security |
| `LAKEFS_AUTH_ENCRYPT_SECRET_KEY` | change_this | random_32_chars | LakeFS encryption |
| `KOHAKU_HUB_SESSION_SECRET` | change_this | random_string | Session security |
| `KOHAKU_HUB_ADMIN_SECRET_TOKEN` | change_this | random_string | Admin portal access |
**Generate Secure Values:**
```bash
# Generate 32-character hex key (for LAKEFS_AUTH_ENCRYPT_SECRET_KEY)
openssl rand -hex 32
# Generate 64-character random string (for SESSION_SECRET and ADMIN_TOKEN)
openssl rand -base64 48
```
## Optional Configuration
| Variable | Default | When to Change |
|----------|---------|----------------|
| `KOHAKU_HUB_BASE_URL` | http://localhost:28080 | Deploying to domain |
| `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` | http://localhost:29001 | Using external S3 |
| `KOHAKU_HUB_LFS_THRESHOLD_BYTES` | 10000000 (10MB) | Adjust LFS threshold |
| `KOHAKU_HUB_REQUIRE_EMAIL_VERIFICATION` | false | Enable email verification |
| `KOHAKU_HUB_LFS_KEEP_VERSIONS` | 5 | Change version retention |
| `KOHAKU_HUB_LFS_AUTO_GC` | false | Enable auto garbage collection |
| `KOHAKU_HUB_ADMIN_ENABLED` | true | Disable admin portal |
## Post-Installation
### 1. Create First User
**Via Web UI:**
- Go to http://localhost:28080
- Click "Register"
- Create account
**Via CLI:**
```bash
pip install -e .
kohub-cli auth register
```
### 2. Get LakeFS Credentials
LakeFS credentials are auto-generated on first startup:
```bash
cat docker/hub-meta/hub-api/credentials.env
```
Use these to login to LakeFS UI at http://localhost:28000
### 3. Test with Python
```bash
pip install huggingface_hub
export HF_ENDPOINT=http://localhost:28080
export HF_TOKEN=your_token_from_ui
python scripts/test.py
```
## Troubleshooting
### Services Won't Start
**Check logs:**
```bash
docker-compose logs hub-api
docker-compose logs lakefs
docker-compose logs minio
```
**Common issues:**
- Port already in use (change ports in docker-compose.yml)
- Insufficient disk space
- Docker daemon not running
### Cannot Connect to API
**Verify nginx is running:**
```bash
docker-compose ps hub-ui
```
**Check nginx logs:**
```bash
docker-compose logs hub-ui
```
**Test directly:**
```bash
curl http://localhost:28080/api/version
```
### Cannot Access from External Network
If deploying on a server:
1. Update `KOHAKU_HUB_BASE_URL` to your domain
2. Update `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` if using external S3
3. Add reverse proxy with HTTPS (nginx/traefik/caddy)
4. Only expose port 28080 (or 443 with HTTPS)
## Production Deployment
### 1. Use HTTPS
Add reverse proxy in front of port 28080:
```nginx
# Example nginx config
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:28080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
### 2. Security Checklist
- [ ] Changed all default passwords
- [ ] Set strong SESSION_SECRET
- [ ] Set strong LAKEFS_AUTH_ENCRYPT_SECRET_KEY
- [ ] Set strong ADMIN_SECRET_TOKEN
- [ ] Using HTTPS with valid certificate
- [ ] Only port 28080 exposed (or 443 for HTTPS)
- [ ] Firewall configured
- [ ] Regular backups configured
### 3. Backup Strategy
**Data to backup:**
- `hub-meta/` - Database, LakeFS metadata, credentials
- `hub-storage/` - MinIO object storage (or use S3)
- `docker-compose.yml` - Your configuration
**Backup command:**
```bash
tar -czf kohakuhub-backup-$(date +%Y%m%d).tar.gz hub-meta/ hub-storage/ docker-compose.yml
```
**Restore:**
```bash
tar -xzf kohakuhub-backup-YYYYMMDD.tar.gz
docker-compose up -d --build
```
## Multi-Worker Deployment
For production deployments, running multiple workers improves performance and availability.
### Database Architecture
KohakuHub uses **synchronous database operations** with Peewee ORM:
- `db.atomic()` transactions ensure data consistency
- Safe for concurrent access from multiple workers
- PostgreSQL and SQLite handle connection pooling internally
- No async database wrappers needed
### Running with Multiple Workers
**Development Testing:**
```bash
# 4 workers (recommended for testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
# Test with load
ab -n 1000 -c 10 http://localhost:48888/health
```
**Docker Deployment:**
Edit your `docker-compose.yml`:
```yaml
services:
hub-api:
image: kohakuhub-api
command: uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
environment:
- KOHAKU_HUB_BASE_URL=http://localhost:28080
# ... other env vars
```
### Worker Scaling Guide
| Deployment Size | Workers | CPU Cores | Memory | Concurrent Users |
|----------------|---------|-----------|--------|------------------|
| Development | 1 | 2 | 2GB | <10 |
| Small | 2-4 | 4 | 4GB | <100 |
| Medium | 4-8 | 8 | 8GB | <1000 |
| Large | 8-16 | 16+ | 16GB+ | >1000 |
**Recommended Formula:** Workers = (2 × CPU cores) + 1
### Benefits
- **Horizontal Scaling:** Handle more concurrent requests
- **High Availability:** One worker crash doesn't affect others
- **CPU Utilization:** Leverage multiple cores efficiently
- **Load Balancing:** Uvicorn distributes requests automatically
### Limitations
- Cannot use `--reload` flag with multiple workers
- In-memory caches are per-worker (use Redis for shared cache)
- Log output from all workers (use log aggregation)
## Updating
### Update KohakuHub
```bash
# Pull latest code
git pull
# Rebuild frontend
npm install --prefix ./src/kohaku-hub-ui
npm install --prefix ./src/kohaku-hub-admin
npm run build --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-admin
# Restart services
docker-compose down
docker-compose up -d --build
```
**Note:** Check CHANGELOG for breaking changes before updating.
## Uninstall
```bash
# Stop and remove containers
docker-compose down
# Remove data (WARNING: This deletes everything!)
rm -rf hub-meta/ hub-storage/
# Remove docker-compose config
rm docker-compose.yml
```
## Support
- **Discord:** https://discord.gg/xWYrkyvJ2s
- **GitHub Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues
- **Documentation:** See docs/ folder