# KohakuHub Setup Guide
*Last Updated: January 2025*
## Quick Start
```mermaid
graph LR
Start[Start] --> Clone[Clone Repository]
Clone --> Config[Configure
docker-compose.yml]
Config --> Build[Build Frontend]
Build --> Deploy[Start Docker]
Deploy --> Verify[Verify Installation]
Verify --> CreateUser[Create First User]
CreateUser --> Done[Ready!]
```
### 1. Clone Repository
```bash
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
```
### 2. Copy Configuration
```bash
cp docker-compose.example.yml docker-compose.yml
```
**Important:** The repository only includes `docker-compose.example.yml` as a template. You must copy it to `docker-compose.yml` and customize it for your deployment.
**Alternative:** Use the interactive generator:
```bash
python scripts/generate_docker_compose.py
```
The generator will guide you through:
- PostgreSQL setup (built-in vs external)
- LakeFS database backend
- S3 storage (MinIO vs external)
- Security key generation
### 2. Customize Configuration
**Edit `docker-compose.yml` and change these critical settings:**
#### ⚠️ Security (MUST CHANGE)
```yaml
# MinIO (Object Storage)
environment:
- MINIO_ROOT_USER=your_secure_username # Change from 'minioadmin'
- MINIO_ROOT_PASSWORD=your_secure_password # Change from 'minioadmin'
# PostgreSQL (Database)
environment:
- POSTGRES_PASSWORD=your_secure_db_password # Change from 'hubpass'
# LakeFS (Version Control)
environment:
- LAKEFS_AUTH_ENCRYPT_SECRET_KEY=generate_random_32_char_key_here # Change!
# KohakuHub API
environment:
- KOHAKU_HUB_SESSION_SECRET=generate_random_string_here # Change!
```
#### 🌐 Deployment URL (Optional)
If deploying to a server with a domain name:
```yaml
# KohakuHub API
environment:
- KOHAKU_HUB_BASE_URL=https://your-domain.com # Change from localhost
- KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com # For downloads
```
### 3. Build Frontend
```bash
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
```
### 4. Start Services
```bash
docker-compose up -d --build
```
### 5. Verify Installation
```bash
# Check all services are running
docker-compose ps
# View logs
docker-compose logs -f hub-api
```
### 6. Access KohakuHub
- **Web UI & API:** http://localhost:28080
- **API Docs:** http://localhost:48888/docs (optional, for development)
## Configuration Reference
```mermaid
graph TD
subgraph "Security Settings (MUST CHANGE)"
MinIO["MinIO Credentials
MINIO_ROOT_USER
MINIO_ROOT_PASSWORD"]
Postgres["PostgreSQL Password
POSTGRES_PASSWORD"]
LakeFS["LakeFS Encryption Key
LAKEFS_AUTH_ENCRYPT_SECRET_KEY"]
Session["Session Secret
KOHAKU_HUB_SESSION_SECRET"]
Admin["Admin Token
KOHAKU_HUB_ADMIN_SECRET_TOKEN"]
end
subgraph "Optional Settings"
BaseURL["Base URL
KOHAKU_HUB_BASE_URL"]
S3Public["S3 Public Endpoint
KOHAKU_HUB_S3_PUBLIC_ENDPOINT"]
LFSThreshold["LFS Threshold
KOHAKU_HUB_LFS_THRESHOLD_BYTES"]
Email["Email Verification
KOHAKU_HUB_REQUIRE_EMAIL_VERIFICATION"]
end
Deploy[Deploy] --> Security
Security --> Optional
Optional --> Production[Production Ready]
```
### Required Changes
| Variable | Default | Change To | Why |
|----------|---------|-----------|-----|
| `MINIO_ROOT_USER` | minioadmin | your_username | Security |
| `MINIO_ROOT_PASSWORD` | minioadmin | strong_password | Security |
| `POSTGRES_PASSWORD` | hubpass | strong_password | Security |
| `LAKEFS_AUTH_ENCRYPT_SECRET_KEY` | change_this | random_32_chars | Security |
| `KOHAKU_HUB_SESSION_SECRET` | change_this | random_string | Security |
| `KOHAKU_HUB_ADMIN_SECRET_TOKEN` | change_this | random_string | Admin portal access |
**Generate secure values:**
```bash
# Generate 32-character hex key
openssl rand -hex 32
# Generate 64-character random string
openssl rand -base64 48
```
### Optional Changes
| Variable | Default | When to Change |
|----------|---------|----------------|
| `KOHAKU_HUB_BASE_URL` | http://localhost:28080 | Deploying to domain |
| `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` | http://localhost:29001 | Using external S3 |
| `KOHAKU_HUB_LFS_THRESHOLD_BYTES` | 5242880 (5MB) | Adjust LFS threshold |
| `KOHAKU_HUB_REQUIRE_EMAIL_VERIFICATION` | false | Enable email verification |
| `KOHAKU_HUB_LFS_KEEP_VERSIONS` | 5 | Change version retention |
| `KOHAKU_HUB_LFS_AUTO_GC` | false | Enable auto garbage collection |
| `KOHAKU_HUB_ADMIN_ENABLED` | true | Disable admin portal |
## Post-Installation
### 1. Create First User
**Via Web UI:**
- Go to http://localhost:28080
- Click "Register"
- Create account
**Via CLI:**
```bash
pip install -e .
kohub-cli auth register
```
### 2. Get LakeFS Credentials
LakeFS credentials are auto-generated on first startup:
```bash
cat docker/hub-meta/hub-api/credentials.env
```
Use these to login to LakeFS UI at http://localhost:28000
### 3. Test with Python
```bash
pip install huggingface_hub
export HF_ENDPOINT=http://localhost:28080
export HF_TOKEN=your_token_from_ui
python scripts/test.py
```
## Troubleshooting
### Services Won't Start
**Check logs:**
```bash
docker-compose logs hub-api
docker-compose logs lakefs
docker-compose logs minio
```
**Common issues:**
- Port already in use (change ports in docker-compose.yml)
- Insufficient disk space
- Docker daemon not running
### Cannot Connect to API
**Verify nginx is running:**
```bash
docker-compose ps hub-ui
```
**Check nginx logs:**
```bash
docker-compose logs hub-ui
```
**Test directly:**
```bash
curl http://localhost:28080/api/version
```
### Cannot Access from External Network
**If deploying on a server:**
1. Update `KOHAKU_HUB_BASE_URL` to your domain
2. Update `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` if using external S3
3. Add reverse proxy with HTTPS (nginx/traefik/caddy)
4. Only expose port 28080 (or 443 with HTTPS)
## Production Deployment
### 1. Use HTTPS
Add reverse proxy in front of port 28080:
```nginx
# Example nginx config
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:28080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
### 2. Security Checklist
- [ ] Changed all default passwords
- [ ] Set strong SESSION_SECRET
- [ ] Set strong LAKEFS_AUTH_ENCRYPT_SECRET_KEY
- [ ] Using HTTPS with valid certificate
- [ ] Only port 28080 exposed (or 443 for HTTPS)
- [ ] Firewall configured
- [ ] Regular backups configured
### 3. Backup Strategy
**Data to backup:**
- `hub-meta/` - Database, LakeFS metadata, credentials
- `hub-storage/` - MinIO object storage (or use S3)
- `docker-compose.yml` - Your configuration
```bash
# Backup command
tar -czf kohakuhub-backup-$(date +%Y%m%d).tar.gz hub-meta/ hub-storage/ docker-compose.yml
```
## Updating
### Update KohakuHub
```bash
# Pull latest code
git pull
# Rebuild frontend
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
# Restart services
docker-compose down
docker-compose up -d --build
```
**Note:** Check CHANGELOG for breaking changes before updating.
## Multi-Worker Deployment
For production deployments, running multiple workers improves performance and availability.
### Database Architecture
KohakuHub uses **synchronous database operations** with Peewee ORM:
- `db.atomic()` transactions ensure data consistency
- Safe for concurrent access from multiple workers
- PostgreSQL and SQLite handle connection pooling internally
- No async database wrappers needed
**Future:** Migration to peewee-async planned for better concurrency.
### Running with Multiple Workers
**Development Testing:**
```bash
# 4 workers (recommended for testing)
uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
# Test with load
ab -n 1000 -c 10 http://localhost:48888/health
```
**Docker Deployment:**
Edit your `docker-compose.yml`:
```yaml
services:
hub-api:
image: kohakuhub-api
command: uvicorn kohakuhub.main:app --host 0.0.0.0 --port 48888 --workers 4
environment:
- KOHAKU_HUB_BASE_URL=http://localhost:28080
# ... other env vars
```
### Worker Scaling Guide
| Deployment Size | Workers | CPU Cores | Memory | Concurrent Users |
|----------------|---------|-----------|--------|------------------|
| Development | 1 | 2 | 2GB | <10 |
| Small | 2-4 | 4 | 4GB | <100 |
| Medium | 4-8 | 8 | 8GB | <1000 |
| Large | 8-16 | 16+ | 16GB+ | >1000 |
**Recommended Formula:** Workers = (2 × CPU cores) + 1
### Benefits
- **Horizontal Scaling:** Handle more concurrent requests
- **High Availability:** One worker crash doesn't affect others
- **CPU Utilization:** Leverage multiple cores efficiently
- **Load Balancing:** Uvicorn distributes requests automatically
### Limitations
- Cannot use `--reload` flag with multiple workers
- In-memory caches are per-worker (use Redis for shared cache)
- Log output from all workers (use log aggregation)
## Uninstall
```bash
# Stop and remove containers
docker-compose down
# Remove data (WARNING: This deletes everything!)
rm -rf hub-meta/ hub-storage/
# Remove docker-compose config
rm docker-compose.yml
```
## Support
- **Discord:** https://discord.gg/xWYrkyvJ2s
- **GitHub Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues
- **Documentation:** See docs/ folder