update documents

This commit is contained in:
Kohaku-Blueleaf
2025-10-06 15:14:03 +08:00
parent 871a22a7a4
commit a299394dea
8 changed files with 724 additions and 463 deletions

View File

@@ -1,297 +1,175 @@
# Contributing to KohakuHub
*Last Updated: October 2025*
Thank you for your interest in contributing to KohakuHub! We welcome contributions from the community.
Thank you for your interest in contributing to KohakuHub! We welcome contributions from the community and are excited to have you here.
## Quick Links
## Table of Contents
- [Getting Started](#getting-started)
- [Development Setup](#development-setup)
- [Project Structure](#project-structure)
- [How to Contribute](#how-to-contribute)
- [Code Style Guidelines](#code-style-guidelines)
- [Testing](#testing)
- [Pull Request Process](#pull-request-process)
- [Community](#community)
- **Discord:** https://discord.gg/xWYrkyvJ2s (Best for discussions)
- **GitHub Issues:** Bug reports and feature requests
- **Development Guide:** See [CLAUDE.md](./CLAUDE.md)
- **Roadmap:** See [Project Status](#project-status) below
## Getting Started
Before you begin:
- Read the [README.md](./README.md) to understand what KohakuHub does
- Check out our [TODO.md](./docs/TODO.md) to see what needs to be done
- Join our [Discord community](https://discord.gg/xWYrkyvJ2s) to discuss your ideas
## Development Setup
### Prerequisites
- **Python 3.10+**: Backend development
- **Node.js 18+**: Frontend development
- **Docker & Docker Compose**: For running the full stack
- **Git**: Version control
- Python 3.10+
- Node.js 18+
- Docker & Docker Compose
- Git
### Quick Setup
### Setup
1. **Clone the repository**
```bash
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
```
```bash
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
2. **Set up Python environment**
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
pip install -e .
```
# Backend
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"
3. **Install frontend dependencies**
```bash
cd src/kohaku-hub-ui
npm install
```
4. **Start development environment**
```bash
# From project root
./deploy.sh
```
5. **Access the services**
- KohakuHub Web UI: http://localhost:28080
- KohakuHub API: http://localhost:48888
- API Documentation: http://localhost:48888/docs
## Project Structure
# Frontend
npm install --prefix ./src/kohaku-hub-ui
# Start with Docker
cp docker-compose.example.yml docker-compose.yml
./deploy.sh
```
KohakuHub/
├── src/
│ ├── kohakuhub/ # Backend (FastAPI)
│ │ ├── api/ # API endpoints
│ │ ├── auth/ # Authentication & authorization
│ │ ├── org/ # Organization management
│ │ ├── db.py # Database models
│ │ └── main.py # Application entry point
│ ├── kohub_cli/ # CLI tool
│ │ ├── cli.py # CLI commands
│ │ ├── client.py # Python API client
│ │ └── main.py # CLI entry point
│ └── kohaku-hub-ui/ # Frontend (Vue 3 + Vite)
│ ├── src/
│ │ ├── components/ # Reusable components
│ │ ├── pages/ # Page components
│ │ ├── stores/ # State management (Pinia)
│ │ └── utils/ # Utility functions
│ └── vite.config.js
├── docker/ # Docker compose files
├── docs/ # Documentation
│ ├── API.md # API documentation
│ ├── CLI.md # CLI documentation
│ └── TODO.md # Development roadmap
├── scripts/ # Utility scripts
└── README.md
```
**Access:** http://localhost:28080
## Code Style
### Backend (Python)
Follow [CLAUDE.md](./CLAUDE.md) principles:
- Modern Python (match-case, async/await, native types)
- Import order: builtin → 3rd party → our package (alphabetical)
- Use `db_async` wrappers for all DB operations
- Split large functions into smaller ones
### Frontend (Vue 3)
Follow [CLAUDE.md](./CLAUDE.md) principles:
- JavaScript only (no TypeScript), use JSDoc for types
- Split reusable components
- Implement dark/light mode together
- Mobile responsive
## How to Contribute
### Reporting Bugs
If you find a bug, please create an issue on GitHub with:
- **Clear title**: Describe the issue concisely
- **Steps to reproduce**: How can we recreate the bug?
- **Expected behavior**: What should happen?
- **Actual behavior**: What actually happens?
- **Environment**: OS, Python version, Docker version, etc.
- **Logs**: Include relevant error messages or logs
Create an issue with:
- Clear title
- Steps to reproduce
- Expected vs actual behavior
- Environment (OS, Python/Node version)
- Logs/error messages
### Suggesting Features
We welcome feature suggestions! Please:
- Check if the feature is already in [TODO.md](./docs/TODO.md)
- Open a GitHub issue or discuss on Discord
- Describe the use case and why it's valuable
- Propose how it might work
- Check [Project Status](#project-status) first
- Open GitHub issue or discuss on Discord
- Describe use case and value
- Propose implementation approach
### Contributing Code
1. **Pick an issue** or create one describing what you plan to work on
2. **Fork the repository** and create a new branch
3. **Make your changes** following our code style guidelines
4. **Test your changes** thoroughly
5. **Submit a pull request** with a clear description
1. Pick an issue or create one
2. Fork and create branch
3. Make changes following style guidelines
4. Test thoroughly
5. Submit pull request
## Code Style Guidelines
## Project Status
### Backend (Python)
*Last Updated: January 2025*
Follow the guidelines in [CLAUDE.md](./CLAUDE.md):
- **Modern Python**: Use match-case, async/await, type hints
- **Import order**: builtin → 3rd party → our package (alphabetical)
- **Type hints**: Use native types (`dict` not `Dict`)
- **Clean code**: Split large functions into smaller ones
- **Async operations**: Use dedicated threadpools (S3/LakeFS/DB)
- S3 operations → `run_in_s3_executor()`
- LakeFS operations → `run_in_lakefs_executor()`
- DB operations → `db_async` module wrappers
### ✅ Core Features (Complete)
```python
from typing import Optional
from fastapi import APIRouter, HTTPException
**API & Storage:**
- HuggingFace Hub API compatibility
- Git LFS protocol for large files
- File deduplication (SHA256)
- Repository management (create, delete, list, move/rename)
- Branch and tag management
- Commit history
- S3-compatible storage (MinIO, AWS S3, etc.)
- LakeFS versioning (branches, commits, diffs)
def create_repository(
repo_id: str,
repo_type: str,
private: bool = False
) -> dict:
"""Create a new repository.
**Authentication:**
- User registration with email verification (optional)
- Session-based auth + API tokens
- Organization management with role-based access
- Permission system (namespace-based)
Args:
repo_id: Full repository ID (namespace/name)
repo_type: Type of repository (model, dataset, space)
private: Whether the repository is private
**Web UI:**
- Vue 3 interface with dark/light mode
- Repository browsing and file viewer
- Code editor (Monaco) with syntax highlighting
- Markdown rendering
- Commit history viewer
- Settings pages (user, org, repo)
- Documentation viewer
Returns:
Dictionary with repository information
**CLI Tool:**
- Full-featured `kohub-cli` with interactive TUI
- Repository, organization, user management
- Branch/tag operations
- File upload/download
- Commit history viewing
- Health check
- Operation history tracking
- Shell autocomplete (bash/zsh/fish)
Raises:
HTTPException: If repository already exists
"""
pass
```
### 🚧 In Progress
### Frontend (Vue 3 + JavaScript)
- Rate limiting
- More granular permissions
- Repository transfer between namespaces
- Organization deletion
- Search functionality
Follow the guidelines in [CLAUDE.md](./CLAUDE.md):
- **JavaScript only** - No TypeScript, use JSDoc for type hints
- **Split reusable components** - One component per file
- **Dark/light mode** - Implement both at once
- **Mobile responsive** - Consider auto-break lines
- **Composition API**: Use `<script setup>` syntax
- **Styling**: UnoCSS utility classes
### 📋 Planned Features
```vue
<script setup>
import { ref, computed } from 'vue'
import { useRouter } from 'vue-router'
**Advanced Features:**
- Pull requests / merge requests
- Discussion/comments
- Repository stars/likes
- Download statistics
- Model/dataset card templates
- Automated model evaluation
- Multi-region CDN support
- Webhook system
/**
* @typedef {Object} Props
* @property {string} repoId - Repository ID
* @property {string} repoType - Repository type (model/dataset/space)
*/
**UI Improvements:**
- Branch/tag management UI
- Diff viewer for commits
- Image/media file preview
- Activity feed
const props = defineProps({
repoId: String,
repoType: String
})
const router = useRouter()
/** @type {import('vue').Ref<Array<Object>>} */
const files = ref([])
const isLoading = computed(() => files.value.length === 0)
</script>
<template>
<div class="container mx-auto p-4">
<h1 class="text-2xl font-bold">{{ repoId }}</h1>
<!-- Content -->
</div>
</template>
```
### CLI (Python)
- Follow backend Python guidelines
- Use **Click** for command-line interface
- Provide helpful error messages
- Support both interactive and non-interactive modes
## Testing
### Backend Testing
```bash
# Run all tests
pytest
# Run specific test file
pytest tests/test_api.py
# Run with coverage
pytest --cov=kohakuhub
```
### Frontend Testing
```bash
cd src/kohaku-hub-ui
# Run unit tests
npm run test
# Run E2E tests
npm run test:e2e
```
### Manual Testing
1. Start the development environment
2. Test your changes through the Web UI
3. Test API endpoints using the interactive docs at http://localhost:48888/docs
4. Test CLI commands: `kohub-cli [command]`
## Pull Request Process
1. **Update documentation**: If you add features, update relevant docs
2. **Add tests**: Include tests for new functionality
3. **Update CHANGELOG**: Add entry for your changes (if applicable)
4. **Ensure CI passes**: All automated checks must pass
5. **Request review**: Tag maintainers for review
6. **Address feedback**: Respond to review comments promptly
### Pull Request Template
```markdown
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update
## Testing
How did you test these changes?
## Checklist
- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Comments added for complex code
- [ ] Documentation updated
- [ ] Tests added/updated
- [ ] No new warnings generated
```
**Testing & Quality:**
- Unit tests for API endpoints
- Integration tests for HF client
- E2E tests for web UI
- Performance/load testing
## Development Areas
We're especially looking for help in these areas:
We're especially looking for help in:
### 🎨 Frontend Development (High Priority)
- Improving the Vue 3 UI/UX
- Adding missing pages (commit history, diff viewer, etc.)
### 🎨 Frontend (High Priority)
- Improving UI/UX
- Missing pages (branch/tag management, diff viewer)
- Mobile responsiveness
- Accessibility improvements
- Accessibility
### 🔧 Backend Features
### 🔧 Backend
- Additional HuggingFace API compatibility
- Performance optimizations
- Advanced repository features (branches, PRs)
- Advanced repository features
- Search functionality
### 📚 Documentation
@@ -303,30 +181,25 @@ We're especially looking for help in these areas:
### 🧪 Testing
- Unit test coverage
- Integration tests
- E2E test scenarios
- E2E scenarios
- Load testing
### 🔨 CLI Tools
- Additional administrative commands
- File upload/download features
- Batch operations
## Pull Request Process
1. Update documentation if adding features
2. Add tests for new functionality
3. Ensure code follows style guidelines
4. Request review from maintainers
5. Address feedback promptly
## Community
- **Discord**: https://discord.gg/xWYrkyvJ2s (Best for real-time discussion)
- **GitHub Issues**: Bug reports and feature requests
- **GitHub Discussions**: Design discussions and questions
## Questions?
Don't hesitate to ask! We're here to help:
- Join our Discord and ask in the #dev channel
- Open a GitHub Discussion
- Comment on related issues
- **Discord:** https://discord.gg/xWYrkyvJ2s
- **GitHub Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues
## License
By contributing to KohakuHub, you agree that your contributions will be licensed under the AGPL-3.0 license.
By contributing, you agree that your contributions will be licensed under AGPL-3.0.
---

View File

@@ -30,15 +30,24 @@ Self-hosted HuggingFace Hub alternative with Git-like versioning for AI models a
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
# Build frontend and start services
# 1. Copy and customize configuration
cp docker-compose.example.yml docker-compose.yml
# 2. IMPORTANT: Edit docker-compose.yml
# - Change MINIO_ROOT_USER and MINIO_ROOT_PASSWORD
# - Change POSTGRES_PASSWORD
# - Change LAKEFS_AUTH_ENCRYPT_SECRET_KEY
# - Change KOHAKU_HUB_SESSION_SECRET
# 3. Build frontend and start services
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
docker-compose up -d --build
```
**Access:**
- Web UI: http://localhost:28080
- API Docs: http://localhost:48888/docs
- Web UI & API: http://localhost:28080 (all traffic goes here)
- API Docs (Swagger): http://localhost:48888/docs (direct access for development)
- LakeFS UI: http://localhost:28000
- MinIO Console: http://localhost:29000
@@ -50,7 +59,7 @@ docker-compose up -d --build
import os
from huggingface_hub import HfApi
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
os.environ["HF_ENDPOINT"] = "http://localhost:28080"
os.environ["HF_TOKEN"] = "your_token_here"
api = HfApi(endpoint=os.environ["HF_ENDPOINT"], token=os.environ["HF_TOKEN"])
@@ -73,7 +82,7 @@ api.hf_hub_download(repo_id="my-org/my-model", filename="model.safetensors")
```python
import os
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
os.environ["HF_ENDPOINT"] = "http://localhost:28080"
os.environ["HF_TOKEN"] = "your_token_here"
from diffusers import AutoencoderKL
@@ -145,7 +154,8 @@ See [config-example.toml](./config-example.toml) for all options.
**Backend:**
```bash
pip install -e .
uvicorn kohakuhub.main:app --reload --port 48888
uvicorn kohakuhub.main:app --reload --port 48888 # Development only
# Note: In production, access via nginx on port 28080
```
**Frontend:**
@@ -164,11 +174,13 @@ See [CLAUDE.md](./CLAUDE.md) for development guidelines.
## Documentation
- [API.md](./docs/API.md) - API endpoints and workflows
- [CLI.md](./docs/CLI.md) - Command-line tool usage
- [TODO.md](./docs/TODO.md) - Development status and roadmap
- [docs/setup.md](./docs/setup.md) - Setup and installation guide
- [docs/deployment.md](./docs/deployment.md) - Deployment architecture
- [docs/ports.md](./docs/ports.md) - Port configuration reference
- [docs/API.md](./docs/API.md) - API endpoints and workflows
- [docs/CLI.md](./docs/CLI.md) - Command-line tool usage
- [CONTRIBUTING.md](./CONTRIBUTING.md) - Contributing guide & roadmap
- [CLAUDE.md](./CLAUDE.md) - Developer guide for Claude Code
- [CONTRIBUTING.md](./CONTRIBUTING.md) - Contribution guidelines
## Security Notes

View File

@@ -1,5 +1,4 @@
# docker-compose.yml
version: "3.9"
services:
hub-ui:

View File

@@ -81,7 +81,7 @@ from kohub_cli import KohubClient
# Initialize client
client = KohubClient(
endpoint="http://localhost:8000",
endpoint="http://localhost:28080",
token="your_token_here" # optional
)
@@ -255,7 +255,7 @@ client.update_user_settings(
```python
# Save configuration
client.save_config(
endpoint="http://localhost:8000",
endpoint="http://localhost:28080",
token="hf_..."
)
@@ -434,7 +434,7 @@ kohub-cli settings organization members my-org
```bash
# Set endpoint
kohub-cli config set endpoint http://localhost:8000
kohub-cli config set endpoint http://localhost:28080
# Set token
kohub-cli config set token hf_...
@@ -472,7 +472,7 @@ All commands support these global options:
Examples:
```bash
kohub-cli --endpoint http://localhost:8000 auth whoami
kohub-cli --endpoint http://localhost:28080 auth whoami
kohub-cli --output json repo list --type model
kohub-cli --token hf_xxxxx repo create my-model --type model
```
@@ -483,7 +483,7 @@ Located at `~/.kohub/config.json`:
```json
{
"endpoint": "http://localhost:8000",
"endpoint": "http://localhost:28080",
"token": "hf_...",
"default_repo_type": "model",
"interactive_mode_default": true
@@ -524,7 +524,7 @@ $ kohub-cli --output json repo info nonexistent/repo --type model
## Environment Variables
- `HF_ENDPOINT` - KohakuHub endpoint URL (default: `http://localhost:8000`)
- `HF_ENDPOINT` - KohakuHub endpoint URL (default: `http://localhost:28080`)
- `HF_TOKEN` - API token for authentication
- `KOHUB_CONFIG_DIR` - Config directory (default: `~/.kohub`)

View File

@@ -1,183 +0,0 @@
# TODO
*Last Updated: October 2025*
Kohaku-Hub is a pretty large project and really hard to say where to start is better, but I will try to list all the known TODOs here with brief priority note
## Infrastructure
- [x] Basic Infra Structure
- [x] LakeFS + MinIO deployment
- [x] MinIO presigned URL
- [x] PostgreSQL database support
- [x] SQLite database support
- [x] Docker compose setup
- [x] Environment configuration system
## API Layer
- [x] Core File Operations
- [x] Upload small files (not LFS)
- [x] Upload large files (LFS)
- [x] Download with S3 presigned URLs
- [x] File deletion
- [x] File copy operations
- [x] Content deduplication
- [x] Repository Management
- [x] Create repository
- [x] Delete repository
- [x] List repositories
- [x] Get repository info
- [x] Tree list (recursive & non-recursive)
- [x] Paths-info endpoint
- [x] Move/Rename repository
- [x] Update repository settings (private, gated)
- [ ] Repository transfer between namespaces (different from move)
- [x] Authentication & Authorization
- [x] User registration
- [x] User login/logout
- [x] Email verification (optional)
- [x] Session management
- [x] API token generation
- [x] API token revocation
- [x] Permission system (namespace-based)
- [x] Repository access control (read/write/delete)
- [ ] More granular permissions
- [ ] Rate limiting
- [x] Organization Management
- [x] Create organization
- [x] Get organization details
- [x] Add/remove members
- [x] Update member roles
- [x] List organization members
- [x] List user organizations
- [x] Organization settings/metadata (description, etc.)
- [ ] Organization deletion
- [x] Version Control Features
- [x] Repository branches (create/delete)
- [x] Repository tags (create/delete)
- [x] Commit history API
- [ ] Advanced Features
- [ ] Pull requests / merge requests
- [ ] Discussion/comments
- [ ] Repository stars/likes
- [ ] Download statistics
- [ ] Search functionality
- [ ] Repository metadata tags/categories
## Web UI (Vue 3 + Vite)
- [x] Core Pages
- [x] Home/landing page
- [x] User registration page
- [x] User login page
- [x] User settings page
- [x] About/docs pages
- [x] Repository Features
- [x] Repository list
- [x] Repository creation
- [x] Repository info page
- [x] File browser/tree view
- [x] File viewer with code highlighting
- [x] File editor (Monaco Editor)
- [x] File uploader
- [x] Markdown renderer
- [x] Commit history view
- [x] Repository settings page
- [ ] Branch management UI
- [ ] Tag management UI
- [ ] Diff viewer
- [ ] Repository deletion UI confirmation
- [x] User/Organization UI
- [x] User profile view
- [x] Organization pages
- [x] Organization settings page
- [ ] Organization member management UI (add/remove members)
- [ ] Organization creation UI
- [ ] Additional Features
- [x] Theme support (dark/light)
- [ ] Search interface
- [ ] Notifications
- [ ] Activity feed
- [ ] File preview for images/media
## CLI Tool
- [x] User Management
- [x] User registration
- [x] User login/logout
- [x] Token creation/listing/deletion
- [x] Get current user info (whoami)
- [x] Update user settings
- [x] Organization Management
- [x] Create organization
- [x] Get organization info
- [x] List user organizations
- [x] Add/remove members
- [x] Update member roles
- [x] List organization members
- [x] Update organization settings
- [x] Repository Management
- [x] Create/delete repositories via CLI
- [x] List repositories
- [x] Get repository info
- [x] List repository files
- [x] Update repository settings
- [x] Move/rename repositories
- [x] Create/delete branches
- [x] Create/delete tags
- [ ] Upload/download files (use hfutils for now)
- [x] Configuration Management
- [x] Set/get configuration
- [x] List all configuration
- [x] Clear configuration
- [x] Interactive TUI Mode
- [x] Menu-based interface
- [ ] Administrative Features
- [ ] User administration (create/delete users)
- [ ] System statistics
- [ ] Backup/restore utilities
- [ ] LFS garbage collection
## Documentation
- [x] API.md (comprehensive API documentation)
- [x] CLI.md (CLI design and usage documentation)
- [x] README.md (getting started guide)
- [x] TODO.md (this file)
- [x] CONTRIBUTING.md (contributing guidelines)
- [x] config-example.toml
- [x] Documentation pages in Web UI
- [x] API documentation viewer
- [x] CLI documentation viewer
- [x] Roadmap viewer
- [x] Contributing guide viewer
- [ ] Deployment guides
- [ ] Production deployment best practices
- [ ] Scaling guide
- [ ] Migration guide from other hubs
- [ ] Backup/restore procedures
- [ ] Developer documentation
- [ ] Architecture overview document
- [ ] Database schema documentation
- [ ] API client development guide
## Testing & Quality
- [ ] Unit tests
- [ ] API endpoint tests
- [ ] Database model tests
- [ ] Authentication/authorization tests
- [ ] Integration tests
- [ ] E2E workflow tests
- [ ] HuggingFace client compatibility tests
- [ ] Performance testing
- [ ] Load testing
- [ ] Large file handling
- [ ] Security auditing
- [ ] Authentication security review
- [ ] SQL injection prevention
- [ ] XSS prevention in UI
## Future Enhancements
- [ ] Multi-region/CDN support
- [ ] Webhook system
- [ ] Model card/dataset card templates
- [ ] Integration with CI/CD pipelines
- [ ] Automated model evaluation
- [ ] Dataset versioning with lineage tracking
- [ ] Model registry with deployment tracking

223
docs/deployment.md Normal file
View File

@@ -0,0 +1,223 @@
# KohakuHub Deployment Architecture
## Setup Instructions
### First Time Setup
1. **Copy configuration file:**
```bash
cp docker-compose.example.yml docker-compose.yml
```
2. **Edit docker-compose.yml:**
- Change MinIO credentials (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD)
- Change PostgreSQL password (POSTGRES_PASSWORD)
- Change LakeFS secret key (LAKEFS_AUTH_ENCRYPT_SECRET_KEY)
- Change session secret (KOHAKU_HUB_SESSION_SECRET)
- Update BASE_URL if deploying to a domain
3. **Build and start:**
```bash
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
docker-compose up -d --build
```
**Note:** The repository only includes `docker-compose.example.yml` as a template. Your customized `docker-compose.yml` is excluded from git to prevent committing sensitive credentials.
## Port Configuration
### Production Deployment (Docker)
**Exposed Port:**
- **28080** - Main entry point (Web UI + API via nginx reverse proxy)
**Internal Ports (not exposed to users):**
- 48888 - Backend API server (proxied by nginx)
- 28000 - LakeFS UI (admin only)
- 29000 - MinIO Console (admin only)
- 29001 - MinIO S3 API (used by backend)
- 25432 - PostgreSQL (optional, for external access)
### Nginx Reverse Proxy
**Configuration:** `docker/nginx/default.conf`
Nginx on port 28080:
1. Serves frontend static files from `/usr/share/nginx/html`
2. Proxies API requests to backend:48888:
- `/api/*` → `http://hub-api:48888/api/*`
- `/org/*` → `http://hub-api:48888/org/*`
- `/{type}s/{namespace}/{name}/resolve/*` → `http://hub-api:48888/{type}s/{namespace}/{name}/resolve/*`
### Client Configuration
**For HuggingFace Client:**
```python
import os
os.environ["HF_ENDPOINT"] = "http://localhost:28080" # Use nginx port
os.environ["HF_TOKEN"] = "your_token"
```
**For kohub-cli:**
```bash
export HF_ENDPOINT=http://localhost:28080
kohub-cli auth login
```
**❌ WRONG:**
```python
os.environ["HF_ENDPOINT"] = "http://localhost:48888" # Don't use backend port directly
```
## Architecture Diagram
```
┌─────────────────────────────────────────────────────────┐
│ Client Access │
│ (HuggingFace Hub, kohub-cli, Web) │
└────────────────────┬────────────────────────────────────┘
│ Port 28080
┌───────────────────────┐
│ Nginx (hub-ui) │
│ - Serves frontend │
│ - Reverse proxy API │
└───────┬───────────────┘
┌───────┴───────────────┐
│ │
Static Files /api, /org, resolve
(Vue 3 app) │
│ Internal: hub-api:48888
┌────────────────────────┐
│ FastAPI (hub-api) │
│ - HF-compatible API │
└──┬─────────────┬───────┘
│ │
┌────────┴────┐ ┌────┴────────┐
│ LakeFS │ │ MinIO │
│ (version) │ │ (storage) │
└─────────────┘ └─────────────┘
```
## Development vs Production
### Development
**Frontend Dev Server** (port 5173):
```bash
npm run dev --prefix ./src/kohaku-hub-ui
# Proxies /api → http://localhost:48888
```
**Backend** (port 48888):
```bash
uvicorn kohakuhub.main:app --reload --port 48888
```
**Client Access:**
- Frontend: http://localhost:5173
- API: http://localhost:48888 (direct)
- Swagger Docs: http://localhost:48888/docs
### Production (Docker)
**All services via docker-compose:**
```bash
./deploy.sh
```
**Client Access:**
- **Everything:** http://localhost:28080 (Web UI + API)
- Swagger Docs (dev): http://localhost:48888/docs (if port exposed)
## Security Best Practices
### Production Deployment
1. **Only expose port 28080**
```yaml
# docker-compose.yml
hub-ui:
ports:
- "28080:80" # ONLY THIS PORT
hub-api:
# NO ports section - internal only
```
2. **Use HTTPS with reverse proxy**
```nginx
# Production nginx config
server {
listen 443 ssl;
server_name your-domain.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
location / {
proxy_pass http://hub-ui:80;
}
}
```
3. **Set BASE_URL to your domain**
```yaml
environment:
- KOHAKU_HUB_BASE_URL=https://your-domain.com
- KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com
```
## Common Mistakes
❌ **Don't do this:**
```python
# Wrong - bypassing nginx
os.environ["HF_ENDPOINT"] = "http://localhost:48888"
```
✅ **Do this:**
```python
# Correct - using nginx reverse proxy
os.environ["HF_ENDPOINT"] = "http://localhost:28080"
```
## Why This Architecture?
1. **Single Entry Point:** Users only need to know one port (28080)
2. **Security:** Backend (48888) not exposed to internet
3. **SSL Termination:** Nginx handles HTTPS
4. **Static File Serving:** Nginx serves frontend efficiently
5. **Load Balancing:** Can add multiple backend instances behind nginx
6. **Caching:** Nginx can cache static assets
## Troubleshooting
### "Connection refused to localhost:48888"
**Problem:** Client trying to connect directly to backend
**Solution:** Change `HF_ENDPOINT` to use port 28080:
```bash
export HF_ENDPOINT=http://localhost:28080
```
### "CORS errors in browser"
**Problem:** Frontend trying to access wrong port
**Solution:** Ensure `KOHAKU_HUB_BASE_URL` is set correctly:
```yaml
environment:
- KOHAKU_HUB_BASE_URL=http://localhost:28080
```
### "API calls returning HTML instead of JSON"
**Problem:** Hitting nginx for a non-proxied path
**Solution:** Check nginx config ensures all API paths are proxied

73
docs/ports.md Normal file
View File

@@ -0,0 +1,73 @@
# KohakuHub Port Configuration
## Quick Reference
### For Users (Production)
**Use this port for everything:**
- **28080** - Web UI + API (nginx reverse proxy)
**Don't use:**
- ~~48888~~ - Backend API (internal only)
### For Developers
**Development:**
- **5173** - Frontend dev server (npm run dev)
- **48888** - Backend API (uvicorn)
- **48888/docs** - Swagger API documentation
**Production:**
- **28080** - All traffic (nginx → frontend + backend)
### For Admins
**Service Management:**
- **28000** - LakeFS Web UI
- **29000** - MinIO Console
- **29001** - MinIO S3 API
- **25432** - PostgreSQL (if exposed)
## Configuration Examples
### Python Client
```python
os.environ["HF_ENDPOINT"] = "http://localhost:28080" # ✓ Correct
os.environ["HF_ENDPOINT"] = "http://localhost:48888" # ✗ Wrong
```
### CLI
```bash
export HF_ENDPOINT=http://localhost:28080 # ✓ Correct
kohub-cli auth login
```
### Docker Compose
```yaml
# Production - Only expose port 28080
hub-ui:
ports:
- "28080:80" # ✓ Only this
hub-api:
# NO ports exposed # ✓ Internal only
```
## Why Nginx Reverse Proxy?
1. **Single Entry Point** - One port for users to remember
2. **Security** - Backend not directly accessible
3. **SSL Termination** - Nginx handles HTTPS
4. **Static Files** - Nginx serves frontend efficiently
5. **API Proxying** - `/api`, `/org`, resolve → backend:48888
6. **Scalability** - Can add multiple backend instances
## Port Mapping
```
Client Request → Port 28080 (Nginx)
├→ / (static files) → Frontend
├→ /api/* → backend:48888
├→ /org/* → backend:48888
└→ /{type}s/{ns}/{name}/resolve/* → backend:48888
```

264
docs/setup.md Normal file
View File

@@ -0,0 +1,264 @@
# KohakuHub Setup Guide
## Quick Start
### 1. Clone Repository
```bash
git clone https://github.com/KohakuBlueleaf/KohakuHub.git
cd KohakuHub
```
### 2. Copy Configuration
```bash
cp docker-compose.example.yml docker-compose.yml
```
**Important:** The repository only includes `docker-compose.example.yml` as a template. You must copy it to `docker-compose.yml` and customize it for your deployment.
### 2. Customize Configuration
**Edit `docker-compose.yml` and change these critical settings:**
#### ⚠️ Security (MUST CHANGE)
```yaml
# MinIO (Object Storage)
environment:
- MINIO_ROOT_USER=your_secure_username # Change from 'minioadmin'
- MINIO_ROOT_PASSWORD=your_secure_password # Change from 'minioadmin'
# PostgreSQL (Database)
environment:
- POSTGRES_PASSWORD=your_secure_db_password # Change from 'hubpass'
# LakeFS (Version Control)
environment:
- LAKEFS_AUTH_ENCRYPT_SECRET_KEY=generate_random_32_char_key_here # Change!
# KohakuHub API
environment:
- KOHAKU_HUB_SESSION_SECRET=generate_random_string_here # Change!
```
#### 🌐 Deployment URL (Optional)
If deploying to a server with a domain name:
```yaml
# KohakuHub API
environment:
- KOHAKU_HUB_BASE_URL=https://your-domain.com # Change from localhost
- KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com # For downloads
```
### 3. Build Frontend
```bash
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
```
### 4. Start Services
```bash
docker-compose up -d --build
```
### 5. Verify Installation
```bash
# Check all services are running
docker-compose ps
# View logs
docker-compose logs -f hub-api
```
### 6. Access KohakuHub
- **Web UI & API:** http://localhost:28080
- **API Docs:** http://localhost:48888/docs (optional, for development)
## Configuration Reference
### Required Changes
| Variable | Default | Change To | Why |
|----------|---------|-----------|-----|
| `MINIO_ROOT_USER` | minioadmin | your_username | Security |
| `MINIO_ROOT_PASSWORD` | minioadmin | strong_password | Security |
| `POSTGRES_PASSWORD` | hubpass | strong_password | Security |
| `LAKEFS_AUTH_ENCRYPT_SECRET_KEY` | change_this | random_32_chars | Security |
| `KOHAKU_HUB_SESSION_SECRET` | change_this | random_string | Security |
### Optional Changes
| Variable | Default | When to Change |
|----------|---------|----------------|
| `KOHAKU_HUB_BASE_URL` | http://localhost:28080 | Deploying to domain |
| `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` | http://localhost:29001 | Using external S3 |
| `KOHAKU_HUB_LFS_THRESHOLD_BYTES` | 10000000 (10MB) | Adjust LFS threshold |
| `KOHAKU_HUB_REQUIRE_EMAIL_VERIFICATION` | false | Enable email verification |
## Post-Installation
### 1. Create First User
**Via Web UI:**
- Go to http://localhost:28080
- Click "Register"
- Create account
**Via CLI:**
```bash
pip install -e .
kohub-cli auth register
```
### 2. Get LakeFS Credentials
LakeFS credentials are auto-generated on first startup:
```bash
cat docker/hub-meta/hub-api/credentials.env
```
Use these to login to LakeFS UI at http://localhost:28000
### 3. Test with Python
```bash
pip install huggingface_hub
export HF_ENDPOINT=http://localhost:28080
export HF_TOKEN=your_token_from_ui
python scripts/test.py
```
## Troubleshooting
### Services Won't Start
**Check logs:**
```bash
docker-compose logs hub-api
docker-compose logs lakefs
docker-compose logs minio
```
**Common issues:**
- Port already in use (change ports in docker-compose.yml)
- Insufficient disk space
- Docker daemon not running
### Cannot Connect to API
**Verify nginx is running:**
```bash
docker-compose ps hub-ui
```
**Check nginx logs:**
```bash
docker-compose logs hub-ui
```
**Test directly:**
```bash
curl http://localhost:28080/api/version
```
### Cannot Access from External Network
**If deploying on a server:**
1. Update `KOHAKU_HUB_BASE_URL` to your domain
2. Update `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` if using external S3
3. Add reverse proxy with HTTPS (nginx/traefik/caddy)
4. Only expose port 28080 (or 443 with HTTPS)
## Production Deployment
### 1. Use HTTPS
Add reverse proxy in front of port 28080:
```nginx
# Example nginx config
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:28080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
### 2. Security Checklist
- [ ] Changed all default passwords
- [ ] Set strong SESSION_SECRET
- [ ] Set strong LAKEFS_AUTH_ENCRYPT_SECRET_KEY
- [ ] Using HTTPS with valid certificate
- [ ] Only port 28080 exposed (or 443 for HTTPS)
- [ ] Firewall configured
- [ ] Regular backups configured
### 3. Backup Strategy
**Data to backup:**
- `hub-meta/` - Database, LakeFS metadata, credentials
- `hub-storage/` - MinIO object storage (or use S3)
- `docker-compose.yml` - Your configuration
```bash
# Backup command
tar -czf kohakuhub-backup-$(date +%Y%m%d).tar.gz hub-meta/ hub-storage/ docker-compose.yml
```
## Updating
### Update KohakuHub
```bash
# Pull latest code
git pull
# Rebuild frontend
npm install --prefix ./src/kohaku-hub-ui
npm run build --prefix ./src/kohaku-hub-ui
# Restart services
docker-compose down
docker-compose up -d --build
```
**Note:** Check CHANGELOG for breaking changes before updating.
## Uninstall
```bash
# Stop and remove containers
docker-compose down
# Remove data (WARNING: This deletes everything!)
rm -rf hub-meta/ hub-storage/
# Remove docker-compose config
rm docker-compose.yml
```
## Support
- **Discord:** https://discord.gg/xWYrkyvJ2s
- **GitHub Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues
- **Documentation:** See docs/ folder