diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 99f8681..75feaea 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,297 +1,175 @@ # Contributing to KohakuHub -*Last Updated: October 2025* +Thank you for your interest in contributing to KohakuHub! We welcome contributions from the community. -Thank you for your interest in contributing to KohakuHub! We welcome contributions from the community and are excited to have you here. +## Quick Links -## Table of Contents - -- [Getting Started](#getting-started) -- [Development Setup](#development-setup) -- [Project Structure](#project-structure) -- [How to Contribute](#how-to-contribute) -- [Code Style Guidelines](#code-style-guidelines) -- [Testing](#testing) -- [Pull Request Process](#pull-request-process) -- [Community](#community) +- **Discord:** https://discord.gg/xWYrkyvJ2s (Best for discussions) +- **GitHub Issues:** Bug reports and feature requests +- **Development Guide:** See [CLAUDE.md](./CLAUDE.md) +- **Roadmap:** See [Project Status](#project-status) below ## Getting Started -Before you begin: -- Read the [README.md](./README.md) to understand what KohakuHub does -- Check out our [TODO.md](./docs/TODO.md) to see what needs to be done -- Join our [Discord community](https://discord.gg/xWYrkyvJ2s) to discuss your ideas - -## Development Setup - ### Prerequisites -- **Python 3.10+**: Backend development -- **Node.js 18+**: Frontend development -- **Docker & Docker Compose**: For running the full stack -- **Git**: Version control +- Python 3.10+ +- Node.js 18+ +- Docker & Docker Compose +- Git -### Quick Setup +### Setup -1. **Clone the repository** - ```bash - git clone https://github.com/KohakuBlueleaf/KohakuHub.git - cd KohakuHub - ``` +```bash +git clone https://github.com/KohakuBlueleaf/KohakuHub.git +cd KohakuHub -2. **Set up Python environment** - ```bash - python -m venv .venv - source .venv/bin/activate # On Windows: .venv\Scripts\activate - pip install -r requirements.txt - pip install -e . - ``` +# Backend +python -m venv .venv +source .venv/bin/activate # On Windows: .venv\Scripts\activate +pip install -e ".[dev]" -3. **Install frontend dependencies** - ```bash - cd src/kohaku-hub-ui - npm install - ``` - -4. **Start development environment** - ```bash - # From project root - ./deploy.sh - ``` - -5. **Access the services** - - KohakuHub Web UI: http://localhost:28080 - - KohakuHub API: http://localhost:48888 - - API Documentation: http://localhost:48888/docs - -## Project Structure +# Frontend +npm install --prefix ./src/kohaku-hub-ui +# Start with Docker +cp docker-compose.example.yml docker-compose.yml +./deploy.sh ``` -KohakuHub/ -├── src/ -│ ├── kohakuhub/ # Backend (FastAPI) -│ │ ├── api/ # API endpoints -│ │ ├── auth/ # Authentication & authorization -│ │ ├── org/ # Organization management -│ │ ├── db.py # Database models -│ │ └── main.py # Application entry point -│ ├── kohub_cli/ # CLI tool -│ │ ├── cli.py # CLI commands -│ │ ├── client.py # Python API client -│ │ └── main.py # CLI entry point -│ └── kohaku-hub-ui/ # Frontend (Vue 3 + Vite) -│ ├── src/ -│ │ ├── components/ # Reusable components -│ │ ├── pages/ # Page components -│ │ ├── stores/ # State management (Pinia) -│ │ └── utils/ # Utility functions -│ └── vite.config.js -├── docker/ # Docker compose files -├── docs/ # Documentation -│ ├── API.md # API documentation -│ ├── CLI.md # CLI documentation -│ └── TODO.md # Development roadmap -├── scripts/ # Utility scripts -└── README.md -``` + +**Access:** http://localhost:28080 + +## Code Style + +### Backend (Python) + +Follow [CLAUDE.md](./CLAUDE.md) principles: +- Modern Python (match-case, async/await, native types) +- Import order: builtin → 3rd party → our package (alphabetical) +- Use `db_async` wrappers for all DB operations +- Split large functions into smaller ones + +### Frontend (Vue 3) + +Follow [CLAUDE.md](./CLAUDE.md) principles: +- JavaScript only (no TypeScript), use JSDoc for types +- Split reusable components +- Implement dark/light mode together +- Mobile responsive ## How to Contribute ### Reporting Bugs -If you find a bug, please create an issue on GitHub with: -- **Clear title**: Describe the issue concisely -- **Steps to reproduce**: How can we recreate the bug? -- **Expected behavior**: What should happen? -- **Actual behavior**: What actually happens? -- **Environment**: OS, Python version, Docker version, etc. -- **Logs**: Include relevant error messages or logs +Create an issue with: +- Clear title +- Steps to reproduce +- Expected vs actual behavior +- Environment (OS, Python/Node version) +- Logs/error messages ### Suggesting Features -We welcome feature suggestions! Please: -- Check if the feature is already in [TODO.md](./docs/TODO.md) -- Open a GitHub issue or discuss on Discord -- Describe the use case and why it's valuable -- Propose how it might work +- Check [Project Status](#project-status) first +- Open GitHub issue or discuss on Discord +- Describe use case and value +- Propose implementation approach ### Contributing Code -1. **Pick an issue** or create one describing what you plan to work on -2. **Fork the repository** and create a new branch -3. **Make your changes** following our code style guidelines -4. **Test your changes** thoroughly -5. **Submit a pull request** with a clear description +1. Pick an issue or create one +2. Fork and create branch +3. Make changes following style guidelines +4. Test thoroughly +5. Submit pull request -## Code Style Guidelines +## Project Status -### Backend (Python) +*Last Updated: January 2025* -Follow the guidelines in [CLAUDE.md](./CLAUDE.md): -- **Modern Python**: Use match-case, async/await, type hints -- **Import order**: builtin → 3rd party → our package (alphabetical) -- **Type hints**: Use native types (`dict` not `Dict`) -- **Clean code**: Split large functions into smaller ones -- **Async operations**: Use dedicated threadpools (S3/LakeFS/DB) - - S3 operations → `run_in_s3_executor()` - - LakeFS operations → `run_in_lakefs_executor()` - - DB operations → `db_async` module wrappers +### ✅ Core Features (Complete) -```python -from typing import Optional -from fastapi import APIRouter, HTTPException +**API & Storage:** +- HuggingFace Hub API compatibility +- Git LFS protocol for large files +- File deduplication (SHA256) +- Repository management (create, delete, list, move/rename) +- Branch and tag management +- Commit history +- S3-compatible storage (MinIO, AWS S3, etc.) +- LakeFS versioning (branches, commits, diffs) -def create_repository( - repo_id: str, - repo_type: str, - private: bool = False -) -> dict: - """Create a new repository. +**Authentication:** +- User registration with email verification (optional) +- Session-based auth + API tokens +- Organization management with role-based access +- Permission system (namespace-based) - Args: - repo_id: Full repository ID (namespace/name) - repo_type: Type of repository (model, dataset, space) - private: Whether the repository is private +**Web UI:** +- Vue 3 interface with dark/light mode +- Repository browsing and file viewer +- Code editor (Monaco) with syntax highlighting +- Markdown rendering +- Commit history viewer +- Settings pages (user, org, repo) +- Documentation viewer - Returns: - Dictionary with repository information +**CLI Tool:** +- Full-featured `kohub-cli` with interactive TUI +- Repository, organization, user management +- Branch/tag operations +- File upload/download +- Commit history viewing +- Health check +- Operation history tracking +- Shell autocomplete (bash/zsh/fish) - Raises: - HTTPException: If repository already exists - """ - pass -``` +### 🚧 In Progress -### Frontend (Vue 3 + JavaScript) +- Rate limiting +- More granular permissions +- Repository transfer between namespaces +- Organization deletion +- Search functionality -Follow the guidelines in [CLAUDE.md](./CLAUDE.md): -- **JavaScript only** - No TypeScript, use JSDoc for type hints -- **Split reusable components** - One component per file -- **Dark/light mode** - Implement both at once -- **Mobile responsive** - Consider auto-break lines -- **Composition API**: Use ` - - -``` - -### CLI (Python) - -- Follow backend Python guidelines -- Use **Click** for command-line interface -- Provide helpful error messages -- Support both interactive and non-interactive modes - -## Testing - -### Backend Testing - -```bash -# Run all tests -pytest - -# Run specific test file -pytest tests/test_api.py - -# Run with coverage -pytest --cov=kohakuhub -``` - -### Frontend Testing - -```bash -cd src/kohaku-hub-ui - -# Run unit tests -npm run test - -# Run E2E tests -npm run test:e2e -``` - -### Manual Testing - -1. Start the development environment -2. Test your changes through the Web UI -3. Test API endpoints using the interactive docs at http://localhost:48888/docs -4. Test CLI commands: `kohub-cli [command]` - -## Pull Request Process - -1. **Update documentation**: If you add features, update relevant docs -2. **Add tests**: Include tests for new functionality -3. **Update CHANGELOG**: Add entry for your changes (if applicable) -4. **Ensure CI passes**: All automated checks must pass -5. **Request review**: Tag maintainers for review -6. **Address feedback**: Respond to review comments promptly - -### Pull Request Template - -```markdown -## Description -Brief description of changes - -## Type of Change -- [ ] Bug fix -- [ ] New feature -- [ ] Breaking change -- [ ] Documentation update - -## Testing -How did you test these changes? - -## Checklist -- [ ] Code follows style guidelines -- [ ] Self-review completed -- [ ] Comments added for complex code -- [ ] Documentation updated -- [ ] Tests added/updated -- [ ] No new warnings generated -``` +**Testing & Quality:** +- Unit tests for API endpoints +- Integration tests for HF client +- E2E tests for web UI +- Performance/load testing ## Development Areas -We're especially looking for help in these areas: +We're especially looking for help in: -### 🎨 Frontend Development (High Priority) -- Improving the Vue 3 UI/UX -- Adding missing pages (commit history, diff viewer, etc.) +### 🎨 Frontend (High Priority) +- Improving UI/UX +- Missing pages (branch/tag management, diff viewer) - Mobile responsiveness -- Accessibility improvements +- Accessibility -### 🔧 Backend Features +### 🔧 Backend - Additional HuggingFace API compatibility - Performance optimizations -- Advanced repository features (branches, PRs) +- Advanced repository features - Search functionality ### 📚 Documentation @@ -303,30 +181,25 @@ We're especially looking for help in these areas: ### 🧪 Testing - Unit test coverage - Integration tests -- E2E test scenarios +- E2E scenarios - Load testing -### 🔨 CLI Tools -- Additional administrative commands -- File upload/download features -- Batch operations +## Pull Request Process + +1. Update documentation if adding features +2. Add tests for new functionality +3. Ensure code follows style guidelines +4. Request review from maintainers +5. Address feedback promptly ## Community -- **Discord**: https://discord.gg/xWYrkyvJ2s (Best for real-time discussion) -- **GitHub Issues**: Bug reports and feature requests -- **GitHub Discussions**: Design discussions and questions - -## Questions? - -Don't hesitate to ask! We're here to help: -- Join our Discord and ask in the #dev channel -- Open a GitHub Discussion -- Comment on related issues +- **Discord:** https://discord.gg/xWYrkyvJ2s +- **GitHub Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues ## License -By contributing to KohakuHub, you agree that your contributions will be licensed under the AGPL-3.0 license. +By contributing, you agree that your contributions will be licensed under AGPL-3.0. --- diff --git a/README.md b/README.md index 9180f8f..a300549 100644 --- a/README.md +++ b/README.md @@ -30,15 +30,24 @@ Self-hosted HuggingFace Hub alternative with Git-like versioning for AI models a git clone https://github.com/KohakuBlueleaf/KohakuHub.git cd KohakuHub -# Build frontend and start services +# 1. Copy and customize configuration +cp docker-compose.example.yml docker-compose.yml + +# 2. IMPORTANT: Edit docker-compose.yml +# - Change MINIO_ROOT_USER and MINIO_ROOT_PASSWORD +# - Change POSTGRES_PASSWORD +# - Change LAKEFS_AUTH_ENCRYPT_SECRET_KEY +# - Change KOHAKU_HUB_SESSION_SECRET + +# 3. Build frontend and start services npm install --prefix ./src/kohaku-hub-ui npm run build --prefix ./src/kohaku-hub-ui docker-compose up -d --build ``` **Access:** -- Web UI: http://localhost:28080 -- API Docs: http://localhost:48888/docs +- Web UI & API: http://localhost:28080 (all traffic goes here) +- API Docs (Swagger): http://localhost:48888/docs (direct access for development) - LakeFS UI: http://localhost:28000 - MinIO Console: http://localhost:29000 @@ -50,7 +59,7 @@ docker-compose up -d --build import os from huggingface_hub import HfApi -os.environ["HF_ENDPOINT"] = "http://localhost:48888" +os.environ["HF_ENDPOINT"] = "http://localhost:28080" os.environ["HF_TOKEN"] = "your_token_here" api = HfApi(endpoint=os.environ["HF_ENDPOINT"], token=os.environ["HF_TOKEN"]) @@ -73,7 +82,7 @@ api.hf_hub_download(repo_id="my-org/my-model", filename="model.safetensors") ```python import os -os.environ["HF_ENDPOINT"] = "http://localhost:48888" +os.environ["HF_ENDPOINT"] = "http://localhost:28080" os.environ["HF_TOKEN"] = "your_token_here" from diffusers import AutoencoderKL @@ -145,7 +154,8 @@ See [config-example.toml](./config-example.toml) for all options. **Backend:** ```bash pip install -e . -uvicorn kohakuhub.main:app --reload --port 48888 +uvicorn kohakuhub.main:app --reload --port 48888 # Development only +# Note: In production, access via nginx on port 28080 ``` **Frontend:** @@ -164,11 +174,13 @@ See [CLAUDE.md](./CLAUDE.md) for development guidelines. ## Documentation -- [API.md](./docs/API.md) - API endpoints and workflows -- [CLI.md](./docs/CLI.md) - Command-line tool usage -- [TODO.md](./docs/TODO.md) - Development status and roadmap +- [docs/setup.md](./docs/setup.md) - Setup and installation guide +- [docs/deployment.md](./docs/deployment.md) - Deployment architecture +- [docs/ports.md](./docs/ports.md) - Port configuration reference +- [docs/API.md](./docs/API.md) - API endpoints and workflows +- [docs/CLI.md](./docs/CLI.md) - Command-line tool usage +- [CONTRIBUTING.md](./CONTRIBUTING.md) - Contributing guide & roadmap - [CLAUDE.md](./CLAUDE.md) - Developer guide for Claude Code -- [CONTRIBUTING.md](./CONTRIBUTING.md) - Contribution guidelines ## Security Notes diff --git a/docker-compose.example.yml b/docker-compose.example.yml index 9b9c098..b20cf16 100644 --- a/docker-compose.example.yml +++ b/docker-compose.example.yml @@ -1,5 +1,4 @@ # docker-compose.yml -version: "3.9" services: hub-ui: diff --git a/docs/CLI.md b/docs/CLI.md index f19f171..d09717b 100644 --- a/docs/CLI.md +++ b/docs/CLI.md @@ -81,7 +81,7 @@ from kohub_cli import KohubClient # Initialize client client = KohubClient( - endpoint="http://localhost:8000", + endpoint="http://localhost:28080", token="your_token_here" # optional ) @@ -255,7 +255,7 @@ client.update_user_settings( ```python # Save configuration client.save_config( - endpoint="http://localhost:8000", + endpoint="http://localhost:28080", token="hf_..." ) @@ -434,7 +434,7 @@ kohub-cli settings organization members my-org ```bash # Set endpoint -kohub-cli config set endpoint http://localhost:8000 +kohub-cli config set endpoint http://localhost:28080 # Set token kohub-cli config set token hf_... @@ -472,7 +472,7 @@ All commands support these global options: Examples: ```bash -kohub-cli --endpoint http://localhost:8000 auth whoami +kohub-cli --endpoint http://localhost:28080 auth whoami kohub-cli --output json repo list --type model kohub-cli --token hf_xxxxx repo create my-model --type model ``` @@ -483,7 +483,7 @@ Located at `~/.kohub/config.json`: ```json { - "endpoint": "http://localhost:8000", + "endpoint": "http://localhost:28080", "token": "hf_...", "default_repo_type": "model", "interactive_mode_default": true @@ -524,7 +524,7 @@ $ kohub-cli --output json repo info nonexistent/repo --type model ## Environment Variables -- `HF_ENDPOINT` - KohakuHub endpoint URL (default: `http://localhost:8000`) +- `HF_ENDPOINT` - KohakuHub endpoint URL (default: `http://localhost:28080`) - `HF_TOKEN` - API token for authentication - `KOHUB_CONFIG_DIR` - Config directory (default: `~/.kohub`) diff --git a/docs/TODO.md b/docs/TODO.md deleted file mode 100644 index 2372f6d..0000000 --- a/docs/TODO.md +++ /dev/null @@ -1,183 +0,0 @@ -# TODO - -*Last Updated: October 2025* - -Kohaku-Hub is a pretty large project and really hard to say where to start is better, but I will try to list all the known TODOs here with brief priority note - -## Infrastructure -- [x] Basic Infra Structure - - [x] LakeFS + MinIO deployment - - [x] MinIO presigned URL - - [x] PostgreSQL database support - - [x] SQLite database support - - [x] Docker compose setup - - [x] Environment configuration system - -## API Layer -- [x] Core File Operations - - [x] Upload small files (not LFS) - - [x] Upload large files (LFS) - - [x] Download with S3 presigned URLs - - [x] File deletion - - [x] File copy operations - - [x] Content deduplication -- [x] Repository Management - - [x] Create repository - - [x] Delete repository - - [x] List repositories - - [x] Get repository info - - [x] Tree list (recursive & non-recursive) - - [x] Paths-info endpoint - - [x] Move/Rename repository - - [x] Update repository settings (private, gated) - - [ ] Repository transfer between namespaces (different from move) -- [x] Authentication & Authorization - - [x] User registration - - [x] User login/logout - - [x] Email verification (optional) - - [x] Session management - - [x] API token generation - - [x] API token revocation - - [x] Permission system (namespace-based) - - [x] Repository access control (read/write/delete) - - [ ] More granular permissions - - [ ] Rate limiting -- [x] Organization Management - - [x] Create organization - - [x] Get organization details - - [x] Add/remove members - - [x] Update member roles - - [x] List organization members - - [x] List user organizations - - [x] Organization settings/metadata (description, etc.) - - [ ] Organization deletion -- [x] Version Control Features - - [x] Repository branches (create/delete) - - [x] Repository tags (create/delete) - - [x] Commit history API -- [ ] Advanced Features - - [ ] Pull requests / merge requests - - [ ] Discussion/comments - - [ ] Repository stars/likes - - [ ] Download statistics - - [ ] Search functionality - - [ ] Repository metadata tags/categories - -## Web UI (Vue 3 + Vite) -- [x] Core Pages - - [x] Home/landing page - - [x] User registration page - - [x] User login page - - [x] User settings page - - [x] About/docs pages -- [x] Repository Features - - [x] Repository list - - [x] Repository creation - - [x] Repository info page - - [x] File browser/tree view - - [x] File viewer with code highlighting - - [x] File editor (Monaco Editor) - - [x] File uploader - - [x] Markdown renderer - - [x] Commit history view - - [x] Repository settings page - - [ ] Branch management UI - - [ ] Tag management UI - - [ ] Diff viewer - - [ ] Repository deletion UI confirmation -- [x] User/Organization UI - - [x] User profile view - - [x] Organization pages - - [x] Organization settings page - - [ ] Organization member management UI (add/remove members) - - [ ] Organization creation UI -- [ ] Additional Features - - [x] Theme support (dark/light) - - [ ] Search interface - - [ ] Notifications - - [ ] Activity feed - - [ ] File preview for images/media - -## CLI Tool -- [x] User Management - - [x] User registration - - [x] User login/logout - - [x] Token creation/listing/deletion - - [x] Get current user info (whoami) - - [x] Update user settings -- [x] Organization Management - - [x] Create organization - - [x] Get organization info - - [x] List user organizations - - [x] Add/remove members - - [x] Update member roles - - [x] List organization members - - [x] Update organization settings -- [x] Repository Management - - [x] Create/delete repositories via CLI - - [x] List repositories - - [x] Get repository info - - [x] List repository files - - [x] Update repository settings - - [x] Move/rename repositories - - [x] Create/delete branches - - [x] Create/delete tags - - [ ] Upload/download files (use hfutils for now) -- [x] Configuration Management - - [x] Set/get configuration - - [x] List all configuration - - [x] Clear configuration -- [x] Interactive TUI Mode - - [x] Menu-based interface -- [ ] Administrative Features - - [ ] User administration (create/delete users) - - [ ] System statistics - - [ ] Backup/restore utilities - - [ ] LFS garbage collection - -## Documentation -- [x] API.md (comprehensive API documentation) -- [x] CLI.md (CLI design and usage documentation) -- [x] README.md (getting started guide) -- [x] TODO.md (this file) -- [x] CONTRIBUTING.md (contributing guidelines) -- [x] config-example.toml -- [x] Documentation pages in Web UI - - [x] API documentation viewer - - [x] CLI documentation viewer - - [x] Roadmap viewer - - [x] Contributing guide viewer -- [ ] Deployment guides - - [ ] Production deployment best practices - - [ ] Scaling guide - - [ ] Migration guide from other hubs - - [ ] Backup/restore procedures -- [ ] Developer documentation - - [ ] Architecture overview document - - [ ] Database schema documentation - - [ ] API client development guide - -## Testing & Quality -- [ ] Unit tests - - [ ] API endpoint tests - - [ ] Database model tests - - [ ] Authentication/authorization tests -- [ ] Integration tests - - [ ] E2E workflow tests - - [ ] HuggingFace client compatibility tests -- [ ] Performance testing - - [ ] Load testing - - [ ] Large file handling -- [ ] Security auditing - - [ ] Authentication security review - - [ ] SQL injection prevention - - [ ] XSS prevention in UI - -## Future Enhancements -- [ ] Multi-region/CDN support -- [ ] Webhook system -- [ ] Model card/dataset card templates -- [ ] Integration with CI/CD pipelines -- [ ] Automated model evaluation -- [ ] Dataset versioning with lineage tracking -- [ ] Model registry with deployment tracking \ No newline at end of file diff --git a/docs/deployment.md b/docs/deployment.md new file mode 100644 index 0000000..3aac49e --- /dev/null +++ b/docs/deployment.md @@ -0,0 +1,223 @@ +# KohakuHub Deployment Architecture + +## Setup Instructions + +### First Time Setup + +1. **Copy configuration file:** + ```bash + cp docker-compose.example.yml docker-compose.yml + ``` + +2. **Edit docker-compose.yml:** + - Change MinIO credentials (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD) + - Change PostgreSQL password (POSTGRES_PASSWORD) + - Change LakeFS secret key (LAKEFS_AUTH_ENCRYPT_SECRET_KEY) + - Change session secret (KOHAKU_HUB_SESSION_SECRET) + - Update BASE_URL if deploying to a domain + +3. **Build and start:** + ```bash + npm install --prefix ./src/kohaku-hub-ui + npm run build --prefix ./src/kohaku-hub-ui + docker-compose up -d --build + ``` + +**Note:** The repository only includes `docker-compose.example.yml` as a template. Your customized `docker-compose.yml` is excluded from git to prevent committing sensitive credentials. + +## Port Configuration + +### Production Deployment (Docker) + +**Exposed Port:** +- **28080** - Main entry point (Web UI + API via nginx reverse proxy) + +**Internal Ports (not exposed to users):** +- 48888 - Backend API server (proxied by nginx) +- 28000 - LakeFS UI (admin only) +- 29000 - MinIO Console (admin only) +- 29001 - MinIO S3 API (used by backend) +- 25432 - PostgreSQL (optional, for external access) + +### Nginx Reverse Proxy + +**Configuration:** `docker/nginx/default.conf` + +Nginx on port 28080: +1. Serves frontend static files from `/usr/share/nginx/html` +2. Proxies API requests to backend:48888: + - `/api/*` → `http://hub-api:48888/api/*` + - `/org/*` → `http://hub-api:48888/org/*` + - `/{type}s/{namespace}/{name}/resolve/*` → `http://hub-api:48888/{type}s/{namespace}/{name}/resolve/*` + +### Client Configuration + +**For HuggingFace Client:** +```python +import os +os.environ["HF_ENDPOINT"] = "http://localhost:28080" # Use nginx port +os.environ["HF_TOKEN"] = "your_token" +``` + +**For kohub-cli:** +```bash +export HF_ENDPOINT=http://localhost:28080 +kohub-cli auth login +``` + +**❌ WRONG:** +```python +os.environ["HF_ENDPOINT"] = "http://localhost:48888" # Don't use backend port directly +``` + +## Architecture Diagram + +``` +┌─────────────────────────────────────────────────────────┐ +│ Client Access │ +│ (HuggingFace Hub, kohub-cli, Web) │ +└────────────────────┬────────────────────────────────────┘ + │ + │ Port 28080 + ▼ + ┌───────────────────────┐ + │ Nginx (hub-ui) │ + │ - Serves frontend │ + │ - Reverse proxy API │ + └───────┬───────────────┘ + │ + ┌───────┴───────────────┐ + │ │ + Static Files /api, /org, resolve + (Vue 3 app) │ + │ Internal: hub-api:48888 + ▼ + ┌────────────────────────┐ + │ FastAPI (hub-api) │ + │ - HF-compatible API │ + └──┬─────────────┬───────┘ + │ │ + ┌────────┴────┐ ┌────┴────────┐ + │ LakeFS │ │ MinIO │ + │ (version) │ │ (storage) │ + └─────────────┘ └─────────────┘ +``` + +## Development vs Production + +### Development + +**Frontend Dev Server** (port 5173): +```bash +npm run dev --prefix ./src/kohaku-hub-ui +# Proxies /api → http://localhost:48888 +``` + +**Backend** (port 48888): +```bash +uvicorn kohakuhub.main:app --reload --port 48888 +``` + +**Client Access:** +- Frontend: http://localhost:5173 +- API: http://localhost:48888 (direct) +- Swagger Docs: http://localhost:48888/docs + +### Production (Docker) + +**All services via docker-compose:** +```bash +./deploy.sh +``` + +**Client Access:** +- **Everything:** http://localhost:28080 (Web UI + API) +- Swagger Docs (dev): http://localhost:48888/docs (if port exposed) + +## Security Best Practices + +### Production Deployment + +1. **Only expose port 28080** + ```yaml + # docker-compose.yml + hub-ui: + ports: + - "28080:80" # ONLY THIS PORT + + hub-api: + # NO ports section - internal only + ``` + +2. **Use HTTPS with reverse proxy** + ```nginx + # Production nginx config + server { + listen 443 ssl; + server_name your-domain.com; + + ssl_certificate /etc/nginx/ssl/cert.pem; + ssl_certificate_key /etc/nginx/ssl/key.pem; + + location / { + proxy_pass http://hub-ui:80; + } + } + ``` + +3. **Set BASE_URL to your domain** + ```yaml + environment: + - KOHAKU_HUB_BASE_URL=https://your-domain.com + - KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com + ``` + +## Common Mistakes + +❌ **Don't do this:** +```python +# Wrong - bypassing nginx +os.environ["HF_ENDPOINT"] = "http://localhost:48888" +``` + +✅ **Do this:** +```python +# Correct - using nginx reverse proxy +os.environ["HF_ENDPOINT"] = "http://localhost:28080" +``` + +## Why This Architecture? + +1. **Single Entry Point:** Users only need to know one port (28080) +2. **Security:** Backend (48888) not exposed to internet +3. **SSL Termination:** Nginx handles HTTPS +4. **Static File Serving:** Nginx serves frontend efficiently +5. **Load Balancing:** Can add multiple backend instances behind nginx +6. **Caching:** Nginx can cache static assets + +## Troubleshooting + +### "Connection refused to localhost:48888" + +**Problem:** Client trying to connect directly to backend + +**Solution:** Change `HF_ENDPOINT` to use port 28080: +```bash +export HF_ENDPOINT=http://localhost:28080 +``` + +### "CORS errors in browser" + +**Problem:** Frontend trying to access wrong port + +**Solution:** Ensure `KOHAKU_HUB_BASE_URL` is set correctly: +```yaml +environment: + - KOHAKU_HUB_BASE_URL=http://localhost:28080 +``` + +### "API calls returning HTML instead of JSON" + +**Problem:** Hitting nginx for a non-proxied path + +**Solution:** Check nginx config ensures all API paths are proxied diff --git a/docs/ports.md b/docs/ports.md new file mode 100644 index 0000000..6305452 --- /dev/null +++ b/docs/ports.md @@ -0,0 +1,73 @@ +# KohakuHub Port Configuration + +## Quick Reference + +### For Users (Production) + +**Use this port for everything:** +- **28080** - Web UI + API (nginx reverse proxy) + +**Don't use:** +- ~~48888~~ - Backend API (internal only) + +### For Developers + +**Development:** +- **5173** - Frontend dev server (npm run dev) +- **48888** - Backend API (uvicorn) +- **48888/docs** - Swagger API documentation + +**Production:** +- **28080** - All traffic (nginx → frontend + backend) + +### For Admins + +**Service Management:** +- **28000** - LakeFS Web UI +- **29000** - MinIO Console +- **29001** - MinIO S3 API +- **25432** - PostgreSQL (if exposed) + +## Configuration Examples + +### Python Client +```python +os.environ["HF_ENDPOINT"] = "http://localhost:28080" # ✓ Correct +os.environ["HF_ENDPOINT"] = "http://localhost:48888" # ✗ Wrong +``` + +### CLI +```bash +export HF_ENDPOINT=http://localhost:28080 # ✓ Correct +kohub-cli auth login +``` + +### Docker Compose +```yaml +# Production - Only expose port 28080 +hub-ui: + ports: + - "28080:80" # ✓ Only this + +hub-api: + # NO ports exposed # ✓ Internal only +``` + +## Why Nginx Reverse Proxy? + +1. **Single Entry Point** - One port for users to remember +2. **Security** - Backend not directly accessible +3. **SSL Termination** - Nginx handles HTTPS +4. **Static Files** - Nginx serves frontend efficiently +5. **API Proxying** - `/api`, `/org`, resolve → backend:48888 +6. **Scalability** - Can add multiple backend instances + +## Port Mapping + +``` +Client Request → Port 28080 (Nginx) + ├→ / (static files) → Frontend + ├→ /api/* → backend:48888 + ├→ /org/* → backend:48888 + └→ /{type}s/{ns}/{name}/resolve/* → backend:48888 +``` diff --git a/docs/setup.md b/docs/setup.md new file mode 100644 index 0000000..511aa86 --- /dev/null +++ b/docs/setup.md @@ -0,0 +1,264 @@ +# KohakuHub Setup Guide + +## Quick Start + +### 1. Clone Repository + +```bash +git clone https://github.com/KohakuBlueleaf/KohakuHub.git +cd KohakuHub +``` + +### 2. Copy Configuration + +```bash +cp docker-compose.example.yml docker-compose.yml +``` + +**Important:** The repository only includes `docker-compose.example.yml` as a template. You must copy it to `docker-compose.yml` and customize it for your deployment. + +### 2. Customize Configuration + +**Edit `docker-compose.yml` and change these critical settings:** + +#### ⚠️ Security (MUST CHANGE) + +```yaml +# MinIO (Object Storage) +environment: + - MINIO_ROOT_USER=your_secure_username # Change from 'minioadmin' + - MINIO_ROOT_PASSWORD=your_secure_password # Change from 'minioadmin' + +# PostgreSQL (Database) +environment: + - POSTGRES_PASSWORD=your_secure_db_password # Change from 'hubpass' + +# LakeFS (Version Control) +environment: + - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=generate_random_32_char_key_here # Change! + +# KohakuHub API +environment: + - KOHAKU_HUB_SESSION_SECRET=generate_random_string_here # Change! +``` + +#### 🌐 Deployment URL (Optional) + +If deploying to a server with a domain name: + +```yaml +# KohakuHub API +environment: + - KOHAKU_HUB_BASE_URL=https://your-domain.com # Change from localhost + - KOHAKU_HUB_S3_PUBLIC_ENDPOINT=https://s3.your-domain.com # For downloads +``` + +### 3. Build Frontend + +```bash +npm install --prefix ./src/kohaku-hub-ui +npm run build --prefix ./src/kohaku-hub-ui +``` + +### 4. Start Services + +```bash +docker-compose up -d --build +``` + +### 5. Verify Installation + +```bash +# Check all services are running +docker-compose ps + +# View logs +docker-compose logs -f hub-api +``` + +### 6. Access KohakuHub + +- **Web UI & API:** http://localhost:28080 +- **API Docs:** http://localhost:48888/docs (optional, for development) + +## Configuration Reference + +### Required Changes + +| Variable | Default | Change To | Why | +|----------|---------|-----------|-----| +| `MINIO_ROOT_USER` | minioadmin | your_username | Security | +| `MINIO_ROOT_PASSWORD` | minioadmin | strong_password | Security | +| `POSTGRES_PASSWORD` | hubpass | strong_password | Security | +| `LAKEFS_AUTH_ENCRYPT_SECRET_KEY` | change_this | random_32_chars | Security | +| `KOHAKU_HUB_SESSION_SECRET` | change_this | random_string | Security | + +### Optional Changes + +| Variable | Default | When to Change | +|----------|---------|----------------| +| `KOHAKU_HUB_BASE_URL` | http://localhost:28080 | Deploying to domain | +| `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` | http://localhost:29001 | Using external S3 | +| `KOHAKU_HUB_LFS_THRESHOLD_BYTES` | 10000000 (10MB) | Adjust LFS threshold | +| `KOHAKU_HUB_REQUIRE_EMAIL_VERIFICATION` | false | Enable email verification | + +## Post-Installation + +### 1. Create First User + +**Via Web UI:** +- Go to http://localhost:28080 +- Click "Register" +- Create account + +**Via CLI:** +```bash +pip install -e . +kohub-cli auth register +``` + +### 2. Get LakeFS Credentials + +LakeFS credentials are auto-generated on first startup: + +```bash +cat docker/hub-meta/hub-api/credentials.env +``` + +Use these to login to LakeFS UI at http://localhost:28000 + +### 3. Test with Python + +```bash +pip install huggingface_hub + +export HF_ENDPOINT=http://localhost:28080 +export HF_TOKEN=your_token_from_ui + +python scripts/test.py +``` + +## Troubleshooting + +### Services Won't Start + +**Check logs:** +```bash +docker-compose logs hub-api +docker-compose logs lakefs +docker-compose logs minio +``` + +**Common issues:** +- Port already in use (change ports in docker-compose.yml) +- Insufficient disk space +- Docker daemon not running + +### Cannot Connect to API + +**Verify nginx is running:** +```bash +docker-compose ps hub-ui +``` + +**Check nginx logs:** +```bash +docker-compose logs hub-ui +``` + +**Test directly:** +```bash +curl http://localhost:28080/api/version +``` + +### Cannot Access from External Network + +**If deploying on a server:** + +1. Update `KOHAKU_HUB_BASE_URL` to your domain +2. Update `KOHAKU_HUB_S3_PUBLIC_ENDPOINT` if using external S3 +3. Add reverse proxy with HTTPS (nginx/traefik/caddy) +4. Only expose port 28080 (or 443 with HTTPS) + +## Production Deployment + +### 1. Use HTTPS + +Add reverse proxy in front of port 28080: + +```nginx +# Example nginx config +server { + listen 443 ssl http2; + server_name your-domain.com; + + ssl_certificate /path/to/cert.pem; + ssl_certificate_key /path/to/key.pem; + + location / { + proxy_pass http://localhost:28080; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + } +} +``` + +### 2. Security Checklist + +- [ ] Changed all default passwords +- [ ] Set strong SESSION_SECRET +- [ ] Set strong LAKEFS_AUTH_ENCRYPT_SECRET_KEY +- [ ] Using HTTPS with valid certificate +- [ ] Only port 28080 exposed (or 443 for HTTPS) +- [ ] Firewall configured +- [ ] Regular backups configured + +### 3. Backup Strategy + +**Data to backup:** +- `hub-meta/` - Database, LakeFS metadata, credentials +- `hub-storage/` - MinIO object storage (or use S3) +- `docker-compose.yml` - Your configuration + +```bash +# Backup command +tar -czf kohakuhub-backup-$(date +%Y%m%d).tar.gz hub-meta/ hub-storage/ docker-compose.yml +``` + +## Updating + +### Update KohakuHub + +```bash +# Pull latest code +git pull + +# Rebuild frontend +npm install --prefix ./src/kohaku-hub-ui +npm run build --prefix ./src/kohaku-hub-ui + +# Restart services +docker-compose down +docker-compose up -d --build +``` + +**Note:** Check CHANGELOG for breaking changes before updating. + +## Uninstall + +```bash +# Stop and remove containers +docker-compose down + +# Remove data (WARNING: This deletes everything!) +rm -rf hub-meta/ hub-storage/ + +# Remove docker-compose config +rm docker-compose.yml +``` + +## Support + +- **Discord:** https://discord.gg/xWYrkyvJ2s +- **GitHub Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues +- **Documentation:** See docs/ folder