Files
KohakuHub/docs/Admin.md
Kohaku-Blueleaf 1a7100a586 minor fixes
2025-10-11 13:03:58 +08:00

18 KiB

Admin Portal Guide

Complete guide to KohakuHub's administration interface

Last Updated: January 2025 Access: http://your-hub.com/admin


Admin Portal Architecture

graph LR
    subgraph "Admin Access"
        Browser[Browser] -->|X-Admin-Token| Portal[Admin Portal UI]
    end

    subgraph "Admin API"
        Portal -->|REST API| AdminAPI[Admin Endpoints]
    end

    subgraph "Data Sources"
        AdminAPI -->|Queries| DB[PostgreSQL/SQLite]
        AdminAPI -->|List Objects| S3[MinIO/S3]
        AdminAPI -->|Repository Info| LakeFS[LakeFS]
    end

Table of Contents

  1. Overview
  2. Authentication
  3. Dashboard
  4. User Management
  5. Repository Management
  6. Commit History Viewer
  7. S3 Storage Browser
  8. Quota Management
  9. API Reference
  10. Security Best Practices

Overview

The Admin Portal provides a centralized interface for managing your KohakuHub instance. It offers:

  • User Management - Create, view, and delete users
  • Repository Browser - View all repositories with statistics
  • Commit History - Track commits across all repositories
  • Storage Browser - Browse S3 buckets and objects
  • Quota Management - Set and monitor storage quotas
  • Statistics Dashboard - Real-time insights into usage

Access URL:

http://your-hub.com/admin

Authentication

Admin Token

The admin portal requires a secret token configured in your environment:

Configuration:

# docker-compose.yml
environment:
  KOHAKU_HUB_ADMIN_ENABLED: "true"
  KOHAKU_HUB_ADMIN_SECRET_TOKEN: "your-secret-token-here"  # CHANGE THIS!

Security:

  • ⚠️ NEVER use default token "change-me-in-production" in production
  • Generate strong random token: openssl rand -hex 32
  • Store securely (environment variable, secrets manager)
  • Rotate regularly
  • Use HTTPS in production

Login

  1. Navigate to /admin
  2. Enter your admin secret token
  3. Token is stored in browser session (not localStorage for security)
  4. Auto-logout on browser close

Example:

# Generate secure token
openssl rand -hex 32
# Output: a1b2c3d4e5f6...

# Add to docker-compose.yml
KOHAKU_HUB_ADMIN_SECRET_TOKEN: "a1b2c3d4e5f6..."

# Restart
docker-compose up -d

Dashboard

Overview Statistics

The dashboard shows real-time statistics from your database:

User Stats:

  • Total users
  • Active users
  • Email verified users
  • Inactive users

Repository Stats:

  • Total repositories
  • Private vs public repositories
  • Breakdown by type (models, datasets, spaces)

Commit Stats:

  • Total commits
  • Top contributors (by commit count)

Storage Stats:

  • Total storage used
  • Private vs public storage
  • LFS object count and size

Quick Actions:

  • Navigate to user management
  • Browse repositories
  • View commits
  • Inspect S3 storage
  • Manage quotas

User Management

List Users

Features:

  • View all users with pagination
  • Sort by ID, username, storage usage
  • Filter and search
  • Storage quota visualization

Columns:

  • ID, Username, Email
  • Private storage (used/quota)
  • Public storage (used/quota)
  • Total storage
  • Email verification status
  • Active status
  • Created date

Create User

Fields:

  • Username (required, unique)
  • Email (required, unique)
  • Password (required)
  • Email verified (checkbox)
  • Private quota (bytes, optional = unlimited)
  • Public quota (bytes, optional = unlimited)

Example:

Username: alice
Email: alice@example.com
Password: ********
Email Verified: ✓
Private Quota: 10737418240  (10 GB)
Public Quota: 53687091200   (50 GB)

View User Details

Click "View" to see:

  • User ID, username, email
  • Verification and active status
  • Storage quotas (private, public)
  • Storage used (private, public)
  • Created date

Actions:

  • Manage Quota (navigate to quota page)

Delete User

Normal Delete:

  • Deletes user account
  • Deletes all sessions and tokens
  • Deletes organization memberships
  • Keeps repositories (must delete separately)

Force Delete:

  • Deletes everything above
  • Also deletes all owned repositories
  • ⚠️ Cannot be undone!

Workflow:

  1. Click "Delete" → Confirmation dialog
  2. If user owns repos → Shows repo list
  3. Choose: Cancel or Force Delete
  4. Confirm force delete → All data deleted

Toggle Email Verification

Use case: Manually verify users when email verification is disabled or failed.

Action: Click "Verify" or "Unverify" button → Instant update


Repository Management

List Repositories

Filters:

  • Repository type (model/dataset/space)
  • Namespace (user or organization)

Columns:

  • ID
  • Type (color-coded badge)
  • Full repository ID (namespace/name)
  • Privacy status (Private/Public badge)
  • Owner username
  • Created date

Actions:

  • View Details → Opens detailed dialog

Repository Details

Information:

  • ID, Type, Full ID
  • Namespace, Name
  • Owner username
  • Privacy status
  • Created date
  • File count (from database)
  • Commit count (from database)
  • Total size (sum of all files)

Actions:

  • View in Main App → Opens repository in main UI

API Endpoints

GET /admin/api/repositories
  Query: repo_type, namespace, limit, offset

GET /admin/api/repositories/{type}/{namespace}/{name}
  Returns: Detailed repo info with stats

Commit History Viewer

Overview

View all commits across all repositories in your instance.

Filters:

  • Repository ID (e.g., "org/repo-name")
  • Author username

Columns:

  • Commit ID (first 8 chars)
  • Repository (type badge + full ID)
  • Branch
  • Author
  • Message (truncated, hover for full)
  • Created date

Sorting:

  • Sort by ID, created date, username, repository

Pagination:

  • Page size: 10, 20, 50, 100
  • Navigate through pages

Use Cases

  • Track user activity
  • Find specific commits
  • Monitor repository changes
  • Debug commit issues
  • Audit trail

API Endpoint

GET /admin/api/commits
  Query: repo_full_id, username, limit, offset

S3 Storage Browser

Bucket List

Overview:

  • View all S3 buckets
  • Total size and object count
  • Visual progress bars
  • Creation dates

Metrics:

  • Bucket name
  • Total size (formatted: KB, MB, GB)
  • Object count
  • Creation date
  • Progress bar (relative to 100GB)

Actions:

  • Click bucket → Browse contents

Object Browser

Features:

  • List objects in selected bucket
  • Filter by prefix (e.g., "lfs/", "models/")
  • Pagination (up to 1000 objects)

Columns:

  • Key (full S3 path)
  • Size
  • Storage class (STANDARD, etc.)
  • Last modified date

Prefix Filtering:

Enter prefix: lfs/
→ Shows only objects starting with "lfs/"

Enter prefix: hf-model-org-repo/
→ Shows objects for specific repository

API Endpoints

GET /admin/api/storage/buckets
  Returns: All buckets with sizes

GET /admin/api/storage/objects/{bucket}
  Query: prefix, limit
  Returns: Objects in bucket

Quota Management

View Quota

Per-user or per-organization:

  • Private quota (limit)
  • Private used
  • Public quota (limit)
  • Public used
  • Total usage
  • Usage percentages

Set Quota

Fields:

  • Private quota bytes (null = unlimited)
  • Public quota bytes (null = unlimited)

Examples:

10 GB = 10737418240 bytes
50 GB = 53687091200 bytes
Unlimited = (empty/null)

Recalculate Storage

Purpose: Re-scan all files and update storage usage.

When to use:

  • Database out of sync
  • After manual S3 operations
  • Quota shows incorrect values

Process:

  1. Scans all LakeFS objects for namespace
  2. Sums file sizes
  3. Updates User/Organization table

API Endpoints

GET /admin/api/quota/{namespace}?is_org=false
  Returns: Quota information

PUT /admin/api/quota/{namespace}
  Body: {private_quota_bytes, public_quota_bytes}
  Returns: Updated quota

POST /admin/api/quota/{namespace}/recalculate
  Returns: Recalculated usage

API Reference

Authentication

All admin API endpoints require X-Admin-Token header:

curl -H "X-Admin-Token: your-secret-token" \
  http://localhost:48888/admin/api/stats

Endpoints Overview

User Management:

GET    /admin/api/users                    # List users
GET    /admin/api/users/{username}         # Get user info
POST   /admin/api/users                    # Create user
DELETE /admin/api/users/{username}         # Delete user
PATCH  /admin/api/users/{username}/email-verification  # Set verification

Repository Management:

GET /admin/api/repositories                # List repositories
GET /admin/api/repositories/{type}/{namespace}/{name}  # Get details

Commit History:

GET /admin/api/commits                     # List commits

Storage:

GET /admin/api/storage/buckets             # List buckets
GET /admin/api/storage/objects/{bucket}    # List objects

Statistics:

GET /admin/api/stats                       # Basic stats
GET /admin/api/stats/detailed              # Detailed stats
GET /admin/api/stats/timeseries?days=30    # Time-series data
GET /admin/api/stats/top-repos?by=commits  # Top repositories

Quota:

GET  /admin/api/quota/{namespace}          # Get quota
PUT  /admin/api/quota/{namespace}          # Set quota
POST /admin/api/quota/{namespace}/recalculate  # Recalculate

Response Formats

User Info:

{
  "id": 1,
  "username": "alice",
  "email": "alice@example.com",
  "email_verified": true,
  "is_active": true,
  "private_quota_bytes": 10737418240,
  "public_quota_bytes": 53687091200,
  "private_used_bytes": 1234567,
  "public_used_bytes": 9876543,
  "created_at": "2025-01-01T00:00:00.000000Z"
}

Repository Info:

{
  "id": 42,
  "repo_type": "model",
  "namespace": "org",
  "name": "my-model",
  "full_id": "org/my-model",
  "private": false,
  "owner_id": 1,
  "owner_username": "alice",
  "created_at": "2025-01-01T00:00:00.000000Z",
  "file_count": 15,
  "commit_count": 8,
  "total_size": 12345678
}

Detailed Stats:

{
  "users": {
    "total": 100,
    "active": 95,
    "verified": 80,
    "inactive": 5
  },
  "organizations": {
    "total": 10
  },
  "repositories": {
    "total": 250,
    "private": 100,
    "public": 150,
    "by_type": {
      "model": 180,
      "dataset": 60,
      "space": 10
    }
  },
  "commits": {
    "total": 1500,
    "top_contributors": [
      {"username": "alice", "commit_count": 150},
      {"username": "bob", "commit_count": 120}
    ]
  },
  "lfs": {
    "total_objects": 500,
    "total_size": 107374182400
  },
  "storage": {
    "private_used": 10737418240,
    "public_used": 53687091200,
    "total_used": 64424509440
  }
}

Security Best Practices

Token Management

DO:

  • Generate cryptographically random tokens
  • Use environment variables (never hardcode)
  • Rotate tokens regularly (monthly)
  • Use HTTPS in production
  • Restrict admin portal access (firewall, VPN)

DON'T:

  • Use default token in production
  • Commit tokens to git
  • Share tokens via insecure channels
  • Use same token across environments
  • Store tokens in browser localStorage

Token Rotation

# 1. Generate new token
NEW_TOKEN=$(openssl rand -hex 32)

# 2. Update docker-compose.yml
KOHAKU_HUB_ADMIN_SECRET_TOKEN: "$NEW_TOKEN"

# 3. Restart services
docker-compose up -d

# 4. Update saved tokens in admin portal sessions

Network Security

Production Deployment:

# Restrict admin portal to specific IPs
location /admin {
    allow 192.168.1.0/24;  # Internal network
    allow 10.0.0.0/8;      # VPN
    deny all;

    # ... rest of config
}

Alternative: Basic Auth Layer

location /admin/api/ {
    auth_basic "Admin Area";
    auth_basic_user_file /etc/nginx/.htpasswd;

    # Then require X-Admin-Token header
    proxy_pass http://hub-api:48888;
}

Audit Logging

Admin operations are logged with [ADMIN] prefix:

[WARNING] [ADMIN] [07:05:55] Admin deleted user: testuser (deleted 5 repositories)
[INFO] [ADMIN] [07:06:12] Admin set quota for user alice: private=10737418240, public=53687091200
[WARNING] [ADMIN] [07:06:45] Admin deleted repository: model:org/test-model

Monitor logs:

docker logs khub-hub-api | grep "\[ADMIN\]"

Use Cases

Scenario 1: New User Onboarding

1. Dashboard → Quick Actions → "Manage Users"
2. Click "Create User"
3. Fill form:
   - Username: newuser
   - Email: newuser@company.com
   - Password: (generate secure password)
   - Email Verified: ✓
   - Quotas: 10GB private, 50GB public
4. Click "Create User"
5. Share credentials with user

Scenario 2: Storage Cleanup

1. Dashboard → "Browse Storage"
2. Click on "hub-storage" bucket
3. Filter by prefix: "lfs/"
4. Review large objects
5. Identify unused LFS objects
6. (Manually delete via CLI/API if needed)

Scenario 3: User Investigation

1. Dashboard → "View Commits"
2. Filter by username: "suspicious-user"
3. Review commit activity
4. Click repository links to inspect content
5. If needed: Go to Users → Delete user (with force)

Scenario 4: Quota Enforcement

1. Dashboard → "Manage Quotas"
2. Select namespace (user or org)
3. View current usage
4. Set new limits if exceeded
5. Click "Recalculate" to verify
6. Monitor dashboard for compliance

Troubleshooting

Can't Login

Problem: Invalid admin token Solution: Check KOHAKU_HUB_ADMIN_SECRET_TOKEN in docker-compose.yml matches your input


Problem: "Admin API is disabled" Solution: Set KOHAKU_HUB_ADMIN_ENABLED=true in environment


Statistics Not Updating

Problem: Stale data Solution: Click "Refresh Stats" button on dashboard


Storage Size Incorrect

Problem: Database out of sync with S3 Solution: Use "Recalculate" button in Quota Management


Can't Delete User

Problem: User owns repositories Solution: Either delete repos first, or use "Force Delete" option


Advanced Features

Time-Series Statistics

API:

curl -H "X-Admin-Token: your-token" \
  "http://localhost:48888/admin/api/stats/timeseries?days=30"

Returns:

{
  "repositories_by_day": {
    "2025-01-01": {"model": 5, "dataset": 2, "space": 0},
    "2025-01-02": {"model": 3, "dataset": 1, "space": 1}
  },
  "commits_by_day": {
    "2025-01-01": 15,
    "2025-01-02": 20
  },
  "users_by_day": {
    "2025-01-01": 2,
    "2025-01-02": 1
  }
}

Use case: Build custom dashboards with charts

Top Repositories

By Commits:

curl -H "X-Admin-Token: your-token" \
  "http://localhost:48888/admin/api/stats/top-repos?by=commits&limit=10"

By Size:

curl -H "X-Admin-Token: your-token" \
  "http://localhost:48888/admin/api/stats/top-repos?by=size&limit=10"

Integration with CI/CD

Automated User Creation

import requests

admin_token = "your-admin-token"
base_url = "http://hub.example.com"

# Create user via API
response = requests.post(
    f"{base_url}/admin/api/users",
    headers={"X-Admin-Token": admin_token},
    json={
        "username": "ci-bot",
        "email": "ci@company.com",
        "password": "generated-password",
        "email_verified": True,
        "private_quota_bytes": 107374182400,  # 100 GB
        "public_quota_bytes": None,  # Unlimited
    }
)

user = response.json()
print(f"Created user: {user['username']} (ID: {user['id']})")

Monitoring Script

import requests

admin_token = "your-admin-token"

# Get statistics
response = requests.get(
    "http://hub.example.com/admin/api/stats/detailed",
    headers={"X-Admin-Token": admin_token}
)

stats = response.json()

# Alert if storage > 80%
total_used = stats['storage']['total_used']
if total_used > 0.8 * (100 * 1024 * 1024 * 1024):  # 80GB
    print("WARNING: Storage usage high!")

# Alert if too many inactive users
if stats['users']['inactive'] > 10:
    print(f"WARNING: {stats['users']['inactive']} inactive users")

Performance Considerations

Database Queries

Admin operations run synchronous queries in the DB thread pool:

  • User listings: O(n) where n = total users
  • Repository stats: Aggregation queries
  • Commit history: Indexed by repo_full_id and username

Optimization:

  • Limit page size (default: 20, max: 100)
  • Use filters to reduce result sets
  • Statistics are computed on-demand (cache in frontend if needed)

S3 Bucket Scanning

Warning: Scanning large buckets is slow!

# For bucket with 100,000 objects:
# - Scan time: 30-60 seconds
# - Uses pagination (1000 objects per request)

Recommendation:

  • Limit to specific prefixes when possible
  • Don't scan too frequently
  • Consider caching results for large buckets

Comparison: Admin Portal vs CLI

Feature Admin Portal kohub-cli Best For
User management GUI Commands GUI: Quick actions
CLI: Automation
Repository browser Full ⚠️ Limited Portal: Overview
CLI: Specific repos
Commit history Full No Portal only
Storage browser Full No Portal only
Quota management Full ⚠️ API only Portal: Visual
CLI: Scripting
Statistics Dashboard No Portal only
Automation Manual Scripts Portal: Manual
CLI: Automation

Recommendation: Use portal for exploration/monitoring, CLI for automation.


Frequently Asked Questions

Q: Can I disable the admin portal? A: Yes, set KOHAKU_HUB_ADMIN_ENABLED=false

Q: Is the admin token different from user tokens? A: Yes, admin token is system-wide. User tokens are per-user.

Q: Can I create multiple admin users? A: No, admin portal uses shared secret token. For user-based admin, implement role system.

Q: Does deleting a user delete their repositories? A: No (unless force delete). Repositories can be transferred to another user.

Q: Can I access admin API without the portal UI? A: Yes, use curl/Python with X-Admin-Token header.

Q: Is audit logging enabled by default? A: Yes, all admin operations are logged with [ADMIN] prefix.


Last Updated: January 2025 Version: 1.0 Status: Production Ready