mock data api for kohakuboard

2026-04-28 09:57:43 -05:00 · 2025-10-25 18:46:06 +08:00
parent b45be04146
commit 09a0cd0c54
13 changed files with 1328 additions and 0 deletions
--- a/src/kohakuboard/LICENSE
+++ b/src/kohakuboard/LICENSE
@@ -0,0 +1,205 @@
+
+# Kohaku Software License 1.0
+
+**Published by KohakuBlueLeaf**
+
+## Purpose
+
+The **Kohaku Software License** aims to provide maximum freedom for users to work with the Software while protecting contributors from liability and ensuring the freedom of end users. It incorporates commercial usage restrictions to balance open access with sustainable development.
+
+## Definitions
+
+- **Software**: Refers to the source code, compiled binaries, libraries, modules, documentation, configuration files, and any other materials provided under this License.
+
+- **Source Code**: The preferred form for making modifications to the Software, including all source files, build scripts, configuration files, and documentation necessary to understand, compile, and modify the Software.
+
+- **Derivative Work**: Any software based on or derived from the original Software, including but not limited to:
+  - Modified versions of the Software
+  - Software that incorporates any portion of the Software
+  - Software that links to, imports, or otherwise depends on the Software in a manner that creates a combined work
+
+  For a Derivative Work to qualify under this license, it must include the complete Source Code necessary to build, use, and modify the Derivative Work.
+
+- **Modify**: To alter, adapt, translate, or otherwise change the Software, or to create Derivative Works.
+
+- **Service Provider**: An entity that uses the Software to offer services to **End Users**, thereby making the **End Users** the recipients of the service.
+
+- **End User**: Any individual or entity that uses the Software directly or uses services provided by a **Service Provider** that utilizes the Software.
+
+- **Non-Commercial Purpose**: Uses that do not involve direct or indirect monetary compensation arising from the use of the Software, including personal use, academic research, experimentation, testing, or non-commercial organizational use.
+
+- **Commercial Usage**: Any use of the Software where:
+  - The Software is used to provide services or products to customers, clients, or users (internal or external) for monetary compensation, or
+  - The Software is incorporated into commercial products or services, or
+  - The Software is used as part of internal company systems that help internal teams execute their business operations in a for-profit organization, or
+  - The organization using the Software generates revenue from activities directly or indirectly involving the Software
+
+- **Total Revenue**:
+  - For Service Providers: The total revenue generated from services utilizing the Software
+  - For product vendors: The total revenue from products incorporating the Software
+  - For internal business systems: The total revenue of the organization using the Software for business operations
+
+## License Grant
+
+### 1. General Permissions
+
+Subject to compliance with this License, KohakuBlueLeaf grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free, and limited license to access, use, modify, create Derivative Works, and distribute the Software for **Non-Commercial Purposes** and **Commercial Usage** under certain conditions.
+
+### 2. Categories of Use
+
+#### a. Direct Users
+
+Individuals or entities that use the Software directly for their personal, academic, or non-commercial purposes without operating in a commercial capacity.
+
+#### b. Service Providers and Commercial Entities
+
+Entities that use the Software to offer services or products to **End Users**, or that use the Software for internal business operations in a for-profit organization.
+
+### 3. Source Code Availability
+
+When using or distributing the Software or any Derivative Works, you must:
+
+- Make the complete Source Code available to recipients
+- Ensure the Source Code is in a form that allows recipients to build, modify, and use the Software
+- Include all necessary build scripts, configuration files, and dependencies information
+
+### 4. Derivative Works
+
+Any Derivative Works created must be published under the **Kohaku Software License**. The minimal requirement includes:
+
+- Complete Source Code of the Derivative Work
+- Build and installation instructions
+- Clear indication of what has been modified from the original Software
+
+**Additional Requirements for Combined Works:**
+
+- If the Derivative Work combines multiple software components or libraries, all such components that form a combined work must be published under this License or a compatible license.
+- You must provide clear documentation on how the components interact and how to build the combined work.
+- **Note**: You are not obligated to release proprietary business logic or workflows that use the Software through standard APIs or interfaces without creating Derivative Works.
+
+## Restrictions
+
+### 1. Commercial Usage
+
+- **Definition**: **Commercial Usage** is defined as any use where:
+  - The Software is used to provide services or products to customers, clients, or users (internal or external) for monetary compensation
+  - The Software is incorporated into commercial products or services
+  - The Software is used as part of internal company systems that help internal teams execute their business operations in a for-profit organization
+  - The organization using the Software generates revenue from activities directly or indirectly involving the Software
+
+- **Conditions for Requiring a Commercial License**: Commercial Usage is prohibited **if either** of the following conditions are met:
+  - **Total Revenue** attributable to or associated with the Software exceeds $25,000 USD per year, OR
+  - **Usage Duration** exceeds 3 months
+
+- **Revenue Threshold and Usage Duration**:
+  - **Trial Period**: Entities are allowed to engage in **Commercial Usage** without a commercial license for a trial period of **up to 3 months**, provided their **Total Revenue** remains below or equal to $25,000 USD per year.
+  - **Revenue Limit**: Entities with **Total Revenue** attributable to or associated with the Software below or equal to $25,000 USD per year are permitted to continue **Commercial Usage** without a commercial license, provided the **Usage Duration** does not exceed 3 months.
+  - **Exceeding Either Threshold**: If an entity's **Total Revenue** exceeds $25,000 USD per year OR the **Commercial Usage** period exceeds 3 months, the entity must request a commercial license from the author.
+
+- **Requesting a Commercial License**: Entities that need to engage in **Commercial Usage** exceeding both thresholds must contact the author at kohaku@kblueleaf.net to request a commercial license. The author may grant such licenses at their sole discretion, potentially subject to fees, royalties, or revenue-sharing agreements.
+
+### 2. Prohibited Uses
+
+You may not use the Software for:
+
+- Military purposes or weapons development
+- Surveillance systems or mass monitoring
+- Biometric identification or tracking systems
+- Any activity that infringes on third-party rights
+- Any use violating applicable laws, including privacy and security regulations
+- Generating or distributing malware, exploits, or other malicious software
+
+You may not:
+
+- Alter or remove copyright and proprietary notices
+- Circumvent or remove any security or usage restrictions
+- Impose additional terms that conflict with this License
+- Distribute the Software to prohibited individuals, entities, or countries as defined by applicable export laws
+
+### 3. Distribution Requirements
+
+When distributing the Software or any Derivative Works, you must:
+
+- Include a copy of this License with the distribution
+- Include the complete Source Code or provide clear instructions on how to obtain it
+- **Attribution Notice**: Prominently display the following notice:
+
+  ```
+  This Software is licensed under the Kohaku Software License by KohakuBlueLeaf.
+  Copyright 2025 KohakuBlueLeaf.
+
+  IN NO EVENT SHALL KohakuBlueLeaf BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER
+  LIABILITY ARISING FROM THE USE OF THIS SOFTWARE.
+  ```
+
+- **For Derivative Works**:
+  - Include a statement clearly indicating that you have modified the original Software
+  - Document the nature of modifications made
+  - Ensure all Source Code is available under this License
+
+- **No Misrepresentation**: Do not misrepresent or imply that Derivative Works are official versions or have been endorsed by the original author unless authorized in writing.
+
+- **Service Provider Requirements**:
+  - **Service Providers** must provide **End Users** with clear notice that the service utilizes Software licensed under the Kohaku Software License
+  - Include a reference to the original Software and this License in service documentation, terms of service, or user interface (e.g., "About" page, footer)
+
+## No Harm and No Liability
+
+### 1. No Harm
+
+You agree that no contributor's conduct in creating the Software has caused you harm. To the extent permitted by law, you waive the right to pursue any legal claims against contributors related to the creation of the Software.
+
+### 2. No Liability
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+## Patent Grant
+
+Each contributor grants you a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable patent license to make, use, offer to sell, sell, import, and otherwise transfer the Software, where such license applies only to those patent claims licensable by such contributor that are necessarily infringed by their contribution(s) alone or by combination of their contribution(s) with the Software.
+
+## Interpretation of Ambiguous Terms
+
+In the event of any ambiguity or uncertainty in the interpretation of the terms of this License, the Licensee has the right to interpret the ambiguous descriptions in a manner that aligns with the intended purpose of this License, which is to promote open access while protecting sustainable development through commercial licensing.
+
+## Acceptance and Compliance
+
+By using, modifying, or distributing the Software, you agree to comply with all terms of this License. Non-compliance may result in the automatic termination of your rights under this License.
+
+## Termination
+
+Your rights under this License terminate automatically upon any breach of its terms. Upon termination, you must:
+
+- Cease all use, modification, and distribution of the Software and Derivative Works
+- Destroy all copies of the Software in your possession or control
+- If you are a Service Provider, cease providing services that utilize the Software
+
+Sections regarding No Liability, Indemnification, and General Provisions survive termination.
+
+## Indemnification
+
+You agree to indemnify, defend, and hold harmless KohakuBlueLeaf and its affiliates, contributors, and licensors from and against any claims, damages, losses, liabilities, costs, and expenses (including reasonable attorneys' fees) arising from:
+
+- Your use of the Software
+- Your violation of this License
+- Your violation of any rights of another party
+- Your distribution of the Software or Derivative Works
+
+## General Provisions
+
+- **Governing Law**: This License is governed by the laws of Taiwan, without regard to conflict of law principles.
+
+- **Severability**: If any provision of this License is held to be unenforceable or invalid, that provision shall be modified to the minimum extent necessary to make it enforceable, and the remaining provisions shall remain in full force and effect.
+
+- **Entire Agreement**: This License constitutes the entire agreement between you and KohakuBlueLeaf regarding the Software and supersedes all prior agreements and understandings.
+
+- **No Waiver**: The failure of KohakuBlueLeaf to enforce any provision of this License shall not constitute a waiver of that provision or any other provision.
+
+- **Assignment**: You may not assign or transfer your rights or obligations under this License without prior written consent from KohakuBlueLeaf.
+
+## Revisions
+
+KohakuBlueLeaf may publish revised versions of the Kohaku Software License from time to time. Each version will be given a distinguishing version number. You may choose to use the Software under the terms of the version of the License under which you originally received the Software, or under the terms of any subsequent version published by KohakuBlueLeaf.
+
+## Contact
+
+For commercial licensing inquiries, please contact: kohaku@kblueleaf.net
--- a/src/kohakuboard/README.md
+++ b/src/kohakuboard/README.md
@@ -0,0 +1,46 @@
+# KohakuBoard - Backend
+
+Minimal experiment tracking backend with high-performance data serving.
+
+## Features
+
+- FastAPI-based REST API
+- Sparse metric logging support
+- Multiple data types: scalars, media, tables, histograms
+- Step-indexed data structure
+- Mock data generation for testing
+
+## License
+
+**Kohaku Software License 1.0**
+
+This is a premium feature of KohakuHub with commercial usage restrictions.
+
+- ✅ Free for non-commercial use
+- ⚠️ Commercial trial: 3 months OR $25k revenue/year
+- ⚠️ After trial, requires commercial license
+
+Contact: kohaku@kblueleaf.net
+
+## Installation
+
+```bash
+pip install -e .
+```
+
+## Development
+
+```bash
+uvicorn kohakuboard.main:app --reload --port 48889
+```
+
+API docs: http://localhost:48889/docs
+
+## API Endpoints
+
+- `GET /api/experiments` - List experiments
+- `GET /api/experiments/{id}/summary` - Get experiment summary
+- `GET /api/experiments/{id}/scalars/{name}` - Get scalar metric data
+- `GET /api/experiments/{id}/media/{name}` - Get media log
+- `GET /api/experiments/{id}/tables/{name}` - Get table log
+- `GET /api/experiments/{id}/histograms/{name}` - Get histogram log
--- a/src/kohakuboard/init.py
+++ b/src/kohakuboard/init.py
@@ -0,0 +1,3 @@
+"""KohakuBoard - ML Experiment Tracking System"""
+
+__version__ = "0.1.0"
--- a/src/kohakuboard/api/init.py
+++ b/src/kohakuboard/api/init.py
@@ -0,0 +1 @@
+"""API module for KohakuBoard"""
--- a/src/kohakuboard/api/routers/init.py
+++ b/src/kohakuboard/api/routers/init.py
@@ -0,0 +1 @@
+"""API routers for KohakuBoard"""
--- a/src/kohakuboard/api/routers/experiments.py
+++ b/src/kohakuboard/api/routers/experiments.py
@@ -0,0 +1,321 @@
+"""Experiments API endpoints"""
+
+import random
+
+from fastapi import APIRouter, HTTPException, Query
+from pydantic import BaseModel
+from typing import List, Optional
+
+from kohakuboard.api.utils.mock_data import (
+    generate_experiment,
+    generate_metrics_data,
+    generate_sparse_metrics_data,
+    generate_histogram_data,
+    generate_table_data,
+)
+from kohakuboard.config import cfg
+from kohakuboard.logger import logger_api
+
+router = APIRouter()
+
+# Mock experiment storage with large datasets for testing WebGL performance
+MOCK_EXPERIMENTS = {
+    "exp-001": generate_experiment(
+        "exp-001", "ResNet50 Training (1K steps)", steps=1000, status="completed"
+    ),
+    "exp-002": generate_experiment(
+        "exp-002", "BERT Fine-tuning (10K steps)", steps=10000, status="running"
+    ),
+    "exp-003": generate_experiment(
+        "exp-003", "ViT Pretraining (50K steps)", steps=50000, status="completed"
+    ),
+    "exp-004": generate_experiment(
+        "exp-004", "GPT-2 Training (100K steps)", steps=100000, status="completed"
+    ),
+    "exp-005": generate_experiment(
+        "exp-005", "Stable Diffusion (25K steps)", steps=25000, status="stopped"
+    ),
+}
+
+
+class MetricsQuery(BaseModel):
+    """Query parameters for metrics"""
+
+    metric_names: Optional[List[str]] = None
+    start_step: Optional[int] = None
+    end_step: Optional[int] = None
+
+
+@router.get("/experiments")
+async def list_experiments():
+    """List all experiments"""
+    logger_api.info("Fetching experiments list")
+    return list(MOCK_EXPERIMENTS.values())
+
+
+@router.get("/experiments/{experiment_id}")
+async def get_experiment(experiment_id: str):
+    """Get experiment details"""
+    logger_api.info(f"Fetching experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    return MOCK_EXPERIMENTS[experiment_id]
+
+
+@router.get("/experiments/{experiment_id}/metrics")
+async def get_metrics(
+    experiment_id: str,
+    metric_names: Optional[str] = Query(
+        None, description="Comma-separated metric names"
+    ),
+    start_step: Optional[int] = Query(None, description="Start step"),
+    end_step: Optional[int] = Query(None, description="End step"),
+    steps: int = Query(None, description="Number of steps", le=cfg.mock.max_steps),
+):
+    """Get metrics data for an experiment"""
+    logger_api.info(f"Fetching metrics for experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    if steps is None:
+        steps = MOCK_EXPERIMENTS[experiment_id]["total_steps"]
+
+    # Parse metric names
+    metrics = None
+    if metric_names:
+        metrics = [m.strip() for m in metric_names.split(",")]
+
+    # Generate metrics data
+    metrics_data = generate_metrics_data(steps=steps, metrics=metrics)
+
+    # Filter by step range if provided
+    if start_step is not None or end_step is not None:
+        for metric in metrics_data:
+            start = start_step or 0
+            end = end_step or len(metric["x"])
+            metric["x"] = metric["x"][start:end]
+            metric["y"] = metric["y"][start:end]
+
+    return {"experiment_id": experiment_id, "metrics": metrics_data}
+
+
+@router.get("/experiments/{experiment_id}/summary")
+async def get_experiment_summary(experiment_id: str):
+    """Get experiment summary with available data"""
+    logger_api.info(f"Fetching summary for experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    sample = generate_sparse_metrics_data(total_events=100)
+
+    return {
+        "experiment_id": experiment_id,
+        "experiment_info": MOCK_EXPERIMENTS[experiment_id],
+        "total_steps": MOCK_EXPERIMENTS[experiment_id]["total_steps"],
+        "available_data": {
+            "scalars": [k for k in sample.keys() if k != "time"],
+            "media": ["generated_images", "model_predictions", "attention_maps"],
+            "tables": ["validation_results", "layer_stats", "confusion_matrix"],
+            "histograms": ["gradients", "weights", "activations"],
+        },
+    }
+
+
+@router.get("/experiments/{experiment_id}/scalars/{metric_name}")
+async def get_scalar_metric(experiment_id: str, metric_name: str):
+    """Get scalar metric as step-value pairs"""
+    logger_api.info(f"Fetching scalar '{metric_name}' for experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    total_steps = MOCK_EXPERIMENTS[experiment_id]["total_steps"]
+    full_data = generate_sparse_metrics_data(total_events=total_steps)
+
+    if metric_name not in full_data:
+        raise HTTPException(status_code=404, detail=f"Metric '{metric_name}' not found")
+
+    # Return as step-value pairs (filter out None values)
+    data = []
+    for i, value in enumerate(full_data[metric_name]):
+        if value is not None:
+            data.append({"step": i, "value": value})
+
+    return {"experiment_id": experiment_id, "metric_name": metric_name, "data": data}
+
+
+@router.get("/experiments/{experiment_id}/media/{media_name}")
+async def get_media_log(experiment_id: str, media_name: str):
+    """Get media log entries"""
+    logger_api.info(f"Fetching media '{media_name}' for experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    # Mock media data with real placeholder URLs
+    media_entries = []
+    total_steps = MOCK_EXPERIMENTS[experiment_id]["total_steps"]
+    log_every = 1000  # Log media every 1000 steps
+
+    for step in range(0, total_steps, log_every):
+        media_entries.append(
+            {
+                "step": step,
+                "type": "image",
+                "url": f"https://picsum.photos/seed/{experiment_id}-{media_name}-{step}/512/512",
+                "caption": f"{media_name} at step {step}",
+            }
+        )
+
+    return {
+        "experiment_id": experiment_id,
+        "media_name": media_name,
+        "data": media_entries,
+    }
+
+
+@router.get("/experiments/{experiment_id}/tables/{table_name}")
+async def get_table_log(experiment_id: str, table_name: str):
+    """Get table log entries"""
+    logger_api.info(f"Fetching table '{table_name}' for experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    total_steps = MOCK_EXPERIMENTS[experiment_id]["total_steps"]
+    log_every = 5000
+
+    table_entries = []
+    for step in range(0, total_steps, log_every):
+        step_num = step // log_every
+        table_entries.append(
+            {
+                "step": step,
+                "columns": [
+                    "Sample",
+                    "Image",
+                    "Precision",
+                    "Recall",
+                    "F1-Score",
+                    "Support",
+                ],
+                "column_types": [
+                    "text",
+                    "image",
+                    "number",
+                    "number",
+                    "number",
+                    "number",
+                ],
+                "rows": [
+                    [
+                        "Cat",
+                        f"https://picsum.photos/seed/{experiment_id}-cat-{step}/64/64",
+                        0.85 + random.random() * 0.1,
+                        0.80 + random.random() * 0.1,
+                        0.82 + random.random() * 0.1,
+                        120,
+                    ],
+                    [
+                        "Dog",
+                        f"https://picsum.photos/seed/{experiment_id}-dog-{step}/64/64",
+                        0.88 + random.random() * 0.1,
+                        0.85 + random.random() * 0.1,
+                        0.86 + random.random() * 0.1,
+                        150,
+                    ],
+                    [
+                        "Bird",
+                        f"https://picsum.photos/seed/{experiment_id}-bird-{step}/64/64",
+                        0.75 + random.random() * 0.1,
+                        0.70 + random.random() * 0.1,
+                        0.72 + random.random() * 0.1,
+                        80,
+                    ],
+                ],
+            }
+        )
+
+    return {
+        "experiment_id": experiment_id,
+        "table_name": table_name,
+        "data": table_entries,
+    }
+
+
+@router.get("/experiments/{experiment_id}/histograms/{histogram_name}")
+async def get_histogram_log(experiment_id: str, histogram_name: str):
+    """Get histogram log entries"""
+    logger_api.info(
+        f"Fetching histogram '{histogram_name}' for experiment: {experiment_id}"
+    )
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    total_steps = MOCK_EXPERIMENTS[experiment_id]["total_steps"]
+    log_every = 2000
+
+    histogram_entries = []
+    for step in range(0, total_steps, log_every):
+        histogram_entries.append(
+            {
+                "step": step,
+                "bins": 50,
+                "values": [
+                    random.gauss(0, 1 - step / total_steps) for _ in range(10000)
+                ],
+            }
+        )
+
+    return {
+        "experiment_id": experiment_id,
+        "histogram_name": histogram_name,
+        "data": histogram_entries,
+    }
+
+
+@router.get("/experiments/{experiment_id}/histograms/{histogram_name}")
+async def get_histogram(
+    experiment_id: str,
+    histogram_name: str,
+    num_values: int = Query(10000, description="Number of data points", le=1000000),
+    distribution: str = Query(
+        "normal", description="Distribution type (normal, uniform, exponential)"
+    ),
+):
+    """Get histogram data"""
+    logger_api.info(
+        f"Fetching histogram '{histogram_name}' for experiment: {experiment_id}"
+    )
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    histogram_data = generate_histogram_data(
+        num_values=num_values, distribution=distribution
+    )
+
+    return histogram_data
+
+
+@router.get("/experiments/{experiment_id}/tables/{table_name}")
+async def get_table(
+    experiment_id: str,
+    table_name: str,
+    num_rows: int = Query(100, description="Number of rows", le=10000),
+    num_cols: int = Query(6, description="Number of columns", le=50),
+):
+    """Get table data"""
+    logger_api.info(f"Fetching table '{table_name}' for experiment: {experiment_id}")
+
+    if experiment_id not in MOCK_EXPERIMENTS:
+        raise HTTPException(status_code=404, detail="Experiment not found")
+
+    table_data = generate_table_data(num_rows=num_rows, num_cols=num_cols)
+
+    return table_data
--- a/src/kohakuboard/api/routers/mock.py
+++ b/src/kohakuboard/api/routers/mock.py
@@ -0,0 +1,119 @@
+"""Mock data generation API endpoints"""
+
+from fastapi import APIRouter, HTTPException
+from pydantic import BaseModel, Field
+from typing import List, Optional
+
+from kohakuboard.api.utils.mock_data import (
+    generate_metrics_data,
+    generate_histogram_data,
+    generate_scatter_data,
+    generate_table_data,
+)
+from kohakuboard.config import cfg
+from kohakuboard.logger import logger_mock
+
+router = APIRouter()
+
+
+class MockMetricsConfig(BaseModel):
+    """Configuration for mock metrics generation"""
+
+    steps: int = Field(
+        default=100000, le=cfg.mock.max_steps, description="Number of steps"
+    )
+    metrics: Optional[List[str]] = Field(default=None, description="Metric names")
+
+
+class MockHistogramConfig(BaseModel):
+    """Configuration for mock histogram generation"""
+
+    num_values: int = Field(
+        default=10000, le=1000000, description="Number of data points"
+    )
+    distribution: str = Field(default="normal", description="Distribution type")
+    mean: float = Field(default=0.0, description="Mean value")
+    std: float = Field(default=1.0, description="Standard deviation")
+
+
+class MockScatterConfig(BaseModel):
+    """Configuration for mock scatter plot generation"""
+
+    num_points: int = Field(default=1000, le=100000, description="Number of points")
+    correlation: float = Field(
+        default=0.7, ge=-1.0, le=1.0, description="Correlation coefficient"
+    )
+
+
+class MockTableConfig(BaseModel):
+    """Configuration for mock table generation"""
+
+    num_rows: int = Field(default=100, le=10000, description="Number of rows")
+    num_cols: int = Field(default=6, le=50, description="Number of columns")
+
+
+@router.post("/mock/metrics")
+async def generate_mock_metrics(config: MockMetricsConfig):
+    """Generate mock metrics data"""
+    logger_mock.info(
+        f"Generating mock metrics: steps={config.steps}, metrics={config.metrics}"
+    )
+
+    try:
+        data = generate_metrics_data(steps=config.steps, metrics=config.metrics)
+        return {"metrics": data}
+    except Exception as e:
+        logger_mock.error(f"Failed to generate mock metrics: {e}")
+        raise HTTPException(status_code=500, detail=str(e))
+
+
+@router.post("/mock/histogram")
+async def generate_mock_histogram(config: MockHistogramConfig):
+    """Generate mock histogram data"""
+    logger_mock.info(
+        f"Generating mock histogram: num_values={config.num_values}, distribution={config.distribution}"
+    )
+
+    try:
+        data = generate_histogram_data(
+            num_values=config.num_values,
+            distribution=config.distribution,
+            mean=config.mean,
+            std=config.std,
+        )
+        return data
+    except Exception as e:
+        logger_mock.error(f"Failed to generate mock histogram: {e}")
+        raise HTTPException(status_code=500, detail=str(e))
+
+
+@router.post("/mock/scatter")
+async def generate_mock_scatter(config: MockScatterConfig):
+    """Generate mock scatter plot data"""
+    logger_mock.info(
+        f"Generating mock scatter: num_points={config.num_points}, correlation={config.correlation}"
+    )
+
+    try:
+        data = generate_scatter_data(
+            num_points=config.num_points, correlation=config.correlation
+        )
+        return {"scatter": data}
+    except Exception as e:
+        logger_mock.error(f"Failed to generate mock scatter: {e}")
+        raise HTTPException(status_code=500, detail=str(e))
+
+
+@router.post("/mock/table")
+async def generate_mock_table(config: MockTableConfig):
+    """Generate mock table data"""
+    logger_mock.info(
+        f"Generating mock table: rows={config.num_rows}, cols={config.num_cols}"
+    )
+
+    try:
+        data = generate_table_data(num_rows=config.num_rows, num_cols=config.num_cols)
+        return data
+    except Exception as e:
+        logger_mock.error(f"Failed to generate mock table: {e}")
+        raise HTTPException(status_code=500, detail=str(e))
--- a/src/kohakuboard/api/routers/runs.py
+++ b/src/kohakuboard/api/routers/runs.py
@@ -0,0 +1,126 @@
+"""Experiment runs API with better data structure"""
+
+from fastapi import APIRouter, HTTPException, Query
+from typing import List, Optional
+from kohakuboard.api.utils.mock_data import generate_sparse_metrics_data
+from kohakuboard.logger import logger_api
+
+router = APIRouter()
+
+# Mock runs data
+MOCK_RUNS = {
+    "run-001": {
+        "id": "run-001",
+        "name": "ResNet50 Training",
+        "status": "running",
+        "created_at": "2025-01-15T10:00:00Z",
+    }
+}
+
+
+@router.get("/runs/{run_id}/summary")
+async def get_run_summary(run_id: str):
+    """Get run summary with all available metrics"""
+    logger_api.info(f"Fetching summary for run: {run_id}")
+
+    if run_id not in MOCK_RUNS:
+        raise HTTPException(status_code=404, detail="Run not found")
+
+    # Generate sample data to get metric names
+    sample_data = generate_sparse_metrics_data(total_events=100)
+
+    return {
+        "run_id": run_id,
+        "run_info": MOCK_RUNS[run_id],
+        "total_events": 100000,
+        "available_metrics": {
+            "scalars": [k for k in sample_data.keys() if k != "time"],
+            "images": ["generated_samples", "confusion_matrix"],
+            "tables": ["validation_results", "layer_stats"],
+        },
+    }
+
+
+@router.get("/runs/{run_id}/scalars/{metric_name}")
+async def get_scalar_values(
+    run_id: str,
+    metric_name: str,
+    start_event: Optional[int] = Query(None),
+    end_event: Optional[int] = Query(None),
+):
+    """Get scalar values for a specific metric"""
+    logger_api.info(f"Fetching scalar '{metric_name}' for run: {run_id}")
+
+    if run_id not in MOCK_RUNS:
+        raise HTTPException(status_code=404, detail="Run not found")
+
+    # Generate full dataset
+    full_data = generate_sparse_metrics_data(total_events=100000)
+
+    if metric_name not in full_data:
+        raise HTTPException(status_code=404, detail=f"Metric '{metric_name}' not found")
+
+    start = start_event or 0
+    end = end_event or len(full_data["time"])
+
+    return {
+        "run_id": run_id,
+        "metric_name": metric_name,
+        "time": full_data["time"][start:end],
+        "values": full_data[metric_name][start:end],
+    }
+
+
+@router.get("/runs/{run_id}/images/{image_name}")
+async def get_image_log(run_id: str, image_name: str, limit: int = Query(100, le=1000)):
+    """Get image log entries"""
+    logger_api.info(f"Fetching images '{image_name}' for run: {run_id}")
+
+    if run_id not in MOCK_RUNS:
+        raise HTTPException(status_code=404, detail="Run not found")
+
+    # Mock image data
+    images = []
+    for i in range(min(limit, 10)):
+        images.append(
+            {
+                "step": i * 1000,
+                "url": f"https://via.placeholder.com/256x256?text=Step+{i * 1000}",
+                "caption": f"Generated sample at step {i * 1000}",
+            }
+        )
+
+    return {"run_id": run_id, "image_name": image_name, "images": images}
+
+
+@router.get("/runs/{run_id}/tables/{table_name}")
+async def get_table_log(run_id: str, table_name: str):
+    """Get table log with optional image columns"""
+    logger_api.info(f"Fetching table '{table_name}' for run: {run_id}")
+
+    if run_id not in MOCK_RUNS:
+        raise HTTPException(status_code=404, detail="Run not found")
+
+    # Mock table with images
+    columns = ["ID", "Name", "Score", "Image", "Status"]
+    column_types = ["number", "text", "number", "image", "text"]
+    rows = []
+
+    for i in range(20):
+        rows.append(
+            [
+                i + 1,
+                f"Sample_{i + 1}",
+                round(0.5 + (i / 20) * 0.5, 3),
+                f"https://via.placeholder.com/64x64?text={i + 1}",
+                "Pass" if i % 3 == 0 else "Fail",
+            ]
+        )
+
+    return {
+        "run_id": run_id,
+        "table_name": table_name,
+        "columns": columns,
+        "column_types": column_types,
+        "rows": rows,
+    }
--- a/src/kohakuboard/api/utils/init.py
+++ b/src/kohakuboard/api/utils/init.py
@@ -0,0 +1 @@
+"""API utilities for KohakuBoard"""
--- a/src/kohakuboard/api/utils/mock_data.py
+++ b/src/kohakuboard/api/utils/mock_data.py
@@ -0,0 +1,351 @@
+"""Mock data generation utilities"""
+
+import random
+import math
+from datetime import datetime, timedelta, timezone
+from typing import List, Dict, Any
+
+from kohakuboard.config import cfg
+
+
+def generate_time_series_data(
+    steps: int = None,
+    start_value: float = 1.0,
+    trend: str = "decreasing",
+    noise_level: float = None,
+    smoothness: float = 0.95,
+) -> List[float]:
+    """
+    Generate realistic time series data
+
+    Args:
+        steps: Number of data points
+        start_value: Starting value
+        trend: 'increasing', 'decreasing', or 'stable'
+        noise_level: Amount of random noise (0.0 to 1.0)
+        smoothness: Exponential smoothing factor (0.0 to 1.0)
+
+    Returns:
+        List of values
+    """
+    if steps is None:
+        steps = cfg.mock.default_steps
+    if noise_level is None:
+        noise_level = cfg.mock.default_noise_level
+
+    values = []
+    current_value = start_value
+    smoothed_value = start_value
+
+    for step in range(steps):
+        # Calculate trend component
+        progress = step / max(steps - 1, 1)
+
+        match trend:
+            case "decreasing":
+                trend_value = start_value * math.exp(-3 * progress)
+            case "increasing":
+                trend_value = start_value * (1 + 2 * progress)
+            case "stable":
+                trend_value = start_value
+            case "oscillating":
+                trend_value = start_value * (1 + 0.3 * math.sin(10 * progress))
+            case _:
+                trend_value = start_value
+
+        # Add noise
+        noise = random.gauss(0, noise_level * start_value)
+
+        # Combine and smooth
+        current_value = trend_value + noise
+        smoothed_value = smoothness * smoothed_value + (1 - smoothness) * current_value
+
+        values.append(smoothed_value)
+
+    return values
+
+
+def generate_sparse_metrics_data(
+    total_events: int = 1000, metrics_config: List[Dict[str, Any]] = None
+) -> Dict[str, List[Any]]:
+    """
+    Generate sparse multi-metric logging data
+
+    Args:
+        total_events: Total number of logging events
+        metrics_config: List of metric configurations with logging frequency
+
+    Returns:
+        Dict mapping metric names to lists with None for missing values
+    """
+    if metrics_config is None:
+        metrics_config = [
+            {
+                "name": "train_loss",
+                "log_every": 1,
+                "type": "loss",
+                "start": 2.5,
+                "noise": 0.08,
+            },
+            {
+                "name": "train_accuracy",
+                "log_every": 1,
+                "type": "accuracy",
+                "start": 0.3,
+                "noise": 0.015,
+            },
+            {
+                "name": "val_loss",
+                "log_every": 10,
+                "type": "loss",
+                "start": 2.8,
+                "noise": 0.12,
+            },
+            {
+                "name": "val_accuracy",
+                "log_every": 10,
+                "type": "accuracy",
+                "start": 0.25,
+                "noise": 0.02,
+            },
+            {
+                "name": "learning_rate",
+                "log_every": 5,
+                "type": "lr",
+                "start": 0.001,
+                "noise": 0,
+            },
+            {"name": "step", "log_every": 1, "type": "step"},
+        ]
+
+    result = {"time": list(range(total_events))}
+
+    for config in metrics_config:
+        metric_name = config["name"]
+        log_every = config["log_every"]
+        metric_type = config.get("type", "default")
+        start_val = config.get("start", 1.0)
+        noise_level = config.get("noise", 0.1)
+
+        values = []
+        value_index = 0
+
+        for i in range(total_events):
+            if i % log_every == 0:
+                if metric_type == "loss":
+                    base_value = start_val * (0.95 ** (value_index / 10))
+                    value = base_value + random.gauss(0, noise_level)
+                elif metric_type == "accuracy":
+                    progress = value_index / (total_events / log_every)
+                    base_value = min(0.99, start_val + progress * 0.65)
+                    value = base_value + random.gauss(0, noise_level)
+                elif metric_type == "lr":
+                    value = start_val * (0.99 ** (value_index / 10))
+                elif metric_type == "step":
+                    value = i
+                else:
+                    value = random.random()
+
+                values.append(value)
+                value_index += 1
+            else:
+                values.append(None)
+
+        result[metric_name] = values
+
+    return result
+
+
+def generate_metrics_data(
+    steps: int = None, metrics: List[str] = None
+) -> List[Dict[str, Any]]:
+    """
+    Generate mock metrics data for line plots
+
+    Args:
+        steps: Number of steps
+        metrics: List of metric names
+
+    Returns:
+        List of metric series
+    """
+    if steps is None:
+        steps = cfg.mock.default_steps
+    if metrics is None:
+        metrics = ["train_loss", "val_loss", "train_accuracy", "val_accuracy"]
+
+    result = []
+
+    for metric_name in metrics:
+        x_values = list(range(steps))
+
+        # Configure based on metric type
+        if "loss" in metric_name:
+            y_values = generate_time_series_data(
+                steps=steps,
+                start_value=2.5 if "train" in metric_name else 2.8,
+                trend="decreasing",
+                noise_level=0.05,
+                smoothness=0.95,
+            )
+        elif "accuracy" in metric_name:
+            y_values = generate_time_series_data(
+                steps=steps,
+                start_value=0.3 if "train" in metric_name else 0.25,
+                trend="increasing",
+                noise_level=0.02,
+                smoothness=0.97,
+            )
+        else:
+            y_values = generate_time_series_data(
+                steps=steps,
+                start_value=1.0,
+                trend="stable",
+                noise_level=0.1,
+                smoothness=0.9,
+            )
+
+        result.append({"name": metric_name, "x": x_values, "y": y_values})
+
+    return result
+
+
+def generate_histogram_data(
+    num_values: int = 10000,
+    distribution: str = "normal",
+    mean: float = 0.0,
+    std: float = 1.0,
+) -> Dict[str, Any]:
+    """
+    Generate histogram data
+
+    Args:
+        num_values: Number of data points
+        distribution: 'normal', 'uniform', 'exponential'
+        mean: Mean value (for normal distribution)
+        std: Standard deviation (for normal distribution)
+
+    Returns:
+        Histogram data dict
+    """
+    match distribution:
+        case "normal":
+            values = [random.gauss(mean, std) for _ in range(num_values)]
+        case "uniform":
+            values = [random.uniform(mean - std, mean + std) for _ in range(num_values)]
+        case "exponential":
+            values = [random.expovariate(1 / std) for _ in range(num_values)]
+        case _:
+            values = [random.gauss(mean, std) for _ in range(num_values)]
+
+    return {
+        "values": values,
+        "bins": 50,
+        "name": f"{distribution.capitalize()} Distribution",
+    }
+
+
+def generate_scatter_data(
+    num_points: int = 1000, correlation: float = 0.7
+) -> List[Dict[str, Any]]:
+    """
+    Generate scatter plot data
+
+    Args:
+        num_points: Number of data points
+        correlation: Correlation between x and y (-1.0 to 1.0)
+
+    Returns:
+        List of scatter series
+    """
+    x_values = [random.gauss(0, 1) for _ in range(num_points)]
+    y_values = [
+        correlation * x + math.sqrt(1 - correlation**2) * random.gauss(0, 1)
+        for x in x_values
+    ]
+
+    # Generate color values based on distance from origin
+    colors = [math.sqrt(x**2 + y**2) for x, y in zip(x_values, y_values)]
+
+    return [{"name": "Data Points", "x": x_values, "y": y_values, "color": colors}]
+
+
+def generate_table_data(num_rows: int = 100, num_cols: int = 6) -> Dict[str, Any]:
+    """
+    Generate table data
+
+    Args:
+        num_rows: Number of rows
+        num_cols: Number of columns
+
+    Returns:
+        Table data dict
+    """
+    columns = [f"Column_{i+1}" for i in range(num_cols)]
+    rows = []
+
+    for i in range(num_rows):
+        row = [
+            i + 1,  # ID column
+            f"Item_{i+1}",  # Name column
+            round(random.uniform(0, 100), 2),  # Value 1
+            round(random.uniform(0, 1), 4),  # Value 2
+            random.choice(["A", "B", "C", "D"]),  # Category
+            round(random.uniform(0, 10), 1),  # Value 3
+        ]
+        rows.append(row[:num_cols])
+
+    return {"columns": columns, "rows": rows}
+
+
+def generate_experiment(
+    experiment_id: str, name: str, steps: int = None, status: str = "completed"
+) -> Dict[str, Any]:
+    """
+    Generate a complete experiment with all data
+
+    Args:
+        experiment_id: Experiment ID
+        name: Experiment name
+        steps: Number of training steps
+        status: Experiment status
+
+    Returns:
+        Complete experiment data
+    """
+    if steps is None:
+        steps = cfg.mock.default_steps
+
+    created_at = datetime.now(timezone.utc) - timedelta(hours=random.randint(1, 168))
+    duration_seconds = random.randint(600, 14400)  # 10 min to 4 hours
+
+    return {
+        "id": experiment_id,
+        "name": name,
+        "description": f"Mock experiment for testing KohakuBoard visualization",
+        "status": status,
+        "total_steps": steps,
+        "duration": format_duration(duration_seconds),
+        "created_at": created_at.isoformat(),
+        "updated_at": (created_at + timedelta(seconds=duration_seconds)).isoformat(),
+        "config": {
+            "learning_rate": round(random.uniform(1e-5, 1e-2), 6),
+            "batch_size": random.choice([16, 32, 64, 128]),
+            "optimizer": random.choice(["Adam", "SGD", "AdamW"]),
+            "model": random.choice(["ResNet50", "ViT-B/16", "BERT-base"]),
+        },
+    }
+
+
+def format_duration(seconds: int) -> str:
+    """Format duration in human-readable format"""
+    hours = seconds // 3600
+    minutes = (seconds % 3600) // 60
+    secs = seconds % 60
+
+    if hours > 0:
+        return f"{hours}h {minutes}m"
+    elif minutes > 0:
+        return f"{minutes}m {secs}s"
+    else:
+        return f"{secs}s"
--- a/src/kohakuboard/config.py
+++ b/src/kohakuboard/config.py
@@ -0,0 +1,58 @@
+"""Configuration for KohakuBoard"""
+
+import os
+from dataclasses import dataclass
+
+
+@dataclass
+class AppConfig:
+    """Application configuration"""
+
+    host: str = "0.0.0.0"
+    port: int = 48889
+    api_base: str = "/api"
+    cors_origins: list = None
+
+    def __post_init__(self):
+        if self.cors_origins is None:
+            self.cors_origins = ["http://localhost:5175", "http://localhost:28080"]
+
+
+@dataclass
+class MockDataConfig:
+    """Mock data generation configuration"""
+
+    default_steps: int = 1000
+    default_metrics_count: int = 4
+    default_noise_level: float = 0.1
+    max_steps: int = 100000
+    max_metrics: int = 50
+
+
+@dataclass
+class Config:
+    """Main configuration"""
+
+    app: AppConfig
+    mock: MockDataConfig
+
+    @classmethod
+    def from_env(cls):
+        """Load configuration from environment variables"""
+        return cls(
+            app=AppConfig(
+                host=os.getenv("KOHAKU_BOARD_HOST", "0.0.0.0"),
+                port=int(os.getenv("KOHAKU_BOARD_PORT", "48889")),
+                api_base=os.getenv("KOHAKU_BOARD_API_BASE", "/api"),
+            ),
+            mock=MockDataConfig(
+                default_steps=int(os.getenv("KOHAKU_BOARD_DEFAULT_STEPS", "1000")),
+                default_metrics_count=int(
+                    os.getenv("KOHAKU_BOARD_DEFAULT_METRICS", "4")
+                ),
+                default_noise_level=float(os.getenv("KOHAKU_BOARD_NOISE_LEVEL", "0.1")),
+            ),
+        )
+
+
+cfg = Config.from_env()
--- a/src/kohakuboard/logger.py
+++ b/src/kohakuboard/logger.py
@@ -0,0 +1,42 @@
+"""Logging configuration for KohakuBoard"""
+
+import logging
+import sys
+
+
+class ColoredFormatter(logging.Formatter):
+    """Colored log formatter"""
+
+    COLORS = {
+        "DEBUG": "\033[0;36m",  # Cyan
+        "INFO": "\033[0;32m",  # Green
+        "WARNING": "\033[0;33m",  # Yellow
+        "ERROR": "\033[0;31m",  # Red
+        "CRITICAL": "\033[1;31m",  # Bold Red
+    }
+    RESET = "\033[0m"
+
+    def format(self, record):
+        log_color = self.COLORS.get(record.levelname, self.RESET)
+        record.levelname = f"{log_color}{record.levelname}{self.RESET}"
+        record.name = f"\033[0;35m[{record.name}]{self.RESET}"
+        return super().format(record)
+
+
+def get_logger(name: str) -> logging.Logger:
+    """Get a colored logger instance"""
+    logger = logging.getLogger(name)
+
+    if not logger.handlers:
+        handler = logging.StreamHandler(sys.stdout)
+        formatter = ColoredFormatter("%(name)s %(levelname)s: %(message)s")
+        handler.setFormatter(formatter)
+        logger.addHandler(handler)
+        logger.setLevel(logging.INFO)
+
+    return logger
+
+
+# Pre-created loggers
+logger_api = get_logger("API")
+logger_mock = get_logger("MOCK")
--- a/src/kohakuboard/main.py
+++ b/src/kohakuboard/main.py
@@ -0,0 +1,54 @@
+"""Main FastAPI application for KohakuBoard"""
+
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+
+from kohakuboard.api.routers import experiments, mock
+from kohakuboard.config import cfg
+from kohakuboard.logger import logger_api
+
+app = FastAPI(
+    title="KohakuBoard API",
+    description="ML Experiment Tracking API",
+    version="0.1.0",
+    docs_url=f"{cfg.app.api_base}/docs",
+    openapi_url=f"{cfg.app.api_base}/openapi.json",
+)
+
+# CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=cfg.app.cors_origins,
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+
+# Register routers
+app.include_router(experiments.router, prefix=cfg.app.api_base, tags=["experiments"])
+app.include_router(mock.router, prefix=cfg.app.api_base, tags=["mock"])
+
+
+@app.get("/")
+async def root():
+    """Root endpoint"""
+    return {
+        "name": "KohakuBoard API",
+        "version": "0.1.0",
+        "docs": f"{cfg.app.api_base}/docs",
+    }
+
+
+@app.get("/health")
+async def health():
+    """Health check endpoint"""
+    return {"status": "healthy"}
+
+
+if __name__ == "__main__":
+    import uvicorn
+
+    logger_api.info(f"Starting KohakuBoard API on {cfg.app.host}:{cfg.app.port}")
+    uvicorn.run(
+        "kohakuboard.main:app", host=cfg.app.host, port=cfg.app.port, reload=True
+    )