# core.solver.ScalingModel { #mlsysim.core.solver.ScalingModel } ```python core.solver.ScalingModel() ``` Analyzes the 'Scaling Physics' of model training (Chinchilla Laws). This solver determines the optimal model size (P) and dataset size (D) given a compute budget (C), following the compute-optimal training regime where D ≈ 20P. Literature Source: 1. Hoffmann et al. (2022), "Training Compute-Optimal Large Language Models." 2. Kaplan et al. (2020), "Scaling Laws for Neural Language Models." 3. McCandlish et al. (2018), "An Empirical Model of Large-Batch Training." ## Methods | Name | Description | | --- | --- | | [solve](#mlsysim.core.solver.ScalingModel.solve) | Solves for compute-optimal model and dataset parameters. | ### solve { #mlsysim.core.solver.ScalingModel.solve } ```python core.solver.ScalingModel.solve(compute_budget, target_model_size=None) ``` Solves for compute-optimal model and dataset parameters. #### Parameters {.doc-section .doc-section-parameters} | Name | Type | Description | Default | |-------------------|----------|---------------------------------------------------------------------------|------------| | compute_budget | Quantity | Total training budget (e.g., in TFLOPs or H100-GPU-days). | _required_ | | target_model_size | Quantity | If provided, calculates the required tokens for this specific model size. | `None` | #### Returns {.doc-section .doc-section-returns} | Name | Type | Description | |--------|------------------|-------------------------------------------------------------------| | | Dict\[str, Any\] | Optimal parameters, token count, and training duration estimates. |