# core.solver.CheckpointModel { #mlsysim.core.solver.CheckpointModel } ```python core.solver.CheckpointModel() ``` Analyzes the storage constraints and I/O burst penalties of saving model states. Training massive models requires saving hundreds of gigabytes (weights + optimizer states) to persistent storage. This solver calculates the time spent blocked on I/O, subtracting from the cluster's Model FLOPs Utilization. Literature Source: 1. Eisenman et al. (2022), "Check-N-Run: A Checkpointing System for Training Large Language Models." ## Methods | Name | Description | | --- | --- | | [solve](#mlsysim.core.solver.CheckpointModel.solve) | Solves for checkpoint size, write time, and resulting MFU penalty. | ### solve { #mlsysim.core.solver.CheckpointModel.solve } ```python core.solver.CheckpointModel.solve( model, hardware, optimizer='adam', checkpoint_interval_hours=4.0, ) ``` Solves for checkpoint size, write time, and resulting MFU penalty. #### Parameters {.doc-section .doc-section-parameters} | Name | Type | Description | Default | |------|------|-------------|---------| | model | Workload | The model architecture. | _required_ | | hardware | HardwareNode | The target hardware for storage bandwidth. | _required_ | | optimizer | str | Optimizer type ('adam' or 'sgd'), determines bytes per parameter. | `'adam'` | | checkpoint_interval_hours | float | Hours between checkpoints. | `4.0` | #### Returns {.doc-section .doc-section-returns} | Name | Type | Description | |------|------|-------------| | | CheckpointResult | Checkpoint size (GB), write time, storage bottleneck flag, and MFU penalty percentage. |