Caching & Reproducibility¶

ExpOps provides intelligent multi-level caching and reproducibility guarantees. The system supports both step-level and process-level caching for maximum flexibility.

Caching Levels¶

Step-Level Caching¶

Each pipeline step can be cached independently:

Cache key: Based on step inputs, configuration hash, and function code hash
Cache lookup: Automatic before step execution
Granularity: Individual steps within a process
Use case: When you want to skip specific steps that haven't changed, even if other steps in the process have

Example: If a data preprocessing step hasn't changed, it can be skipped even if the training step needs to run.

Process-Level Caching¶

Entire processes (containing multiple steps) can be cached as a single unit:

Cache key: Based on process inputs, configuration hash, and process function code hash
Cache lookup: Automatic before process execution
Granularity: Entire process as a single unit
Use case: When you want to skip an entire process if all its inputs and configuration are unchanged

Example: If a complete training pipeline hasn't changed, the entire process can be skipped, avoiding execution of all its constituent steps.

How They Work Together¶

Step-level caching is checked first when executing individual steps
Process-level caching is checked when starting a process execution
If a process is cached, all its steps are skipped
If a process isn't cached but some steps are, only the uncached steps execute
Both levels use the same cache backends and KV stores

Cache Backends¶

Google Cloud Storage (GCS)¶

Remote backend for shared caching: - Cross-machine sharing - Persistent storage - Requires GCP credentials

Configuration¶

Cache settings in configs/project_config.yaml:

model:
  parameters:
    cache:
      backend: local  # or gcs

Reproducibility¶

Random Seed Management¶

ExpOps manages random seeds for: - NumPy random number generation - Python's random module - ML framework random states (sklearn, PyTorch, TensorFlow)

Configuration¶

Seed settings in configs/project_config.yaml:

reproducibility:
  seed: 42

Cache Invalidation¶

Caches are invalidated when:

Step-level: Step code changes (detected via function hash), step inputs change, or step configuration changes
Process-level: Process code changes (detected via function hash), process inputs change, or process configuration changes

Benefits¶

Multi-level caching and reproducibility provide:

Faster iterations: Skip unchanged steps or entire processes
Flexible granularity: Choose step-level for fine-grained control or process-level for coarse-grained optimization
Reproducible results: Same inputs produce same outputs
Cost savings: Avoid redundant computation at both step and process levels