Experiment Mangement

It is common to create a new directory for each group of experiments, usually these are group around a specific publication goal or project.

Hydra configuration options are taken from one of the following possibilities.

  1. A local config.yml taken from the directory in which the command is run.
  2. Configuration group overrides from any subdirectory of the current working directory named after a configuration group
  3. Overrides from a special configuration group usually called experiments

for an example on experiment management have a look at experiments/cifar10.

It has the following directory structure:

├── augmentation
│   └── cifar_augment.yaml
├── config.yaml
├── datasets
│   └── ...
├── experiment
│   ├── sweep_lr.yaml
│   └── sweep_models.yaml
├── scripts
│   └── train_slurm.sh
└── trained_models
    ├── sweep_lr
    |   ├── 0.0001
    |   ├── ...
    └── sweep_models
        ├── multirun.yaml
        └── resnet18

The main configuration is found in config.yaml.

    - base_config
    - override dataset: cifar10          # Dataset configuration name
    - override features: identity        # Feature extractor configuration name (use identity for vision datasets)
    - override model: timm_resnet18      #timm_mobilenetv3_small_100      # Neural network name (for now timm_resnet50 or timm_efficientnet_lite1)
    - override scheduler: 1cycle         # learning rate scheduler config name
    - override optimizer: sgd            # Optimizer config name
    - override normalizer: null          # Feature normalizer (used for quantized neural networks)
    - override module: image_classifier  # Lightning module config for the training loop (image classifier for image classification tasks)
    - override augmentation: cifar_augment
    - _self_

  metric: val_f1_micro
  direction: maximize

  batch_size: 64

  max_epochs: 50

  max_lr: 0.1

The configuration is composed from the standard presets in base_config the defaults are then using different config group presets, using the - override <groupname>: <presetname> syntax. Most of the presets are taken from the package wide configuration in hannah/conf. The configuration for data augmentation is defined in: augmentation/cifar_augment.yaml:

  pipeline: null
      p: 0.5
      size: [32,32]
      padding: 4

This file specifies the presets for data augmentation shared among the experiments in this group of experiments.

The experiments are then defined in for example experiment/sweep_lr.yaml and experiment/sweep_model.yaml. These experiments can be started with: hannah-train +experiment=<experiment_name>. Please note the + in front of the commandline argument. This is needed as experiment is not part of default configuration hierarchy.

As an example have a look at experiment/sweep_lr.yaml.

# @package _global_
experiment_id: sweep_lr
  mode: MULTIRUN
    subdir: lr=${scheduler.max_lr}
      scheduler.max_lr: 0.0001,0.001,0.01,0.1

Experiments must start with # @package _global_ this means that overrides defined here change the global configuration and not some values under experiment. The next line sets the experiment_id this is used to identify the experiment and set a different subfolder for the output artifacts (normally: trained_models/<experiment_id>).

The final part of the configuration then configures a sweep over multiple parameters. In this case we configure a sweep over the max_lr parameter of the used 1cycle learning rate scheduler, and configure the output directory to contain a separate subdirector lr=0.0001, lr=0.001 for each setting of the max_lr parameter.

The final outputs can then be found in trained_models/sweep_lr/lr\=0.0001/ and so on.