Hyperparameter tuning¶
In this notebook we will go a bit further in the training of a Core/Readout model. Using once again data from Hoefling et al., 2024: "A chromatic feature detector in the retina signals visual context changes".
We will see how we can search the hyperparameter space using Hydra, while doing minimal changes to the configuration files. It can only be used using the CLI version of open_retina.
The recommended sweeper is the library optuna https://optuna.org. It's already integrated in the pipeline. To learn more about the keyword that can be used, see here : https://hydra.cc/docs/plugins/optuna_sweeper/.
Hyperparameter Search Configuration¶
The hyperparameter search can be fully defined in the main config file of your experiment.
Let's compare the configs hoefling_2024_core_readout_low_res and hoefling_2024_core_readout_low_res_hyperparams_search:
- Added Optuna sweeper overrides. Feel free to use a different sampler:
- `override hydra/sweeper: optuna`
- `override hydra/sweeper/sampler: tpe`
Defined objective target for optimization. It has to be one of the column of the valdiation object:
objective_target: val_correlation # Must be a computed validation metric
Configured the Hydra sweeper section:
hydra: run: dir: ${paths.log_dir} sweeper: sampler: # Configurable sampler parameters seed: 42 direction: maximize study_name: ${exp_name} storage: null n_trials: 20 # Control over trials and parallel jobs n_jobs: 1 params: # Parameter optimization using Optuna keywords (choice, interval) # Example optimizes hidden channels and core spatial regularization model.hidden_channels: choice([8, 8, 8, 8], [16, 16, 16, 16], [32, 32, 32, 32]) model.core_gamma_input: interval(1e-5, 1e-2)
Optional. Enable MLflow logger
- logger:
- tensorboard
- csv
- mlflow
You should be able to adapt those 4 steps to any particular needs.
Launching the search¶
We will use the same command as usual, adding the option --multirun.
openretina train --config-name "hoefling_2024_core_readout_low_res_hyperparams_search" --multirun
If the option --multirun is not selected, the sweeper section will be ignore, launching a single training session with the default parameters.
Viewing and Analyzing Results with MLflow¶
This tutorial demonstrates how to visualize and analyze your experiment results using MLflow's web interface. While we use MLflow in this example, you can adapt these concepts to other logging tools like TensorBoard or CSV loggers. It's also relativaly easy to save additional artifacts to tensorboard or mlflow using pytorch lightning callbacks.
Starting the MLflow Server¶
Launch the MLflow UI by running the following command:
mlflow server --backend-store-uri ./openretina_assets/mlflow --host 0.0.0.0 --port 5000
Important: Make sure to specify the correct --backend-store-uri path where your MLflow data is stored.
Navigating the MLflow Interface¶
Access the UI: Open your browser and navigate to
http://localhost:5000Locate Your Experiment: Look for your experiment name in the left sidebar
Customize the Results View:
- Click on "Columns" to add metrics to the table view
- Add your target metric (e.g.,
val_correlation) - Sort runs by clicking on any column header
- Use filters to narrow down results
Comparing Multiple Runs¶
To analyze and compare different runs:
- Select multiple runs by checking the boxes next to them
- Click the "Compare" button at the top
- Access various comparison views:
- Metrics plots: Visualize how metrics changed across runs
- Parameter comparisons: See how different hyperparameters affected performance
- Model artifacts: Review saved models and configurations
Programmatic Access¶
For batch analysis or automation, you can retrieve the best run programmatically:
import mlflow
# Connect to your MLflow tracking server
mlflow.set_tracking_uri("./openretina_assets/mlflow")
# Get the best run from an experiment
experiment_id = mlflow.get_experiment_by_name("your_experiment_name").experiment_id
best_run = mlflow.search_runs(
experiment_ids=[experiment_id],
order_by=["metrics.val_correlation DESC"],
max_results=1
).iloc[0]
# Load the best model
best_model = mlflow.sklearn.load_model(f"runs:/{best_run.run_id}/model")
What's Stored in MLflow by default in open retina¶
Each run automatically saves:
- Model weights and artifacts
- Training configuration and hyperparameters
- Metrics logged during training
- Environment details and dependencies
Viewing and analyzing results with tensorboard¶
I you'd prefer to use tensorboard, you still can. Here is how to access the UI.
Starting the Tensorboard Server¶
Launch the Tensorboard UI by running the following command:
tensorboard --logdir ./openretina_assets/runs/[(your_experiment_name)] --host 0.0.0.0 --port 6000