Mix and match models and datasets
One strength of OpenRetina is that model definitions and dataset loaders are decoupled through shared interfaces. In practice, this means you can often reuse a model family across datasets by changing configs rather than rewriting training code.
Concrete example: base core-readout across datasets
The base_core_readout model is used in multiple dataset-specific top-level configs, including:
Both compose:
- dataset-specific
data_io/* - dataset-specific
dataloader/* - shared
model: base_core_readout
This is the core "mix-and-match" pattern: keep the model family, change dataset wiring.
Practical recipe
- Pick a top-level config close to your target data format.
- Keep
model: base_core_readout(or another compatible model group). - Swap
data_ioanddataloadergroups to match the dataset. - Verify input shape and required metadata in the top-level
model:overrides.
Minimum config requirements (what you always need)
When you mix and match, these fields should always be explicitly defined in your top-level config:
exp_name: run identifier used in logs and output directories.paths.cache_dir: where downloaded files are cached.paths.data_dir: local or remote data location consumed bydata_ioloaders. If remote, the download target will be inpaths.cache_dir.model.in_shape: expected stimulus shape as(channels, time, height, width).- dataset/model defaults in
defaults:data_io,dataloader,model,trainer,logger, and callbacks.
In practice, you should also confirm:
check_stimuli_responses_match: set totrueunless you are debugging loader mismatches.trainer.max_epochsand precision/device settings are realistic for your hardware.
Important: n_neurons_dict is required by the model, but it is injected automatically at train time from dataset metadata (compute_data_info). You do not need to set it manually in top-level configs.
Compatibility caveats
Not every model can run on every dataset without adaptation. For example, the package conatins single-cell models that rely on white-noise-derived receptive field parameters, which may require metadata that natural-movie-only datasets do not provide.
How to validate compatibility quickly
- Run one short training/evaluation job with reduced epochs/batch size.
- Check that dataloaders expose expected keys (
train,validation,test, or custom test names). - Confirm model input/output shapes and neuron counts are resolved correctly.
See also: