Training and Evaluation
Here we show how to launch a training loop manually, and how to evaluate a trained model. The examples assume you already prepared matching dataloaders as described in the Data I/O guide and instantiated a model such as ExampleCoreReadout from the model overview.
Training Loop
The dataloaders returned by openretina.data_io.base_dataloader.multiple_movies_dataloaders are organised by session. Utilities in openretina.data_io.cyclers let you iterate across sessions to present data to PyTorch Lightning as a single stream.
from lightning import Trainer
from openretina.data_io.cyclers import LongCycler, ShortCycler
train_loader = LongCycler(dataloaders["train"], shuffle=True)
val_loader = ShortCycler(dataloaders["validation"])
trainer = Trainer(max_epochs=50, gradient_clip_val=0.5)
trainer.fit(model, train_loader, val_loader)
Lightning handles the boilerplate for checkpoints, mixed precision, logging, and distributed execution. Loss functions, optimisers, and regularisers live inside the model (see the “Extending or Customising Models” section in the core + readout page), so the training loop only needs the model and the loaders.
Tips:
- Use
LongCyclerfor splits where you want balanced sampling across sessions; useShortCyclerwhen one pass over each session is sufficient (e.g. validation or test). - Pass callbacks (early stopping, learning-rate schedulers) to
Trainerif you require certain functionalities. - To monitor metrics, enable Lightning loggers such as TensorBoard or CSV by providing
logger=...toTrainer.
Refer to the Lightning documentation for the full list of trainer arguments.
Evaluating a Model
After training, switch to the test split and let Lightning handle evaluation:
from openretina.data_io.cyclers import ShortCycler
test_loader = ShortCycler(dataloaders["test"])
trainer.test(model, test_loader)
During trainer.test, the model runs in evaluation mode (model.eval()) and metrics defined in the module are logged. By default BaseCoreReadout uses the Poisson loss for optimisation and reports the time-averaged Pearson correlation as an evaluation metric. You can customise these via the model constructor.
Additional Metrics and Oracles
Beyond the losses embedded in each model, openretina provides richer evaluation utilities under openretina.eval:
- Leave-One-Out (Jackknife) Oracle: estimates an upper bound on achievable correlation by averaging all but one repeat of the same stimulus before comparison [Lurz et al., 2022].
- Fraction of Explainable Variance Explained (FEVE): measures the proportion of stimulus-driven variance captured by the model [Cadena et al., 2019].
These helpers accept the same stimulus/response dictionaries used for training and can be run after trainer.test to generate publication-ready scores.
Putting It Together
Whether you call Lightning manually or rely on the openretina train command, the workflow always follows the same pattern:
- Prepare data dictionaries and dataloaders.
- Instantiate a model with the correct input shape and neuron counts.
- Train with
Trainer.fit(...)using cyclers to merge session-wise loaders. - Evaluate with
Trainer.test(...)and optional metrics fromopenretina.eval.
Choose this manual approach when you need fine-grained control or are prototyping new trainer settings; otherwise the unified script provides the same functionality with additional convenience features.