Base Data Classes
Core data containers used across all datasets.
Data Containers
MoviesTrainTestSplit
dataclass
MoviesTrainTestSplit(
train: Float[
ndarray, "channels train_time height width"
],
test_dict: dict = (lambda: {})(),
test: InitVar[
Float[ndarray, "channels test_time height width"]
| None
] = None,
stim_id: Optional[str] = None,
random_sequences: Optional[ndarray] = None,
norm_mean: Optional[float] = None,
norm_std: Optional[float] = None,
)
Container for stimulus movies used during training and evaluation.
| ATTRIBUTE | DESCRIPTION |
|---|---|
train |
Continuous movie shown during training.
TYPE:
|
test_dict |
Named dictionary of frozen test stimuli. For legacy single-test datasets
pass
TYPE:
|
test |
Convenience field to pass a single frozen movie.
TYPE:
|
stim_id |
Optional identifier (e.g. "natural") to keep responses/movies aligned.
TYPE:
|
random_sequences |
Optional clip permutations (Höfling 2024 format).
TYPE:
|
norm_mean |
Normalization statistics applied to both train and test movies.
TYPE:
|
ResponsesTrainTestSplit
dataclass
ResponsesTrainTestSplit(
train: Float[ndarray, "neurons train_time"],
test_dict: dict = (lambda: {})(),
test: InitVar[
Float[ndarray, "neurons test_time"] | None
] = None,
test_by_trial: Float[
ndarray, "trials neurons test_time"
]
| None = None,
test_by_trial_dict: dict = (lambda: {})(),
stim_id: str | None = None,
session_kwargs: dict[str, Any] = (lambda: {})(),
)
Container for neural responses paired with MoviesTrainTestSplit.
Supports multiple test stimuli via test_dict and per-trial traces via
test_by_trial_dict. For single-test datasets you may provide test and
optionally test_by_trial; both will be lifted into the matching dictionaries.
get_test_by_trial
get_test_by_trial(
name: str = "test",
) -> Float[ndarray, "trials neurons test_time"] | None
Return the per-trial responses for a specific stimulus.
| PARAMETER | DESCRIPTION |
|---|---|
name
|
Key inside
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Float[ndarray, 'trials neurons test_time'] | None
|
Array of shape (trials, neurons, time) if available, otherwise |
Source code in openretina/data_io/base.py
175 176 177 178 179 180 181 182 183 184 185 186 187 | |
DatasetStatistics
dataclass
DatasetStatistics(
unique_train_frames: int,
unique_val_frames: int,
unique_train_val_frames: int,
unique_test_frames: dict[str, int],
unique_train_transitions: int,
unique_val_transitions: int,
unique_test_transitions: dict[str, int],
n_sessions: int,
)
Statistics about unique frames and transitions across sessions, computed from dataloaders.
| ATTRIBUTE | DESCRIPTION |
|---|---|
unique_train_frames |
Number of unique training frames seen across all sessions.
TYPE:
|
unique_val_frames |
Number of unique validation frames seen across all sessions.
TYPE:
|
unique_train_val_frames |
Union of unique train and val frames (deduplicated).
TYPE:
|
unique_test_frames |
Dict mapping test split name to unique frame count.
TYPE:
|
unique_train_transitions |
Number of unique consecutive-frame transitions in training.
TYPE:
|
unique_val_transitions |
Number of unique consecutive-frame transitions in validation.
TYPE:
|
unique_test_transitions |
Dict mapping test split name to unique transition count.
TYPE:
|
n_sessions |
Total number of sessions.
TYPE:
|
empty
classmethod
empty() -> DatasetStatistics
Create an empty DatasetStatistics instance (all counts zero).
Source code in openretina/data_io/base.py
218 219 220 221 222 223 224 225 226 227 228 229 230 | |
Helper Functions
normalize_train_test_movies
normalize_train_test_movies(
train: Float[
ndarray, "channels train_time height width"
],
test: Float[ndarray, "channels test_time height width"],
) -> tuple[
Float[ndarray, "channels train_time height width"],
Float[ndarray, "channels test_time height width"],
dict[str, float | None],
]
z-score normalization of train and test movies using the mean and standard deviation of the train movie.
Parameters: - train: train movie with shape (channels, time, height, width) - test: test movie with shape (channels, time, height, width)
Returns: - train_video_preproc: normalized train movie - test_video_preproc: normalized test movie - norm_stats: dictionary containing the mean and standard deviation of the train movie
Note: The functions casts the input to torch tensors to calculate the mean and standard deviation of large inputs more efficiently.
Source code in openretina/data_io/base.py
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 | |
compute_data_info
compute_data_info(
neuron_data_dictionary: dict[
str, ResponsesTrainTestSplit
],
movies_dictionary: dict[str, MoviesTrainTestSplit]
| MoviesTrainTestSplit,
partial_data_info: dict[str, Any] | None = None,
) -> dict[str, Any]
Computes information related to the data used to train a model, including the number of neurons, the shape of the movies, and the normalization statistics. This information should be fed to and saved with the models.
Parameters: - neuron_data_dictionary: dictionary of responses for each session - movies_dictionary: dictionary of movies for each session - partial_data_info: dictionary of partial data info from the config, to be merged with the computed data info
Returns: - data_info: dictionary containing various data info useful for downstream tasks, including the number of neurons, the shape of the movies, the movie normalization statistics, and any extra session kwargs related to the data, including partial data information passed in the training config.
Source code in openretina/data_io/base.py
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 | |