Data Cyclers

Adapted from sinzlab/neuralpredictors/training/cyclers.py

LongCycler

LongCycler(
    loaders: dict[str, DataLoader], shuffle: bool = True
)

Bases: IterableDataset

Cycles through a dictionary of data loaders until the loader with the largest size is exhausted. In practice, takes one batch from each loader in each iteration. Necessary for dataloaders of unequal size. Note: iterable dataloaders as this one can lead to duplicate data when using multiprocessing.

Source code in openretina/data_io/cyclers.py

def __init__(self, loaders: dict[str, DataLoader], shuffle: bool = True):
    self.loaders = loaders
    self.max_batches = max(len(loader) for loader in self.loaders.values())
    self.shuffle = shuffle

ShortCycler

ShortCycler(loaders: dict[str, DataLoader])

Bases: IterableDataset

Cycles through the elements of each dataloader without repeating any element.

Source code in openretina/data_io/cyclers.py

def __init__(self, loaders: dict[str, DataLoader]):
    self.loaders = loaders

cycle

cycle(iterable)

itertools.cycle without caching. See: https://github.com/pytorch/pytorch/issues/23900

Source code in openretina/data_io/cyclers.py

def cycle(iterable):
    """
    itertools.cycle without caching.
    See: https://github.com/pytorch/pytorch/issues/23900
    """
    iterator = iter(iterable)
    while True:
        try:
            yield next(iterator)
        except StopIteration:
            iterator = iter(iterable)