Datasets

class BaseDataset[source]

Bases: Dataset, ABC

Base class for all datasets.

__init__(root, split='train', transform=None)[source]
Parameters:
  • root (str | Path) – Path to dataset root

  • split (str) – Dataset split (‘train’, ‘val’, ‘test’)

  • transform (Any | None) – Optional transform to apply to data

class DetectionDataset[source]

Bases: BaseDataset

Base class for object detection datasets.

__init__(root, split='train', transform=None, min_size=800, max_size=1333)[source]
Parameters:
  • root (str) – Path to dataset root

  • split (str) – Dataset split (‘train’, ‘val’, ‘test’)

  • transform (Any | None) – Optional transform to apply to data

  • min_size (int)

  • max_size (int)

collate_fn(batch)[source]

Custom collate function for detection datasets.

Parameters:

batch (List[Dict[str, Any]])

Return type:

Tuple[List[Tensor], List[Dict[str, Any]]]

class SegmentationDataset[source]

Bases: BaseDataset

Generic dataset for semantic segmentation tasks.

__init__(root, images_dir, masks_dir, num_classes, split='train', transform=None, file_extension='jpg')[source]
Parameters:
  • root (str) – Root directory path

  • images_dir (str) – Directory name containing images relative to root

  • masks_dir (str) – Directory name containing masks relative to root

  • num_classes (int) – Number of classes (including background)

  • split (str) – Dataset split (‘train’, ‘val’, or ‘test’)

  • transform (Callable | None) – Optional transform to be applied

  • file_extension (str) – Image file extension to look for

class AnomalyDataset[source]

Bases: BaseDataset

Base class for anomaly detection datasets.

__init__(root, split='train', transform=None, mask_transform=None, normal_class=None)[source]
Parameters:
  • root (str) – Path to dataset root

  • split (str) – Dataset split (‘train’, ‘val’, ‘test’)

  • transform (Any | None) – Optional transform to apply to data

  • mask_transform (Any | None)

  • normal_class (int | None)

get_normal_samples()[source]

Get all normal samples for training reconstruction-based models.

Return type:

Tuple[Tensor, …]

class COCODataset[source]

Bases: DetectionDataset

COCO Dataset for object detection.

__init__(root, split='train', transform=None, year='2017')[source]
Parameters:
  • root (str) – Path to dataset root

  • split (str) – Dataset split (‘train’, ‘val’, ‘test’)

  • transform (Any | None) – Optional transform to apply to data

  • year (str)

Segmentation Dataset

class SegmentationDataset[source]

Bases: BaseDataset

Generic dataset for semantic segmentation tasks.

__init__(root, images_dir, masks_dir, num_classes, split='train', transform=None, file_extension='jpg')[source]
Parameters:
  • root (str) – Root directory path

  • images_dir (str) – Directory name containing images relative to root

  • masks_dir (str) – Directory name containing masks relative to root

  • num_classes (int) – Number of classes (including background)

  • split (str) – Dataset split (‘train’, ‘val’, or ‘test’)

  • transform (Callable | None) – Optional transform to be applied

  • file_extension (str) – Image file extension to look for

Base Dataset

class BaseDataset[source]

Bases: Dataset, ABC

Base class for all datasets.

__init__(root, split='train', transform=None)[source]
Parameters:
  • root (str | Path) – Path to dataset root

  • split (str) – Dataset split (‘train’, ‘val’, ‘test’)

  • transform (Any | None) – Optional transform to apply to data