Data Contracts & File Utilities¶
The src/data/ and src/contracts/ packages manage the clinical and raw data formats throughout the pipeline.
Contracts (src/contracts/)¶
Data contracts are enforced with Python dataclasses. These guarantee strong type-checking when JSON is mapped between the FastAPI routers and the background worker processes.
| Contract | Purpose |
|---|---|
Cell |
Individual cell properties (bounding box, confidence, predicted class). |
Slide |
Holds metadata about a digital slide/field of view. |
ClinicalSummary |
Aggregated report details, e.g., cell counts, abnormal thresholds. |
SlideReport |
The root payload sent into the ClinicalReportGenerator. Includes Model Info, Disclaimer, and the summary. |
Datasets (src/data/)¶
Modules dealing with transformations, dataloading, and dataset splits.
dataset.pyandsipakmed.py: Wrappers aroundtorch.utils.data.Datasetspecifically tailored for parsing SIPaKMeD.transforms.py: Preprocessing logic (Albumentations, normalization, resizing).