base_protocol
TemporalScope/src/temporalscope/partition/base_protocol.py.
This module defines the TemporalPartitionerProtocol, a protocol for all temporal partitioning methods. Each partitioning method must implement the required methods to comply with this protocol. Currently, only single-target workflows (scalar targets) are supported. Multi-target workflows (sequence or tensor targets) are planned for future releases. This protocol is designed to be flexible enough to accommodate both modes when multi-target support is added.
Partitioning for modern XAI Time-Series Pipelines:
Partitioning is foundational to modern time-series workflows. It ensures computational efficiency, robust validation, and interpretable insights. Key use cases include:
| Aspect | Details |
|---|---|
| Temporal Explainability | Facilitates feature importance analyses by segmenting data for localized SHAP/WindowSHAP metrics. |
| Robust Evaluation | Respects temporal ordering in train-test splits, critical for time-series generalization. |
| Scalability and Efficiency | Supports sliding windows, expanding windows, and fixed partitions with lazy-loading and backend compatibility for large-scale datasets. |
| Workflow Flexibility | Supports both single-target and multi-target modes, enabling DataFrame operations and deep learning pipelines through flexible partitioning methods. |
Core Functionality:
The protocol defines four mandatory methods, ensuring a strict and consistent lifecycle across all partitioning implementations. Each method has a clear purpose and aligns with the goals of efficient partitioning:
| Method | Description |
|---|---|
setup |
Prepares and validates input data, ensuring compatibility with the chosen workflow (e.g., backend conversions, deduplication, parameter checks). |
fit |
Generates partition indices (row ranges) for datasets, supporting sliding windows, fixed-length, or expanding partitions. |
transform |
Applies the partition indices to retrieve specific data slices, ensuring memory-efficient operation using lazy evaluation techniques. |
fit_transform |
Combines fit and transform for eager workflows, directly producing partitioned data slices. |
Workflow Modes:
The protocol supports two primary modes:
- Single-Target (DataFrame-Centric):
- Operations focus on Narwhals-backed DataFrames (Pandas, Polars, or Modin).
-
Slices are returned in DataFrame formats, preserving metadata.
-
Multi-Target (Tensor/Dataset-Centric):
- Designed for deep learning workflows (e.g., PyTorch, TensorFlow).
- Handles transformations from DataFrame to tensor or dataset formats.
- Ensures compatibility with sequence or tensor-target models.
Future Plans:
The protocol is designed for extensibility, ensuring advanced workflows like multi-modal models, cross-frequency partitioning, or custom padding strategies can be integrated seamlessly.
See Also
- Nayebi, A., Tipirneni, S., Reddy, C. K., et al. (2024). WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley values. Journal of Biomedical Informatics. DOI:10.1016/j.jbi.2023.104438.
- Gu, X., See, K. W., Wang, Y., et al. (2021). The sliding window and SHAP theory—an improved system with a long short-term memory network model for state of charge prediction in electric vehicles. Energies, 14(12), 3692. DOI:10.3390/en14123692.
- Van Ness, M., Shen, H., Wang, H., et al. (2023). Cross-Frequency Time Series Meta-Forecasting. arXiv preprint arXiv:2302.02077.
| CLASS | DESCRIPTION |
|---|---|
TemporalPartitionerProtocol |
Protocol for temporal partitioning methods. |
TemporalPartitionerProtocol
Protocol for temporal partitioning methods.
This protocol defines the lifecycle for partitioning workflows, supporting both single-target (dataframe-centric) and multi-target (tensor/dataset-centric) use cases.
| METHOD | DESCRIPTION |
|---|---|
fit |
Compute partition indices for slicing. |
fit_transform |
Combine |
setup |
Prepare and validate input data for partitioning. |
transform |
Retrieve data slices using computed indices. |
fit
fit() -> Iterator[Dict[str, Any]]
Compute partition indices for slicing.
This method generates partition indices based on partitioning parameters
such as num_partitions, window_size, and stride. It utilizes a lazy
generator pattern to ensure memory efficiency, especially for large datasets.
| RETURNS | DESCRIPTION |
|---|---|
Iterator[Dict[str, Any]]
|
|
Notes
This method does not perform slicing; it only computes and returns indices. Generator yielding partition indices structured as dictionaries.
Source code in src/temporalscope/partition/base_protocol.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
fit_transform
fit_transform() -> Iterator[Dict[str, Any]]
Combine fit and transform for eager execution.
This method computes partition indices and retrieves data slices in a single step. It is ideal for workflows requiring immediate access to partitioned data without intermediate steps.
| RETURNS | DESCRIPTION |
|---|---|
Iterator[Dict[str, Any]]
|
Generator yielding dictionaries containing partitioned data slices. |
Source code in src/temporalscope/partition/base_protocol.py
167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
setup
setup() -> None
Prepare and validate input data for partitioning.
This method performs preprocessing and ensures the data is compatible with the specific workflow.
Example tasks include:
- Sorting and deduplication for DataFrame workflows.
- Conversion to tensors or datasets for multi-target workflows.
- Validation of partitioning parameters (e.g.,
num_partitions,stride).
This step ensures consistency across partitioning methods and minimizes runtime errors in subsequent stages.
Notes
This method should be idempotent and isolated. While optional for end-users, implementations must ensure it is executed internally before partitioning begins.
| RETURNS | DESCRIPTION |
|---|---|
None
|
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If any required input or parameter is invalid. |
Source code in src/temporalscope/partition/base_protocol.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
transform
transform() -> Iterator[Dict[str, Any]]
Retrieve data slices using computed indices.
This method slices the data based on indices generated by fit. It ensures
memory efficiency through lazy evaluation and supports various output formats
depending on the workflow mode (e.g., DataFrame slices, tensors, or datasets).
| RETURNS | DESCRIPTION |
|---|---|
Iterator[Dict[str, Any]]
|
Generator yielding dictionaries containing partitioned data slices. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
Source code in src/temporalscope/partition/base_protocol.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | |