functional
TemporalScope/src/temporalscope/partition/padding/functional.py
This module provides backend-agnostic utility functions for padding time-series DataFrames.
All functions in this module follow a functional design pattern using Narwhals' API for
deferred and optimized execution. These utilities are designed to operate on SupportedTemporalDataFrame
objects, ensuring flexibility for end-users to integrate custom parallelization and optimization strategies.
These functions are suitable for use in partitioning workflows or standalone operations, with an emphasis
on numerical data compatibility. Users are expected to preprocess their DataFrames (e.g., ensuring numerical columns
and handling any specific column semantics like time_col or target_col) before applying these padding utilities.
Engineering Design:
- Functional Design:
- Stateless functions that return padded DataFrames without modifying inputs.
- Compatible with distributed or batch processing frameworks for scalability.
- Validation:
- All input DataFrames must conform to
SupportedTemporalDataFramestandards, as defined incore_utils.py. - Explicit checks ensure all columns are numeric and free of null or NaN values.
- All input DataFrames must conform to
Examples:
import pandas as pd
import numpy as np
from temporalscope.partition.padding.functional import zero_pad
df = pd.DataFrame({"feature_1": [10, 20], "feature_2": [30, 40], "target": [50, 60]})
padded_df = zero_pad(df, target_len=5, pad_value=0, padding="post")
print(padded_df)
Notes
This module draws inspiration from industry-standard patterns, including:
- TensorFlow's TimeseriesGenerator for its emphasis on preprocessing flexibility.
- PyTorch's Dataset API for its focus on functional design and data transformations.
- FastAI's modular TSDataLoaders for encouraging separation of concerns in time-series workflows.
Refer to the API documentation for further details on usage patterns and constraints.
DataFrame Evaluation Modes:
| Mode | Key Characteristics | Type Handling |
|---|---|---|
| Eager | - Immediate execution - Direct computation - Memory-bound ops |
- Use schema for types - Get Narwhals types direct - Narwhals ops supported |
| Lazy | - Deferred execution - Optimized planning - Large-scale data |
- Must use native dtype - Schema not supported - Native type ops required |
Critical Rules:
- Never mix eager/lazy operations
- Use narwhals operations consistently, noting Dask requires special handling for concatenation
- Convert to native format only when required
- Maintain same mode in concatenations, using backend-specific methods when needed (e.g. dask.concat)
See Also
- Dwarampudi, M. and Reddy, N.V., 2019. Effects of padding on LSTMs and CNNs. arXiv preprint arXiv:1903.07288.
- Lafabregue, B., Weber, J., et al., 2022. End-to-end deep representation learning for time series clustering: a comparative study. Data Mining and Knowledge Discovery.
| FUNCTION | DESCRIPTION |
|---|---|
mean_fill_pad |
Pad a DataFrame to target length by filling with column means. |
mean_fill_pad
mean_fill_pad(
df: FrameT, target_len: int, padding: str = "post"
) -> FrameT
Pad a DataFrame to target length by filling with column means.
A simple padding function that extends a DataFrame to a target length by adding rows filled with each column's mean value. Handles both eager and lazy evaluation.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
DataFrame to pad
TYPE:
|
target_len
|
Desired length after padding
TYPE:
|
padding
|
Where to add padding ('pre' or 'post')
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
FrameT
|
Padded DataFrame |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If target_len <= current length or invalid padding direction |
Source code in src/temporalscope/partition/single_target/padding/functional.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |