TemporalScope Tutorial: TimeFrame and Backend-Agnostic Data Loading¶
TimeFrame Modes¶
The TimeFrame class supports two key modes for handling temporal data:
Implicit & Static Time Series (Default Mode):
- Time column is treated as a feature for static modeling
- Supports mixed-frequency workflows
- No strict temporal ordering enforced
- Use when: Building ML models where time is just another feature
- Example:
enforce_temporal_uniqueness=False(default)
Strict Time Series:
- Enforces strict temporal ordering and uniqueness
- Suitable for forecasting tasks
- Can validate by groups using
id_col - Use when: Building forecasting models requiring temporal integrity
- Example:
enforce_temporal_uniqueness=True
Engineering Design Overview¶
The TimeFrame class uses Narwhals for backend-agnostic DataFrame operations and is designed with several key assumptions:
Preprocessed Data Assumption:
- TemporalScope assumes users provide clean, preprocessed data
- Similar to TensorFlow and GluonTS, preprocessing should be handled before using TemporalScope
Time Column Constraints:
time_colmust be numeric index or timestamp- Critical for operations like sliding window partitioning and temporal XAI
Numeric Features Requirement:
- All features (except
time_col) must be numeric - Ensures compatibility with ML models and XAI techniques
- All features (except
Universal Model Assumption:
- Models operate on entire dataset without hidden groupings
- Enables seamless integration with SHAP, Boruta-SHAP, and LIME
Backend Support¶
TemporalScope leverages Narwhals for backend-agnostic operations, supporting:
Production Environment:
pandas: Core DataFrame library (default)narwhals: Backend-agnostic operations
Test Environment (via hatch):
modin: Parallelized Pandas operationspyarrow: Apache Arrow-based processingpolars: High-performance Rust implementationdask: Distributed computing framework
This separation ensures lightweight production deployments while maintaining robust testing across backends.
import pandas as pd
import narwhals as nw
from temporalscope.core.temporal_data_loader import TimeFrame
from temporalscope.datasets.datasets import DatasetLoader
# Load example data
loader = DatasetLoader("macrodata")
data = loader.load_data()
# Create TimeFrame (default mode: time as static feature)
tf = TimeFrame(data, time_col="ds", target_col="realgdp")
# Display configuration
print("TimeFrame Configuration:")
print(f"Mode: {tf.mode}")
print(f"Sort Order: {'Ascending' if tf.ascending else 'Descending'}")
# Preview data
print("\nData Preview:")
print(tf.df.head())
======================================================================
Loading dataset: 'macrodata'
======================================================================
DataFrame shape: (203, 13)
Target column: realgdp
======================================================================
TimeFrame Configuration:
Mode: single_target
Sort Order: Ascending
Data Preview:
realgdp realcons realinv realgovt realdpi cpi m1 tbilrate \
0 2710.349 1707.4 286.898 470.045 1886.9 28.98 139.7 2.82
1 2778.801 1733.7 310.859 481.301 1919.7 29.15 141.7 3.08
2 2775.488 1751.8 289.226 491.260 1916.4 29.35 140.5 3.82
3 2785.204 1753.7 299.356 484.052 1931.3 29.37 140.0 4.33
4 2847.699 1770.5 331.722 462.199 1955.5 29.54 139.6 3.50
unemp pop infl realint ds
0 5.8 177.146 0.00 0.00 1959-01-01
1 5.1 177.830 2.34 0.74 1959-04-01
2 5.3 178.657 2.74 1.09 1959-07-01
3 5.6 179.386 0.27 4.06 1959-10-01
4 5.2 180.007 2.31 1.19 1960-01-01
Example: Group-Level Temporal Uniqueness¶
TimeFrame supports validation of temporal uniqueness at the group level, essential for multi-entity time series applications:
# Create sample multi-entity data
df = pd.DataFrame(
{
"id": [1, 1, 2, 2],
"time": [1, 2, 1, 3], # Note: Different groups can share timestamps
"feature": [0.1, 0.2, 0.3, 0.4],
"target": [10, 20, 30, 40],
}
)
# Create TimeFrame with group-level temporal validation
tf = TimeFrame(df, time_col="time", target_col="target", enforce_temporal_uniqueness=True, id_col="id")
print("Data with valid temporal uniqueness within groups:")
print(tf.df)
Data with valid temporal uniqueness within groups: id time feature target 0 1 1 0.1 10 2 2 1 0.3 30 1 1 2 0.2 20 3 2 3 0.4 40
Example: TimeFrame Metadata¶
TimeFrame includes a metadata container for extensibility and future ML framework integrations:
# Store custom metadata
tf.metadata["model_config"] = {
"type": "LSTM",
"framework": "PyTorch",
"hyperparameters": {"hidden_size": 64, "num_layers": 2},
}
print("TimeFrame Metadata:")
print(tf.metadata)
TimeFrame Metadata:
{'model_config': {'type': 'LSTM', 'framework': 'PyTorch', 'hyperparameters': {'hidden_size': 64, 'num_layers': 2}}}
Info
This tutorial was auto-generated from the TemporalScope repository.
If you would like to suggest enhancements or report issues, please submit a Pull Request following the contribution guidelines.
Source notebook: 1_load_data_timeframe.ipynb
Disclaimer & Copyright
THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
THIS SOFTWARE IS INTENDED FOR ACADEMIC AND INFORMATIONAL PURPOSES ONLY. IT SHOULD NOT BE USED IN PRODUCTION ENVIRONMENTS OR FOR CRITICAL DECISION-MAKING WITHOUT PROPER VALIDATION. ANY USE OF THIS SOFTWARE IS AT THE USER'S OWN RISK.
© 2024 Philip Ndikum