Quick Start 🚀
You can use TemporalScope with the following steps:
- Import TemporalScope: Start by importing the package.
- Select Backend (Optional): TemporalScope defaults to using Pandas as the backend. However, you can specify other backends like Dask, Modin, or CuDF.
- Load Data: Load your time series data into the
TimeSeriesDataclass, specifying thetime_coland optionally theid_col. - Apply a Feature Importance Method: TemporalScope defaults to using a Random Forest model from scikit-learn if no model is specified. You can either:
- A. Use a pre-trained model: Pass a pre-trained model to the method.
- B. Train a Random Forest model within the method: TemporalScope handles model training and application automatically.
- Analyze and Visualize Results: Interpret the results to understand how feature importance evolves over time or across different phases.
Now, let's refine the code example using a random forest model and an academic dataset. We'll use the California housing dataset as a simple example since it's well-known and accessible.
import polars as pl
import pandas as pd
from statsmodels.datasets import macrodata
from temporalscope.core.temporal_data_loader import TimeFrame
from temporalscope.partitioning.naive_partitioner import NaivePartitioner
from temporalscope.core.temporal_model_trainer import TemporalModelTrainer
# 1. Load the dataset using Pandas (or convert to Polars)
macro_df = macrodata.load_pandas().data
macro_df['time'] = pd.date_range(
start='1959-01-01',
periods=len(macro_df),
freq='Q'
)
# Convert the Pandas DataFrame to a Polars DataFrame
macro_df_polars = pl.DataFrame(macro_df)
# 2. Initialize the TimeFrame object with the data
economic_tf = TimeFrame(
df=macro_df_polars,
time_col='time',
target_col='realgdp',
backend='pl', # Using Polars as the backend
)
# 3. Apply Partitioning Strategy (like sklearn's train_test_split)
partitioner = NaivePartitioner(economic_tf)
partitioned_data = partitioner.apply() # Returns a list of partitioned dataframes
# 4. Train and evaluate the model using the partitioned data
model_trainer = TemporalModelTrainer(
partitioned_data=partitioned_data, # Directly passing the partitioned data
model=None, # Use the default model (LightGBM)
model_params={
'objective': 'regression',
'boosting_type': 'gbdt',
'metric': 'rmse',
'verbosity': -1
}
)
# 5. Execute the training and evaluate
results = model_trainer.train_and_evaluate()
# Output predictions and metrics
for partition_name, predictions in results.items():
print(f"Predictions for {partition_name}:")
print(predictions[:5]) # Display first 5 predictions