Time Series Normalization#
- class apyxl.TimeSeriesNormalizer(freq_trend: str, max_evals: int = 15)#
TimeSeriesNormalizer normalizes a target time series based on external features using XGBoost regression.
Args: freq_trend (str): Frequency string for resampling the time series trend. max_evals (int, optional): Maximum number of evaluations for the XGBoost model. Defaults to 15.:
freq_trend (pd.tseries.offsets.DateOffset): Frequency for trend calculation. xgb (XGBRegressorWrapper): XGBoost regressor wrapper for normalization.
- classmethod apply_shift(trend: Series, y: Series, shift='auto') Series#
Applies the specified shift to the trend.
- Parameters:
trend (pd.Series) – The normalized trend series.
y (pd.Series) – The original target time series.
shift (float or str, optional) – Value to shift the trend by. Can be a float/int or ‘mean’. Defaults to None.
- Returns:
Shifted trend series.
- Return type:
pd.Series
- Raises:
ValueError – If the shift value is invalid.
- normalize(X: DataFrame, y: Series = None, target: str = None, shift: float = None, return_shap_values=False) Series#
Normalizes the target time series using external features.
- Parameters:
X (pd.DataFrame) – DataFrame containing external features.
y (pd.Series, optional) – Target time series. If None, the target column in X will be used.
target (str, optional) – Name of the target column in X. Required if y is None.
shift (float or str, optional) – Value to shift the trend by. Can be a float/int or ‘mean’. Defaults to None.
return_shap_values (bool) – If True, return the SHAP values of the fitted model in addition of the computed trend.
- Returns:
Normalized time series.
- Return type:
pd.Series
- preprocess_data(X: DataFrame, y: Series = None, target: str = None)#
Preprocesses the input data by selecting the target column and checking the index.
- Parameters:
X (pd.DataFrame) – DataFrame containing external features.
y (pd.Series, optional) – Target time series. If None, the target column in X will be used.
target (str, optional) – Name of the target column in X. Required if y is None.
- Returns:
Processed X and y.
- Return type:
Tuple[pd.DataFrame, pd.Series]
- Raises:
AssertionError – If neither y nor target is provided.
ValueError – If the target column in X contains NaN values.