Time Series Normalization#

class apyxl.TimeSeriesNormalizer(freq_trend: str, max_evals: int = 15)#

TimeSeriesNormalizer normalizes a target time series based on external features using XGBoost regression.

Args: freq_trend (str): Frequency string for resampling the time series trend. max_evals (int, optional): Maximum number of evaluations for the XGBoost model. Defaults to 15.:

freq_trend (pd.tseries.offsets.DateOffset): Frequency for trend calculation. xgb (XGBRegressorWrapper): XGBoost regressor wrapper for normalization.

classmethod apply_shift(trend: Series, y: Series, shift='auto') Series#

Applies the specified shift to the trend.

Parameters:
  • trend (pd.Series) – The normalized trend series.

  • y (pd.Series) – The original target time series.

  • shift (float or str, optional) – Value to shift the trend by. Can be a float/int or ‘mean’. Defaults to None.

Returns:

Shifted trend series.

Return type:

pd.Series

Raises:

ValueError – If the shift value is invalid.

normalize(X: DataFrame, y: Series = None, target: str = None, shift: float = None, return_shap_values=False) Series#

Normalizes the target time series using external features.

Parameters:
  • X (pd.DataFrame) – DataFrame containing external features.

  • y (pd.Series, optional) – Target time series. If None, the target column in X will be used.

  • target (str, optional) – Name of the target column in X. Required if y is None.

  • shift (float or str, optional) – Value to shift the trend by. Can be a float/int or ‘mean’. Defaults to None.

  • return_shap_values (bool) – If True, return the SHAP values of the fitted model in addition of the computed trend.

Returns:

Normalized time series.

Return type:

pd.Series

preprocess_data(X: DataFrame, y: Series = None, target: str = None)#

Preprocesses the input data by selecting the target column and checking the index.

Parameters:
  • X (pd.DataFrame) – DataFrame containing external features.

  • y (pd.Series, optional) – Target time series. If None, the target column in X will be used.

  • target (str, optional) – Name of the target column in X. Required if y is None.

Returns:

Processed X and y.

Return type:

Tuple[pd.DataFrame, pd.Series]

Raises:
  • AssertionError – If neither y nor target is provided.

  • ValueError – If the target column in X contains NaN values.