Core classes#

The wrappers for classification and regression.

class apyxl.XGBClassifierWrapper(scoring='matthews', greater_is_better=True, params_space=None, max_evals=15, cv=5, feature_perturbation='tree_path_dependent', device='cpu', verbose=False, random_state=None)#
bar(X=None, shap_values=None, max_display=10, order=shap.Explanation.abs, output=0, title=None, show=True, **kwargs)#

Create a bar plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • max_display (int, optional) – Maximum number of features to display in the bar plot. Default is 10.

  • order (callable, optional) – Function to order the features. Default is shap.Explanation.abs.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP bar plot.

beeswarm(X=None, shap_values=None, max_display=None, order=shap.Explanation.abs.mean(0), output=0, title=None, show=True, **kwargs)#

Create a beeswarm plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix for which SHAP values are calculated. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • max_display (int, optional) – Maximum number of features to display in the beeswarm plot. Default is None.

  • order (callable, optional) – Function to order the features. Default is shap.Explanation.abs.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP beeswarm plot.

compute_shap_values(X) Explanation#

Get SHAP values for a given dataset.

Parameters:

X (pd.DataFrame) – The feature matrix.

Returns:

SHAP values explanation.

Return type:

shap.Explanation

create_objective(X, y)#

Create an objective function for hyperparameter optimization.

Parameters:
  • X (pd.DataFrame) – The feature matrix.

  • y (array-like) – The target values.

Returns:

Objective function for hyperparameter optimization.

Return type:

callable

decision(X=None, shap_values=None, output=0, title=None, show=True, **kwargs)#

Create a decision plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP decision plot.

fit(X: DataFrame, y=None, target=None, frac=None, n=None, **params)#

Fit the model with optional hyperparameters.

Parameters:
  • X (pd.DataFrame) – The feature matrix.

  • y (pd.Series) – The target values. If not provided, the target column should be specified in ‘target’.

  • target (str, optional) – The name of the target column in X. Required if ‘y’ is not provided.

  • frac (float, optional) – Fraction of data to use for fitting. Default is None.

  • n (int, optional) – Number of samples to use for fitting. Default is None.

  • **params – Optional hyperparameters to set for the model.

Returns:

Returns self for method chaining.

Return type:

self

Raises:

AssertionError – If neither ‘y’ nor ‘target’ is provided. If ‘y’ is None and ‘target’ is not found in X’s columns.

force(X=None, shap_values=None, output=0, title=None, show=True, **kwargs)#

Create a force plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP force plot.

predict(X) Series#

Make predictions using the fitted model.

Parameters:

X (pd.DataFrame) – The feature matrix for prediction.

Returns:

Predicted values.

Return type:

pd.Series

scatter(X=None, shap_values=None, feature=0, interaction_feature='auto', output=0, title=None, show=True, **kwargs)#

Create a dependence plot for a specific feature.

Parameters:
  • X (pd.DataFrame) – The feature matrix.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • feature (int or str, optional) – Index or name of the feature to create the dependence plot for. Default is 0.

  • interaction_feature (int, str, or 'auto', optional) – The feature to color the plot by. If ‘auto’, the method will automatically select a feature to color the plot based on interactions with the chosen ‘feature’. If an integer or string is provided, it specifies the index or name of the interaction feature. Default is ‘auto’.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP scatter plot.

Notes

  • If interaction_feature=’auto’, the plot will automatically determine the interaction feature to use for coloring.

  • If interaction_feature is specified (as an index or name), that feature will be used for coloring the scatter plot.

set_scoring(scoring, greater_is_better=False)#

Set the scoring metric.

Parameters:
  • scoring (str or callable) – The scoring metric used for evaluation.

  • greater_is_better (bool, optional) – Whether a higher score indicates a better model. Default is False.

Returns:

Returns self for method chaining.

Return type:

self

class apyxl.XGBRegressorWrapper(scoring='r2', greater_is_better=True, params_space=None, max_evals=15, cv=5, feature_perturbation='tree_path_dependent', device='cpu', verbose=False, random_state=None)#
bar(X=None, shap_values=None, max_display=10, order=shap.Explanation.abs, output=0, title=None, show=True, **kwargs)#

Create a bar plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • max_display (int, optional) – Maximum number of features to display in the bar plot. Default is 10.

  • order (callable, optional) – Function to order the features. Default is shap.Explanation.abs.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP bar plot.

beeswarm(X=None, shap_values=None, max_display=None, order=shap.Explanation.abs.mean(0), output=0, title=None, show=True, **kwargs)#

Create a beeswarm plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix for which SHAP values are calculated. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • max_display (int, optional) – Maximum number of features to display in the beeswarm plot. Default is None.

  • order (callable, optional) – Function to order the features. Default is shap.Explanation.abs.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP beeswarm plot.

compute_shap_values(X) Explanation#

Get SHAP values for a given dataset.

Parameters:

X (pd.DataFrame) – The feature matrix.

Returns:

SHAP values explanation.

Return type:

shap.Explanation

create_objective(X, y)#

Create an objective function for hyperparameter optimization.

Parameters:
  • X (pd.DataFrame) – The feature matrix.

  • y (array-like) – The target values.

Returns:

Objective function for hyperparameter optimization.

Return type:

callable

decision(X=None, shap_values=None, output=0, title=None, show=True, **kwargs)#

Create a decision plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP decision plot.

fit(X: DataFrame, y=None, target=None, frac=None, n=None, **params)#

Fit the model with optional hyperparameters.

Parameters:
  • X (pd.DataFrame) – The feature matrix.

  • y (pd.Series) – The target values. If not provided, the target column should be specified in ‘target’.

  • target (str, optional) – The name of the target column in X. Required if ‘y’ is not provided.

  • frac (float, optional) – Fraction of data to use for fitting. Default is None.

  • n (int, optional) – Number of samples to use for fitting. Default is None.

  • **params – Optional hyperparameters to set for the model.

Returns:

Returns self for method chaining.

Return type:

self

Raises:

AssertionError – If neither ‘y’ nor ‘target’ is provided. If ‘y’ is None and ‘target’ is not found in X’s columns.

force(X=None, shap_values=None, output=0, title=None, show=True, **kwargs)#

Create a force plot of SHAP values.

Parameters:
  • X (pd.DataFrame, optional) – The feature matrix. Default is None.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP force plot.

predict(X) Series#

Make predictions using the fitted model.

Parameters:

X (pd.DataFrame) – The feature matrix for prediction.

Returns:

Predicted values.

Return type:

pd.Series

scatter(X=None, shap_values=None, feature=0, interaction_feature='auto', output=0, title=None, show=True, **kwargs)#

Create a dependence plot for a specific feature.

Parameters:
  • X (pd.DataFrame) – The feature matrix.

  • shap_values (shap.Explanation, optional) – Precomputed SHAP values explanation. Default is None.

  • feature (int or str, optional) – Index or name of the feature to create the dependence plot for. Default is 0.

  • interaction_feature (int, str, or 'auto', optional) – The feature to color the plot by. If ‘auto’, the method will automatically select a feature to color the plot based on interactions with the chosen ‘feature’. If an integer or string is provided, it specifies the index or name of the interaction feature. Default is ‘auto’.

  • output (int, optional) – The output class for which to plot SHAP values (useful for multiclass classification). Default is 0.

  • title (str, optional) – Title for the plot. Default is None.

  • show (bool, optional) – Whether to display the plot. Default is True.

  • **kwargs – Additional keyword arguments for the SHAP scatter plot.

Notes

  • If interaction_feature=’auto’, the plot will automatically determine the interaction feature to use for coloring.

  • If interaction_feature is specified (as an index or name), that feature will be used for coloring the scatter plot.

set_scoring(scoring, greater_is_better=False)#

Set the scoring metric.

Parameters:
  • scoring (str or callable) – The scoring metric used for evaluation.

  • greater_is_better (bool, optional) – Whether a higher score indicates a better model. Default is False.

Returns:

Returns self for method chaining.

Return type:

self