Warning
Pygbm’s API and default values are likely to be changed in future version, without any deprecation cycle.
Gradient Boosting Estimators¶
Gradient Boosting decision trees for classification and regression.

class
pygbm.gradient_boosting.
GradientBoostingClassifier
(loss='auto', learning_rate=0.1, max_iter=100, max_leaf_nodes=31, max_depth=None, min_samples_leaf=20, l2_regularization=0.0, max_bins=256, scoring='neg_log_loss', validation_split=0.1, n_iter_no_change=5, tol=1e07, verbose=0, random_state=None)[source]¶ Scikitlearn compatible Gradient Boosting Tree for classification.
Parameters:  loss ({'auto', 'binary_crossentropy', 'categorical_crossentropy'}, optional(default='auto')) – The loss function to use in the boosting process. ‘binary_crossentropy’ (also known as logistic loss) is used for binary classification and generalizes to ‘categorical_crossentropy’ for multiclass classification. ‘auto’ will automatically choose either loss depending on the nature of the problem.
 learning_rate (float, optional(default=1)) – The learning rate, also known as shrinkage. This is used as a
multiplicative factor for the leaves values. Use
1
for no shrinkage.  max_iter (int, optional(default=100)) – The maximum number of iterations of the boosting process, i.e. the maximum number of trees for binary classification. For multiclass classification, n_classes trees per iteration are built.
 max_leaf_nodes (int or None, optional(default=None)) – The maximum number of leaves for each tree. If None, there is no maximum limit.
 max_depth (int or None, optional(default=None)) – The maximum depth of each tree. The depth of a tree is the number of nodes to go from the root to the deepest leaf.
 min_samples_leaf (int, optional(default=20)) – The minimum number of samples per leaf.
 l2_regularization (float, optional(default=0)) – The L2 regularization parameter. Use 0 for no regularization.
 max_bins (int, optional(default=256)) – The maximum number of bins to use. Before training, each feature of
the input array
X
is binned into at mostmax_bins
bins, which allows for a much faster training stage. Features with a small number of unique values may use less thanmax_bins
bins. Must be no larger than 256.  scoring (str or callable or None, optional (default='accuracy')) – Scoring parameter to use for early stopping (see sklearn.metrics for available options). If None, no early stopping is done.
 validation_split (int or float or None, optional(default=0.1)) – Proportion (or absolute size) of training data to set aside as validation data for early stopping. If None, early stopping is done on the whole training data.
 n_iter_no_change (int, optional (default=5)) – Used to determine when to “early stop”. The fitting process is
stopped when none of the last
n_iter_no_change
scores are better than the ``n_iter_no_change  1``thtolast one, up to some tolerance.  tol (float or None optional (default=1e7)) – The absolute tolerance to use when comparing scores. The higher the tolerance, the more likely we are to early stop: higher tolerance means that it will be harder for subsequent iterations to be considered an improvement upon the reference score.
 verbose (int, optional(default=0)) – The verbosity level. If not zero, print some information about the fitting process.
 random_state (int, np.random.RandomStateInstance or None, optional(default=None)) – Pseudorandom number generator to control the subsampling in the binning process, and the train/validation data split if early stopping is enabled. See scikitlearn glossary.
Examples
>>> from sklearn.datasets import load_iris >>> from pygbm import GradientBoostingClassifier >>> X, y = load_iris(return_X_y=True) >>> clf = GradientBoostingClassifier().fit(X, y) >>> clf.score(X, y) 0.97...

fit
(X, y)¶ Fit the gradient boosting model.
Parameters:  X (arraylike, shape=(n_samples, n_features)) – The input samples.
 y (arraylike, shape=(n_samples,)) – Target values.
Returns: self
Return type: object

get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any

predict
(X)[source]¶ Predict classes for X.
Parameters: X (arraylike, shape=(n_samples, n_features)) – The input samples. If X.dtype == np.uint8
, the data is assumed to be prebinned.Returns: y – The predicted classes. Return type: array, shape (n_samples,)

predict_proba
(X)[source]¶ Predict class probabilities for X.
Parameters: X (arraylike, shape=(n_samples, n_features)) – The input samples. If X.dtype == np.uint8
, the data is assumed to be prebinned.Returns: p – The class probabilities of the input samples. Return type: array, shape (n_samples, n_classes)

score
(X, y, sample_weight=None)¶ Returns the mean accuracy on the given test data and labels.
In multilabel classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters:  X (arraylike, shape = (n_samples, n_features)) – Test samples.
 y (arraylike, shape = (n_samples) or (n_samples, n_outputs)) – True labels for X.
 sample_weight (arraylike, shape = [n_samples], optional) – Sample weights.
Returns: score – Mean accuracy of self.predict(X) wrt. y.
Return type: float

set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self

class
pygbm.gradient_boosting.
GradientBoostingRegressor
(loss='least_squares', learning_rate=0.1, max_iter=100, max_leaf_nodes=31, max_depth=None, min_samples_leaf=20, l2_regularization=0.0, max_bins=256, scoring='neg_mean_squared_error', validation_split=0.1, n_iter_no_change=5, tol=1e07, verbose=0, random_state=None)[source]¶ Scikitlearn compatible Gradient Boosting Tree for regression.
Parameters:  loss ({'least_squares'}, optional(default='least_squares')) – The loss function to use in the boosting process.
 learning_rate (float, optional(default=0.1)) – The learning rate, also known as shrinkage. This is used as a
multiplicative factor for the leaves values. Use
1
for no shrinkage.  max_iter (int, optional(default=100)) – The maximum number of iterations of the boosting process, i.e. the maximum number of trees.
 max_leaf_nodes (int or None, optional(default=None)) – The maximum number of leaves for each tree. If None, there is no maximum limit.
 max_depth (int or None, optional(default=None)) – The maximum depth of each tree. The depth of a tree is the number of nodes to go from the root to the deepest leaf.
 min_samples_leaf (int, optional(default=20)) – The minimum number of samples per leaf.
 l2_regularization (float, optional(default=0)) – The L2 regularization parameter. Use 0 for no regularization.
 max_bins (int, optional(default=256)) – The maximum number of bins to use. Before training, each feature of
the input array
X
is binned into at mostmax_bins
bins, which allows for a much faster training stage. Features with a small number of unique values may use less thanmax_bins
bins. Must be no larger than 256.  scoring (str or callable or None, optional (default='neg_mean_squared_error')) – Scoring parameter to use for early stopping (see sklearn.metrics for available options). If None, no early stopping is done.
 validation_split (int or float or None, optional(default=0.1)) – Proportion (or absolute size) of training data to set aside as validation data for early stopping. If None, early stopping is done on the whole training data.
 n_iter_no_change (int, optional (default=5)) – Used to determine when to “early stop”. The fitting process is
stopped when none of the last
n_iter_no_change
scores are better than the ``n_iter_no_change  1``thtolast one, up to some tolerance.  tol (float or None optional (default=1e7)) – The absolute tolerance to use when comparing scores. The higher the tolerance, the more likely we are to early stop: higher tolerance means that it will be harder for subsequent iterations to be considered an improvement upon the reference score.
 verbose (int, optional (default=0)) – The verbosity level. If not zero, print some information about the fitting process.
 random_state (int, np.random.RandomStateInstance or None, optional (default=None)) –
Pseudorandom number generator to control the subsampling in the binning process, and the train/validation data split if early stopping is enabled. See scikitlearn glossary.
Examples
>>> from sklearn.datasets import load_boston >>> from pygbm import GradientBoostingRegressor >>> X, y = load_boston(return_X_y=True) >>> est = GradientBoostingRegressor().fit(X, y) >>> est.score(X, y) 0.92...

fit
(X, y)¶ Fit the gradient boosting model.
Parameters:  X (arraylike, shape=(n_samples, n_features)) – The input samples.
 y (arraylike, shape=(n_samples,)) – Target values.
Returns: self
Return type: object

get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any

predict
(X)[source]¶ Predict values for X.
Parameters: X (arraylike, shape=(n_samples, n_features)) – The input samples. If X.dtype == np.uint8
, the data is assumed to be prebinned.Returns: y – The predicted values. Return type: array, shape (n_samples,)

score
(X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1  u/v), where u is the residual sum of squares ((y_true  y_pred) ** 2).sum() and v is the total sum of squares ((y_true  y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters:  X (arraylike, shape = (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix instead, shape = (n_samples, n_samples_fitted], where n_samples_fitted is the number of samples used in the fitting for the estimator.
 y (arraylike, shape = (n_samples) or (n_samples, n_outputs)) – True values for X.
 sample_weight (arraylike, shape = [n_samples], optional) – Sample weights.
Returns: score – R^2 of self.predict(X) wrt. y.
Return type: float

set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self