XGBoostLSS
  • Home
  • Distributional Modelling
  • Available Distributions

Examples

  • Basic Walkthrough - Gaussian Regression
  • Dirichlet Regression
  • Expectile Regression
  • Gamma Regression (California Housing Data)
  • Gausssian-Mixture Regression (California Housing Data)
  • How to Select a Univariate Distribution
  • How to Select a Multivariate Distribution
  • Multivariate Gaussian Regression (Cholesky Decomposition)
  • Multivariate Gaussian Regression (Low-Rank Approximation)
  • Multivariate Student-T Regression (Cholesky Decomposition)
  • Spline Flow Regression
  • Zero-Adjusted Gamma Regression

API Docs

  • API references
XGBoostLSS
  • Examples
  • Spline Flow Regression
  • Edit on GitHub

Spline Flow Regression¶

Open in Colab

Normalizing flows transform a simple distribution into a complex data distribution through a series of invertible transformations.

No description has been provided for this image
Image source: https://tikz.net/janosh/normalizing-flow.png

The key steps involved in the operation of normalizing flows are as follows (from left to right):

  • Start with a simple, easy-to-sample distribution, usually a Gaussian, which serves as the "base" distribution
  • Apply a series of invertible transformations to map the samples from the base distribution to the desired complex data distribution
  • Each transformation in the flow must be reversible, meaning it has both a forward pass (sampling from the base distribution to the complex distribution) and an inverse pass (mapping samples from the complex distribution back to the base distribution)
  • The flow ensures that the probability density function (PDF) of the complex distribution can be analytically calculated using the determinant of the Jacobian matrix resulting from the transformations

By stacking multiple transformations in a sequence, normalizing flows can model complex and multi-modal distributions while providing the ability to compute the likelihood of the data and perform efficient sampling in both directions (from base to complex and vice versa). However, it is important to note that since XGBoostLSS is based on a one vs. all estimation strategy, where a separate tree is grown for each parameter, estimating many parameters for a large dataset can become computationally expensive. For more details, we refer to our related paper Alexander März and Thomas Kneib (2022): Distributional Gradient Boosting Machines.

Imports¶

In [1]:
Copied!
from xgboostlss.model import *
from xgboostlss.distributions.SplineFlow import *
from xgboostlss.distributions.flow_utils import NormalizingFlowClass
from xgboostlss.datasets.data_loader import load_simulated_gaussian_data
from scipy.stats import norm

import multiprocessing
import plotnine
from plotnine import *
plotnine.options.figure_size = (20, 10)
from xgboostlss.model import * from xgboostlss.distributions.SplineFlow import * from xgboostlss.distributions.flow_utils import NormalizingFlowClass from xgboostlss.datasets.data_loader import load_simulated_gaussian_data from scipy.stats import norm import multiprocessing import plotnine from plotnine import * plotnine.options.figure_size = (20, 10)

Data¶

In [2]:
Copied!
# The data is simulated as a Gaussian, where x is the only true feature and all others are noise variables
    # loc = 10
    # scale = 1 + 4 * ((0.3 < x) & (x < 0.5)) + 2 * (x > 0.7)

train, test = load_simulated_gaussian_data()
n_cpu = multiprocessing.cpu_count()

X_train, y_train = train.filter(regex="x"), train["y"].values
X_test, y_test = test.filter(regex="x"), test["y"].values

dtrain = xgb.DMatrix(X_train, label=y_train, nthread=n_cpu)
dtest = xgb.DMatrix(X_test, nthread=n_cpu)
# The data is simulated as a Gaussian, where x is the only true feature and all others are noise variables # loc = 10 # scale = 1 + 4 * ((0.3 < x) & (x < 0.5)) + 2 * (x > 0.7) train, test = load_simulated_gaussian_data() n_cpu = multiprocessing.cpu_count() X_train, y_train = train.filter(regex="x"), train["y"].values X_test, y_test = test.filter(regex="x"), test["y"].values dtrain = xgb.DMatrix(X_train, label=y_train, nthread=n_cpu) dtest = xgb.DMatrix(X_test, nthread=n_cpu)

Select Normalizing Flow¶

In the following, we specify a list of candidate normalizing flows. The function flow_select returns the negative log-likelihood of each specification. The normalizing flow with the lowest negative log-likelihood is selected. The function also plots the density of the target variable and the fitted density, using the best suitable normalizing flow among the specified ones. However, note that choosing the best performing flow based solely on training data may lead to overfitting, since normalizing flows have a higher risk of overfitting compared to parametric distributions. When using normalizing flows, it is crucial to carefully select the specifications to strike a balance between model complexity and generalization ability.

In [8]:
Copied!
# See ?SplineFlow for an overview.
bound = np.max([np.abs(y_train.min()), y_train.max()])
target_support = "real"

candidate_flows = [

    SplineFlow(target_support=target_support, count_bins=2,  bound=bound, order="linear"),
    SplineFlow(target_support=target_support, count_bins=4,  bound=bound, order="linear"),
    SplineFlow(target_support=target_support, count_bins=6,  bound=bound, order="linear"),
    SplineFlow(target_support=target_support, count_bins=8,  bound=bound, order="linear"),
    SplineFlow(target_support=target_support, count_bins=12, bound=bound, order="linear"),
    SplineFlow(target_support=target_support, count_bins=16, bound=bound, order="linear"),
    SplineFlow(target_support=target_support, count_bins=20, bound=bound, order="linear"),

    SplineFlow(target_support=target_support, count_bins=2,  bound=bound, order="quadratic"),
    SplineFlow(target_support=target_support, count_bins=4,  bound=bound, order="quadratic"),
    SplineFlow(target_support=target_support, count_bins=6,  bound=bound, order="quadratic"),
    SplineFlow(target_support=target_support, count_bins=8,  bound=bound, order="quadratic"),
    SplineFlow(target_support=target_support, count_bins=12, bound=bound, order="quadratic"),
    SplineFlow(target_support=target_support, count_bins=16, bound=bound, order="quadratic"),
    SplineFlow(target_support=target_support, count_bins=20, bound=bound, order="quadratic"),
    
] 

flow_nll = NormalizingFlowClass().flow_select(target=y_train, candidate_flows=candidate_flows, max_iter=50, plot=True, figure_size=(12, 5))
flow_nll
# See ?SplineFlow for an overview. bound = np.max([np.abs(y_train.min()), y_train.max()]) target_support = "real" candidate_flows = [ SplineFlow(target_support=target_support, count_bins=2, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=4, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=6, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=8, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=12, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=16, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=20, bound=bound, order="linear"), SplineFlow(target_support=target_support, count_bins=2, bound=bound, order="quadratic"), SplineFlow(target_support=target_support, count_bins=4, bound=bound, order="quadratic"), SplineFlow(target_support=target_support, count_bins=6, bound=bound, order="quadratic"), SplineFlow(target_support=target_support, count_bins=8, bound=bound, order="quadratic"), SplineFlow(target_support=target_support, count_bins=12, bound=bound, order="quadratic"), SplineFlow(target_support=target_support, count_bins=16, bound=bound, order="quadratic"), SplineFlow(target_support=target_support, count_bins=20, bound=bound, order="quadratic"), ] flow_nll = NormalizingFlowClass().flow_select(target=y_train, candidate_flows=candidate_flows, max_iter=50, plot=True, figure_size=(12, 5)) flow_nll
Fitting of candidate normalizing flows completed: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:51<00:00,  3.70s/it]
No description has been provided for this image
Out[8]:
nll NormFlow
rank
1 16595.917006 SplineFlow(count_bins: 20, order: linear)
2 16608.693807 SplineFlow(count_bins: 12, order: quadratic)
3 16622.862265 SplineFlow(count_bins: 16, order: quadratic)
4 16640.156074 SplineFlow(count_bins: 6, order: linear)
5 16640.611035 SplineFlow(count_bins: 16, order: linear)
6 16649.404709 SplineFlow(count_bins: 8, order: linear)
7 16651.375456 SplineFlow(count_bins: 8, order: quadratic)
8 16653.378393 SplineFlow(count_bins: 6, order: quadratic)
9 16674.331780 SplineFlow(count_bins: 12, order: linear)
10 16822.629927 SplineFlow(count_bins: 4, order: quadratic)
11 16902.398862 SplineFlow(count_bins: 20, order: quadratic)
12 17538.588405 SplineFlow(count_bins: 4, order: linear)
13 17692.968508 SplineFlow(count_bins: 2, order: linear)
14 17737.569055 SplineFlow(count_bins: 2, order: quadratic)

Normalizing Flow Specification¶

Even though SplineFlow(count_bins: 20, order: linear) shows the best fit to the data, we choose a more parameter parsimonious specification (recall that a separate tree is grown for each parameter):

  • for count_bins=20, we need to estimate 3*count_bins + (count_bins-1) = 79 parameters
  • for count_bins=8, we need to estimate 3*count_bins + (count_bins-1) = 31 parameters
In [4]:
Copied!
# Specifies Spline-Flow. See ?SplineFlow for an overview.
bound = np.max([np.abs(y_train.min()), y_train.max()])

xgblss = XGBoostLSS(
    SplineFlow(target_support="real",     # Specifies the support of the target. Options are "real", "positive", "positive_integer" or "unit_interval"
               count_bins=8,              # The number of segments comprising the spline.
               bound=bound,               # By adjusting the value, you can control the size of the bounding box and consequently control the range of inputs that the spline transform operates on.
               order="linear",            # The order of the spline. Options are "linear" or "quadratic".
               stabilization="None",      # Options are "None", "MAD" or "L2".
               loss_fn="nll"              # Loss function. Options are "nll" (negative log-likelihood) or "crps"(continuous ranked probability score).
            )    
)
# Specifies Spline-Flow. See ?SplineFlow for an overview. bound = np.max([np.abs(y_train.min()), y_train.max()]) xgblss = XGBoostLSS( SplineFlow(target_support="real", # Specifies the support of the target. Options are "real", "positive", "positive_integer" or "unit_interval" count_bins=8, # The number of segments comprising the spline. bound=bound, # By adjusting the value, you can control the size of the bounding box and consequently control the range of inputs that the spline transform operates on. order="linear", # The order of the spline. Options are "linear" or "quadratic". stabilization="None", # Options are "None", "MAD" or "L2". loss_fn="nll" # Loss function. Options are "nll" (negative log-likelihood) or "crps"(continuous ranked probability score). ) )

Hyper-Parameter Optimization¶

Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:

- Float/Int sample_type
    - {"param_name": ["sample_type", low, high, log]}
        - sample_type: str, Type of sampling, e.g., "float" or "int"
        - low: int, Lower endpoint of the range of suggested values
        - high: int, Upper endpoint of the range of suggested values
        - log: bool, Flag to sample the value from the log domain or not
    - Example: {"eta": "float", low=1e-5, high=1, log=True]}

- Categorical sample_type
    - {"param_name": ["sample_type", ["choice1", "choice2", "choice3", "..."]]}
        - sample_type: str, Type of sampling, either "categorical"
        - choice1, choice2, choice3, ...: str, Possible choices for the parameter
    - Example: {"booster": ["categorical", ["gbtree", "dart"]]}

- For parameters without tunable choice (this is needed if tree_method = "gpu_hist" and gpu_id needs to be specified)
    - {"param_name": ["none", [value]]},
        - param_name: str, Name of the parameter
        - value: int, Value of the parameter
    - Example: {"gpu_id": ["none", [0]]}

Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {"booster": ["categorical", ["gbtree", "gblinear"]]} and {"max_depth": ["int", 1, 10, False]} are specified, max_depth is not used when gblinear is sampled, since it has no such argument.

In [5]:
Copied!
param_dict = {
    "eta":              ["float", {"low": 1e-5,   "high": 1,     "log": True}],
    "max_depth":        ["int",   {"low": 1,      "high": 10,    "log": False}],
    "gamma":            ["float", {"low": 1e-8,   "high": 40,    "log": True}],
    "subsample":        ["float", {"low": 0.2,    "high": 1.0,   "log": False}],
    "colsample_bytree": ["float", {"low": 0.2,    "high": 1.0,   "log": False}],
    "min_child_weight": ["float", {"low": 1e-8,   "high": 500,   "log": True}],
    "booster":          ["categorical", ["gbtree"]]
}

np.random.seed(123)
opt_param = xgblss.hyper_opt(param_dict,
                             dtrain,
                             num_boost_round=100,        # Number of boosting iterations.
                             nfold=5,                    # Number of cv-folds.
                             early_stopping_rounds=20,   # Number of early-stopping rounds
                             max_minutes=1000,           # Time budget in minutes, i.e., stop study after the given number of minutes.
                             n_trials=100,               # The number of trials. If this argument is set to None, there is no limitation on the number of trials.
                             silence=False,              # Controls the verbosity of the trail, i.e., user can silence the outputs of the trail.
                             seed=123,                   # Seed used to generate cv-folds.
                             hp_seed=None                # Seed for random number generator used in the Bayesian hyperparameter search.
                            )
param_dict = { "eta": ["float", {"low": 1e-5, "high": 1, "log": True}], "max_depth": ["int", {"low": 1, "high": 10, "log": False}], "gamma": ["float", {"low": 1e-8, "high": 40, "log": True}], "subsample": ["float", {"low": 0.2, "high": 1.0, "log": False}], "colsample_bytree": ["float", {"low": 0.2, "high": 1.0, "log": False}], "min_child_weight": ["float", {"low": 1e-8, "high": 500, "log": True}], "booster": ["categorical", ["gbtree"]] } np.random.seed(123) opt_param = xgblss.hyper_opt(param_dict, dtrain, num_boost_round=100, # Number of boosting iterations. nfold=5, # Number of cv-folds. early_stopping_rounds=20, # Number of early-stopping rounds max_minutes=1000, # Time budget in minutes, i.e., stop study after the given number of minutes. n_trials=100, # The number of trials. If this argument is set to None, there is no limitation on the number of trials. silence=False, # Controls the verbosity of the trail, i.e., user can silence the outputs of the trail. seed=123, # Seed used to generate cv-folds. hp_seed=None # Seed for random number generator used in the Bayesian hyperparameter search. )
[I 2023-07-20 15:21:25,613] A new study created in memory with name: XGBoostLSS Hyper-Parameter Optimization
  0%|          | 0/100 [00:00<?, ?it/s]
[I 2023-07-20 15:22:30,946] Trial 0 finished with value: 6255.344043 and parameters: {'eta': 0.03692566115300059, 'max_depth': 4, 'gamma': 7.887574463976319e-08, 'subsample': 0.9144017805889704, 'colsample_bytree': 0.29974692606946474, 'min_child_weight': 2.1274056292372507e-07, 'booster': 'gbtree'}. Best is trial 0 with value: 6255.344043.
[I 2023-07-20 15:22:44,079] Trial 1 finished with value: 16614.029491999998 and parameters: {'eta': 0.0017154637091307261, 'max_depth': 7, 'gamma': 8.23346450464663e-07, 'subsample': 0.5893614292128422, 'colsample_bytree': 0.22046879379684717, 'min_child_weight': 1.0217730512063771e-06, 'booster': 'gbtree'}. Best is trial 0 with value: 6255.344043.
[I 2023-07-20 15:22:56,891] Trial 2 finished with value: 3504.441992 and parameters: {'eta': 2.7751871589547672e-05, 'max_depth': 6, 'gamma': 3.02777318294296e-07, 'subsample': 0.26524245101484456, 'colsample_bytree': 0.27746635000321973, 'min_child_weight': 6.426034557726815e-05, 'booster': 'gbtree'}. Best is trial 2 with value: 3504.441992.
[I 2023-07-20 15:24:03,745] Trial 3 finished with value: 3070.370508 and parameters: {'eta': 0.08292199572889684, 'max_depth': 5, 'gamma': 1.5182628894900233e-06, 'subsample': 0.9403786460943477, 'colsample_bytree': 0.34051787348973905, 'min_child_weight': 0.2522861634643369, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:25:04,564] Trial 4 finished with value: 3330.1626466000002 and parameters: {'eta': 0.0001509832299785431, 'max_depth': 7, 'gamma': 2.756531681117656e-06, 'subsample': 0.45360605857266495, 'colsample_bytree': 0.7204182240441368, 'min_child_weight': 30.680516520620287, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:25:18,187] Trial 5 finished with value: 379086.15 and parameters: {'eta': 0.830518425703511, 'max_depth': 3, 'gamma': 0.012299985715140278, 'subsample': 0.9711870084155301, 'colsample_bytree': 0.24611711490622082, 'min_child_weight': 0.0005629880692600968, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:25:31,891] Trial 6 finished with value: 3498.6242188000006 and parameters: {'eta': 0.0003611895880336918, 'max_depth': 7, 'gamma': 0.03786187454142862, 'subsample': 0.5124745199562726, 'colsample_bytree': 0.3551315906394943, 'min_child_weight': 0.00940156145963303, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:25:46,531] Trial 7 finished with value: 5983.7352538000005 and parameters: {'eta': 4.496946572176012e-05, 'max_depth': 7, 'gamma': 0.0047194293358602145, 'subsample': 0.9344902260740249, 'colsample_bytree': 0.948864562594103, 'min_child_weight': 0.0005306028753515352, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:26:02,781] Trial 8 finished with value: 111347.4359374 and parameters: {'eta': 0.003469024418224711, 'max_depth': 9, 'gamma': 2.3721946640811852e-07, 'subsample': 0.971286805911628, 'colsample_bytree': 0.6956479802596798, 'min_child_weight': 1.745492550550714e-08, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:26:17,780] Trial 9 finished with value: 3242.7606932 and parameters: {'eta': 0.24801010294899173, 'max_depth': 1, 'gamma': 0.607096851031229, 'subsample': 0.31815420752411927, 'colsample_bytree': 0.8599030202986522, 'min_child_weight': 0.03278105788018955, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:26:30,849] Trial 10 finished with value: 3332.3609862 and parameters: {'eta': 0.032439573574320306, 'max_depth': 10, 'gamma': 6.856157934141567e-05, 'subsample': 0.753463330634383, 'colsample_bytree': 0.4473388149594849, 'min_child_weight': 264.48982201499797, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:26:43,690] Trial 11 finished with value: 3259.6298828000004 and parameters: {'eta': 0.5844375923601339, 'max_depth': 1, 'gamma': 14.125852834419698, 'subsample': 0.24445640866266827, 'colsample_bytree': 0.5042856805135316, 'min_child_weight': 0.16785137409425205, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:27:03,532] Trial 12 finished with value: 3222.1685058 and parameters: {'eta': 0.10743432797751593, 'max_depth': 1, 'gamma': 1.336930452029465e-08, 'subsample': 0.7794937123701792, 'colsample_bytree': 0.9870833686249794, 'min_child_weight': 0.535865544343811, 'booster': 'gbtree'}. Best is trial 3 with value: 3070.370508.
[I 2023-07-20 15:27:29,687] Trial 13 finished with value: 3039.6305176 and parameters: {'eta': 0.1313966761766938, 'max_depth': 3, 'gamma': 1.2885824737841825e-08, 'subsample': 0.7811960478381322, 'colsample_bytree': 0.9910806136654325, 'min_child_weight': 0.6798696113137718, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:28:34,025] Trial 14 finished with value: 3091.93291 and parameters: {'eta': 0.02486308401269763, 'max_depth': 4, 'gamma': 2.4223562057848944e-05, 'subsample': 0.7794301516088603, 'colsample_bytree': 0.596378998525593, 'min_child_weight': 4.149494288988884, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:29:26,465] Trial 15 finished with value: 3062.4055174 and parameters: {'eta': 0.12661561026668358, 'max_depth': 3, 'gamma': 1.2654967651703994e-08, 'subsample': 0.8592553385029919, 'colsample_bytree': 0.4007346733446969, 'min_child_weight': 0.5854364054560578, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:29:40,670] Trial 16 pruned. Trial was pruned at iteration 22.
[I 2023-07-20 15:29:53,687] Trial 17 finished with value: 3328.545508 and parameters: {'eta': 0.13980158809194015, 'max_depth': 3, 'gamma': 1.5215938995345315e-08, 'subsample': 0.8490492736409923, 'colsample_bytree': 0.45639549413029157, 'min_child_weight': 275.8648934425574, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:30:06,776] Trial 18 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:30:24,686] Trial 19 finished with value: 3056.6981444 and parameters: {'eta': 0.6544509882244208, 'max_depth': 5, 'gamma': 0.00021163507207378848, 'subsample': 0.869732698237011, 'colsample_bytree': 0.900974180793565, 'min_child_weight': 2.7817029183422135, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:30:40,061] Trial 20 finished with value: 3058.3840821999997 and parameters: {'eta': 0.7634529963570388, 'max_depth': 5, 'gamma': 0.0003869011067576662, 'subsample': 0.8535815300763139, 'colsample_bytree': 0.9049612798090937, 'min_child_weight': 7.127603761119032, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:30:56,014] Trial 21 finished with value: 3044.4860838 and parameters: {'eta': 0.8750118182115282, 'max_depth': 5, 'gamma': 0.00025296042041120204, 'subsample': 0.8638909186170046, 'colsample_bytree': 0.9261561699617321, 'min_child_weight': 13.307993350012621, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:31:11,403] Trial 22 finished with value: 3041.409082 and parameters: {'eta': 0.8678687545966929, 'max_depth': 6, 'gamma': 0.0003034358805344321, 'subsample': 0.9994750305856172, 'colsample_bytree': 0.9364531239253467, 'min_child_weight': 56.353446165160015, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:31:28,256] Trial 23 finished with value: 3045.8405272 and parameters: {'eta': 0.3091862911564756, 'max_depth': 6, 'gamma': 0.0008102771849499481, 'subsample': 0.992327967436072, 'colsample_bytree': 0.9538912185657742, 'min_child_weight': 56.324078030880365, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:31:40,479] Trial 24 finished with value: 3337.1875489999998 and parameters: {'eta': 0.2648285825517022, 'max_depth': 8, 'gamma': 4.72524211489433e-06, 'subsample': 0.8070969086041984, 'colsample_bytree': 0.9883810337379536, 'min_child_weight': 422.58538326701074, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:32:23,836] Trial 25 finished with value: 3049.2964354 and parameters: {'eta': 0.06430418453639147, 'max_depth': 4, 'gamma': 4.88444011961088e-05, 'subsample': 0.999397426898831, 'colsample_bytree': 0.8029295391032494, 'min_child_weight': 38.633122585967484, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:32:38,187] Trial 26 finished with value: 3054.6845702 and parameters: {'eta': 0.8614115024975222, 'max_depth': 6, 'gamma': 8.28703977696704e-06, 'subsample': 0.9044185260330364, 'colsample_bytree': 0.8032378041387337, 'min_child_weight': 22.700459833842594, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:33:01,784] Trial 27 finished with value: 3068.4505369999997 and parameters: {'eta': 0.29355992688035293, 'max_depth': 2, 'gamma': 0.00014032641956213097, 'subsample': 0.8180607845945178, 'colsample_bytree': 0.9958248821107969, 'min_child_weight': 0.056030271157428396, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:33:16,971] Trial 28 finished with value: 3042.01167 and parameters: {'eta': 0.9489949657070117, 'max_depth': 4, 'gamma': 0.0011531286846959436, 'subsample': 0.8974919764569149, 'colsample_bytree': 0.9200095124201007, 'min_child_weight': 1.2080542972828012, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:33:32,518] Trial 29 pruned. Trial was pruned at iteration 21.
[I 2023-07-20 15:33:46,149] Trial 30 pruned. Trial was pruned at iteration 21.
[I 2023-07-20 15:33:59,322] Trial 31 finished with value: 3043.5458494 and parameters: {'eta': 0.9957731824745992, 'max_depth': 5, 'gamma': 0.0012004496934510645, 'subsample': 0.8822741327587181, 'colsample_bytree': 0.9453127201123616, 'min_child_weight': 10.535252278624212, 'booster': 'gbtree'}. Best is trial 13 with value: 3039.6305176.
[I 2023-07-20 15:34:16,372] Trial 32 finished with value: 3037.6239746 and parameters: {'eta': 0.3759370865943818, 'max_depth': 4, 'gamma': 0.002280606215198254, 'subsample': 0.9310445013233254, 'colsample_bytree': 0.9993191393162827, 'min_child_weight': 1.351438052883453, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:34:33,533] Trial 33 finished with value: 3040.1080566 and parameters: {'eta': 0.38934599890918614, 'max_depth': 4, 'gamma': 8.296441273857323e-07, 'subsample': 0.9515796871722452, 'colsample_bytree': 0.9997532130387258, 'min_child_weight': 0.43235469432563967, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:34:55,933] Trial 34 finished with value: 3039.5586914 and parameters: {'eta': 0.18115338358393857, 'max_depth': 4, 'gamma': 6.166305075098885e-07, 'subsample': 0.945716328699108, 'colsample_bytree': 0.9865423390725786, 'min_child_weight': 0.19544651220436385, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:35:48,666] Trial 35 finished with value: 3038.745459 and parameters: {'eta': 0.04745485570053381, 'max_depth': 4, 'gamma': 5.975889221470206e-07, 'subsample': 0.9302528610335031, 'colsample_bytree': 0.9977485806465173, 'min_child_weight': 0.3298767479525565, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:36:03,076] Trial 36 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:36:38,232] Trial 37 finished with value: 3072.348535 and parameters: {'eta': 0.1676023650818225, 'max_depth': 2, 'gamma': 1.3456281350922575e-06, 'subsample': 0.7312226777053694, 'colsample_bytree': 0.96403972726024, 'min_child_weight': 0.15659509459600862, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:36:51,718] Trial 38 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:37:04,904] Trial 39 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:37:35,193] Trial 40 finished with value: 3044.7080080000005 and parameters: {'eta': 0.16675746943436404, 'max_depth': 5, 'gamma': 4.394096661445257e-07, 'subsample': 0.9523079200049536, 'colsample_bytree': 0.7652455247039238, 'min_child_weight': 0.16742962289315963, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:37:53,559] Trial 41 finished with value: 3042.2968752 and parameters: {'eta': 0.4462381749855381, 'max_depth': 4, 'gamma': 1.7146100810737851e-06, 'subsample': 0.9622380196093626, 'colsample_bytree': 0.9974175750245168, 'min_child_weight': 0.383713251217408, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:38:11,673] Trial 42 finished with value: 3044.6321288 and parameters: {'eta': 0.3778012420288047, 'max_depth': 3, 'gamma': 4.663173144169329e-07, 'subsample': 0.923687326741991, 'colsample_bytree': 0.9979073897811195, 'min_child_weight': 2.2641120684946663, 'booster': 'gbtree'}. Best is trial 32 with value: 3037.6239746.
[I 2023-07-20 15:38:36,702] Trial 43 finished with value: 3036.498877 and parameters: {'eta': 0.17388192674987513, 'max_depth': 4, 'gamma': 6.309051431253172e-08, 'subsample': 0.9566651333215987, 'colsample_bytree': 0.9607007825565661, 'min_child_weight': 0.7675989664161897, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:39:12,740] Trial 44 finished with value: 3040.2740722 and parameters: {'eta': 0.09206794072353378, 'max_depth': 4, 'gamma': 5.6489289726626155e-08, 'subsample': 0.8954928691334318, 'colsample_bytree': 0.8784043381395148, 'min_child_weight': 1.1433654917851803, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:39:26,547] Trial 45 pruned. Trial was pruned at iteration 21.
[I 2023-07-20 15:39:58,144] Trial 46 finished with value: 3036.6901368000003 and parameters: {'eta': 0.1120307552445573, 'max_depth': 5, 'gamma': 3.7580746615978925e-08, 'subsample': 0.9789698093894728, 'colsample_bytree': 0.9644330946571384, 'min_child_weight': 0.2561065047964661, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:40:12,361] Trial 47 finished with value: 317827.7375 and parameters: {'eta': 0.043706750220849286, 'max_depth': 6, 'gamma': 3.913006896332813e-08, 'subsample': 0.9690969467883341, 'colsample_bytree': 0.9072431124346024, 'min_child_weight': 0.001461751094675627, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:40:25,665] Trial 48 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:40:48,546] Trial 49 finished with value: 3042.2581054 and parameters: {'eta': 0.1931007050198132, 'max_depth': 7, 'gamma': 7.094687769779725e-07, 'subsample': 0.9992028340293483, 'colsample_bytree': 0.8590260387503983, 'min_child_weight': 0.21836876632154886, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:41:04,123] Trial 50 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:41:33,545] Trial 51 finished with value: 3037.9859862000003 and parameters: {'eta': 0.11608170004318419, 'max_depth': 3, 'gamma': 1.1356448351699668e-08, 'subsample': 0.9616677986985251, 'colsample_bytree': 0.964418076021131, 'min_child_weight': 0.31618597066744286, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:42:09,265] Trial 52 finished with value: 3037.8550293999997 and parameters: {'eta': 0.11507600724744876, 'max_depth': 4, 'gamma': 2.199638902174226e-08, 'subsample': 0.9695617768158347, 'colsample_bytree': 0.9121099339048496, 'min_child_weight': 0.07206419964863138, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:42:45,858] Trial 53 finished with value: 3042.282959 and parameters: {'eta': 0.09578923298517399, 'max_depth': 3, 'gamma': 2.3214886659640414e-08, 'subsample': 0.9704644240551171, 'colsample_bytree': 0.8861859676783096, 'min_child_weight': 0.08976940730880904, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:43:02,356] Trial 54 finished with value: 3046.4572264 and parameters: {'eta': 0.4863009899191616, 'max_depth': 5, 'gamma': 2.933694891028008e-08, 'subsample': 0.9219233858526725, 'colsample_bytree': 0.918018970862546, 'min_child_weight': 0.57477971522028, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:43:15,596] Trial 55 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:43:29,306] Trial 56 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:43:46,722] Trial 57 finished with value: 3046.938037 and parameters: {'eta': 0.48143764278934603, 'max_depth': 10, 'gamma': 3.306952790316785e-08, 'subsample': 0.8797070161002929, 'colsample_bytree': 0.9688852153874727, 'min_child_weight': 5.035373364625581, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:44:07,802] Trial 58 finished with value: 3040.500879 and parameters: {'eta': 0.23122542961969572, 'max_depth': 5, 'gamma': 1.074776154118335e-08, 'subsample': 0.9791431780517371, 'colsample_bytree': 0.8373987685965062, 'min_child_weight': 1.3021503329254518, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:44:44,348] Trial 59 finished with value: 3038.3109863999994 and parameters: {'eta': 0.12288738314201289, 'max_depth': 6, 'gamma': 1.0261606944828545e-07, 'subsample': 0.9313884266044806, 'colsample_bytree': 0.8747675594194785, 'min_child_weight': 0.4314497637179585, 'booster': 'gbtree'}. Best is trial 43 with value: 3036.498877.
[I 2023-07-20 15:45:16,683] Trial 60 finished with value: 3034.6042479999996 and parameters: {'eta': 0.131715522890815, 'max_depth': 8, 'gamma': 8.595351520871723e-08, 'subsample': 0.8467450878672628, 'colsample_bytree': 0.8921093053178858, 'min_child_weight': 5.529194250117682, 'booster': 'gbtree'}. Best is trial 60 with value: 3034.6042479999996.
[I 2023-07-20 15:45:55,507] Trial 61 finished with value: 3033.7231936 and parameters: {'eta': 0.12878869661672784, 'max_depth': 9, 'gamma': 1.1134506962884295e-07, 'subsample': 0.979008136534911, 'colsample_bytree': 0.8774303224098216, 'min_child_weight': 2.6230072911913602, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:46:16,992] Trial 62 finished with value: 3034.4493164 and parameters: {'eta': 0.23951274541935214, 'max_depth': 9, 'gamma': 2.4451144212121893e-08, 'subsample': 0.9989539360739352, 'colsample_bytree': 0.8999224216276656, 'min_child_weight': 7.489966946988934, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:46:36,107] Trial 63 finished with value: 3034.496631 and parameters: {'eta': 0.2906285342066211, 'max_depth': 9, 'gamma': 2.3320028783573044e-07, 'subsample': 0.998545496085736, 'colsample_bytree': 0.900876478262415, 'min_child_weight': 11.882340482031697, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:46:56,817] Trial 64 finished with value: 3035.3720216 and parameters: {'eta': 0.2820047348161208, 'max_depth': 9, 'gamma': 2.184155692311592e-07, 'subsample': 0.996343223272909, 'colsample_bytree': 0.8381661712662352, 'min_child_weight': 18.587468085803913, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:47:12,480] Trial 65 finished with value: 3047.7063965999996 and parameters: {'eta': 0.5798351672744028, 'max_depth': 9, 'gamma': 2.378427771818312e-07, 'subsample': 0.994049449129419, 'colsample_bytree': 0.8375069604498476, 'min_child_weight': 19.32650734634661, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:47:35,243] Trial 66 finished with value: 3034.7572264 and parameters: {'eta': 0.2498196015404798, 'max_depth': 9, 'gamma': 9.388235442649892e-08, 'subsample': 0.9892227950699081, 'colsample_bytree': 0.7821132557310545, 'min_child_weight': 5.695495978260237, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:47:47,709] Trial 67 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:48:07,900] Trial 68 finished with value: 3034.8570802 and parameters: {'eta': 0.2405533104908695, 'max_depth': 9, 'gamma': 7.837410655297795e-08, 'subsample': 0.9015583204840619, 'colsample_bytree': 0.8511509480054339, 'min_child_weight': 7.834999223021249, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:48:28,141] Trial 69 finished with value: 3040.0000976 and parameters: {'eta': 0.2956399924280415, 'max_depth': 8, 'gamma': 9.88751541731488e-08, 'subsample': 0.9022556763576, 'colsample_bytree': 0.7719874184705332, 'min_child_weight': 8.25004890534742, 'booster': 'gbtree'}. Best is trial 61 with value: 3033.7231936.
[I 2023-07-20 15:48:43,765] Trial 70 pruned. Trial was pruned at iteration 24.
[I 2023-07-20 15:49:07,476] Trial 71 finished with value: 3032.0960452 and parameters: {'eta': 0.2265464552295039, 'max_depth': 9, 'gamma': 6.650095266635949e-08, 'subsample': 0.9538124529289038, 'colsample_bytree': 0.8659924086826483, 'min_child_weight': 5.605164868653055, 'booster': 'gbtree'}. Best is trial 71 with value: 3032.0960452.
[I 2023-07-20 15:49:27,374] Trial 72 finished with value: 3038.8958494000003 and parameters: {'eta': 0.22902007353151901, 'max_depth': 10, 'gamma': 1.1515183939981256e-07, 'subsample': 0.901785004503644, 'colsample_bytree': 0.8588347755027981, 'min_child_weight': 30.385739895404505, 'booster': 'gbtree'}. Best is trial 71 with value: 3032.0960452.
[I 2023-07-20 15:49:45,395] Trial 73 finished with value: 3044.8246092 and parameters: {'eta': 0.35260837690076746, 'max_depth': 9, 'gamma': 1.1106991827582779e-06, 'subsample': 0.9398513983516278, 'colsample_bytree': 0.8895583767246401, 'min_child_weight': 5.98719906547585, 'booster': 'gbtree'}. Best is trial 71 with value: 3032.0960452.
[I 2023-07-20 15:49:58,678] Trial 74 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:50:14,877] Trial 75 finished with value: 3043.1111816 and parameters: {'eta': 0.6183195923598164, 'max_depth': 10, 'gamma': 2.732262054750231e-07, 'subsample': 0.8822019431362266, 'colsample_bytree': 0.8220732908570513, 'min_child_weight': 11.748022411721493, 'booster': 'gbtree'}. Best is trial 71 with value: 3032.0960452.
[I 2023-07-20 15:50:40,403] Trial 76 finished with value: 3037.9492189999996 and parameters: {'eta': 0.23855793927016783, 'max_depth': 9, 'gamma': 3.992529445417728e-08, 'subsample': 0.951313051091195, 'colsample_bytree': 0.8950261415738519, 'min_child_weight': 3.476577441804782, 'booster': 'gbtree'}. Best is trial 71 with value: 3032.0960452.
[I 2023-07-20 15:50:53,113] Trial 77 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:51:05,514] Trial 78 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:51:24,012] Trial 79 finished with value: 3040.228418 and parameters: {'eta': 0.33335306038090784, 'max_depth': 9, 'gamma': 3.4761652743022766e-07, 'subsample': 0.9453491076334737, 'colsample_bytree': 0.7488837947306157, 'min_child_weight': 31.895005763179608, 'booster': 'gbtree'}. Best is trial 71 with value: 3032.0960452.
[I 2023-07-20 15:51:57,058] Trial 80 finished with value: 3031.0040528 and parameters: {'eta': 0.14277821773718294, 'max_depth': 10, 'gamma': 8.088782592976787e-07, 'subsample': 0.9977974732766702, 'colsample_bytree': 0.809525144421714, 'min_child_weight': 7.514977546259929, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:52:34,886] Trial 81 finished with value: 3036.6441894 and parameters: {'eta': 0.14688542814891956, 'max_depth': 10, 'gamma': 7.038891673481538e-08, 'subsample': 0.9928503505941781, 'colsample_bytree': 0.8085460732808113, 'min_child_weight': 7.665893011318919, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:52:51,975] Trial 82 finished with value: 3050.27583 and parameters: {'eta': 0.4379462106578011, 'max_depth': 9, 'gamma': 6.000344883817363e-07, 'subsample': 0.9971102122143154, 'colsample_bytree': 0.8273389607051485, 'min_child_weight': 23.10361905636443, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:53:19,354] Trial 83 finished with value: 3036.3837402 and parameters: {'eta': 0.22417761641274248, 'max_depth': 10, 'gamma': 1.2867623022100379e-06, 'subsample': 0.9577715079481326, 'colsample_bytree': 0.8747482876335879, 'min_child_weight': 3.382772126945124, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:53:59,165] Trial 84 finished with value: 3031.5499022 and parameters: {'eta': 0.07777175813083295, 'max_depth': 9, 'gamma': 1.7185510203212536e-07, 'subsample': 0.9348510407500364, 'colsample_bytree': 0.9329988917156696, 'min_child_weight': 13.528526200805464, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:54:12,033] Trial 85 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:54:37,014] Trial 86 finished with value: 3031.0340822 and parameters: {'eta': 0.16516782634436789, 'max_depth': 10, 'gamma': 1.7234931690405196e-08, 'subsample': 0.9342372920085978, 'colsample_bytree': 0.909095200137177, 'min_child_weight': 9.871149485232598, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:55:07,366] Trial 87 finished with value: 3036.349414 and parameters: {'eta': 0.13339295383590416, 'max_depth': 10, 'gamma': 2.3445324150229605e-08, 'subsample': 0.9325053962950647, 'colsample_bytree': 0.9360339455632799, 'min_child_weight': 2.085188759176219, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:55:32,779] Trial 88 finished with value: 3034.3476564 and parameters: {'eta': 0.17185470709215206, 'max_depth': 10, 'gamma': 1.5232308495682677e-08, 'subsample': 0.9554709654420762, 'colsample_bytree': 0.9012439729356523, 'min_child_weight': 13.647558912144003, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:55:46,546] Trial 89 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:55:59,914] Trial 90 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:56:29,402] Trial 91 finished with value: 3037.1376950000003 and parameters: {'eta': 0.1801027949508445, 'max_depth': 9, 'gamma': 1.4836461200734805e-07, 'subsample': 0.977567003551009, 'colsample_bytree': 0.8711392022063783, 'min_child_weight': 4.944097482180133, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:56:53,676] Trial 92 finished with value: 3043.7844238000002 and parameters: {'eta': 0.1464197811613084, 'max_depth': 10, 'gamma': 1.969266532609399e-08, 'subsample': 0.9768054797862559, 'colsample_bytree': 0.8959052698638602, 'min_child_weight': 38.86530610694329, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:57:14,050] Trial 93 finished with value: 3045.8321287999997 and parameters: {'eta': 0.35648020407239445, 'max_depth': 9, 'gamma': 5.1544279890441144e-08, 'subsample': 0.9410417761633629, 'colsample_bytree': 0.9321853495782388, 'min_child_weight': 1.9636483287430881, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:57:40,873] Trial 94 finished with value: 3037.1115236 and parameters: {'eta': 0.19774996151749544, 'max_depth': 8, 'gamma': 3.7637323222116544e-07, 'subsample': 0.9599639239639111, 'colsample_bytree': 0.8708399930343282, 'min_child_weight': 4.99726808337303, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:57:55,309] Trial 95 pruned. Trial was pruned at iteration 20.
[I 2023-07-20 15:58:30,377] Trial 96 finished with value: 3037.8553222 and parameters: {'eta': 0.12748413134085249, 'max_depth': 9, 'gamma': 1.5262247871791894e-07, 'subsample': 0.9791566175763764, 'colsample_bytree': 0.8853711291798442, 'min_child_weight': 21.67053566526402, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:58:49,951] Trial 97 finished with value: 3043.8745604000005 and parameters: {'eta': 0.4864981430946289, 'max_depth': 7, 'gamma': 7.539816084962301e-08, 'subsample': 0.8933603644872811, 'colsample_bytree': 0.922160705746178, 'min_child_weight': 2.6831925998799897, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.
[I 2023-07-20 15:59:05,830] Trial 98 pruned. Trial was pruned at iteration 21.
[I 2023-07-20 15:59:40,344] Trial 99 finished with value: 3040.4043948 and parameters: {'eta': 0.1875688877271209, 'max_depth': 9, 'gamma': 1.0222902239978172e-07, 'subsample': 0.9427128387025056, 'colsample_bytree': 0.8100148150348059, 'min_child_weight': 0.8118029486547931, 'booster': 'gbtree'}. Best is trial 80 with value: 3031.0040528.

Hyper-Parameter Optimization successfully finished.
  Number of finished trials:  100
  Best trial:
    Value: 3031.0040528
    Params: 
    eta: 0.14277821773718294
    max_depth: 10
    gamma: 8.088782592976787e-07
    subsample: 0.9977974732766702
    colsample_bytree: 0.809525144421714
    min_child_weight: 7.514977546259929
    booster: gbtree
    opt_rounds: 30

Model Training¶

In [6]:
Copied!
np.random.seed(123)

opt_params = opt_param.copy()
n_rounds = opt_params["opt_rounds"]
del opt_params["opt_rounds"]

# Train Model with optimized hyperparameters
xgblss.train(opt_params,
             dtrain,
             num_boost_round=n_rounds
             )
np.random.seed(123) opt_params = opt_param.copy() n_rounds = opt_params["opt_rounds"] del opt_params["opt_rounds"] # Train Model with optimized hyperparameters xgblss.train(opt_params, dtrain, num_boost_round=n_rounds )

Prediction¶

In [7]:
Copied!
# Set seed for reproducibility
torch.manual_seed(123)

# Number of samples to draw from predicted distribution
n_samples = 10000
# Quantiles to calculate from predicted distribution
quant_sel = [0.05, 0.95] 

# Sample from predicted distribution
pred_samples = xgblss.predict(dtest,
                              pred_type="samples",
                              n_samples=n_samples,
                              seed=123)

# Calculate quantiles from predicted distribution
pred_quantiles = xgblss.predict(dtest,
                                pred_type="quantiles",
                                n_samples=n_samples,
                                quantiles=quant_sel)

# Returns predicted parameters
pred_params = xgblss.predict(dtest,
                             pred_type="parameters")
# Set seed for reproducibility torch.manual_seed(123) # Number of samples to draw from predicted distribution n_samples = 10000 # Quantiles to calculate from predicted distribution quant_sel = [0.05, 0.95] # Sample from predicted distribution pred_samples = xgblss.predict(dtest, pred_type="samples", n_samples=n_samples, seed=123) # Calculate quantiles from predicted distribution pred_quantiles = xgblss.predict(dtest, pred_type="quantiles", n_samples=n_samples, quantiles=quant_sel) # Returns predicted parameters pred_params = xgblss.predict(dtest, pred_type="parameters")
In [8]:
Copied!
pred_samples.head()
pred_samples.head()
Out[8]:
y_sample0 y_sample1 y_sample2 y_sample3 y_sample4 y_sample5 y_sample6 y_sample7 y_sample8 y_sample9 ... y_sample9990 y_sample9991 y_sample9992 y_sample9993 y_sample9994 y_sample9995 y_sample9996 y_sample9997 y_sample9998 y_sample9999
0 12.476486 11.232409 4.531297 11.363811 4.781734 13.938910 11.681598 14.658313 9.974003 8.736984 ... 13.760765 9.338540 8.213298 11.582684 9.699373 15.296181 5.445426 10.907387 11.825034 6.373481
1 16.861736 8.414603 10.939034 13.475651 8.340402 10.276856 12.489843 10.608356 12.365025 9.093086 ... 10.898591 2.581861 11.356403 13.315522 6.913608 15.685884 9.282003 9.755927 3.866492 10.277873
2 12.442025 9.501931 9.579384 11.448649 15.371767 11.944936 11.200554 9.318138 9.043898 10.726871 ... 10.183985 11.545186 8.312314 10.584079 8.789487 10.342226 9.892250 8.029637 10.462024 10.342827
3 11.241399 8.752618 13.266323 3.948052 16.326685 16.667992 8.063462 17.408077 6.992835 10.044690 ... -0.004205 8.638702 17.228178 7.539776 3.437411 8.712561 13.743514 13.827578 13.600298 8.999898
4 13.759394 8.122649 6.028914 4.139555 5.615580 10.576174 6.294752 15.667155 10.416535 16.968143 ... 9.751814 7.578747 8.704581 12.697542 8.643000 7.352739 14.508101 10.728389 7.371967 9.274317

5 rows × 10000 columns

In [9]:
Copied!
pred_quantiles.head()
pred_quantiles.head()
Out[9]:
quant_0.05 quant_0.95
0 5.147843 15.415341
1 5.348317 15.360555
2 7.733128 12.995715
3 0.741641 18.253794
4 3.127059 16.718515
In [10]:
Copied!
pred_params.head()
pred_params.head()
Out[10]:
param_1 param_2 param_3 param_4 param_5 param_6 param_7 param_8 param_9 param_10 ... param_22 param_23 param_24 param_25 param_26 param_27 param_28 param_29 param_30 param_31
0 -0.280689 -0.156779 -0.461753 -0.831503 0.182882 -0.557738 -2.383831 1.173182 0.97675 0.290846 ... 0.511808 0.286395 0.945506 0.805746 0.692724 0.634053 -0.886691 4.017344 0.745495 1.069772
1 -0.280689 -0.156779 -0.461753 -0.831503 0.182882 -0.557738 -2.383831 1.173182 0.97675 0.290846 ... 0.480639 0.286395 0.945506 0.805746 0.692724 0.634053 -0.977325 3.833302 0.745495 1.069772
2 -0.280689 -0.156779 -0.461753 -0.831503 0.182882 -0.557738 -2.383831 1.173182 0.97675 0.290846 ... 0.359091 0.286395 0.945506 0.805746 0.692724 0.634053 -2.301708 3.418404 0.745495 1.069772
3 -0.280689 -0.156779 -0.461753 -0.831503 0.182882 -0.557738 -2.383831 1.173182 0.97675 0.290846 ... 0.315179 0.286395 0.945506 0.805746 0.692724 0.634053 0.802579 1.476677 0.745495 1.069772
4 -0.280689 -0.156779 -0.461753 -0.831503 0.182882 -0.557738 -2.383831 1.173182 0.97675 0.290846 ... 0.210731 0.286395 0.945506 0.805746 0.692724 0.634053 0.025325 3.645396 0.745495 1.069772

5 rows × 31 columns

SHAP Interpretability¶

Note that in contrast to parametric distributions, the parameters of the Spline-Flow do not have a direct interpretation.

In [11]:
Copied!
# Partial Dependence Plot of how x acts on param_21
xgblss.plot(X_test,
            parameter="param_21",
            feature="x_true",
            plot_type="Partial_Dependence")
# Partial Dependence Plot of how x acts on param_21 xgblss.plot(X_test, parameter="param_21", feature="x_true", plot_type="Partial_Dependence")
No description has been provided for this image
No description has been provided for this image
In [12]:
Copied!
# Feature Importance of param_21
xgblss.plot(X_test,
            parameter="param_21",
            plot_type="Feature_Importance")
# Feature Importance of param_21 xgblss.plot(X_test, parameter="param_21", plot_type="Feature_Importance")
No description has been provided for this image
No description has been provided for this image

Plot of Actual vs. Predicted Quantiles¶

In [13]:
Copied!
np.random.seed(123)

###
# Actual Quantiles
###
q1 = norm.ppf(quant_sel[0], loc = 10, scale = 1 + 4*((0.3 < test["x_true"].values) & (test["x_true"].values < 0.5)) + 2*(test["x_true"].values > 0.7))
q2 = norm.ppf(quant_sel[1], loc = 10, scale = 1 + 4*((0.3 < test["x_true"].values) & (test["x_true"].values < 0.5)) + 2*(test["x_true"].values > 0.7))
test["quant"] = np.where(test["y"].values < q1, 0, np.where(test["y"].values < q2, 1, 2))
test["alpha"] = np.where(test["y"].values <= q1, 1, np.where(test["y"].values >= q2, 1, 0))
df_quantiles = test[test["alpha"] == 1]

# Lower Bound
yl = list(set(q1))
yl.sort()
yl = [yl[2],yl[0],yl[2],yl[1],yl[1]]
sfunl = pd.DataFrame({"x_true":[0, 0.3, 0.5, 0.7, 1], "y":yl})

# Upper Bound
yu = list(set(q2))
yu.sort()
yu = [yu[0],yu[2],yu[0],yu[1],yu[1]]
sfunu = pd.DataFrame({"x_true":[0, 0.3, 0.5, 0.7, 1], "y":yu})

###
# Predicted Quantiles
###
test["lb"] = pred_quantiles.iloc[:,0]
test["ub"] = pred_quantiles.iloc[:,1]

###
# Plot
###
(ggplot(test,
        aes("x_true",
            "y")) + 
 geom_point(alpha = 0.2, color = "black", size = 2) + 
 theme_bw(base_size=15) +
 theme(legend_position="none",
       plot_title = element_text(hjust = 0.5)) +
 labs(title = "XGBoostLSS Regression - Simulated Data Example",
      x="x")  + 
 geom_line(aes("x_true",
               "ub"),
           size = 1,
           color = "blue", 
           alpha = 0.7) + 
 geom_line(aes("x_true",
               "lb"),
           size = 1,
           color = "blue", 
           alpha = 0.7) + 
 geom_point(df_quantiles,
            aes("x_true",
                "y"), 
            color = "red", 
            alpha = 0.7,
            size = 2) + 
 geom_step(sfunl,
           aes("x_true",
               "y"), 
           size = 1, 
           linetype = "dashed")  + 
 geom_step(sfunu,
           aes("x_true",
               "y"), 
           size = 1, 
           linetype = "dashed") 
)
np.random.seed(123) ### # Actual Quantiles ### q1 = norm.ppf(quant_sel[0], loc = 10, scale = 1 + 4*((0.3 < test["x_true"].values) & (test["x_true"].values < 0.5)) + 2*(test["x_true"].values > 0.7)) q2 = norm.ppf(quant_sel[1], loc = 10, scale = 1 + 4*((0.3 < test["x_true"].values) & (test["x_true"].values < 0.5)) + 2*(test["x_true"].values > 0.7)) test["quant"] = np.where(test["y"].values < q1, 0, np.where(test["y"].values < q2, 1, 2)) test["alpha"] = np.where(test["y"].values <= q1, 1, np.where(test["y"].values >= q2, 1, 0)) df_quantiles = test[test["alpha"] == 1] # Lower Bound yl = list(set(q1)) yl.sort() yl = [yl[2],yl[0],yl[2],yl[1],yl[1]] sfunl = pd.DataFrame({"x_true":[0, 0.3, 0.5, 0.7, 1], "y":yl}) # Upper Bound yu = list(set(q2)) yu.sort() yu = [yu[0],yu[2],yu[0],yu[1],yu[1]] sfunu = pd.DataFrame({"x_true":[0, 0.3, 0.5, 0.7, 1], "y":yu}) ### # Predicted Quantiles ### test["lb"] = pred_quantiles.iloc[:,0] test["ub"] = pred_quantiles.iloc[:,1] ### # Plot ### (ggplot(test, aes("x_true", "y")) + geom_point(alpha = 0.2, color = "black", size = 2) + theme_bw(base_size=15) + theme(legend_position="none", plot_title = element_text(hjust = 0.5)) + labs(title = "XGBoostLSS Regression - Simulated Data Example", x="x") + geom_line(aes("x_true", "ub"), size = 1, color = "blue", alpha = 0.7) + geom_line(aes("x_true", "lb"), size = 1, color = "blue", alpha = 0.7) + geom_point(df_quantiles, aes("x_true", "y"), color = "red", alpha = 0.7, size = 2) + geom_step(sfunl, aes("x_true", "y"), size = 1, linetype = "dashed") + geom_step(sfunu, aes("x_true", "y"), size = 1, linetype = "dashed") )
No description has been provided for this image
Out[13]:
<Figure Size: (2000 x 1000)>

True vs. Predicted Distributional Parameters¶

In the following figure, we compare the true parameters of the Gaussian with the ones predicted by XGBoostLSS. The below figure shows that the estimated parameters closely match the true ones (recall that the location parameter $\mu=10$ is simulated as being a constant).

In [14]:
Copied!
dist_params = ["loc", "scale"]

# Calculate parameters from samples 
sample_params = pd.DataFrame.from_dict(
    {
        "loc": pred_samples.mean(axis=1),
        "scale": pred_samples.std(axis=1),
        "x_true": X_test["x_true"].values        
    }
)

# Data with predicted values
plot_df_predt = pd.melt(sample_params[["x_true"] + dist_params],
                        id_vars="x_true",
                        value_vars=dist_params)
plot_df_predt["type"] = "PREDICT"

# Data with actual values
plot_df_actual = pd.melt(test[["x_true"] + dist_params],
                         id_vars="x_true",
                         value_vars=dist_params)
plot_df_actual["type"] = "TRUE"

# Combine data for plotting
plot_df = pd.concat([plot_df_predt, plot_df_actual])
plot_df["variable"] = plot_df.variable.str.upper()
plot_df["type"] = pd.Categorical(plot_df["type"], categories = ["PREDICT", "TRUE"])

# Plot
(ggplot(plot_df,
        aes(x="x_true",
            y="value",
            color="type")) +
 geom_line(size=1.1) + 
 facet_wrap("variable",
            scales="free") + 
 labs(title="Parameters of univariate Gaussian predicted with XGBoostLSS",
      x="",
      y="") + 
 theme_bw(base_size=15) + 
 theme(legend_position="bottom",
       plot_title = element_text(hjust = 0.5),
       legend_title = element_blank())
)
dist_params = ["loc", "scale"] # Calculate parameters from samples sample_params = pd.DataFrame.from_dict( { "loc": pred_samples.mean(axis=1), "scale": pred_samples.std(axis=1), "x_true": X_test["x_true"].values } ) # Data with predicted values plot_df_predt = pd.melt(sample_params[["x_true"] + dist_params], id_vars="x_true", value_vars=dist_params) plot_df_predt["type"] = "PREDICT" # Data with actual values plot_df_actual = pd.melt(test[["x_true"] + dist_params], id_vars="x_true", value_vars=dist_params) plot_df_actual["type"] = "TRUE" # Combine data for plotting plot_df = pd.concat([plot_df_predt, plot_df_actual]) plot_df["variable"] = plot_df.variable.str.upper() plot_df["type"] = pd.Categorical(plot_df["type"], categories = ["PREDICT", "TRUE"]) # Plot (ggplot(plot_df, aes(x="x_true", y="value", color="type")) + geom_line(size=1.1) + facet_wrap("variable", scales="free") + labs(title="Parameters of univariate Gaussian predicted with XGBoostLSS", x="", y="") + theme_bw(base_size=15) + theme(legend_position="bottom", plot_title = element_text(hjust = 0.5), legend_title = element_blank()) )
No description has been provided for this image
Out[14]:
<Figure Size: (2000 x 1000)>

Density Plots¶

In [15]:
Copied!
pred_df = pd.melt(pred_samples.iloc[:,0:5])
actual_df = pd.DataFrame.from_dict({"variable": "ACTUAL", "value": y_test.reshape(-1,)})
plot_df = pd.concat([pred_df, actual_df])

(
    ggplot(plot_df, 
           aes(x="value",
               color="variable",
               fill="variable")) +  
    geom_density(alpha=0.4) + 
    facet_wrap("variable",
               ncol=2) + 
    theme_bw(base_size=15) + 
    theme(plot_title = element_text(hjust = 0.5)) +
    theme(legend_position="none")
)
pred_df = pd.melt(pred_samples.iloc[:,0:5]) actual_df = pd.DataFrame.from_dict({"variable": "ACTUAL", "value": y_test.reshape(-1,)}) plot_df = pd.concat([pred_df, actual_df]) ( ggplot(plot_df, aes(x="value", color="variable", fill="variable")) + geom_density(alpha=0.4) + facet_wrap("variable", ncol=2) + theme_bw(base_size=15) + theme(plot_title = element_text(hjust = 0.5)) + theme(legend_position="none") )
No description has been provided for this image
Out[15]:
<Figure Size: (2000 x 1000)>

Actual vs. Predicted¶

Since we predict the entire conditional distribution, we can overlay the point predictions with predicted densities, from which we can also derive quantiles of interest.

In [16]:
Copied!
y_pred = []

n_examples = 8
q_sel = [0.05, 0.95]
y_sel=0
samples_arr = pred_samples.values.reshape(-1,n_samples)

for i in range(n_examples):    
    y_samples = pd.DataFrame(samples_arr[i,:].reshape(-1,1), columns=["PREDICT_DENSITY"])
    y_samples["PREDICT_POINT"] = y_samples["PREDICT_DENSITY"].mean()
    y_samples["PREDICT_Q05"] = y_samples["PREDICT_DENSITY"].quantile(q=q_sel[0])
    y_samples["PREDICT_Q95"] = y_samples["PREDICT_DENSITY"].quantile(q=q_sel[1])
    y_samples["ACTUAL"] = y_test[i]
    y_samples["obs"]= f"Obervation {i+1}"
    y_pred.append(y_samples)
    
pred_df = pd.melt(pd.concat(y_pred, axis=0), id_vars="obs")
pred_df["obs"] = pd.Categorical(pred_df["obs"], categories=[f"Obervation {i+1}" for i in range(n_examples)])
df_actual, df_pred_dens, df_pred_point, df_q05, df_q95 = [x for _, x in pred_df.groupby("variable")]

plot_pred = (
    ggplot(pred_df,
           aes(color="variable")) + 
    stat_density(df_pred_dens,
                 aes(x="value"),
                 size=1.1) + 
    geom_point(df_pred_point,
               aes(x="value",
                   y=0),
               size=1.4) + 
    geom_point(df_actual,
               aes(x="value",
                   y=0),
               size=1.4) + 
    geom_vline(df_q05, 
               aes(xintercept="value",
                   fill="variable",
                   color="variable"),
               linetype="dashed",
               size=1.1) + 
    geom_vline(df_q95, 
               aes(xintercept="value",
                   fill="variable",
                   color="variable"),
               linetype="dashed",
               size=1.1) + 
    facet_wrap("obs",
               scales="free",
               ncol=4) + 
    labs(title="Predicted vs. Actual \n",
         x = "") + 
    theme_bw(base_size=15) +
    theme(plot_title = element_text(hjust = 0.5)) +
    scale_fill_brewer(type="qual", palette="Dark2") + 
    theme(legend_position="bottom",
          legend_title = element_blank()
         )
)

print(plot_pred)
y_pred = [] n_examples = 8 q_sel = [0.05, 0.95] y_sel=0 samples_arr = pred_samples.values.reshape(-1,n_samples) for i in range(n_examples): y_samples = pd.DataFrame(samples_arr[i,:].reshape(-1,1), columns=["PREDICT_DENSITY"]) y_samples["PREDICT_POINT"] = y_samples["PREDICT_DENSITY"].mean() y_samples["PREDICT_Q05"] = y_samples["PREDICT_DENSITY"].quantile(q=q_sel[0]) y_samples["PREDICT_Q95"] = y_samples["PREDICT_DENSITY"].quantile(q=q_sel[1]) y_samples["ACTUAL"] = y_test[i] y_samples["obs"]= f"Obervation {i+1}" y_pred.append(y_samples) pred_df = pd.melt(pd.concat(y_pred, axis=0), id_vars="obs") pred_df["obs"] = pd.Categorical(pred_df["obs"], categories=[f"Obervation {i+1}" for i in range(n_examples)]) df_actual, df_pred_dens, df_pred_point, df_q05, df_q95 = [x for _, x in pred_df.groupby("variable")] plot_pred = ( ggplot(pred_df, aes(color="variable")) + stat_density(df_pred_dens, aes(x="value"), size=1.1) + geom_point(df_pred_point, aes(x="value", y=0), size=1.4) + geom_point(df_actual, aes(x="value", y=0), size=1.4) + geom_vline(df_q05, aes(xintercept="value", fill="variable", color="variable"), linetype="dashed", size=1.1) + geom_vline(df_q95, aes(xintercept="value", fill="variable", color="variable"), linetype="dashed", size=1.1) + facet_wrap("obs", scales="free", ncol=4) + labs(title="Predicted vs. Actual \n", x = "") + theme_bw(base_size=15) + theme(plot_title = element_text(hjust = 0.5)) + scale_fill_brewer(type="qual", palette="Dark2") + theme(legend_position="bottom", legend_title = element_blank() ) ) print(plot_pred)
No description has been provided for this image

Previous Next

Built with MkDocs using a theme provided by Read the Docs.
GitHub « Previous Next »