This chapter will teach you how to make your XGBoost models as performant as possible. You'll learn about the variety of parameters that can be adjusted to alter the behavior of XGBoost and how to tune them efficiently so that you can supercharge the performance of your models.

Why tune your model ?

import pandas as pd
import numpy as np
import warnings

pd.set_option('display.expand_frame_repr', False)

warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=FutureWarning)
housing_data = pd.read_csv('datasets/ames_housing_trimmed_processed.csv')
housing_data.describe()
MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt Remodeled GrLivArea BsmtFullBath BsmtHalfBath ... HouseStyle_1.5Unf HouseStyle_1Story HouseStyle_2.5Fin HouseStyle_2.5Unf HouseStyle_2Story HouseStyle_SFoyer HouseStyle_SLvl PavedDrive_P PavedDrive_Y SalePrice
count 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 ... 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000 1460.000000
mean 56.897260 57.623288 10516.828082 6.099315 5.575342 1971.267808 0.476712 1515.463699 0.425342 0.057534 ... 0.009589 0.497260 0.005479 0.007534 0.304795 0.025342 0.044521 0.020548 0.917808 180921.195890
std 42.300571 34.664304 9981.264932 1.382997 1.112799 30.202904 0.499629 525.480383 0.518911 0.238753 ... 0.097486 0.500164 0.073846 0.086502 0.460478 0.157217 0.206319 0.141914 0.274751 79442.502883
min 20.000000 0.000000 1300.000000 1.000000 1.000000 1872.000000 0.000000 334.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 34900.000000
25% 20.000000 42.000000 7553.500000 5.000000 5.000000 1954.000000 0.000000 1129.500000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 129975.000000
50% 50.000000 63.000000 9478.500000 6.000000 5.000000 1973.000000 0.000000 1464.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 163000.000000
75% 70.000000 79.000000 11601.500000 7.000000 6.000000 2000.000000 1.000000 1776.750000 1.000000 0.000000 ... 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 214000.000000
max 190.000000 313.000000 215245.000000 10.000000 9.000000 2010.000000 1.000000 5642.000000 3.000000 2.000000 ... 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 755000.000000

8 rows × 57 columns

""" Untuned model example """
import xgboost as xgb
import pandas as pd
import numpy as np


X,y = housing_data.iloc[:, :-1], housing_data.iloc[:, -1]

housing_dmatrix = xgb.DMatrix(data=X, label=y)

untuned_params = {"objective": "reg:squarederror"}

# run 4 fold cross validation on untuned model params
untuned_cv_results_rmse = xgb.cv(dtrain=housing_dmatrix, params=untuned_params,nfold=4,metrics="rmse", as_pandas=True, seed=123)

untuned_rmse = untuned_cv_results_rmse["test-rmse-mean"].tail(1)
print("Untuned model RMSE:", untuned_cv_results_rmse["test-rmse-mean"].tail(1))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File c:\Users\nguyenngochai\.conda\envs\my_conda_env\lib\site-packages\pandas\core\indexes\range.py:385, in RangeIndex.get_loc(self, key, method, tolerance)
    384 try:
--> 385     return self._range.index(new_key)
    386 except ValueError as err:

ValueError: -1 is not in range

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
c:\Users\nguyenngochai\Self_study\Data-Science-selfstudy-notes-Blog\_notebooks\Extreme Gradient Boosting with XGBoost\2022-05-21-Extreme Gradient Boosting with XGBoost-Part 3.ipynb Cell 5 in <cell line: 16>()
     <a href='vscode-notebook-cell:/c%3A/Users/nguyenngochai/Self_study/Data-Science-selfstudy-notes-Blog/_notebooks/Extreme%20Gradient%20Boosting%20with%20XGBoost/2022-05-21-Extreme%20Gradient%20Boosting%20with%20XGBoost-Part%203.ipynb#W4sZmlsZQ%3D%3D?line=12'>13</a> # run 4 fold cross validation on untuned model params
     <a href='vscode-notebook-cell:/c%3A/Users/nguyenngochai/Self_study/Data-Science-selfstudy-notes-Blog/_notebooks/Extreme%20Gradient%20Boosting%20with%20XGBoost/2022-05-21-Extreme%20Gradient%20Boosting%20with%20XGBoost-Part%203.ipynb#W4sZmlsZQ%3D%3D?line=13'>14</a> untuned_cv_results_rmse = xgb.cv(dtrain=housing_dmatrix, params=untuned_params,nfold=4,metrics="rmse", as_pandas=True, seed=123)
---> <a href='vscode-notebook-cell:/c%3A/Users/nguyenngochai/Self_study/Data-Science-selfstudy-notes-Blog/_notebooks/Extreme%20Gradient%20Boosting%20with%20XGBoost/2022-05-21-Extreme%20Gradient%20Boosting%20with%20XGBoost-Part%203.ipynb#W4sZmlsZQ%3D%3D?line=15'>16</a> untuned_rmse = untuned_cv_results_rmse["test-rmse-mean"].tail(1)[-1]
     <a href='vscode-notebook-cell:/c%3A/Users/nguyenngochai/Self_study/Data-Science-selfstudy-notes-Blog/_notebooks/Extreme%20Gradient%20Boosting%20with%20XGBoost/2022-05-21-Extreme%20Gradient%20Boosting%20with%20XGBoost-Part%203.ipynb#W4sZmlsZQ%3D%3D?line=16'>17</a> print("Untuned model RMSE:", untuned_cv_results_rmse["test-rmse-mean"].tail(1))

File c:\Users\nguyenngochai\.conda\envs\my_conda_env\lib\site-packages\pandas\core\series.py:958, in Series.__getitem__(self, key)
    955     return self._values[key]
    957 elif key_is_scalar:
--> 958     return self._get_value(key)
    960 if is_hashable(key):
    961     # Otherwise index.get_value will raise InvalidIndexError
    962     try:
    963         # For labels that don't resolve as scalars like tuples and frozensets

File c:\Users\nguyenngochai\.conda\envs\my_conda_env\lib\site-packages\pandas\core\series.py:1069, in Series._get_value(self, label, takeable)
   1066     return self._values[label]
   1068 # Similar to Index.get_value, but we do not fall back to positional
-> 1069 loc = self.index.get_loc(label)
   1070 return self.index._get_values_for_loc(self, loc, label)

File c:\Users\nguyenngochai\.conda\envs\my_conda_env\lib\site-packages\pandas\core\indexes\range.py:387, in RangeIndex.get_loc(self, key, method, tolerance)
    385         return self._range.index(new_key)
    386     except ValueError as err:
--> 387         raise KeyError(key) from err
    388 self._check_indexing_error(key)
    389 raise KeyError(key)

KeyError: -1
""" Tuned model example """
# data was loaded and prepared in the above cell

tuned_params = {"objective": "reg:squarederror", 'colsample_bytree': 0.3,'learning_rate': 0.1, 'max_depth': 5}

tuned_cv_results_rmse = xgb.cv(dtrain=housing_dmatrix, params=tuned_params,nfold=4,num_boost_round=200, metrics="rmse", as_pandas=True, seed=123)

tuned_rmse = tuned_cv_results_rmse["test-rmse-mean"].tail(1)
print("Tuned model RMSE:", tuned_cv_results_rmse["test-rmse-mean"].tail(1))
 
Tuned model RMSE: 199    29965.411196
Name: test-rmse-mean, dtype: float64
untuned_rmse - tuned_rmse
9     NaN
199   NaN
Name: test-rmse-mean, dtype: float64

That is almost 1400

Tuning the number of boosting rounds

Let's start with parameter tuning by seeing how the number of boosting rounds (number of trees you build) impacts the out-of-sample performance of your XGBoost model. You'll use xgb.cv() inside a for loop and build one model per num_boost_round parameter.

Here, you'll continue working with the Ames housing dataset. The features are available in the array X, and the target vector is contained in y.

  • Instructions:
    • Create a DMatrix called housing_dmatrix from X and y.
    • Create a parameter dictionary called params, passing in the appropriate "objective" ("reg:linear") and "max_depth" (set it to 3).
    • Iterate over num_rounds inside a for loop and perform 3-fold cross-validation. In each iteration of the loop, pass in the current number of boosting rounds (curr_num_rounds) to xgb.cv() as the argument to num_boost_round.
    • Append the final boosting round RMSE for each cross-validated XGBoost model to the final_rmse_per_round list.
    • num_rounds and final_rmse_per_round have been zipped and converted into a DataFrame so you can easily see how the model performs with each boosting round. Hit 'Submit Answer' to see the results!
import warnings;   warnings.filterwarnings("ignore") # ignored the pandas warning
 
# Create the DMatrix: housing_dmatrix
housing_dmatrix = xgb.DMatrix(data= X, label = y )

# Create the parameter dictionary for each tree: params 
# params = {"objective":"reg:linear", "max_depth":3}
params = {"objective":"reg:squarederror", "max_depth":3}

# Create list of number of boosting rounds
num_rounds = [5, 10, 15, 20, 50, 75, 100, 200]

# Empty list to store final round rmse per XGBoost model
final_rmse_per_round = []

# Iterate over num_rounds and build one model per num_boost_round parameter
for curr_num_rounds in num_rounds:

    # Perform cross-validation: cv_results
    cv_results = xgb.cv(dtrain=housing_dmatrix, params=params, nfold=3, num_boost_round=curr_num_rounds, metrics="rmse", as_pandas=True, seed=123)
    
    # Append final round RMSE
    final_rmse_per_round.append(cv_results["test-rmse-mean"].tail().values[-1])

# Print the resultant DataFrame
num_rounds_rmses = list(zip(num_rounds, final_rmse_per_round))
print(pd.DataFrame(num_rounds_rmses,columns=["num_boosting_rounds","rmse"]))
   num_boosting_rounds          rmse
0                    5  50903.298177
1                   10  34774.192709
2                   15  32895.097656
3                   20  32019.971354
4                   50  30943.686198
5                   75  30579.746094
6                  100  30680.307292
7                  200  30691.264974

Automated boosting round selection using early_stopping

Now, instead of attempting to cherry pick the best possible number of boosting rounds, you can very easily have XGBoost automatically select the number of boosting rounds for you within xgb.cv(). This is done using a technique called early stopping.

Early stopping works by testing the XGBoost model after every boosting round against a hold-out dataset and stopping the creation of additional boosting rounds (thereby finishing training of the model early) if the hold-out metric ("rmse" in our case) does not improve for a given number of rounds. Here you will use the early_stopping_rounds parameter in xgb.cv() with a large possible number of boosting rounds (50). Bear in mind that if the holdout metric continuously improves up through when num_boost_rounds is reached, then early stopping does not occur.

Here, the DMatrix and parameter dictionary have been created for you. Your task is to use cross-validation with early stopping. Go for it!

  • Instructions
    • Perform 3-fold cross-validation with early stopping and "rmse" as your metric. Use 10 early stopping rounds and 50 boosting rounds. Specify a seed of 123 and make sure the output is a pandas DataFrame. Remember to specify the other parameters such as dtrain, params, and metrics.
    • Print cv_results.
housing_dmatrix = xgb.DMatrix(data=X, label=y)

# Create the parameter dictionary for each tree: params
params = {"objective":"reg:squarederror", "max_depth":4}

# Perform cross-validation with early stopping: cv_results
cv_results = xgb.cv(dtrain=housing_dmatrix,
                    params = params,
                    nfold=3,
                    num_boost_round=50,
                    metrics='rmse',
                    as_pandas = True,
                    early_stopping_rounds=10,
                    seed = 123)

# Print cv_results
print(cv_results["test-rmse-mean"].tail())
# print(cv_results)
45    30758.543732
46    30729.971937
47    30732.663173
48    30712.241251
49    30720.853939
Name: test-rmse-mean, dtype: float64

Overview of XGBoost's hyperparameters

Tunable parameters in XGBoost

  • Common tree tunable parameters
    • learning rate: learning rate/eta
    • gamma: min loss reduction to create new tree split
    • lambda: L2 reg on leaf weights
    • alpha: L1 reg on leaf weights
    • max_depth: max depth per tree
    • subsample: % samples used per tree
    • colsample_bytree: % features used per


  • Linear tunable parameters
    • lambda: L2 reg on weights
    • alpha: L1 reg on weights
    • lambda_bias: L2 reg term on bias
    • You can also tune the number of estimators used for both base model types!

Tuning eta

It's time to practice tuning other XGBoost hyperparameters in earnest and observing their effect on model performance! You'll begin by tuning the "eta", also known as the learning rate.

The learning rate in XGBoost is a parameter that can range between 0 and 1, with higher values of "eta" penalizing feature weights more strongly, causing much stronger regularization.

  • Instructions

    • Create a list called eta_vals to store the following "eta" values: 0.001, 0.01, and 0.1.
    • Iterate over your eta_vals list using a for loop.
    • In each iteration of the for loop, set the "eta" key of params to be equal to curr_val. Then, perform 3-fold cross-validation with early stopping (5 rounds), 10 boosting rounds, a metric of "rmse", and a seed of 123. Ensure the output is a DataFrame.
    • Append the final round RMSE to the best_rmse list.
housing_dmatrix = xgb.DMatrix(data=X, label=y)

# Create the parameter dictionary for each tree (boosting round)
params = {"objective":"reg:squarederror", "max_depth":3}

# Create list of eta values and empty list to store final round rmse per xgboost model
eta_vals = [0.001, 0.01, 0.1]
best_rmse = []

# Systematically vary the eta 
for curr_val in eta_vals:

    params["eta"] = curr_val
    
    # Perform cross-validation: cv_results
    cv_results = xgb.cv(dtrain = housing_dmatrix,
                        params = params,
                        nfold = 3, 
                        num_boost_round=10,
                        early_stopping_rounds = 5,
                        metrics ="rmse",
                        as_pandas =True,
                        seed = 123)
    
    # Append the final round rmse to best_rmse
    best_rmse.append(cv_results["test-rmse-mean"].tail().values[-1])

# Print the resultant DataFrame
print(pd.DataFrame(list(zip(eta_vals, best_rmse)), columns=["eta","best_rmse"]))
     eta      best_rmse
0  0.001  195736.402543
1  0.010  179932.183986
2  0.100   79759.411808

Tuning max_depth

In this exercise, your job is to tune max_depth, which is the parameter that dictates the maximum depth that each tree in a boosting round can grow to. Smaller values will lead to shallower trees, and larger values to deeper trees.

  • Instructions
    • Create a list called max_depths to store the following "max_depth" values: 2, 5, 10, and 20.
    • Iterate over your max_depths list using a for loop.
    • Systematically vary "max_depth" in each iteration of the for loop and perform 2-fold cross-validation with early stopping (5 rounds), 10 boosting rounds, a metric of "rmse", and a seed of 123. Ensure the output is a
housing_dmatrix = xgb.DMatrix(data=X,label=y)

# Create the parameter dictionary
params = {"objective":"reg:squarederror"}

# Create list of max_depth values
max_depths = [2, 5, 10, 20]
best_rmse = []

# Systematically vary the max_depth
for curr_val in max_depths:

    params["max_depth"] = curr_val
    
    # Perform cross-validation
    cv_results = xgb.cv(dtrain= housing_dmatrix,
                        params = params,
                        nfold = 2,
                        num_boost_round =10,
                        early_stopping_rounds=5,
                        metrics="rmse",
                        as_pandas=True,
                        seed=123)
    
    
    # Append the final round rmse to best_rmse
    best_rmse.append(cv_results["test-rmse-mean"].tail().values[-1])

# Print the resultant DataFrame
print(pd.DataFrame(list(zip(max_depths, best_rmse)),columns=["max_depth","best_rmse"]))
   max_depth     best_rmse
0          2  37957.469464
1          5  35596.599504
2         10  36065.547345
3         20  36739.576068

Tuning colsample_bytree

Now, it's time to tune "colsample_bytree". You've already seen this if you've ever worked with scikit-learn's RandomForestClassifier or RandomForestRegressor, where it just was called max_features. In both xgboost and sklearn, this parameter (although named differently) simply specifies the fraction of features to choose from at every split in a given tree. In xgboost, colsample_bytree must be specified as a float between 0 and 1.

  • Instructions
    • Create a list called colsample_bytree_vals to store the values 0.1, 0.5, 0.8, and 1.
    • Systematically vary "colsample_bytree" and perform cross-validation, exactly as you did with max_depth and eta previously.
housing_dmatrix = xgb.DMatrix(data=X,label=y)

# Create the parameter dictionary
params={"objective":"reg:squarederror","max_depth":3}

# Create list of hyperparameter values: colsample_bytree_vals
colsample_bytree_vals = [0.1, 0.5, 0.8, 1]
best_rmse = []

# Systematically vary the hyperparameter value 
for curr_val in colsample_bytree_vals:

    params["colsample_bytree"] = curr_val
    
    # Perform cross-validation
    cv_results = xgb.cv(dtrain=housing_dmatrix, 
                        params=params,
                        nfold=2,
                        num_boost_round=10, 
                        early_stopping_rounds=5,
                        metrics="rmse",
                        as_pandas=True,
                        seed=123)
    
    # Append the final round rmse to best_rmse
    best_rmse.append(cv_results["test-rmse-mean"].tail().values[-1])

# Print the resultant DataFrame
print(pd.DataFrame(list(zip(colsample_bytree_vals, best_rmse)), columns=["colsample_bytree","best_rmse"]))
   colsample_bytree     best_rmse
0               0.1  40918.116895
1               0.5  35813.904168
2               0.8  35995.678734
3               1.0  35836.044343
"""Grid search: example"""
import pandas as pd
import xgboost as xgb
import numpy as np
from sklearn.model_selection import GridSearchCV


housing_data = pd.read_csv("datasets/ames_housing_trimmed_processed.csv")
X, y = housing_data[housing_data.columns.tolist()[:-1]], housing_data[housing_data.columns.tolist()[-1]]

housing_dmatrix = xgb.DMatrix(data=X,label=y)
gbm_param_grid = {  'learning_rate': [0.01, 0.05, 0.1, 0.5, 0.9],
                    'n_estimators': [200],
                    # 'subsample': [0.3, 0.5, 0.9]}
                    'subsample':  np.arange(0.05,1.05,.05)}
gbm = xgb.XGBRegressor()

grid_mse = GridSearchCV(estimator=gbm,param_grid=gbm_param_grid,scoring='neg_mean_squared_error', cv=4, verbose=1)

grid_mse.fit(X, y)
print("Best parameters found: ",grid_mse.best_params_)
print("Lowest RMSE found: ", np.sqrt(np.abs(grid_mse.best_score_)))


""" 2m41.3s
Fitting 4 folds for each of 100 candidates, totalling 400 fits
Best parameters found:  {'learning_rate': 0.1, 'n_estimators': 200, 'subsample': 0.45}
Lowest RMSE found:  28528.32863427011
"""
Fitting 4 folds for each of 100 candidates, totalling 400 fits
Best parameters found:  {'learning_rate': 0.1, 'n_estimators': 200, 'subsample': 0.45}
Lowest RMSE found:  28528.32863427011
"""Random search: example"""
import pandas as pd
import xgboost as xgb
import numpy as np
from sklearn.model_selection import RandomizedSearchCV

housing_data = pd.read_csv("datasets/ames_housing_trimmed_processed.csv")
X,y = housing_data[housing_data.columns.tolist()[:-1]], housing_data[housing_data.columns.tolist()[-1]]

housing_dmatrix = xgb.DMatrix(data=X,label=y)

gbm_param_grid = {  'learning_rate': np.arange(0.05,1.05,.05),
                    'n_estimators': [200],
                    'subsample': np.arange(0.05,1.05,.05)}
gbm = xgb.XGBRegressor()

# try with 25 random combinations
randomized_mse = RandomizedSearchCV(estimator=gbm, param_distributions=gbm_param_grid, n_iter=25, scoring='neg_mean_squared_error', cv=4, verbose=1)

randomized_mse.fit(X, y)
print("Best parameters found: ",randomized_mse.best_params_)
print("Lowest RMSE found: ", np.sqrt(np.abs(randomized_mse.best_score_)))
Fitting 4 folds for each of 25 candidates, totalling 100 fits
Best parameters found:  {'subsample': 0.6500000000000001, 'n_estimators': 200, 'learning_rate': 0.05}
Lowest RMSE found:  28875.728215978015
np.arange(0.05,1.05,.05)
array([0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55,
       0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1.  ])

Grid search with XGBoost

Now that you've learned how to tune parameters individually with XGBoost, let's take your parameter tuning to the next level by using scikit-learn's GridSearch and RandomizedSearch capabilities with internal cross-validation using the GridSearchCV and RandomizedSearchCV functions. You will use these to find the best model exhaustively from a collection of possible parameter values across multiple parameters simultaneously. Let's get to work, starting with GridSearchCV!

  • Instructions
    • Create a parameter grid called gbm_param_grid that contains a list of "colsample_bytree" values (0.3, 0.7), a list with a single value for "n_estimators" (50), and a list of 2 "max_depth" (2, 5) values.
    • Instantiate an XGBRegressor object called gbm.
    • Create a GridSearchCV object called grid_mse, passing in: the parameter grid to param_grid, the XGBRegressor to estimator, "neg_mean_squared_error" to scoring, and 4 to cv. - Also specify verbose=1 so you can better understand the output.
    • Fit the GridSearchCV object to X and y.
    • Print the best parameter values and lowest RMSE, using the .bestparams and .bestscore attributes, respectively, of grid_mse.
gbm_param_grid = {
    'colsample_bytree': [0.3, 0.7],
    'n_estimators': [50],
    'max_depth': [2, 5]
}

# Instantiate the regressor: gbm
gbm = xgb.XGBRegressor()

# Perform grid search: grid_mse
grid_mse = GridSearchCV(estimator= gbm,
                        param_grid=gbm_param_grid,
                        scoring= 'neg_mean_squared_error',
                        cv =4,
                        verbose = 1)

# Fit grid_mse to the data
grid_mse.fit(X,y)

# Print the best parameters and lowest RMSE
print("Best parameters found: ", grid_mse.best_params_)
print("Lowest RMSE found: ", np.sqrt(np.abs(grid_mse.best_score_)))
Fitting 4 folds for each of 4 candidates, totalling 16 fits
Best parameters found:  {'colsample_bytree': 0.3, 'max_depth': 5, 'n_estimators': 50}
Lowest RMSE found:  28986.18703093561

Random search with XGBoost

Often, GridSearchCV can be really time consuming, so in practice, you may want to use RandomizedSearchCV instead, as you will do in this exercise. The good news is you only have to make a few modifications to your GridSearchCV code to do RandomizedSearchCV. The key difference is you have to specify a param_distributions parameter instead of a param_grid parameter.

  • Instructions
    • Create a parameter grid called gbm_param_grid that contains a list with a single value for 'n_estimators' (25), and a list of 'max_depth' values between 2 and 11 for 'max_depth' - use range(2, 12) for this.
    • Create a RandomizedSearchCV object called randomized_mse, passing in: the parameter grid to param_distributions, the XGBRegressor to estimator, "neg_mean_squared_error" to scoring, 5 to n_iter, and 4 to cv. Also specify verbose=1 so you can better understand the output.
    • Fit the RandomizedSearchCV object to X and y.
gbm_param_grid = {
    'n_estimators': [25],
    'max_depth': range(2,12)
}

# Instantiate the regressor: gbm
gbm = xgb.XGBRegressor(n_estimators=10)

# Perform random search: grid_mse
randomized_mse = RandomizedSearchCV(estimator =gbm,
                            param_distributions = gbm_param_grid,
                            scoring='neg_mean_squared_error',
                            cv =4, 
                            n_iter = 5,
                            verbose =1 )


# Fit randomized_mse to the data
randomized_mse.fit(X,y)

# Print the best parameters and lowest RMSE
print("Best parameters found: ", randomized_mse.best_params_)
print("Lowest RMSE found: ", np.sqrt(np.abs(randomized_mse.best_score_)))
Fitting 4 folds for each of 5 candidates, totalling 20 fits
Best parameters found:  {'n_estimators': 25, 'max_depth': 4}
Lowest RMSE found:  29998.4522530019