For questions related to the eXtreme Gradient Boosting algorithm.
Questions tagged [xgboost]
701 questions
44
votes
2 answers
LightGBM vs XGBoost
I'm trying to understand which is better (more accurate, especially in classification problems)
I've been searching articles comparing LightGBM and XGBoost but found only…

Sergey Nizhevyasov
- 553
- 1
- 4
- 4
16
votes
1 answer
How does Xgboost learn what are the inputs for missing values?
So from Algorithm 3 of https://arxiv.org/pdf/1603.02754v3.pdf, it says that an optimum default direction is determined and the missing values will go in that direction. However, or perhaps I have misunderstood/missed the explanation from the…

mathnoob
- 183
- 1
- 1
- 6
15
votes
2 answers
In XGBoost would we evaluate results with a Precision Recall curve vs ROC?
I am using XGBoost for payment fraud detection. The objective is binary classification, and the data is very unbalanced. One out of every 3-4k transactions is fraud.
I would expect the best way to evaluate the results is a Precision-Recall (PR)…

davidjhp
- 435
- 1
- 4
- 10
10
votes
3 answers
XGboost - Choice made by model
i am using XGboost to predict a 2 classes target variable on insurance claims. I have a model ( training with cross validation, hyper parameters tuning etc...) i run on another dataset.
My question is :
is there a way to know why a given claim has…

Fabrice BOUCHAREL
- 439
- 3
- 12
6
votes
3 answers
Can I specify the root node splitting feature in XGBoost?
Just what the title says. Suppose I know the feature that I want to be used for splitting for the root node in a tree model in XGBoost; is there a way for me to tell XGBoost to split on this feature at the root node?

bgav
- 61
- 2
6
votes
2 answers
Ordinal classification with xgboost
I am working in the problem where the dependent variables are ordered classes, such as bad, good, very good.
How could I declare this problem in xgboost instead of normal classification or regression?
Thanks

mommomonthewind
- 161
- 1
- 5
6
votes
1 answer
XGBoost results are not invariant under monotone predictor transformations?
It is believed by many that tree-based methods are invariant under monotone transformations of the predictors. But recently I've read a paper (https://arxiv.org/pdf/1611.04561.pdf, referred to as the arxiv paper later) that says whether it's…

hooyeh
- 61
- 1
5
votes
1 answer
What does the limit of xgboost max_depth=1 represent?
In my mind, this means that each tree just takes one feature, and produces a step function based upon it.
In the limit of n_estimators being very large and max_depth=1, does xgboost become a linear model?
On my dataset, a gridsearch found max_depth…

cjm2671
- 284
- 2
- 8
5
votes
1 answer
Difference between XGBRegressor and XGBClassifier
I'm trying to understand the difference between xgboost.XGBRegressor and xgboost.sklearn.XGBClassifier.
Can someone explain the difference in a concise manner?
Because when I fit both classifiers with the exact same data, I get pretty different…

aerin
- 907
- 1
- 9
- 13
5
votes
1 answer
What does it mean to "warm-start" XGBoost?
In the project I am currently working on (predicting whether or not someone will click on some item from the mailing list that I send), each day data about users is extracted and the models are learned from scratch. Since users from yesterday don't…

JohnnyQ
- 201
- 2
- 5
4
votes
1 answer
Handling unbalanced datasets with XG boosting
Suppose you want to model (predict) a rare disease, and you use the parameter "pos scale weight" as a hyperparameter in XG boost . For example I have 20 times more positive cases, can I then use pos scale weight = 0.05, even though for example the…

Aniel Kali
- 41
- 3
4
votes
1 answer
XGBoost equations (for dummies)
I am having a hard time trying to understand the MSE loss function given in the Introduction to Boosted Trees (beware! My maths skills are the equivalent of a very sparse matrix):
$
\begin{split}\text{obj}^{(t)} & = \sum_{i=1}^n (y_i -…

Pierpaolo Calanna
- 63
- 2
3
votes
1 answer
what make lightGBM run faster than XGBoost?
I am curious on what differences in implementation allow speed up of lightGBM over XGBoost, some times up to magnitude of orders.

Victor Luu
- 243
- 1
- 6
3
votes
1 answer
Explaining XGBoost functioning to non-technical people
I have been tasked to explain the principle of the XGBoost algorithm to non-technical people (think 1-2 slides in a powerpoint presentation to upper management).
I am currently working with the original papers : here for the paper specific to…

Lucas Morin
- 2,196
- 5
- 21
- 42
3
votes
1 answer
Is XGBoost better with numeric predictors?
I have a categorical feature that I one-hot encoded and used in my XGBoost model, but it consistently underperforms as a predictor compared to the other predictors.
Then I created a new variable that contains the same kind of information that the…

conv3d
- 221
- 1
- 7