Questions tagged [xgboost]

For questions related to the eXtreme Gradient Boosting algorithm.

701 questions
44
votes
2 answers

LightGBM vs XGBoost

I'm trying to understand which is better (more accurate, especially in classification problems) I've been searching articles comparing LightGBM and XGBoost but found only…
Sergey Nizhevyasov
  • 553
  • 1
  • 4
  • 4
16
votes
1 answer

How does Xgboost learn what are the inputs for missing values?

So from Algorithm 3 of https://arxiv.org/pdf/1603.02754v3.pdf, it says that an optimum default direction is determined and the missing values will go in that direction. However, or perhaps I have misunderstood/missed the explanation from the…
mathnoob
  • 183
  • 1
  • 1
  • 6
15
votes
2 answers

In XGBoost would we evaluate results with a Precision Recall curve vs ROC?

I am using XGBoost for payment fraud detection. The objective is binary classification, and the data is very unbalanced. One out of every 3-4k transactions is fraud. I would expect the best way to evaluate the results is a Precision-Recall (PR)…
davidjhp
  • 435
  • 1
  • 4
  • 10
10
votes
3 answers

XGboost - Choice made by model

i am using XGboost to predict a 2 classes target variable on insurance claims. I have a model ( training with cross validation, hyper parameters tuning etc...) i run on another dataset. My question is : is there a way to know why a given claim has…
6
votes
3 answers

Can I specify the root node splitting feature in XGBoost?

Just what the title says. Suppose I know the feature that I want to be used for splitting for the root node in a tree model in XGBoost; is there a way for me to tell XGBoost to split on this feature at the root node?
bgav
  • 61
  • 2
6
votes
2 answers

Ordinal classification with xgboost

I am working in the problem where the dependent variables are ordered classes, such as bad, good, very good. How could I declare this problem in xgboost instead of normal classification or regression? Thanks
mommomonthewind
  • 161
  • 1
  • 5
6
votes
1 answer

XGBoost results are not invariant under monotone predictor transformations?

It is believed by many that tree-based methods are invariant under monotone transformations of the predictors. But recently I've read a paper (https://arxiv.org/pdf/1611.04561.pdf, referred to as the arxiv paper later) that says whether it's…
hooyeh
  • 61
  • 1
5
votes
1 answer

What does the limit of xgboost max_depth=1 represent?

In my mind, this means that each tree just takes one feature, and produces a step function based upon it. In the limit of n_estimators being very large and max_depth=1, does xgboost become a linear model? On my dataset, a gridsearch found max_depth…
cjm2671
  • 284
  • 2
  • 8
5
votes
1 answer

Difference between XGBRegressor and XGBClassifier

I'm trying to understand the difference between xgboost.XGBRegressor and xgboost.sklearn.XGBClassifier. Can someone explain the difference in a concise manner? Because when I fit both classifiers with the exact same data, I get pretty different…
aerin
  • 907
  • 1
  • 9
  • 13
5
votes
1 answer

What does it mean to "warm-start" XGBoost?

In the project I am currently working on (predicting whether or not someone will click on some item from the mailing list that I send), each day data about users is extracted and the models are learned from scratch. Since users from yesterday don't…
JohnnyQ
  • 201
  • 2
  • 5
4
votes
1 answer

Handling unbalanced datasets with XG boosting

Suppose you want to model (predict) a rare disease, and you use the parameter "pos scale weight" as a hyperparameter in XG boost . For example I have 20 times more positive cases, can I then use pos scale weight = 0.05, even though for example the…
Aniel Kali
  • 41
  • 3
4
votes
1 answer

XGBoost equations (for dummies)

I am having a hard time trying to understand the MSE loss function given in the Introduction to Boosted Trees (beware! My maths skills are the equivalent of a very sparse matrix): $ \begin{split}\text{obj}^{(t)} & = \sum_{i=1}^n (y_i -…
3
votes
1 answer

what make lightGBM run faster than XGBoost?

I am curious on what differences in implementation allow speed up of lightGBM over XGBoost, some times up to magnitude of orders.
Victor Luu
  • 243
  • 1
  • 6
3
votes
1 answer

Explaining XGBoost functioning to non-technical people

I have been tasked to explain the principle of the XGBoost algorithm to non-technical people (think 1-2 slides in a powerpoint presentation to upper management). I am currently working with the original papers : here for the paper specific to…
Lucas Morin
  • 2,196
  • 5
  • 21
  • 42
3
votes
1 answer

Is XGBoost better with numeric predictors?

I have a categorical feature that I one-hot encoded and used in my XGBoost model, but it consistently underperforms as a predictor compared to the other predictors. Then I created a new variable that contains the same kind of information that the…
conv3d
  • 221
  • 1
  • 7
1
2 3