Highest Voted 'xgboost' Questions - Data Science Stack Exchange

44

votes

2 answers

LightGBM vs XGBoost

I'm trying to understand which is better (more accurate, especially in classification problems) I've been searching articles comparing LightGBM and XGBoost but found only…

xgboost

asked May 11 '17 at 12:12

Sergey Nizhevyasov

553
1
4
4

16

votes

1 answer

How does Xgboost learn what are the inputs for missing values?

So from Algorithm 3 of https://arxiv.org/pdf/1603.02754v3.pdf, it says that an optimum default direction is determined and the missing values will go in that direction. However, or perhaps I have misunderstood/missed the explanation from the…

xgboost

asked Nov 23 '16 at 09:46

mathnoob

183
1
1
6

15

votes

2 answers

In XGBoost would we evaluate results with a Precision Recall curve vs ROC?

I am using XGBoost for payment fraud detection. The objective is binary classification, and the data is very unbalanced. One out of every 3-4k transactions is fraud. I would expect the best way to evaluate the results is a Precision-Recall (PR)…

xgboost

asked Jan 10 '17 at 16:29

davidjhp

435
1
4
10

10

votes

3 answers

XGboost - Choice made by model

i am using XGboost to predict a 2 classes target variable on insurance claims. I have a model ( training with cross validation, hyper parameters tuning etc...) i run on another dataset. My question is : is there a way to know why a given claim has…

xgboost

asked Jun 14 '18 at 14:48

Fabrice BOUCHAREL

439
3
12

6

votes

3 answers

Can I specify the root node splitting feature in XGBoost?

Just what the title says. Suppose I know the feature that I want to be used for splitting for the root node in a tree model in XGBoost; is there a way for me to tell XGBoost to split on this feature at the root node?

xgboost

asked Oct 22 '19 at 20:36

bgav

61
2

6

votes

2 answers

Ordinal classification with xgboost

I am working in the problem where the dependent variables are ordered classes, such as bad, good, very good. How could I declare this problem in xgboost instead of normal classification or regression? Thanks

xgboost

asked Jan 22 '19 at 03:53

mommomonthewind

161
1
5

6

votes

1 answer

XGBoost results are not invariant under monotone predictor transformations?

It is believed by many that tree-based methods are invariant under monotone transformations of the predictors. But recently I've read a paper (https://arxiv.org/pdf/1611.04561.pdf, referred to as the arxiv paper later) that says whether it's…

xgboost

asked Jun 04 '18 at 22:07

hooyeh

61
1

5

votes

1 answer

What does the limit of xgboost max_depth=1 represent?

In my mind, this means that each tree just takes one feature, and produces a step function based upon it. In the limit of n_estimators being very large and max_depth=1, does xgboost become a linear model? On my dataset, a gridsearch found max_depth…

xgboost

asked Dec 19 '18 at 11:15

cjm2671

284
2
8

5

votes

1 answer

Difference between XGBRegressor and XGBClassifier

I'm trying to understand the difference between xgboost.XGBRegressor and xgboost.sklearn.XGBClassifier. Can someone explain the difference in a concise manner? Because when I fit both classifiers with the exact same data, I get pretty different…

xgboost

asked Apr 05 '18 at 22:50

aerin

907
1
9
13

5

votes

1 answer

What does it mean to "warm-start" XGBoost?

In the project I am currently working on (predicting whether or not someone will click on some item from the mailing list that I send), each day data about users is extracted and the models are learned from scratch. Since users from yesterday don't…

xgboost

asked Jan 30 '17 at 16:38

JohnnyQ

201
2
5

4

votes

1 answer

Handling unbalanced datasets with XG boosting

Suppose you want to model (predict) a rare disease, and you use the parameter "pos scale weight" as a hyperparameter in XG boost . For example I have 20 times more positive cases, can I then use pos scale weight = 0.05, even though for example the…

xgboost

asked Sep 26 '18 at 14:51

Aniel Kali

41
3

4

votes

1 answer

XGBoost equations (for dummies)

I am having a hard time trying to understand the MSE loss function given in the Introduction to Boosted Trees (beware! My maths skills are the equivalent of a very sparse matrix): $ \begin{split}\text{obj}^{(t)} & = \sum_{i=1}^n (y_i -…

xgboost

asked Apr 04 '18 at 07:24

Pierpaolo Calanna

63
2

3

votes

1 answer

what make lightGBM run faster than XGBoost?

I am curious on what differences in implementation allow speed up of lightGBM over XGBoost, some times up to magnitude of orders.

xgboost

asked Jul 24 '20 at 19:37

Victor Luu

243
1
6

3

votes

1 answer

Explaining XGBoost functioning to non-technical people

I have been tasked to explain the principle of the XGBoost algorithm to non-technical people (think 1-2 slides in a powerpoint presentation to upper management). I am currently working with the original papers : here for the paper specific to…

xgboost

asked Aug 22 '19 at 09:03

Lucas Morin

2,196
5
21
42

3

votes

1 answer

Is XGBoost better with numeric predictors?

I have a categorical feature that I one-hot encoded and used in my XGBoost model, but it consistently underperforms as a predictor compared to the other predictors. Then I created a new variable that contains the same kind of information that the…

xgboost

asked Mar 16 '18 at 20:45

conv3d

221
1
7

Questions tagged [xgboost]