Most Popular
1500 questions
49
votes
5 answers
Opening a 20GB file for analysis with pandas
I am currently trying to open a file with pandas and python for machine learning purposes it would be ideal for me to have them all in a DataFrame. Now The file is 18GB large and my RAM is 32 GB but I keep getting memory errors.
From your experience…

Hari Prasad
- 501
- 1
- 5
- 4
49
votes
7 answers
What is the difference between model hyperparameters and model parameters?
I have noticed that such terms as model hyperparameter and model parameter have been used interchangeably on the web without prior clarification. I think this is incorrect and needs explanation. Consider a machine learning model, an SVM/NN/NB based…

minerals
- 2,147
- 3
- 17
- 19
48
votes
3 answers
What exactly is bootstrapping in reinforcement learning?
Apparently, in reinforcement learning, temporal-difference (TD) method is a bootstrapping method. On the other hand, Monte Carlo methods are not bootstrapping methods.
What exactly is bootstrapping in RL? What is a bootstrapping method in RL?
user10640
48
votes
4 answers
Why is ReLU used as an activation function?
Activation functions are used to introduce non-linearities in the linear output of the type w * x + b in a neural network.
Which I am able to understand intuitively for the activation functions like sigmoid.
I understand the advantages of ReLU,…

Bunny Rabbit
- 603
- 1
- 6
- 6
48
votes
3 answers
What is Ground Truth
In the context of Machine Learning, I have seen the term Ground Truth used a lot. I have searched a lot and found the following definition in Wikipedia:
In machine learning, the term "ground truth" refers to the accuracy of the training set's…

Green Falcon
- 14,058
- 9
- 57
- 98
48
votes
3 answers
What does the notation mAP@[.5:.95] mean?
For detection, a common way to determine if one object proposal was right is Intersection over Union (IoU, IU). This takes the set $A$ of proposed object pixels and the set of true object pixels $B$ and calculates:
$$IoU(A, B) = \frac{A \cap B}{A…

Martin Thoma
- 18,880
- 35
- 95
- 169
47
votes
2 answers
How does the validation_split parameter of Keras' fit function work?
Validation-split in Keras Sequential model fit function is documented as following on https://keras.io/models/sequential/ :
validation_split: Float between 0 and 1. Fraction of the training data
to be used as validation data. The model will set…

rnso
- 1,578
- 3
- 18
- 34
47
votes
5 answers
Does gradient descent always converge to an optimum?
I am wondering whether there is any scenario in which gradient descent does not converge to a minimum.
I am aware that gradient descent is not always guaranteed to converge to a global optimum. I am also aware that it might diverge from an optimum…

wit221
- 573
- 1
- 4
- 5
46
votes
6 answers
Calculating KL Divergence in Python
I am rather new to this and can't say I have a complete understanding of the theoretical concepts behind this. I am trying to calculate the KL Divergence between several lists of points in Python. I am using this to try and do this. The problem that…

Nanda
- 773
- 1
- 7
- 8
46
votes
3 answers
What does from_logits=True do in SparseCategoricalcrossEntropy loss function?
In the documentation it has been mentioned that y_pred needs to be in the range of [-inf to inf] when from_logits=True. I truly didn't understand what this means, since the probabilities need to be in the range of 0 to 1! Can someone please explain…

Nagendra Prasad
- 573
- 1
- 5
- 4
46
votes
12 answers
Data Science in C (or C++)
I'm an R language programmer. I'm also in the group of people who are considered Data Scientists but who come from academic disciplines other than CS.
This works out well in my role as a Data Scientist, however, by starting my career in R and only…

Hack-R
- 1,919
- 1
- 21
- 34
46
votes
9 answers
How much of data wrangling is a data scientist's job?
I'm currently working as a data scientist at a large company (my first job as a DS, so this question may be a result of my lack of experience). They have a huge backlog of really important data science projects that would have a great positive…

Victor Valente
- 569
- 4
- 9
46
votes
4 answers
Early stopping on validation loss or on accuracy?
I am currently training a neural network and I cannot decide which to use to implement my Early Stopping criteria: validation loss or a metrics like accuracy/f1score/auc/whatever calculated on the validation set.
In my research, I came upon articles…

qmeeus
- 1,259
- 1
- 10
- 13
46
votes
10 answers
When is precision more important over recall?
Can anyone give me some examples where precision is important and some examples where recall is important?

Rajat
- 1,077
- 2
- 9
- 10
46
votes
2 answers
Merging two different models in Keras
I am trying to merge two Keras models into a single model and I am unable to accomplish this.
For example in the attached Figure, I would like to fetch the middle layer $A2$ of dimension 8, and use this as input to the layer $B1$ (of dimension 8…

Rkz
- 1,033
- 1
- 10
- 12