Most Popular
1500 questions
289
votes
8 answers
Micro Average vs Macro average Performance in a Multiclass classification setting
I am trying out a multiclass classification setting with 3 classes. The class distribution is skewed with most of the data falling in 1 of the 3 classes. (class labels being 1,2,3, with 67.28% of the data falling in class label 1, 11.99% data in…

SHASHANK GUPTA
- 3,795
- 4
- 19
- 26
283
votes
12 answers
What are deconvolutional layers?
I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work.
The relevant part is
3.3. Upsampling is backwards strided…

Martin Thoma
- 18,880
- 35
- 95
- 169
263
votes
10 answers
How to set class weights for imbalanced classes in Keras?
I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to provide one?
By the way, in this case the appropriate praxis is simply to weight up the…

Hendrik
- 8,587
- 17
- 42
- 55
240
votes
10 answers
What's the difference between fit and fit_transform in scikit-learn models?
I do not understand the difference between the fit and fit_transform methods in scikit-learn. Can anybody explain simply why we might need to transform data?
What does it mean, fitting a model on training data and transforming to test data? Does it…

Kaggle
- 2,877
- 5
- 14
- 8
202
votes
13 answers
K-Means clustering for mixed numeric and categorical data
My data set contains a number of numeric attributes and one categorical.
Say, NumericAttr1, NumericAttr2, ..., NumericAttrN, CategoricalAttr,
where CategoricalAttr takes one of three possible values: CategoricalAttrValue1, CategoricalAttrValue2 or…

IgorS
- 5,474
- 11
- 31
- 43
202
votes
35 answers
Publicly Available Datasets
One of the common problems in data science is gathering data from various sources in a somehow cleaned (semi-structured) format and combining metrics from various sources for making a higher level analysis. Looking at the other people's effort,…

Amir Ali Akbari
- 1,393
- 3
- 13
- 25
198
votes
5 answers
What is the "dying ReLU" problem in neural networks?
Referring to the Stanford course notes on Convolutional Neural Networks for Visual Recognition, a paragraph says:
"Unfortunately, ReLU units can be fragile during training and can
"die". For example, a large gradient flowing through a ReLU…

tejaskhot
- 4,065
- 7
- 20
- 18
198
votes
16 answers
Train/Test/Validation Set Splitting in Sklearn
How could I randomly split a data matrix and the corresponding label vector into a X_train, X_test, X_val, y_train, y_test, y_val with scikit-learn?
As far as I know, sklearn.model_selection.train_test_split is only capable of splitting into two not…

Hendrik
- 8,587
- 17
- 42
- 55
198
votes
6 answers
How to draw Deep learning network architecture diagrams?
I have built my model. Now I want to draw the network architecture diagram for my research paper. Example is shown below:

Muhammad Ali
- 2,487
- 5
- 19
- 22
193
votes
1 answer
Difference between isna() and isnull() in pandas
I have been using pandas for quite some time. But, I don't understand what's the difference between isna() and isnull(). And, more importantly, which one to use when identifying missing values in a dataframe.
What is the basic underlying difference…

Vaibhav Thakur
- 2,363
- 3
- 12
- 9
180
votes
6 answers
When to use GRU over LSTM?
The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates).
Why do we make use of GRU when we clearly have more control on the network…

Sayali Sonawane
- 2,051
- 3
- 12
- 13
178
votes
21 answers
How do you visualize neural network architectures?
When writing a paper / making a presentation about a topic which is about neural networks, one usually visualizes the networks architecture.
What are good / simple ways to visualize common architectures automatically?

Martin Thoma
- 18,880
- 35
- 95
- 169
172
votes
4 answers
When to use One Hot Encoding vs LabelEncoder vs DictVectorizor?
I have been building models with categorical data for a while now and when in this situation I basically default to using scikit-learn's LabelEncoder function to transform this data prior to building a model.
I understand the difference between OHE,…

anthr
- 1,843
- 3
- 11
- 11
151
votes
6 answers
The cross-entropy error function in neural networks
In the MNIST For ML Beginners they define cross-entropy as
$$H_{y'} (y) := - \sum_{i} y_{i}' \log (y_i)$$
$y_i$ is the predicted probability value for class $i$ and $y_i'$ is the true probability for that class.
Question 1
Isn't it a problem that…

Martin Thoma
- 18,880
- 35
- 95
- 169
150
votes
17 answers
Best python library for neural networks
I'm using Neural Networks to solve different Machine learning problems. I'm using Python and pybrain but this library is almost discontinued. Are there other good alternatives in Python?

marcodena
- 1,667
- 4
- 14
- 17