Most Popular
1500 questions
8
votes
1 answer
What are the differences between SVC, NuSVC, and LinearSVC?
What are the differences between SVC, NuSVC, and LinearSVC?
Please shed some light.

Taylor
- 103
- 1
- 2
- 5
8
votes
4 answers
Can the vanishing gradient problem be solved by multiplying the input of tanh with a coefficient?
To my understanding, the vanishing gradient problem occurs when training neural networks when the gradient of each activation function is less than 1 such that when corrections are back-propagated through many layers, the product of these gradients…

zephyr
- 131
- 1
- 9
8
votes
1 answer
How to extract features and classify alert emails coming from monitoring tools into proper category?
My company provides managed services to a lot of its clients. Our customers typically uses following monitoring tools to monitor their servers/webapps:
OpsView
Nagios
Pingdom
Custom shell scripts
Whenever any issue is found, an alert mail comes to…

Kartikeya Sinha
- 181
- 3
8
votes
2 answers
Model for Differing Number of Rows per Observation
Looking to build a response model (click or no click) on marketing data which displays varying number of offers to a person. I don't want to model which offer they click but do they click any of the offers presented to them. My issue is how to deal…

Zachary
- 181
- 1
8
votes
2 answers
Covariance as inner product
Why is covariance considered as inner product if there is no projection of one vector onto another?
Right now I perceive this as just a multiplication of $x$ segment of vector($x_i - \bar{x}$) and $y$ segment($y_i - \bar{y}$) of the same vector in…

user641597
- 143
- 3
- 7
8
votes
2 answers
What are the disadvantages of having a left skewed distribution?
I'm currently working on a classification problem and I've a numerical column which is left skewed. i've read many posts where people are recommending to take log transformation or boxcox transformation to fix the left skewness.
So I was wondering…

Jeeth
- 931
- 2
- 10
- 19
8
votes
1 answer
Is it OK to try to find the best PCA k parameter as we do with other hyperparameters?
Principal Component Analysis (PCA) is used to reduce n-dimensional data to k-dimensional data to speed things up in machine learning. After PCA is applied, one can check how much of the variance of the original dataset remains in the resulting…

J. Doe
- 81
- 1
- 2
8
votes
2 answers
ValueError: could not convert string to float: '���'
I have a (2M, 23) dimensional numpy array X. It has a dtype of

cappy0704
- 231
- 1
- 3
- 7
8
votes
2 answers
What activation function should I use for a specific regression problem?
Which is better for regression problems create a neural net with tanh/sigmoid and exp(like) activations or ReLU and linear? Standard is to use ReLU but it's brute force solution that requires certain net size and I would like to avoid creating a…

quester
- 295
- 1
- 3
- 8
8
votes
3 answers
Fuzzy name and nickname match
I have a dataset with the following structure:
full_name,nickname,match
Christian Douglas,Chris,1,
Jhon Stevens,Charlie,0,
David Jr Simpson,Junior,1
Anastasia Williams,Stacie,1
Lara Williams,Ana,0
John Williams,Willy,1
where each predictor row…

David Masip
- 6,051
- 2
- 24
- 61
8
votes
3 answers
Why do I get an OOM error although my model is not that large?
I am a newbie in GPU based training and deep learning models. I am running cDCGAN (Conditional DCGAN) in TensorFlow on my 2 Nvidia GTX 1080 GPUs. My data set consists of around 320,000 images with size 64*64 and 2,350 class labels. If I set my batch…

Ammar Ul Hassan
- 185
- 1
- 1
- 5
8
votes
2 answers
dataframe.columns.difference() use
I am trying to find the working of dataframe.columns.difference() but couldn't find a satisfactory explanation about it. Can anyone explain the working of this method in detail?

Parth S.
- 83
- 1
- 1
- 5
8
votes
1 answer
Micro-F1 and Macro-F1 are equal in binary classification and I don't know why
I have a binary classification problem which in the test set, the number of data in both classes are equal (the test number of class 0 and class 1 are equal). Since we know that the number of samples from every class are equal, I use median on the…

user137927
- 379
- 1
- 3
- 11
8
votes
1 answer
Gaussian Mixture Models as a classifier?
I'm learning the GMM clustering algorithm. I don't understand how it can used as a classifier. Here are my thought:
1) GMM is an unsupervised ML algorithm. At least that's how sklearn categorizes it.
2) Unsupervised methods can cluster data, but…

F.S.
- 183
- 1
- 4
8
votes
2 answers
Date Extraction in Python
I would like to extract all date information from a given document. Essentially, I guess this can be done with a lot of regexes:
2019-02-20
20.02.2019 ("German format")
02/2019 ("February 2019")
"tomorrow" (datetime.timedelta(days=1))
"yesterday"…

Martin Thoma
- 18,880
- 35
- 95
- 169