Highest Voted Questions - Data Science Stack Exchange

8

votes

2 answers

How should I use BERT embeddings for clustering (as opposed to fine-tuning BERT model for a supervised task)

First of all, I want to say that I am asking this question because I am interested in using BERT embeddings as document features to do clustering. I am using Transformers from the Hugging Face library. I was thinking of averaging all of the Word…

asked Aug 21 '20 at 02:00

fractalnature

805
6
19

8

votes

4 answers

Understanding how convolutional layers work

After working with a CNN using Keras and the Mnist dataset for the well-know hand written digit recognition problem, I came up with some questions about how the convolutional layer work. I can understand what the convolution process is. My first…

asked Aug 18 '20 at 11:48

Karampistis Dimitrios

93
1
4

8

votes

4 answers

Does reinforcement learning require the help of other learning algorithms?

Can't reinforcement learning be used without the help of other learning algorithms like SVM and MLP back propagation? I consulted two papers: Paper 1 Paper 2 both have used other machine learning methods in the inner loop.

asked Sep 07 '15 at 08:29

girl101

1,161
2
11
26

8

votes

3 answers

Are there any machine learning techniques to identify points on plots/ images?

I have data for each vehicle's lateral position over time and lane number as shown in these 3 plots in the image and sample data below. > a Frame.ID xcoord Lane 1 452 27.39400 3 2 453 27.38331 3 3 454 27.42999 3 4 …

asked Sep 06 '15 at 02:04

umair durrani

344
2
8

8

votes

2 answers

Can a linear regression model without polynomial features overfit?

I've read in some articles on the internet that linear regression can overfit. However is that possible when we are not using polynomial features? We are just plotting a line trough the data points when we have one feature or a plane when we have…

asked Aug 08 '20 at 20:21

Tim von Känel

361
1
10

8

votes

4 answers

Job title similarity

I'm trying to define a metric between job titles in IT field. For this I need some metric between words of job titles that are not appearing together in the same job title, e.g. metric between the words senior, primary, lead, head, vp, director,…

asked Jul 21 '14 at 09:00

Mher

181
5

8

votes

1 answer

Anybody know what this type of visualisation is called?

I think this is a pretty cool way to visualise changes in values but I can’t find any name for this type of visualisation. I Source: https://www.economist.com/graphic-detail/2020/07/28/americans-are-getting-more-nervous-about-what-they-say-in-public

visualization

asked Jul 29 '20 at 17:53

K G

183
3

8

votes

3 answers

Should you use random state or random seed in machine learning models?

I'm starting to study machine learning. All the examples I saw, the person that created the ML model used a random state or a random seed to stop the randomness of the process. But, in real life, when you're trying to apply a machine learning model…

asked Jul 22 '20 at 02:43

Caldass_

167
1
7

8

votes

3 answers

Modality of data

Can anyone please explain in clear words what is generally meant by "modality of data"? I know what modality means with respect to distributions.

asked Jul 05 '20 at 11:56

Julia

81
1
2

8

votes

3 answers

Bert-Transformer : Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further downstream task. Bert's last layer looks like this…

asked Jul 02 '20 at 21:25

Aaditya ura

415
5
16

8

votes

2 answers

Do I need validation data if my train and test accuracy/loss is consistent?

I am trying to understand the purpose of a 3rd split in the form of a validation dataset. I am not necessarily talking about cross-validation here. In the scenario below, it would appear that the model is overfit to the training dataset. Train…

asked Jun 16 '20 at 01:18

Kermit

529
5
17

8

votes

2 answers

Is over fitting okay if test accuracy is high enough?

I am trying to build a binary classifier. I have tried deep neural networks with various different structures and parameters and I was not able to get anything better than Train set accuracy : 0.70102 Test set accuracy : 0.70001 Then I tried…

asked May 23 '20 at 04:54

skrrrt

304
2
13

8

votes

2 answers

Why Scikit and statsmodel provide different Coefficient of determination?

First of all, I know there is a similar question, however, I didn't find it so much helpful. My issue is concerning simple Linear regression and the outcome of R-Squared. I founded that results can be quite different if I use statsmodels and…

asked May 19 '20 at 08:03

Luckasino

183
1
4

8

votes

1 answer

Which ML approach to choose for the game AI when rewards are delayed?

Question: Which Machine Learning approach should I choose for the AI of my computer game, where the actions of the AI do not lead to immediate rewards, but delayed rewards instead? About me: I am a complete beginner in the area of machine learning.…

asked May 17 '20 at 11:43

Logende

61
4

8

votes

1 answer

Keras Early Stopping: Monitor 'loss' or 'val_loss'?

I often use "early stopping" when I train neural nets, e.g. in Keras: from keras.callbacks import EarlyStopping # Define early stopping as callback early_stopping = EarlyStopping(monitor='loss', patience=5, mode='auto',…

asked May 07 '20 at 11:37

Peter

7,446
5
19
49

Most Popular