Highest Voted Questions - Artificial Intelligence Stack Exchange

5

votes

1 answer

Is the LSTM component a neuron or a layer?

Given the standard illustrative feed-forward neural net model, with the dots as neurons and the lines as neuron-to-neuron connection, what part is the (unfold) LSTM cell (see picture)? Is it a neuron (a dot) or a layer?

asked Feb 08 '20 at 14:23

MScott

445
4
13

5

votes

1 answer

How powerful is OpenAI's Gym and Universe in board games area?

I'm a big fan of computer board games and would like to make Python chess/go/shogi/mancala programs. Having heard of reinforcement learning, I decided to look at OpenAI Gym. But first of all, I would like to know, is it possible using OpenAI…

asked Jan 26 '20 at 10:20

Taissa

63
4

5

votes

2 answers

What is "Computational Linguistics"?

It's not clear to me whether or not someone whose work aims to improve an NLP system may be called a "Computational Linguist" even when she/he doesn't modify the algorithm directly by coding. Let's consider the following activities: Annotation for…

asked Jan 23 '20 at 12:10

franz1

173
4

5

votes

2 answers

What are examples of approaches to dimensionality reduction of feature vectors?

Given a pre-trained CNN model, I extract feature vector of images in reference and query dataset with several thousands of elements. I would like to apply some augmentation techniques to reduce the feature vector dimension to speed up cosine…

asked Jan 23 '20 at 11:40

doplano

299
3
10

5

votes

1 answer

Which deep learning models are suitable for image-to-image mapping?

I am working on a problem in which I need to train a neural network to map one or more input images to one or more output images (1 channel for image). Below I report some examples of input&output. In this case I report 1 input and 1 output image,…

asked Jan 21 '20 at 09:08

Giulio Ortali

81
5

5

votes

1 answer

Autoencoder produces repeated artifacts after convergence

As experiment, I have tried using an autoencoder to encode height data from the alps, however the decoded image is very pixellated after training for several hours as show in the image below. This repeating patter is larger than the final kernel…

asked Jan 08 '20 at 22:01

Yadeses

231
2
5

5

votes

1 answer

Why is a softmax used rather than dividing each activation by the sum?

Just wondering why a softmax is typically used in practice on outputs of most neural nets rather than just summing the activations and dividing each activation by the sum. I know it's roughly the same thing but what is the mathematical reasoning…

asked Jan 01 '20 at 21:31

user8714896

797
1
6
24

5

votes

1 answer

Why do we average gradients and not loss in distributed training?

I'm running some distributed trainings in Tensorflow with Horovod. It runs training separately on multiple workers, each of which uses the same weights and does forward pass on unique data. Computed gradients are averaged within the communicator…

asked Dec 31 '19 at 13:22

pSoLT

161
2

5

votes

1 answer

Is running more epochs really a direct cause of overfitting?

I've seen some comments in online articles/tutorials or Stack Overflow questions which suggest that increasing the number of epochs can result in overfitting. But my intuition tells me that there should be no direct relationship at all between the…

asked Dec 28 '19 at 15:39

Alexander Soare

1,339
2
11
27

5

votes

1 answer

What is a "batch" in batch normalization?

I'm working on an example of CNN with the MNIST hand-written numbers dataset. Currently I've got convolution -> pool -> dense -> dense, and for the optimiser I'm using Mini-Batch Gradient Descent with a batch size of 32. Now this concept of batch…

asked Dec 27 '19 at 10:23

Alexander Soare

1,339
2
11
27

5

votes

0 answers

Training and inference for highly-context-sensitive information

What is the best way to train / do inference when the context matters highly as to what the inferred result should be? For example in the image below all people are standing upright, but because of the perspective of the camera, their location…

asked Dec 21 '19 at 01:32

g491

101
2

5

votes

1 answer

Are neurons in layer $l$ only affected by neurons in the previous layer?

Are artificial neurons in layer $l$ only affected by those in layer $l-1$ (providing inputs) or are they also affected by neurons in layer $l$ (and maybe by neurons in other layers)?

asked Dec 20 '19 at 18:26

George White

194
1
9

5

votes

1 answer

How can we prove that an autoassociator network will continue to perform if we zero the diagonal elements of a weight matrix?

How can we prove that an auto-associator network will continue to perform if we zero the diagonal elements of a weight matrix that has been determined by the Hebb rule? In other words, suppose that the weight matrix is determined from $W = PP^T-…

asked Dec 14 '19 at 13:00

estamos

157
1
12

5

votes

1 answer

When exactly is a model considered over-parameterized?

When exactly is a model considered over-parameterized? There are some recent researches in Deep Learning about the role of over-parameterization toward generalization, so it would be nice if I can know what exactly can be considered as such. A…

asked Dec 13 '19 at 15:41

Phúc Lê

161
5

5

votes

1 answer

What is Statistical relational learning?

I have gone through the wikipedia explanation of SRL. But, it only confused me more: Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both…

asked Aug 22 '16 at 05:47

Dawny33

1,371
13
29

Most Popular