Highest Voted Questions - Artificial Intelligence Stack Exchange

22

votes

2 answers

How to define states in reinforcement learning?

I am studying reinforcement learning and the variants of it. I am starting to get an understanding of how the algorithms work and how they apply to an MDP. What I don't understand is the process of defining the states of the MDP. In most examples…

asked Aug 30 '18 at 23:45

Andy

323
1
2
6

22

votes

1 answer

How does the (decoder-only) transformer architecture work?

How does the (decoder-only) transformer architecture work which is used in impressive models such as GPT-4?

asked Apr 23 '23 at 19:28

Robin van Hoorn

2,366
1
10
33

22

votes

2 answers

Why would you implement the position-wise feed-forward network of the transformer with convolution layers?

The Transformer model introduced in "Attention is all you need" by Vaswani et al. incorporates a so-called position-wise feed-forward network (FFN): In addition to attention sub-layers, each of the layers in our encoder and decoder contains a…

asked Sep 18 '19 at 23:45

Eli Korvigo

321
1
2
6

22

votes

1 answer

Has the Lovelace Test 2.0 been successfully used in an academic setting?

In October 2014, Dr. Mark Riedl published an approach to testing AI intelligence, called the "Lovelace Test 2.0", after being inspired by the original Lovelace Test (published in 2001). Mark believed that the original Lovelace Test would be…

asked Aug 07 '16 at 18:17

Left SE On 10_6_19

1,660
9
23

22

votes

3 answers

Why doesn't Q-learning converge when using function approximation?

The tabular Q-learning algorithm is guaranteed to find the optimal $Q$ function, $Q^*$, provided the following conditions (the Robbins-Monro conditions) regarding the learning rate are satisfied $\sum_{t} \alpha_t(s, a) = \infty$ $\sum_{t}…

asked Apr 05 '19 at 18:23

nbro

40,472
12
105
192

21

votes

5 answers

Why does Batch Normalization work?

Adding BatchNorm layers improves training time and makes the whole deep model more stable. That's an experimental fact that is widely used in machine learning practice. My question is - why does it work? The original (2015) paper motivated the…

asked Apr 10 '21 at 23:05

Kostya

2,515
10
24

21

votes

3 answers

Is a dystopian surveillance state computationally possible?

This isn't really a conspiracy theory question. More of an inquire on the global computational power and data storage logistics question. Most recording instruments such as cameras and microphones are typically voluntary opt in devices, in that,…

asked Feb 28 '20 at 08:34

Harrison Tran

319
2
6

21

votes

2 answers

What is the difference between First-Visit Monte-Carlo and Every-Visit Monte-Carlo Policy Evaluation?

I came across these 2 algorithms, but I cannot understand the difference between these 2, both in terms of implementation as well as intuitionally. So, what difference does the second point in both the slides refer to?

asked Feb 22 '19 at 09:28

user9947

20

votes

1 answer

Why do you not see dropout layers on reinforcement learning examples?

I've been looking at reinforcement learning, and specifically playing around with creating my own environments to use with the OpenAI Gym AI. I am using agents from the stable_baselines project to test with it. One thing I've noticed in virtually…

asked Oct 07 '18 at 09:55

Matt Hamilton

333
2
5

20

votes

4 answers

Why do we need floats for using neural networks?

Is it possible to make a neural network that uses only integers by scaling input and output of each function to [-INT_MAX, INT_MAX]? Is there any drawbacks?

asked Jul 22 '18 at 14:12

elimohl

311
1
2
5

20

votes

3 answers

How are Artificial Neural Networks and the Biological Neural Networks similar and different?

I've heard multiple times that "Neural Networks are the best approximation we have to model the human brain", and I think it is commonly known that Neural Networks are modelled after our brain. I strongly suspect that this model has been simplified,…

asked Apr 08 '18 at 21:34

Andreas Storvik Strauman

491
3
15

20

votes

3 answers

How can we process the data from both the true distribution and the generator?

I'm struggling to understand the GAN loss function as provided in Understanding Generative Adversarial Networks (a blog post written by Daniel Seita). In the standard cross-entropy loss, we have an output that has been run through a sigmoid function…

asked Jun 13 '17 at 10:50

tryingtolearn

385
1
2
10

20

votes

2 answers

How do neural networks play chess?

I have been spending a few days trying to wrap my head around how and why neural networks are used to play chess. Although I know very little about how the game of chess works, I can understand the following idea. Theoretically, we could make a…

asked Mar 18 '22 at 15:14

stats_noob

329
1
11

20

votes

2 answers

Why does GPT-2 Exclude the Transformer Encoder?

After looking into transformers, BERT, and GPT-2, from what I understand, GPT-2 essentially uses only the decoder part of the original transformer architecture and uses masked self-attention that can only look at prior tokens. Why does GPT-2 not…

asked Mar 27 '21 at 19:55

Athena Wisdom

351
1
2
5

20

votes

2 answers

What is the "Hello World" problem of Reinforcement Learning?

As we all know, "Hello World" is usually the first program that any programmer learns/implements in any language/framework. As Aurélien Géron mentioned in his book that MNIST is often called the Hello World of Machine Learning, is there any "Hello…

asked Sep 13 '20 at 12:57

Arpit-Gole

394
2
9

Most Popular