Most Popular

1500 questions
5
votes
1 answer

What does the notation $\mathcal{N}(z; \mu, \sigma)$ stand for in statistics?

I know that the notation $\mathcal{N}(\mu, \sigma)$ stands for a normal distribution. But I'm reading the book "An Introduction to Variational Autoencoders" and in it, there is this notation: $$\mathcal{N}(z; 0, I)$$ What does it mean? picture of…
Peyman
  • 564
  • 1
  • 3
  • 11
5
votes
1 answer

How does the Ornstein-Uhlenbeck process work, and how it is used in DDPG?

In section 3 of the paper Continuous control with deep reinforcement learning, the authors write As detailed in the supplementary materials we used an Ornstein-Uhlenbeck process (Uhlenbeck & Ornstein, 1930) to generate temporally correlated…
dani
  • 51
  • 1
  • 3
5
votes
1 answer

Why is the mean used to compute the expectation in the GAN loss?

From Goodfellow et al. (2014), we have the adversarial loss: $$ \min_G \, \max_D V (D, G) = \mathbb{E}_{x∼p_{data}(x)} \, [\log \, D(x)] + \, \mathbb{E}_{z∼p_z(z)} \, [\log \, (1 − D(G(z)))] \, \text{.} \quad$$ In practice, the expectation is…
5
votes
1 answer

Can you convert a MDP problem to a Contextual Multi-Arm Bandits problem?

I'm trying to get a better understanding of Multi-Arm Bandits, Contextual Multi-Arm Bandits and Markov Decision Process. Basically, Multi-Arm Bandits is a special case of Contextual Multi-Arm Bandits where there is no state(features/context). And…
peidaqi
  • 151
  • 2
5
votes
2 answers

Why are policy iteration and value iteration studied as separate algorithms?

In Sutton and Barto's book about reinforcement learning, policy iteration and value iterations are presented as separate/different algorithms. This is very confusing because policy iteration includes an update/change of value and value iteration…
User007
  • 51
  • 3
5
votes
2 answers

How can we prevent AGI from doing drugs?

I recently read some introductions to AI alignment, AIXI and decision theory things. As far as I understood, one of the main problems in AI alignment is how to define a utility function well, not causing something like the paperclip apocalypse. Then…
user3584499
  • 153
  • 2
5
votes
1 answer

Why does TD Learning require Markovian domains?

One of my friends and I were discussing the differences between Dynamic Programming, Monte-Carlo, and Temporal Difference (TD) Learning as policy evaluation methods - and we agreed on the fact that Dynamic Programming requires the Markov assumption…
stoic-santiago
  • 1,141
  • 8
  • 19
5
votes
1 answer

How can I find a specific word in an audio file?

I'm trying to train and use a neural network to detect a specific word in an audio file. The input of the neural network is an audio of 2-3 seconds duration, and the neural network must determine whether the input audio (the voice of a person)…
Ali.kavari76
  • 111
  • 6
5
votes
1 answer

What is eager learning and lazy learning?

What is the difference between eager learning and lazy learning? How does eager learning or lazy learning help me build a neural network system? And how can I use it for any target function?
mogoja
  • 73
  • 5
5
votes
1 answer

Why do DQNs tend to forget?

Why do DQNs tend to forget? Is it because when you feed highly correlated samples, your model (function approximation) doesn't give a general solution? For example: I use level 1 experiences, my model $p$ is fitted to learn how to play that…
Chukwudi
  • 369
  • 2
  • 7
5
votes
2 answers

Could an AI be sentient?

In theory, could an AI become sentient, as in learning and becoming self-aware, all from its source code?
5
votes
3 answers

Why is symbolic AI not so popular as ANN but used by IBM's Deep Blue?

Everybody is implementing and using DNN with, for example, TensorFlow or PyTorch. I thought IBM's Deep Blue was an ANN-based AI system, but this article says that IBM's Deep Blue was symbolic AI. Are there any special features in symbolic AI that…
Dan D.
  • 1,283
  • 1
  • 11
  • 38
5
votes
1 answer

Why do we need target network in deep Q learning?

I already know deep RL, but to learn it deeply I want to know why do we need 2 networks in deep RL. What does the target network do? I now there is huge mathematics into this, but I want to know deep Q-learning deeply, because I am about to make…
dato nefaridze
  • 872
  • 8
  • 20
5
votes
1 answer

What is a "closed expression" in the context of logic?

I was reading about logic systems and the following phrase appeared. any closed expression that is not derivable inside the same system What is a "closed expression" in this context? What does "closed expression that is not derivable" mean?
Ale
  • 153
  • 3
  • 11
5
votes
2 answers

What is a trap function in the context of a genetic algorithm?

What is a trap function in the context of a genetic algorithm? How is it related to the concepts of local and global optima?