Questions tagged [deep-learning]

a new area of Machine Learning research concerned with the technologies used for learning hierarchical representations of data, mainly done with deep neural networks (i.e. networks with two or more hidden layers), but also with some sort of Probabilistic Graphical Models.

What is Deep Learning?

Deep Learning is an area of machine-learning which attempts to build neural-networks to learn complex functions by using special architectures composed of many layers (hence the term "deep").

Deep architectures allow more complex tasks to be learned because, in addition to these neural networks having more layers to perform transformations, the larger number of layers and more complex architectures of the neural network allow a hierarchical organization of functionality to emerge.

Deep Learning was introduced into machine learning research with the intention of moving machine learning closer to artificial intelligence. A significant impact of deep learning lies in feature learning, mitigating much of the effort going into manual feature engineering in non-deep learning neural networks.

New to Deep Learning?

There are a variety of resources including books, tutorials/workshops, etc. for those looking to learn more about Deep Learning.

A popular introductory tutorial is:

SciPy 2020 Conference Tutorial:

Deep Learning from Scratch with PyTorch

Some popular introductory books:

Deep Learning with Python, by François Chollet
Deep Learning with PyTorch, by Eli Stevens, Luca Antiga, and Thomas Viehman
Deep Learning, by Ian Goodfellow

Resources

Papers

Deep Learning in Neural Networks: An Overview

Books

Neural Networks and Deep Learning By Michael Nielsen - this is a free book with associated Python source code on Github
Deep Learning Made Easy with R: A Gentle Introduction For Data Science
Deep Learning: Methods and Applications
Autonomous Robotics and Deep Learning

Videos

Neural Networks Demystified - accompanied by a set of Jupyter Notebooks
Deep Learning by Andrew Ng

Stack Exchange Sites

Other StackExchange sites with Deep Learning tag:

4871 questions

votes

2 answers

Sort numbers using only 2 hidden layers

I'm reading the cornerstone paper Sequence to Sequence Learning with Neural Networks by Ilya Sutskever and Quoc Le. On the first page, it briefly mentions that: A surprising example of the power of DNNs is their ability to sort N N-bit numbers…

deep-learning

asked Aug 13 '17 at 23:04

aerin

votes

2 answers

deep learning for non-image non-NLP tasks?

So far there are many interesting applications for deep learning in computer vision or natural language processing. How is it in other more traditional fields? For example, I have traditional socio-demographic variables plus maybe a lot of lab…

deep-learning

asked Mar 08 '17 at 11:01

spore234

votes

2 answers

Relu does have 0 gradient by definition, then why gradient vanish is not a problem for x < 0?

By definition, Relu is max(0,f(x)). Then its gradient is defined as: 1 if x > 0 and 0 if x < 0. Wouldn't this mean the gradient is always 0 (vanishes) when x < 0? Then why do we say Relu doesn't suffer from the gradient vanish problem?

deep-learning

asked May 04 '16 at 18:17

Edamame

2,745
5
24
33

votes

3 answers

How to use Cross Entropy loss in pytorch for binary prediction?

In the pytorch docs, it says for cross entropy loss: input has to be a Tensor of size (minibatch, C) Does this mean that for binary (0,1) prediction, the input must be converted into an (N,2) tensor where the second dimension is equal to (1-p)? So…

deep-learning

asked Aug 18 '18 at 00:56

AAC

votes

1 answer

What is the relationship between "landmark Detection" and "landmark localization"

I am reading this paper "Grand Challenge of 106-Point Facial Landmark Localization" In the context of face recognition "Landmark Detection" is to detect a face by matching landmarks on a face. "Landmark Localization" is to predict the coordinates of…

deep-learning

asked May 28 '19 at 12:04

whnlp

votes

1 answer

What is missing from the following Curriculum Learning implementation in a Deep Neural Net?

First of all we have a classification task. So we use the typical softmax cross entropy to classify. Current implementation of curriculum learning is as follows. First we train our best version of the neural net At the last epoch we get all of the…

deep-learning

asked Feb 28 '17 at 16:09

George Pligoropoulos

votes

2 answers

Question about the simple example for batch normalization given in "deep learning" book

In the section about batch normalization of Deep Learning book by Ian Goodfellow (chapter link) there is the follwing text: As example, suppose we have a deep neural network that has only one unit per layerand does not use an activation function…

deep-learning

asked Feb 18 '17 at 14:30

amit

votes

3 answers

Intuition behind the number of output neurons for a neural network

I am reading Michael Nielsen's book on deep learning. In the first chapter, he gives the classic example of classifying 10 handwritten digits, and uses it to explain the intuition behind choosing the number of output neurons. Initially, before…

deep-learning

asked Aug 25 '19 at 15:02

An Ignorant Wanderer

votes

5 answers

Can we train a neural network to tell if an object is present or not in an Image?

I am new to machine learning, working on object detection, but not interested in the location of the object in the image, so I just want to know is it possible to train such a neural network, if yes, how? (I just want a list of objects present in…

deep-learning

asked Dec 29 '16 at 03:35

Abstractgears

votes

3 answers

Do I need to buy a NVIDIA graphic card to run deep learning algorithm?

I am new in deep learning. I am running a MacBook Pro yosemite (upgraded from Snowleopard). I don't have a CUDA-enabled card GPU, and running the code on the CPU is extremely slow. I heard that I can buy some instances on AWS, but it seems that they…

deep-learning

asked Jul 31 '15 at 08:21

Lilianna

votes

1 answer

what does smooth/soft probablity mean?

I was recently reading the Knowledge Distillation paper, and encountered the term smooth probabilities. The term was used to denote when the logits were divided a temperature. Neural networks typically produce class probabilities by using a …

deep-learning

asked Dec 13 '17 at 17:13

Hossein

votes

0 answers

why model's training is faster on windows than ubuntu?

I'm training a model of object detection with Tensorflow object detection API on windows 10 it looks around 3-4 times faster than ubuntu 18.04 and I don't know why I'm using same batch size, same PC and same dataset what could be the problem here…

deep-learning

asked Feb 09 '21 at 16:06

Mustafa Alahmid

votes

1 answer

What does this formula in Glorot & Bengio mean?

In this paper, on page 5, we find the formula $$Var(z^i)=Var(x)\prod_{i'=0}^{i-1}n_{i'}Var(W^{i'})$$ I am really struggling to understand what is meant by this formula. I think at least some of the following are true: We're dealing with a linear…

deep-learning

asked Oct 24 '20 at 18:40

Jack M

votes

1 answer

Why is IoU said to be non-differentiable?

I have been trying to find an answer online but I couldn't really find one. If anyone could help me I would appreciate it

deep-learning

asked Oct 16 '20 at 06:49

StrickBan

votes

1 answer

Why do we want the variance of the layers to remain the same throughout a deep network?

I've been reading the literature on vanishing/exploding gradients and specifically how they connect to weight initialization. An idea I've come across a few times, which seems very important in this area, is that we want the variance to remain the…

deep-learning

asked Oct 12 '20 at 17:18

Jack M

2 3 4 5 6 7 Next