Series: Optimization - Paperspace by DigitalOcean Blog

Intro to Optimization in Deep Learning: Busting the Myth About Batch Normalization

Batch Normalisation does NOT reduce internal covariate shift. This posts looks into why internal covariate shift is a problem and how batch normalisation is used to address it.

By Ayoosh Kathuria

• 7 years ago

Series: Optimization

Intro to Optimization in Deep Learning: Vanishing Gradients and Choosing the Right Activation Function

An look into how various activation functions like ReLU, PReLU, RReLU and ELU are used to address the vanishing gradient problem, and how to chose one amongst them for your network.

By Ayoosh Kathuria

• 7 years ago

Series: Optimization

Intro to optimization in deep learning: Momentum, RMSProp and Adam

In this post, we take a look at a problem that plagues training of neural networks, pathological curvature.

By Ayoosh Kathuria

• 7 years ago

Series: Optimization

Intro to optimization in deep learning: Gradient Descent

An in-depth explanation of Gradient Descent, and how to avoid the problems of local minima and saddle points.

By Ayoosh Kathuria

• 7 years ago

Blog

Docs

Community

ML Showcase

Professional Services

Talk to an Expert

Intro to Optimization in Deep Learning: Busting the Myth About Batch Normalization

Intro to Optimization in Deep Learning: Vanishing Gradients and Choosing the Right Activation Function

Intro to optimization in deep learning: Momentum, RMSProp and Adam

Intro to optimization in deep learning: Gradient Descent

Solutions

Product

Resources

Company

Intro to Optimization in Deep Learning: Busting the Myth About Batch Normalization

Intro to Optimization in Deep Learning: Vanishing Gradients and Choosing the Right Activation Function

Intro to optimization in deep learning: Momentum, RMSProp and Adam

Intro to optimization in deep learning: Gradient Descent

Subscribe to our newsletter

Solutions

Product

Resources

Company