ReLU derivative in backpropagation

Question

I am about making backpropagation on a neural network that uses ReLU. In a previous project of mine, I did it on a network that was using Sigmoid activation function, but now I'm a little bit confused, since ReLU doesn't have a derivative.

Here's an image about how weight5 contributes to the total error. In this example, out/net = a*(1 - a) if I use sigmoid function.

What should I write instead of "a*(1 - a)" to make the backpropagation work?

Depends on the actual ReLU expression. There are several ReLUs that can be used. Nevertheless, it's just the derivative of the ReLU function with respect to its argument. And you can compute that either by hand or using e.g. wolfram alpha. Or just google it. — zegkljan

malioboro malioboro · Accepted Answer · 2017-02-05T03:53:12

since ReLU doesn't have a derivative.

No, ReLU has derivative. I assumed you are using ReLU function f(x)=max(0,x). It means if x<=0 then f(x)=0, else f(x)=x. In the first case, when x<0 so the derivative of f(x) with respect to x gives result f'(x)=0. In the second case, it's clear to compute f'(x)=1.

ReLU derivative in backpropagation

3 Answers