In the TensorFlow Python API, the default value for the activation
kwarg of tf.layers.dense
is None
, then in the documentation it says:
activation: Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
Why not just use the identity function as default value when defining the function? like this:
def dense(..., activation=lambda x: x, ...):
pass
This way you don't have to worry about the inconsistency between the documentation and the code.
Is this (using None to represent a default function) just a coding style, or is there some caveat for using function as the default value of a kw argument?
It's not there to avoid unnecessary function calls, since an identity function is still created and called even None
is passed to activation
. Besides, since this happens at graph construction time, there is no point to do optimization like this - assuming this indeed is an optimization.
Correction:
As pointed out by @y-luo, the tf
implementation doesn't actually create an identity function. But the tf.keras
implementation does.
None
as default argument and use something likeif activation is None: activation = ...
in the function body. The default arguments are evaluated when the function is defined (not each time the function is called), so all function calls get the same instance of the default argument. This leads to unexpected behavior when using mutable objects, for example, lists. It may be unnecessary in case of alambda
argument, but it still is a good practice. - pschillactivation
, that function can only accept one argument, so we don't have to worry about mutable argument in the function identified byactivation
, thus no matter you "call-it-when-None-encountered" or "use-it-as-the-default-value-then-call-it", they always refer to the same function, ... - IncömpleteNone
and forget about it. Thanks :) - Incömplete