Ultimate Guide to Activation Functions

Published in

ITNEXT

9 min readApr 22, 2022

A great place to find and learn about activation functions is Wikipedia; however, over the years, the table of activation functions has fluctuated wildly, functions have been added and removed time and time again. You can view a list of historical changes to this particular Wikipedia page here. The first introduction of the ‘table of activation functions’ was in November 2015 by the user Laughinthestocks. Since then, at the time of writing this article, there have been 391 changes to the Wikipedia page. In this article, I have written an algorithm to mine every unique Activation function out of the history of this Wikipedia page as of the 22nd of April 2022 so that I can list them all in one comprehensive document here. I have also provided additional links to appropriate research papers for activation functions where none had been, or in cases where no specific research paper could be located a relevant paper of interest is provided in-place.

Typically one would use tanh for an FNN and ReLU for a CNN.

If we included the Identity Activation function this list would contain 42 activation functions, although you could say with the inclusion of the bipolar sigmoid that it is indeed 42. I’ve not read ‘The Hitchhiker’s Guide to the Galaxy’. Seriously.

The derivative is provided w.r.t f(𝑥) if possible, but in instances this may not be the case; then it would be w.r.t 𝑥.