In every node of the neural network model, we need a transfer function. There are variety of these functions. How can we choose the best of these? According to what? Is there a rule for selecting or does it depend on the experiment? Is there an article or a journal paper cited in a wide range of researches supporting your answer?

DL is an emerging branch of supervised learning method in machine learning 8. For example, in neural network, for a given neuron function, the activation function is applied on the deep layers to extract the abstractions from voluminous data. This is similar to a hierarchical structure where deep learning models are applied 16. In the research domain, DL is a very promising and evolving area. DL has many highly developed algorithmic models such as convolution neural networks (CNNs), deep Boltzmann machines (DBMs), deep belief networks (DBNs), deep representation, recursive auto-encoders and restricted Boltzmann machines (RBMs) 17.

