Abstract – In this study, we present deep neural networks with a set of node-wise varying activation functions. The feature-learning abilities of the nodes are affected by the selected activation functions, where the nodes with smaller indices become increasingly more sensitive during training. As a result, the features learned by the nodes are sorted by the node indices in order of their importance such that more sensitive nodes are related to more important features. The proposed networks learn input features but also the importance of the features. Nodes with lower importance in the proposed networks can be pruned to reduce the complexity of the networks, and the pruned networks can be retrained without incurring performance losses. We validated the feature-sorting property of the proposed method using both shallow and deep networks as well as deep networks transferred from existing networks.