AdaPID: An Adaptive PID Optimizer for Training Deep Neural Networks
Boxi Weng, Jian Sun, Gang Wang, Alireza Sadeghi
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:06:01
Deep neural networks (DNNs) have well-documented merits in learning nonlinear functions in high-dimensional spaces. Stochastic gradient descent (SGD)-type optimization algorithms are the `workhorse' for training DNNs. Nonetheless, such algorithms often suffer from slow convergence, sizable fluctuations, and abundant local solutions, to name a few. In this context, the present paper draws ideas from adaptive control and develops an adaptive proportional-integral-derivative (AdaPID) solver for fast, stable, and effective training of DNNs. AdaPID relies on second-order moment estimates of gradients to adaptively adjust the PID coefficients. Numerical tests corroborate the merits of several tasks such as image generation using generative adversarial networks (GANs) and image classification using convolutional neural networks (CNNs) as well as long-short term memories (LSTMs).