Title: On the influence of stochastic rounding bias in implementing gradient descent with applications in low-precision training, Speaker(s): Mrs Lu Xia, Eindhoven University of Technology, Date and time: 18 Jul 2023, 14:00 (Europe/Rome), Lecture series: Seminar on Numerical Analysis, Venue: Dipartimento di Matematica (Aula Magna).
You can access the full event here: https://events.dm.unipi.it/e/201
Abstract --------
In the context of low-precision computation for the training of neural networks with thegradient descent method (GD), the occurrence of deterministic rounding errors often leadsto stagnation or adversely affects the convergence of the optimizers. The employ-ment of unbiased stochastic rounding (SR) may partially capture gradient updates thatare lower than the minimum rounding precision, with a certain probability. Weprovide a theoretical elucidation for the stagnation observed in GD when training neuralnetworks with low-precision computation. We analyze the impact of floating-point round-off errors on the convergence behavior of GD with a particular focus on convex problems.Two biased stochastic rounding methods, signed-SR$_\varepsilon$ and SR$_\varepsilon$, are proposed, which havebeen demonstrated to eliminate the stagnation of GD and to result in significantly fasterconvergence than SR in low-precision floating-point computation.We validate our theoretical analysis by training a binary logistic regression model onthe Cifar10 database and a 4-layer fully-connected neural network model on the MNISTdatabase, utilizing a 16-bit floating-point representation and various rounding techniques.The experiments demonstrate that signed-SR$_\varepsilon$ and SR$_\varepsilon$ may achieve higher classificationaccuracy than rounding to the nearest (RN) and SR, with the same number of trainingepochs. It is shown that a faster convergence may be obtained by the new roundingmethods with 16-bit floating-point representation than by RN with 32-bit floating-pointrepresentation.
-- Indico :: Email Notifier https://events.dm.unipi.it/e/201