Loss Scaling and Step Size in Deep Learning Optimization

Alosily, Nora

Loss Scaling and Step Size in Deep Learning Optimization

dc.contributor.advisor	Bhatia, Sanjiv
dc.contributor.author	Alosily, Nora
dc.date.accessioned	2023-05-24T06:32:51Z
dc.date.available	2023-05-24T06:32:51Z
dc.date.issued	2023-05
dc.description.abstract	Deep learning training consumes ever-increasing time and resources, and that is due to the complexity of the model, the number of updates taken to reach good results, and both the amount and dimensionality of the data. In this dissertation, we will focus on making the process of training more efficient by focusing on the step size to reduce the number of computations for parameters in each update. We achieved our objective in two new ways: we use loss scaling as a proxy for the learning rate, and we use learnable layer-wise optimizers. Although our work is perhaps not the first to point to the equivalence of loss scaling and learning rate in deep learning optimization, ours is the first to leveraging this relationship towards more efficient training. We did not only use it in simple gradient descent, but also we were able to extend it to other adaptive algorithms. Finally, we use metalearning to shed light on various relevant aspects, including learnable losses and optimizers. In this regard, we developed a novel learnable optimizer and effectively utilized it to acquire an adaptive rescaling factor and learning rate, resulting in a significant reduction in required memory during training.
dc.format.extent	102
dc.identifier.uri	https://hdl.handle.net/20.500.14154/68144
dc.language.iso	en_US
dc.subject	deep learning optimization
dc.subject	metalearning
dc.subject	meta learning
dc.subject	loss scaling
dc.subject	efficient training
dc.title	Loss Scaling and Step Size in Deep Learning Optimization
dc.type	Thesis
sdl.degree.department	Computer Science
sdl.degree.discipline	Computer Science
sdl.degree.grantor	University of Missouri-St. Louis
sdl.degree.name	Doctor of Philosophy in Mathematical and Computational Sciences

Collections

SACM - United States of America

Loss Scaling and Step Size in Deep Learning Optimization

Files

Collections