VarAdam
The VarAdam is an instance of the pytorch nn.Module class. The VarAdam class implements an adapted version of the Adam optimizer as introduced in [1].
- class context_builder.optimizer.VarAdam(model, factor=1, warmup=4000, optimizer=<class 'torch.optim.adam.Adam'>, lr=0, betas=(0.9, 0.98), eps=1e-09)[source]
Adam optimizer with variable learning rate.
- VarAdam.__init__(model, factor=1, warmup=4000, optimizer=<class 'torch.optim.adam.Adam'>, lr=0, betas=(0.9, 0.98), eps=1e-09)[source]
Update
The following functions update the optimizer with a given number of steps.
Reference
[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In Advances in neural information processing systems (NIPS). [PDF]