-
- Downloads
Language model: replace the optimizer and LR-decay scheduler
Replace the original "homebrew" optimizer and LR-decay schedule with PyTorch's SGD and ReduceLROnPlateau. SGD with momentum=0 and weight_decay=0, and ReduceLROnPlateau with patience=0 and factor=0.5 will give the same behavior as in the original PyTorch example. Having a standard optimizer and LR-decay schedule gives us the flexibility to experiment with these during the training process.
Loading
Please register or sign in to comment