Skip to content
Snippets Groups Projects
  • Neta Zmora's avatar
    a9b28923
    Language model: replace the optimizer and LR-decay scheduler · a9b28923
    Neta Zmora authored
    Replace the original "homebrew" optimizer and LR-decay schedule with
    PyTorch's SGD and ReduceLROnPlateau.
    SGD with momentum=0 and weight_decay=0, and ReduceLROnPlateau with
    patience=0 and factor=0.5 will give the same behavior as in the
    original PyTorch example.
    
    Having a standard optimizer and LR-decay schedule gives us the
    flexibility to experiment with these during the training process.
    a9b28923
    History
    Language model: replace the optimizer and LR-decay scheduler
    Neta Zmora authored
    Replace the original "homebrew" optimizer and LR-decay schedule with
    PyTorch's SGD and ReduceLROnPlateau.
    SGD with momentum=0 and weight_decay=0, and ReduceLROnPlateau with
    patience=0 and factor=0.5 will give the same behavior as in the
    original PyTorch example.
    
    Having a standard optimizer and LR-decay schedule gives us the
    flexibility to experiment with these during the training process.
main.py 15.59 KiB