-
- Downloads
QAT: Better handling of optimizer and of creation of fp32 weights copy (#399)
* Create float copy such that the actual tensor being learned stays the same * This way the optimizer doesn't have to be re-created, just need to add parameter groups if algo requires it (e.g. PACT) * This also means we don't care about pre-existing parameter groups, as opposed to the previous implementation which ASSUMED a single existing group
Loading
Please register or sign in to comment