examples/README.md · 32a7e4bfcf9fcdea76c3d778efb62b664fe6b088 · llvm / distiller

4 years ago

Knowledge distillation fixes (#503) · 32a7e4bf

Guy Jacob authored 4 years ago

Fixed two long-standing bugs in knowledge distillation:
 * Distillation loss needs to be scaled by T^2 (#122)
 * Use tensor.clone instead of new_tensor when caching student logits (#234)
Updated example results and uploaded the script to generate them

Unverified

32a7e4bf

History

Knowledge distillation fixes (#503)

Guy Jacob authored 4 years ago

Fixed two long-standing bugs in knowledge distillation:
 * Distillation loss needs to be scaled by T^2 (#122)
 * Use tensor.clone instead of new_tensor when caching student logits (#234)
Updated example results and uploaded the script to generate them