Skip to content
Snippets Groups Projects
  • Guy Jacob's avatar
    32a7e4bf
    Knowledge distillation fixes (#503) · 32a7e4bf
    Guy Jacob authored
    Fixed two long-standing bugs in knowledge distillation:
     * Distillation loss needs to be scaled by T^2 (#122)
     * Use tensor.clone instead of new_tensor when caching student logits (#234)
    Updated example results and uploaded the script to generate them
    Knowledge distillation fixes (#503)
    Guy Jacob authored
    Fixed two long-standing bugs in knowledge distillation:
     * Distillation loss needs to be scaled by T^2 (#122)
     * Use tensor.clone instead of new_tensor when caching student logits (#234)
    Updated example results and uploaded the script to generate them