Skip to content
Snippets Groups Projects
Commit be97de23 authored by Neta Zmora's avatar Neta Zmora
Browse files

Revert "ModelSummary: adapt sparsity accounting to correctly account for "weight tying"

This reverts commit ecade1b2.
This simply does not work, so reverting until we find a correct solution.
For example, in the language model the encoder and decoder weights are tied and use the
same memory, and yet I can't see how to determine that they are the same parameter.
parent 8de6223e
No related branches found
No related tags found
No related merge requests found
...@@ -96,17 +96,11 @@ def weights_sparsity_summary(model, return_total_sparsity=False, param_dims=[2,4 ...@@ -96,17 +96,11 @@ def weights_sparsity_summary(model, return_total_sparsity=False, param_dims=[2,4
pd.set_option('precision', 2) pd.set_option('precision', 2)
params_size = 0 params_size = 0
sparse_params_size = 0 sparse_params_size = 0
# In language models, we might use use "weight tying", which means that the same
# weights tensor is used in several different places. If tying is used, we'd like
# to log the tensor information, but exclude it from the total sparsity calculation.
seen_params = []
for name, param in model.state_dict().items(): for name, param in model.state_dict().items():
if (param.dim() in param_dims) and any(type in name for type in ['weight', 'bias']): if (param.dim() in param_dims) and any(type in name for type in ['weight', 'bias']):
_density = distiller.density(param) _density = distiller.density(param)
if name not in seen_params: params_size += torch.numel(param)
params_size += torch.numel(param) sparse_params_size += param.numel() * _density
sparse_params_size += param.numel() * _density
seen_params.append(name)
df.loc[len(df.index)] = ([ df.loc[len(df.index)] = ([
name, name,
distiller.size_to_str(param.size()), distiller.size_to_str(param.size()),
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment