-
Christos Christodoulopoulos authoredChristos Christodoulopoulos authored
Nominalizations
Nombank dictionary statistics
- The following data comes from NOMLEX-plus-clean-1.0.
- Grouping these per type, we get
Nominalization Class | Count | |
---|---|---|
NOM | 3934 | |
NOMLIKE | 1244 | |
NOMING | 359 | |
ABLE-NOM | 18 | |
NOMADJ | 503 | |
NOMADJLIKE | 142 | |
PARTITIVE | 509 | |
ATTRIBUTE | 417 | |
RELATIONAL | 331 | |
WORK-OF-ART | 188 | |
ABILITY | 112 | |
ENVIRONMENT | 91 | |
GROUP | 84 | |
HALLMARK | 38 | |
JOB | 28 | |
VERSION | 21 | |
TYPE | 17 | |
EVENT | 12 | |
SHARE | 12 | |
ISSUE | 11 | |
CRISSCROSS | 7 | |
FIELD | 6 | |
Total | 8084 | |
Nominalization type | Count | Percent |
Verbal | 5555 | 68.7 |
Adjectival | 645 | 8.0 |
Other | 1884 | 23.3 |
Total | 8084 |
Nombank training data statistics
- Statistics for distribution of NOMLEX classes over the training data
Type | Number of occurrences |
---|---|
NOM | 50612 |
NOMLIKE | 22583 |
PARTITIVE | 11841 |
ATTRIBUTE | 10456 |
RELATIONAL | 7527 |
ABILITY | 5904 |
WORK_OF_ART | 3852 |
GROUP | 3633 |
NOMADJ | 3280 |
NOMING | 3252 |
SHARE | 3192 |
ENVIRONMENT | 3108 |
NOMADJLIKE | 2700 |
JOB | 1502 |
ISSUE | 1031 |
VERSION | 725 |
CRISSCROSS | 346 |
FIELD | 330 |
HALLMARK | 299 |
EVENT | 172 |
ABLE_NOM | 60 |
UNKNOWN_CLASS | 1 |
- Statistics for distribution of NOMLEX classes for predicates that have at least one of the four Deverbal classes
Type | Number of occurrences |
---|---|
NOM | 50612 |
NOMLIKE | 22583 |
ATTRIBUTE | 4805 |
PARTITIVE | 3998 |
NOMING | 3252 |
SHARE | 2646 |
ABILITY | 2583 |
WORK_OF_ART | 2134 |
RELATIONAL | 2090 |
NOMADJLIKE | 1489 |
ENVIRONMENT | 1094 |
GROUP | 1078 |
JOB | 933 |
NOMADJ | 650 |
ISSUE | 592 |
VERSION | 543 |
TYPE | 328 |
FIELD | 251 |
CRISSCROSS | 212 |
HALLMARK | 164 |
ABLE_NOM | 60 |
Version 3.0.2
For timing the systems, the experiments were conducted on a machine that has 2x 6-Core Intel Xeon E5645 Processor with 12MB cache and clock speed of 2.40 GHz. ILP inference was done using Gurobi v4. The beam search does use the multi-core nature of the processor.
Verb
Memory and time considerations
- Memory: At least 5.5 GB main memory
- Time for inference:
- ILP inference on 2400 sentences took 197467 ms
- Beam search on 2400 sentences took 163934 ms
Performance: ILP inference
- Number of Sentences : 2416
- Number of Propositions : 5267
- Percentage of perfect props : 50.56
corr. excess missed prec. rec. F1
Overall 10610 3312 3467 76.21 75.37 75.79
A0 3086 530 477 85.34 86.61 85.97 A1 3792 960 1135 79.80 76.96 78.36 A2 686 316 424 68.46 61.80 64.96 A3 94 53 79 63.95 54.34 58.75 A4 74 30 28 71.15 72.55 71.84 A5 3 4 2 42.86 60.00 50.00 AM 0 4 0 0.00 0.00 0.00 AM-ADV 264 260 242 50.38 52.17 51.26 AM-CAU 33 42 40 44.00 45.21 44.59 AM-DIR 36 39 49 48.00 42.35 45.00 AM-DIS 246 106 74 69.89 76.88 73.21 AM-EXT 14 12 18 53.85 43.75 48.28 AM-LOC 192 178 171 51.89 52.89 52.39 AM-MNR 192 181 152 51.47 55.81 53.56 AM-MOD 524 49 27 91.45 95.10 93.24 AM-NEG 207 29 23 87.71 90.00 88.84 AM-PNC 44 70 71 38.60 38.26 38.43 AM-PRD 1 6 4 14.29 20.00 16.67 AM-REC 0 1 2 0.00 0.00 0.00 AM-TMP 789 327 298 70.70 72.59 71.63 R-A0 182 47 42 79.48 81.25 80.35 R-A1 108 53 48 67.08 69.23 68.14 R-A2 6 4 10 60.00 37.50 46.15 R-A3 0 1 1 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 R-AM-ADV 0 0 2 0.00 0.00 0.00 R-AM-CAU 1 0 3 100.00 25.00 40.00 R-AM-EXT 0 0 1 0.00 0.00 0.00 R-AM-LOC 8 1 13 88.89 38.10 53.33 R-AM-MNR 2 2 4 50.00 33.33 40.00 R-AM-TMP 26 7 26 78.79 50.00 61.18
V 5259 8 8 99.85 99.85 99.85
Performance: Beam search
- Number of Sentences : 2416
- Number of Propositions : 5267
- Percentage of perfect props : 50.39
corr. excess missed prec. rec. F1
Overall 10420 3009 3657 77.59 74.02 75.77
A0 3045 490 518 86.14 85.46 85.80 A1 3702 903 1225 80.39 75.14 77.68 A2 664 289 446 69.67 59.82 64.37 A3 91 45 82 66.91 52.60 58.90 A4 75 27 27 73.53 73.53 73.53 A5 3 2 2 60.00 60.00 60.00 AM 0 1 0 0.00 0.00 0.00 AM-ADV 263 244 243 51.87 51.98 51.92 AM-CAU 33 36 40 47.83 45.21 46.48 AM-DIR 35 31 50 53.03 41.18 46.36 AM-DIS 244 93 76 72.40 76.25 74.28 AM-EXT 14 13 18 51.85 43.75 47.46 AM-LOC 187 160 176 53.89 51.52 52.68 AM-MNR 189 153 155 55.26 54.94 55.10 AM-MOD 518 47 33 91.68 94.01 92.83 AM-NEG 207 31 23 86.97 90.00 88.46 AM-PNC 47 62 68 43.12 40.87 41.96 AM-PRD 1 5 4 16.67 20.00 18.18 AM-REC 0 1 2 0.00 0.00 0.00 AM-TMP 777 286 310 73.10 71.48 72.28 R-A0 178 27 46 86.83 79.46 82.98 R-A1 105 42 51 71.43 67.31 69.31 R-A2 5 6 11 45.45 31.25 37.04 R-A3 0 1 1 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 R-AM-ADV 0 0 2 0.00 0.00 0.00 R-AM-CAU 2 0 2 100.00 50.00 66.67 R-AM-EXT 0 0 1 0.00 0.00 0.00 R-AM-LOC 8 2 13 80.00 38.10 51.61 R-AM-MNR 1 3 5 25.00 16.67 20.00 R-AM-TMP 26 9 26 74.29 50.00 59.77
V 5258 10 9 99.81 99.83 99.82
Nominalizations
Memory and time considerations
- Memory: At least 4 GB main memory
- Time for inference:
- ILP inference on 2400 sentences took 78835 ms
- Beam search on 2400 sentences took 81746 ms
Performance: ILP inference
- Number of Sentences : 2416
- Number of Propositions : 3793
- Percentage of perfect props : 40.55
corr. excess missed prec. rec. F1
Overall 4646 1632 2981 74.00 60.92 66.82
A0 1188 304 589 79.62 66.85 72.68 A1 1629 513 997 76.05 62.03 68.33 A2 598 191 408 75.79 59.44 66.63 A3 138 39 86 77.97 61.61 68.83 A4 5 9 12 35.71 29.41 32.26 A5 1 0 0 100.00 100.00 100.00 A8 2 2 3 50.00 40.00 44.44 A9 0 0 2 0.00 0.00 0.00 AM-ADV 4 7 17 36.36 19.05 25.00 AM-CAU 0 1 0 0.00 0.00 0.00 AM-DIR 1 2 1 33.33 50.00 40.00 AM-DIS 0 0 2 0.00 0.00 0.00 AM-EXT 20 14 14 58.82 58.82 58.82 AM-LOC 102 68 97 60.00 51.26 55.28 AM-MNR 216 98 125 68.79 63.34 65.95 AM-NEG 18 4 11 81.82 62.07 70.59 AM-PNC 3 0 8 100.00 27.27 42.86 AM-TMP 279 78 123 78.15 69.40 73.52 R-A0 4 3 27 57.14 12.90 21.05 R-A1 4 3 12 57.14 25.00 34.78 R-A2 0 1 7 0.00 0.00 0.00 R-A3 0 0 2 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 SUP 434 295 437 59.53 49.83 54.25
V 2513 195 148 92.80 94.44 93.61
Performance: Beam Search
- Number of Sentences : 2416
- Number of Propositions : 3793
- Percentage of perfect props : 39.52
corr. excess missed prec. rec. F1
Overall 4566 1633 3061 73.66 59.87 66.05
A0 1181 360 596 76.64 66.46 71.19 A1 1585 499 1041 76.06 60.36 67.30 A2 590 190 416 75.64 58.65 66.07 A3 136 35 88 79.53 60.71 68.86 A4 4 10 13 28.57 23.53 25.81 A5 0 0 1 0.00 0.00 0.00 A8 2 2 3 50.00 40.00 44.44 A9 0 1 2 0.00 0.00 0.00 AM-ADV 4 7 17 36.36 19.05 25.00 AM-CAU 0 1 0 0.00 0.00 0.00 AM-DIR 1 1 1 50.00 50.00 50.00 AM-DIS 0 0 2 0.00 0.00 0.00 AM-EXT 20 10 14 66.67 58.82 62.50 AM-LOC 100 66 99 60.24 50.25 54.79 AM-MNR 213 97 128 68.71 62.46 65.44 AM-NEG 17 2 12 89.47 58.62 70.83 AM-PNC 3 0 8 100.00 27.27 42.86 AM-TMP 274 71 128 79.42 68.16 73.36 R-A0 4 3 27 57.14 12.90 21.05 R-A1 4 1 12 80.00 25.00 38.10 R-A2 0 0 7 0.00 0.00 0.00 R-A3 0 0 2 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 SUP 428 277 443 60.71 49.14 54.31
V 2523 205 138 92.49 94.81 93.64