Skip to content
Snippets Groups Projects
report.org 12.84 KiB

Nominalizations

Nombank dictionary statistics

  • The following data comes from NOMLEX-plus-clean-1.0.
  • Grouping these per type, we get
Nominalization Class Count
NOM 3934
NOMLIKE 1244
NOMING 359
ABLE-NOM 18
NOMADJ 503
NOMADJLIKE 142
PARTITIVE 509
ATTRIBUTE 417
RELATIONAL 331
WORK-OF-ART 188
ABILITY 112
ENVIRONMENT 91
GROUP 84
HALLMARK 38
JOB 28
VERSION 21
TYPE 17
EVENT 12
SHARE 12
ISSUE 11
CRISSCROSS 7
FIELD 6
Total 8084
Nominalization type Count Percent
Verbal 5555 68.7
Adjectival 645 8.0
Other 1884 23.3
Total 8084

Nombank training data statistics

  • Statistics for distribution of NOMLEX classes over the training data
Type Number of occurrences
NOM 50612
NOMLIKE 22583
PARTITIVE 11841
ATTRIBUTE 10456
RELATIONAL 7527
ABILITY 5904
WORK_OF_ART 3852
GROUP 3633
NOMADJ 3280
NOMING 3252
SHARE 3192
ENVIRONMENT 3108
NOMADJLIKE 2700
JOB 1502
ISSUE 1031
VERSION 725
CRISSCROSS 346
FIELD 330
HALLMARK 299
EVENT 172
ABLE_NOM 60
UNKNOWN_CLASS 1
  • Statistics for distribution of NOMLEX classes for predicates that have at least one of the four Deverbal classes
Type Number of occurrences
NOM 50612
NOMLIKE 22583
ATTRIBUTE 4805
PARTITIVE 3998
NOMING 3252
SHARE 2646
ABILITY 2583
WORK_OF_ART 2134
RELATIONAL 2090
NOMADJLIKE 1489
ENVIRONMENT 1094
GROUP 1078
JOB 933
NOMADJ 650
ISSUE 592
VERSION 543
TYPE 328
FIELD 251
CRISSCROSS 212
HALLMARK 164
ABLE_NOM 60

Version 3.0.2

For timing the systems, the experiments were conducted on a machine that has 2x 6-Core Intel Xeon E5645 Processor with 12MB cache and clock speed of 2.40 GHz. ILP inference was done using Gurobi v4. The beam search does use the multi-core nature of the processor.

Verb

Memory and time considerations

  • Memory: At least 5.5 GB main memory
  • Time for inference:
    • ILP inference on 2400 sentences took 197467 ms
    • Beam search on 2400 sentences took 163934 ms

Performance: ILP inference

  • Number of Sentences : 2416
  • Number of Propositions : 5267
  • Percentage of perfect props : 50.56

    corr. excess missed prec. rec. F1


Overall 10610 3312 3467 76.21 75.37 75.79


A0 3086 530 477 85.34 86.61 85.97 A1 3792 960 1135 79.80 76.96 78.36 A2 686 316 424 68.46 61.80 64.96 A3 94 53 79 63.95 54.34 58.75 A4 74 30 28 71.15 72.55 71.84 A5 3 4 2 42.86 60.00 50.00 AM 0 4 0 0.00 0.00 0.00 AM-ADV 264 260 242 50.38 52.17 51.26 AM-CAU 33 42 40 44.00 45.21 44.59 AM-DIR 36 39 49 48.00 42.35 45.00 AM-DIS 246 106 74 69.89 76.88 73.21 AM-EXT 14 12 18 53.85 43.75 48.28 AM-LOC 192 178 171 51.89 52.89 52.39 AM-MNR 192 181 152 51.47 55.81 53.56 AM-MOD 524 49 27 91.45 95.10 93.24 AM-NEG 207 29 23 87.71 90.00 88.84 AM-PNC 44 70 71 38.60 38.26 38.43 AM-PRD 1 6 4 14.29 20.00 16.67 AM-REC 0 1 2 0.00 0.00 0.00 AM-TMP 789 327 298 70.70 72.59 71.63 R-A0 182 47 42 79.48 81.25 80.35 R-A1 108 53 48 67.08 69.23 68.14 R-A2 6 4 10 60.00 37.50 46.15 R-A3 0 1 1 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 R-AM-ADV 0 0 2 0.00 0.00 0.00 R-AM-CAU 1 0 3 100.00 25.00 40.00 R-AM-EXT 0 0 1 0.00 0.00 0.00 R-AM-LOC 8 1 13 88.89 38.10 53.33 R-AM-MNR 2 2 4 50.00 33.33 40.00 R-AM-TMP 26 7 26 78.79 50.00 61.18


V 5259 8 8 99.85 99.85 99.85


Performance: Beam search

  • Number of Sentences : 2416
  • Number of Propositions : 5267
  • Percentage of perfect props : 50.39

    corr. excess missed prec. rec. F1


Overall 10420 3009 3657 77.59 74.02 75.77


A0 3045 490 518 86.14 85.46 85.80 A1 3702 903 1225 80.39 75.14 77.68 A2 664 289 446 69.67 59.82 64.37 A3 91 45 82 66.91 52.60 58.90 A4 75 27 27 73.53 73.53 73.53 A5 3 2 2 60.00 60.00 60.00 AM 0 1 0 0.00 0.00 0.00 AM-ADV 263 244 243 51.87 51.98 51.92 AM-CAU 33 36 40 47.83 45.21 46.48 AM-DIR 35 31 50 53.03 41.18 46.36 AM-DIS 244 93 76 72.40 76.25 74.28 AM-EXT 14 13 18 51.85 43.75 47.46 AM-LOC 187 160 176 53.89 51.52 52.68 AM-MNR 189 153 155 55.26 54.94 55.10 AM-MOD 518 47 33 91.68 94.01 92.83 AM-NEG 207 31 23 86.97 90.00 88.46 AM-PNC 47 62 68 43.12 40.87 41.96 AM-PRD 1 5 4 16.67 20.00 18.18 AM-REC 0 1 2 0.00 0.00 0.00 AM-TMP 777 286 310 73.10 71.48 72.28 R-A0 178 27 46 86.83 79.46 82.98 R-A1 105 42 51 71.43 67.31 69.31 R-A2 5 6 11 45.45 31.25 37.04 R-A3 0 1 1 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 R-AM-ADV 0 0 2 0.00 0.00 0.00 R-AM-CAU 2 0 2 100.00 50.00 66.67 R-AM-EXT 0 0 1 0.00 0.00 0.00 R-AM-LOC 8 2 13 80.00 38.10 51.61 R-AM-MNR 1 3 5 25.00 16.67 20.00 R-AM-TMP 26 9 26 74.29 50.00 59.77


V 5258 10 9 99.81 99.83 99.82


Nominalizations

Memory and time considerations

  • Memory: At least 4 GB main memory
  • Time for inference:
    • ILP inference on 2400 sentences took 78835 ms
    • Beam search on 2400 sentences took 81746 ms

Performance: ILP inference

  • Number of Sentences : 2416
  • Number of Propositions : 3793
  • Percentage of perfect props : 40.55

    corr. excess missed prec. rec. F1


Overall 4646 1632 2981 74.00 60.92 66.82


A0 1188 304 589 79.62 66.85 72.68 A1 1629 513 997 76.05 62.03 68.33 A2 598 191 408 75.79 59.44 66.63 A3 138 39 86 77.97 61.61 68.83 A4 5 9 12 35.71 29.41 32.26 A5 1 0 0 100.00 100.00 100.00 A8 2 2 3 50.00 40.00 44.44 A9 0 0 2 0.00 0.00 0.00 AM-ADV 4 7 17 36.36 19.05 25.00 AM-CAU 0 1 0 0.00 0.00 0.00 AM-DIR 1 2 1 33.33 50.00 40.00 AM-DIS 0 0 2 0.00 0.00 0.00 AM-EXT 20 14 14 58.82 58.82 58.82 AM-LOC 102 68 97 60.00 51.26 55.28 AM-MNR 216 98 125 68.79 63.34 65.95 AM-NEG 18 4 11 81.82 62.07 70.59 AM-PNC 3 0 8 100.00 27.27 42.86 AM-TMP 279 78 123 78.15 69.40 73.52 R-A0 4 3 27 57.14 12.90 21.05 R-A1 4 3 12 57.14 25.00 34.78 R-A2 0 1 7 0.00 0.00 0.00 R-A3 0 0 2 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 SUP 434 295 437 59.53 49.83 54.25


V 2513 195 148 92.80 94.44 93.61


Performance: Beam Search

  • Number of Sentences : 2416
  • Number of Propositions : 3793
  • Percentage of perfect props : 39.52

    corr. excess missed prec. rec. F1


Overall 4566 1633 3061 73.66 59.87 66.05


A0 1181 360 596 76.64 66.46 71.19 A1 1585 499 1041 76.06 60.36 67.30 A2 590 190 416 75.64 58.65 66.07 A3 136 35 88 79.53 60.71 68.86 A4 4 10 13 28.57 23.53 25.81 A5 0 0 1 0.00 0.00 0.00 A8 2 2 3 50.00 40.00 44.44 A9 0 1 2 0.00 0.00 0.00 AM-ADV 4 7 17 36.36 19.05 25.00 AM-CAU 0 1 0 0.00 0.00 0.00 AM-DIR 1 1 1 50.00 50.00 50.00 AM-DIS 0 0 2 0.00 0.00 0.00 AM-EXT 20 10 14 66.67 58.82 62.50 AM-LOC 100 66 99 60.24 50.25 54.79 AM-MNR 213 97 128 68.71 62.46 65.44 AM-NEG 17 2 12 89.47 58.62 70.83 AM-PNC 3 0 8 100.00 27.27 42.86 AM-TMP 274 71 128 79.42 68.16 73.36 R-A0 4 3 27 57.14 12.90 21.05 R-A1 4 1 12 80.00 25.00 38.10 R-A2 0 0 7 0.00 0.00 0.00 R-A3 0 0 2 0.00 0.00 0.00 R-A4 0 0 1 0.00 0.00 0.00 SUP 428 277 443 60.71 49.14 54.31


V 2523 205 138 92.49 94.81 93.64