Skip to content
Snippets Groups Projects
Commit 69170438 authored by Akash Kothari's avatar Akash Kothari
Browse files

pushing runtime experiment profile

parent e73b5097
No related branches found
No related tags found
No related merge requests found
2725.121326
+++++
conf1 1 1 78.78 0.0
1 gpu conv fp32 11 add fp32 1 tanh fp32 1 pool_max fp32 1
2 gpu conv fp32 11 add fp32 1 tanh fp32 1 pool_max fp32 1
3 gpu conv fp32 11 add fp32 1 tanh fp32 1
4 gpu conv fp32 11 add fp32 1 tanh fp32 1
5 gpu conv fp32 11 add fp32 1 tanh fp32 1 pool_max fp32 1
6 gpu mul fp32 11 add fp32 1
7 gpu softmax fp32 1
-----
+++++
conf2 2.1233638648528457 1.6150951710244676 78.3544 0.42560000000000286
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf3 2.051295134864554 1.6122580072322763 78.3278 0.4522000000000048
1 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf4 2.188609573694276 1.688911612634961 78.30120000000001 0.47879999999999256
1 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 268 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf5 2.0570505767108007 1.6000014977491621 78.2214 0.5585999999999984
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 265 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf6 2.009166522889861 1.5755494376470724 78.1948 0.5852000000000004
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf7 2.0188668300066377 1.5976556515195433 78.06179999999999 0.7182000000000102
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 268 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 266 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf8 2.1797184471932716 1.6767378001241562 78.06179999999999 0.7182000000000102
1 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf9 2.064914192886025 1.6203964986881603 78.06179999999999 0.7182000000000102
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf10 2.2070171560926672 1.7194657877315815 78.0352 0.7447999999999979
1 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 265 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf11 2.0161469236407057 1.5964768988685245 78.0086 0.7713999999999999
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf12 2.157846755426679 1.6765250202752133 78.0086 0.7713999999999999
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf13 2.0319664118931096 1.6183541826275754 77.98200000000001 0.7979999999999876
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf14 2.354997704376988 1.7779732164691666 77.98200000000001 0.7979999999999876
1 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv fp16 12 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf15 2.3463673263694 1.8510470086526165 77.98200000000001 0.7979999999999876
1 gpu conv samp_fp16 264 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf16 2.284714727579521 1.7855758235498087 77.7692 1.0108000000000033
1 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
5 gpu conv samp_fp16 269 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf17 2.3463673263694 1.8510470086526165 77.68939999999999 1.0906000000000091
1 gpu conv samp_fp16 264 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf18 2.427840309027486 1.9007943438562696 77.68939999999999 1.0906000000000091
1 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 263 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf19 2.4671009475732766 1.9246545843862224 77.47659999999999 1.3034000000000106
1 gpu conv samp_fp16 264 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf20 2.5567127702266332 1.9773019485322874 77.2638 1.5161999999999978
1 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf21 2.557898283218207 1.9895818051250724 77.2372 1.5427999999999997
1 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf22 2.557898283218207 1.9895818051250724 77.21060000000001 1.5693999999999875
1 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
+++++
conf23 2.6457265307759883 2.029290916760937 77.1574 1.6226000000000056
1 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
2 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
6 gpu mul fp16 12 add fp16 12
7 gpu softmax fp16 12
-----
1129.3450630000002
+++++
conf1 1 1 84.76 0.0
1 gpu conv fp32 11 add fp32 1 tanh fp32 1
2 gpu conv fp32 11 add fp32 1 tanh fp32 1 pool_max fp32 1
3 gpu conv fp32 11 add fp32 1 tanh fp32 1
4 gpu conv fp32 11 add fp32 1 tanh fp32 1 pool_max fp32 1
5 gpu conv fp32 11 add fp32 1 tanh fp32 1
6 gpu conv fp32 11 add fp32 1 tanh fp32 1 pool_max fp32 1
7 gpu mul fp32 11 add fp32 1
8 gpu softmax fp32 1
-----
+++++
conf2 2.2258170210610477 1.3875307929727092 84.74 0.020000000000010232
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 151 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv fp16 12 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf3 2.3673182996864846 1.4566777038051897 84.49999999999999 0.2600000000000193
1 gpu conv fp16 12 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf4 2.24614762418964 1.41800542976017 84.25999999999999 0.5000000000000142
1 gpu conv fp16 12 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 158 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 268 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf5 2.304084258604824 1.4284953488024343 84.228 0.5320000000000107
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 151 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 267 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf6 2.3377766277342653 1.4440340860007412 84.228 0.5320000000000107
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
6 gpu conv fp16 12 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf7 2.24614762418964 1.41800542976017 84.17479999999999 0.5852000000000146
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 158 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 268 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf8 2.3673182996864846 1.4566777038051897 84.095 0.6650000000000063
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf9 2.2463714607055545 1.417884448648111 83.8024 0.9575999999999993
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 158 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 266 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf10 2.389025803395913 1.4732901147183992 83.77579999999999 0.9842000000000155
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 268 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf11 2.288831273542033 1.435952475412438 83.61619999999999 1.143800000000013
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 158 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf12 2.288831273542033 1.435952475412438 83.58959999999999 1.170400000000015
1 gpu conv fp16 12 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 158 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf13 2.389025803395913 1.4732901147183992 83.58959999999999 1.170400000000015
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 268 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf14 2.3892790238475423 1.4731595166090572 83.4566 1.3034000000000106
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 266 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf15 2.390450803781405 1.4707319718833016 83.3768 1.3832000000000022
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 266 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 157 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf16 2.4373708430335537 1.49267343110314 83.3768 1.3832000000000022
1 gpu conv fp16 11 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
+++++
conf17 2.4373708430335537 1.49267343110314 83.2704 1.48960000000001
1 gpu conv fp16 12 add fp16 12 tanh fp16 12
2 gpu conv perf_fp16 153 add fp16 12 tanh fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
4 gpu conv samp_fp16 262 add fp16 12 tanh fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 261 add fp16 12 tanh fp16 12
6 gpu conv perf_fp16 160 add fp16 12 tanh fp16 12 pool_max fp16 12
7 gpu mul fp16 12 add fp16 12
8 gpu softmax fp16 12
-----
3845.438677999999
+++++
conf1 1 1 68.42 0.0
1 gpu conv fp32 11 add fp32 1 relu fp32 1
2 gpu conv fp32 11 add fp32 1 relu fp32 1 pool_max fp32 1
3 gpu conv fp32 11 add fp32 1 relu fp32 1
4 gpu conv fp32 11 add fp32 1 relu fp32 1 pool_max fp32 1
5 gpu conv fp32 11 add fp32 1 relu fp32 1
6 gpu conv fp32 11 add fp32 1 relu fp32 1
7 gpu conv fp32 11 add fp32 1 relu fp32 1 pool_max fp32 1
8 gpu conv fp32 11 add fp32 1 relu fp32 1
9 gpu conv fp32 11 add fp32 1 relu fp32 1
10 gpu conv fp32 11 add fp32 1 relu fp32 1 pool_max fp32 1
11 gpu conv fp32 11 add fp32 1 relu fp32 1
12 gpu conv fp32 11 add fp32 1 relu fp32 1
13 gpu conv fp32 11 add fp32 1 relu fp32 1 pool_max fp32 1
14 gpu mul fp32 11 add fp32 1 relu fp32 1
15 gpu mul fp32 11 add fp32 1
16 gpu softmax fp32 1
-----
+++++
conf2 2.4361074671227554 1.7555866253938424 67.22 1.2000000000000028
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv perf_fp16 163 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 269 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv fp16 11 add fp16 12 relu fp16 12
12 gpu conv fp16 11 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 264 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf3 2.602684148359414 1.8286503060252126 67.10000000000001 1.3199999999999932
1 gpu conv fp16 12 add fp16 12 relu fp16 12
2 gpu conv perf_fp16 156 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv fp16 11 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf4 2.661880095451371 1.886369953641946 67.06 1.3599999999999994
1 gpu conv fp16 12 add fp16 12 relu fp16 12
2 gpu conv perf_fp16 156 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf5 2.5990656605003855 1.8588553950032938 66.84 1.5799999999999983
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv perf_fp16 163 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf6 2.5884968081531485 1.8594972115815722 66.8 1.6200000000000045
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv perf_fp16 165 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf7 2.4323231936537972 1.8028228076034056 66.8 1.6200000000000045
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv samp_fp16 269 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf8 2.575472326184571 1.8375078883357683 66.72 1.7000000000000028
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv perf_fp16 161 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv fp16 11 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf9 2.4912510106198957 1.848807665058795 66.58 1.8400000000000034
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv samp_fp16 266 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf10 2.4323231936537972 1.8028228076034056 66.53999999999999 1.8800000000000097
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv samp_fp16 269 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 152 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
+++++
conf11 2.4027045398540046 1.7853827712848849 66.47999999999999 1.940000000000012
1 gpu conv fp16 11 add fp16 12 relu fp16 12
2 gpu conv samp_fp16 269 add fp16 12 relu fp16 12 pool_max fp16 12
3 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
4 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
5 gpu conv fp16 12 add fp16 12 relu fp16 12
6 gpu conv samp_fp16 261 add fp16 12 relu fp16 12
7 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
8 gpu conv perf_fp16 155 add fp16 12 relu fp16 12
9 gpu conv samp_fp16 262 add fp16 12 relu fp16 12
10 gpu conv samp_fp16 262 add fp16 12 relu fp16 12 pool_max fp16 12
11 gpu conv perf_fp16 160 add fp16 12 relu fp16 12
12 gpu conv perf_fp16 151 add fp16 12 relu fp16 12
13 gpu conv samp_fp16 261 add fp16 12 relu fp16 12 pool_max fp16 12
14 gpu mul fp16 12 add fp16 12 relu fp16 12
15 gpu mul fp16 12 add fp16 12
16 gpu softmax fp16 12
-----
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment