[PATCH] D57669: [SLP] Fix incorrect cost tree calculation.

Dinar Temirbulatov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 3 16:10:23 PST 2019


dtemirbulatov created this revision.
dtemirbulatov added reviewers: ABataev, RKSimon, spatel, anton-afanasyev, hfinkel.
Herald added a subscriber: javed.absar.

I found that during tree cost calulation, the algorithm uses tree entries that were not supposed to be vectorized and were rejected on the early stage, but we still estimating those entries during the whole tree estimation. Following change fixes this issue. 
Also  here is spec 2k6 data before and after this change on :
...
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
stepping	: 3
microcode	: 0xc6
cpu MHz		: 2429.650
cache size	: 6144 KB
....
Before:
400.perlbench                               NR
401.bzip2        9650    502         19.2   S
401.bzip2        9650    481         20.1   S
401.bzip2        9650    500         19.3   *
403.gcc          8050    248         32.5   S
403.gcc          8050    244         33.0   S
403.gcc          8050    245         32.9   *
429.mcf          9120    328         27.8   *
429.mcf          9120    322         28.3   S
429.mcf          9120    331         27.5   S
445.gobmk       10490    468         22.4   *
445.gobmk       10490    475         22.1   S
445.gobmk       10490    466         22.5   S
456.hmmer        9330    349         26.7   S
456.hmmer        9330    348         26.8   S
456.hmmer        9330    349         26.7   *
458.sjeng       12100    458         26.4   S
458.sjeng       12100    588         20.6   S
458.sjeng       12100    467         25.9   *
462.libquantum  20720    269         77.1   *
462.libquantum  20720    312         66.4   S
462.libquantum  20720    267         77.7   S
464.h264ref     22130    516         42.9   *
464.h264ref     22130    516         42.9   S
464.h264ref     22130    515         43.0   S
471.omnetpp      6250    327         19.1   S
471.omnetpp      6250    330         18.9   *
471.omnetpp      6250    333         18.8   S
473.astar        7020         --            CE

483.xalancbmk    6900         --            CE
==============================================

400.perlbench                               NR
401.bzip2        9650    500         19.3   *
403.gcc          8050    245         32.9   *
429.mcf          9120    328         27.8   *
445.gobmk       10490    468         22.4   *
456.hmmer        9330    349         26.7   *
458.sjeng       12100    467         25.9   *
462.libquantum  20720    269         77.1   *
464.h264ref     22130    516         42.9   *
471.omnetpp      6250    330         18.9   *
473.astar                                   NR
483.xalancbmk                               NR

After:
400.perlbench                               NR
401.bzip2        9650    493         19.6   S
401.bzip2        9650    491         19.6   S
401.bzip2        9650    492         19.6   *
403.gcc          8050    254         31.7   S
403.gcc          8050    253         31.8   S
403.gcc          8050    254         31.7   *
429.mcf          9120    329         27.7   S
429.mcf          9120    328         27.8   *
429.mcf          9120    327         27.9   S
445.gobmk       10490    469         22.4   S
445.gobmk       10490    468         22.4   S
445.gobmk       10490    468         22.4   *
456.hmmer        9330    347         26.9   S
456.hmmer        9330    427         21.8   S
456.hmmer        9330    348         26.8   *
458.sjeng       12100    460         26.3   S
458.sjeng       12100    662         18.3   S
458.sjeng       12100    460         26.3   *
462.libquantum  20720    268         77.3   *
462.libquantum  20720    268         77.4   S
462.libquantum  20720    341         60.8   S
464.h264ref     22130    504         43.9   S
464.h264ref     22130    500         44.2   S
464.h264ref     22130    503         44.0   *
471.omnetpp      6250    325         19.3   *
471.omnetpp      6250    324         19.3   S
471.omnetpp      6250    328         19.1   S
473.astar        7020         --            CE

483.xalancbmk    6900         --            CE
==============================================

400.perlbench                               NR
401.bzip2        9650    492         19.6   *
403.gcc          8050    254         31.7   *
429.mcf          9120    328         27.8   *
445.gobmk       10490    468         22.4   *
456.hmmer        9330    348         26.8   *
458.sjeng       12100    460         26.3   *
462.libquantum  20720    268         77.3   *
464.h264ref     22130    503         44.0   *
471.omnetpp      6250    325         19.3   *
473.astar                                   NR
483.xalancbmk                               NR


https://reviews.llvm.org/D57669

Files:
  lib/Transforms/Vectorize/SLPVectorizer.cpp
  test/Transforms/SLPVectorizer/AArch64/gather-cost.ll
  test/Transforms/SLPVectorizer/AArch64/getelementptr.ll
  test/Transforms/SLPVectorizer/AArch64/horizontal.ll
  test/Transforms/SLPVectorizer/AArch64/transpose.ll
  test/Transforms/SLPVectorizer/X86/PR36280.ll
  test/Transforms/SLPVectorizer/X86/PR39774.ll
  test/Transforms/SLPVectorizer/X86/addsub.ll
  test/Transforms/SLPVectorizer/X86/alternate-fp.ll
  test/Transforms/SLPVectorizer/X86/alternate-int.ll
  test/Transforms/SLPVectorizer/X86/bad_types.ll
  test/Transforms/SLPVectorizer/X86/blending-shuffle.ll
  test/Transforms/SLPVectorizer/X86/crash_binaryop.ll
  test/Transforms/SLPVectorizer/X86/crash_cmpop.ll
  test/Transforms/SLPVectorizer/X86/crash_dequeue.ll
  test/Transforms/SLPVectorizer/X86/crash_flop7.ll
  test/Transforms/SLPVectorizer/X86/crash_gep.ll
  test/Transforms/SLPVectorizer/X86/crash_lencod.ll
  test/Transforms/SLPVectorizer/X86/crash_scheduling.ll
  test/Transforms/SLPVectorizer/X86/cse.ll
  test/Transforms/SLPVectorizer/X86/external_user.ll
  test/Transforms/SLPVectorizer/X86/hadd.ll
  test/Transforms/SLPVectorizer/X86/horizontal.ll
  test/Transforms/SLPVectorizer/X86/hsub.ll
  test/Transforms/SLPVectorizer/X86/reorder_phi.ll
  test/Transforms/SLPVectorizer/X86/simplebb.ll
  test/Transforms/SLPVectorizer/X86/treecost.ll
  test/Transforms/SLPVectorizer/X86/unreachable.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D57669.184983.patch
Type: text/x-patch
Size: 156888 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190204/d2c3dd5e/attachment-0001.bin>


More information about the llvm-commits mailing list