[all-commits] [llvm/llvm-project] 3c25c4: [LV] Account for the cost of predication of scalar...

Wed Mar 17 03:58:18 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 3c25c40d51e80492ad4368c6bfdf37e02848f49d
      https://github.com/llvm/llvm-project/commit/3c25c40d51e80492ad4368c6bfdf37e02848f49d
  Author: David Green <david.green at arm.com>
  Date:   2021-03-17 (Wed, 17 Mar 2021)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/test/Transforms/LoopVectorize/X86/small-size.ll

  Log Message:
  -----------
  [LV] Account for the cost of predication of scalarized load/store

This adds the cost of an i1 extract and a branch to the cost in
getMemInstScalarizationCost when the instruction is predicated. These
predicated loads/store would generate blocks of something like:

    %c1 = extractelement <4 x i1> %C, i32 1
    br i1 %c1, label %if, label %else
  if:
    %sa = extractelement <4 x i32> %a, i32 1
    %sb = getelementptr inbounds float, float* %pg, i32 %sa
    %sv = extractelement <4 x float> %x, i32 1
    store float %sa, float* %sb, align 4
  else:

So this increases the cost by the extract and branch. This is probably
still too low in many cases due to the cost of all that branching, but
there is already an existing hack increasing the cost using
useEmulatedMaskMemRefHack. It will increase the cost of a memop if it is
a load or there are more than one store. This patch improves the cost
for when there is only a single store, and hopefully at some point in
the future the hack can be removed.

Differential Revision: https://reviews.llvm.org/D98243