[all-commits] [llvm/llvm-project] 9ddb28: [ARM] Tune getCastInstrCost for extending masked l...

Wed Jul 29 05:41:59 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 9ddb28964c92f2a2185aff0db77eaa167ac48dcf
      https://github.com/llvm/llvm-project/commit/9ddb28964c92f2a2185aff0db77eaa167ac48dcf
  Author: David Green <david.green at arm.com>
  Date:   2020-07-29 (Wed, 29 Jul 2020)

  Changed paths:
    M llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
    A llvm/test/Transforms/LoopVectorize/ARM/tail-folding-reduces-vf.ll

  Log Message:
  -----------
  [ARM] Tune getCastInstrCost for extending masked loads and truncating masked stores

This patch uses the feature added in D79162 to fix the cost of a
sext/zext of a masked load, or a trunc for a masked store.
Previously, those were considered cheap or even free, but it's
not the case as we cannot split the load in the same way we would for
normal loads.

This updates the costs to better reflect reality, and adds a test for it
in test/Analysis/CostModel/ARM/cast.ll.

It also adds a vectorizer test that showcases the improvement: in some
cases, the vectorizer will now choose a smaller VF when
tail-predication is enabled, which results in better codegen. (Because
if it were to use a higher VF in those cases, the code we see above
would be generated, and the vmovs would block tail-predication later in
the process, resulting in very poor codegen overall)

Original Patch by Pierre van Houtryve

Differential Revision: https://reviews.llvm.org/D79163