[all-commits] [llvm/llvm-project] 670dd4: [Analysis] Fix getNumberOfParts to return 0 when t...

Wed Nov 17 04:07:24 PST 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 670dd402441fdda4779fd59604a8b9b526efd7e6
      https://github.com/llvm/llvm-project/commit/670dd402441fdda4779fd59604a8b9b526efd7e6
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2021-11-17 (Wed, 17 Nov 2021)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
    M llvm/include/llvm/CodeGen/BasicTTIImpl.h
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    A llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll

  Log Message:
  -----------
  [Analysis] Fix getNumberOfParts to return 0 when the answer is unknown

When asking how many parts are required for a scalable vector type
there are occasions when it cannot be computed. For example, <vscale x 1 x i3>
is one such vector for AArch64+SVE because at the moment no matter how we
promote the i3 type we never end up with a legal vector. This means
that getTypeConversion returns TypeScalarizeScalableVector as the
LegalizeKind, and then getTypeLegalizationCost returns an invalid cost.
This then causes BasicTTImpl::getNumberOfParts to dereference an invalid
cost, which triggers an assert. This patch changes getNumberOfParts to
return 0 for such cases, since the definition of getNumberOfParts in
TargetTransformInfo.h states that we can use a return value of 0 to represent
an unknown answer.

Currently, LoopVectorize.cpp is the only place where we need to check for
0 as a return value, because all other instances will not currently
ask for the number of parts for <vscale x 1 x iX> types.

In addition, I have changed the target-independent interface for
getNumberOfParts to return 1 and assume there is a single register
that can fit the type. The loop vectoriser has lots of tests that are
target-independent and they relied upon the 0 value to mean the
answer is known and that we are not scalarising the vector.

I have added tests here that show we correctly return an invalid cost
for VF=vscale x 1 when the loop contains unusual types such as i7:

  Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll

Differential Revision: https://reviews.llvm.org/D113772