RFC: Enable vectorization of call instructions in the loop vectorizer

Michael Zolotukhin mzolotukhin at apple.com
Thu Feb 26 16:57:17 PST 2015


Hi Arnold,

> On Feb 26, 2015, at 1:49 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> 
> Hi Michael,
> 
> thanks for following up on this. I think it is important to move forward with this.
> 
>> On Feb 26, 2015, at 12:01 PM, Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
>> 
>> Hi,
>> 
>> Some time ago James posted a patch for vectorization of calls [1]. Though it seems to be a missing feature in the vectorizer, it wasn’t committed, so I decided to revive the discussion.
>> 
>> To begin with, I rebased the patch and split it to several semantically separated parts. The approach used here is pretty natural: TargetLibraryInfo has a mapping for a set of scalar functions to their vector counterparts, TargetTransformInfo consults with it when cost of call instruction is calculated, if the vector-version is cheaper, vectorizer emits vector-function.
>> 
>> There are several question on this patch:
>> 1. Is this approach still viable in general and worth pursuing?
> 
> I believe this is still important. We want to be able to vectorize function calls if vectorized versions of math library functions are available.
Thanks, I’ll continue digging this then:)
> 
> 
>> 2. The cost model is changed. In theory it should become more accurate, but I haven’t measured performance impact of this change yet. Does these changes look good in general? If yes, do we need to tune some target-cost-model parameters, or those are the values that no one has been actually using yet?
> 
> I think we will have to test performance impact of this change since we are most likely the first clients. We should test with/without the target tuple that enables the vectorized function calls. I think if we don’t see egregious regressions in the run that uses the modified cost model but is a run without vectorized versions of the math library being available we can commit this patch set such that it gets testing by the community.
That sounds good.
> 
> 
>> 3. What is the best way to populate the scalar to vector functions map? For now, I populate it when the target-triple contains ‘Accelerate’. If we have a lot of other cases, we might think of something different here.
> 
> We use the environment substring of the target tuple (in this case quadruple). This is in spirit to what we are already to conditionally turn on/off features in TargetLibraryInfo. I think we should address the issue of scalability when we have a client where this gets to be an issue.
Sounds good.
> 
> 
>> 
>> Any feedback or ideas are appreciated!
>> 
>> Thanks,
>> Michael
>> 
>> [1]: http://thread.gmane.org/gmane.comp.compilers.llvm.cvs/169736 <http://thread.gmane.org/gmane.comp.compilers.llvm.cvs/169736>
>> 
>> 
>> <0001-Make-ToVectorTy-static.patch>
>> <0002-Fix-a-copy-paste-bug.patch>
>> <0003-TLI-Use-lambda.-NFC.patch>
>> <0004-LoopVectorizer-Add-TargetTransformInfo.patch>
>> <0005-TLI-Factor-out-sanitizeFunctionName.-NFC.patch>
>> <0006-TLI-Add-interface-for-querying-whether-a-function-is.patch>
>> <0007-TTI-Add-getCallInstrCost.patch>
>> <0008-LoopVectorize-teach-loop-vectorizer-to-vectorize-cal.patch>
>> <0009-TTI-Honour-cost-model-for-estimating-cost-of-vector-.patch>
>> <0010-TLI-Add-several-entries-to-VectorizableFunctions-tab.patch>
> 
> Some comments on the code:
> 
> 
> This function could use some documentation. What are Tys? (Argument types).
I’ll add some, thanks.
> 
> +  unsigned getCallInstrCost(Function *F, Type *RetTy, ArrayRef<Type *> Tys,
> +                            const TargetLibraryInfo *TLI) {
> +
> +    // Scalar function calls are always expensive.
> +    if (!RetTy->isVectorTy())
> +      return 10;
> +
> +    // Functions with a vector form are no more expensive than a scalar call.
> +    if (TLI &&
> +        TLI->isFunctionVectorizable(F->getName(),
> +                                    RetTy->getVectorNumElements()))
> +      return 10;
> +
> +    // We have to scalarize this function call. Estimate the cost.
> +    unsigned ScalarizationCost = getScalarizationOverhead(RetTy, true, false);
> +    unsigned ScalarCalls = RetTy->getVectorNumElements();
> +    for (unsigned i = 0, ie = Tys.size(); i != ie; ++i) {
> +      if (Tys[i]->isVectorTy()) {
> +        ScalarizationCost += getScalarizationOverhead(Tys[i], false, true);
> 
> This is not obvious. Why could we have more elements in the vector type of a function arguments than the vector return type when vectorizing a function such that we need more function calls when scalarizing?
It’s an attempt to handle ‘unusual’ functions here. Possible examples would be converts, where input element size differs from output element size, or reductions.I think that for now we can just assert that we only deal with functions where operand types are the same as the result type.

> 
> +        ScalarCalls = std::max(ScalarCalls, Tys[i]->getVectorNumElements());
> +      }
> +    }
> +
> +    return ScalarCalls * 10 + ScalarizationCost;
> +  }
> 
> 
> There seems to be some repetition in the code below. Can we refactor that?
After re-examining this code I spotted strange part (I didn’t change it in the patches):
          Type *Tys[] = {CI->getType()};
          if (VF > 1)
            Tys[0] = VectorType::get(CI->getType()->getScalarType(), VF);
I assume that only works for instrinsics with one argument, and those are the only cases we handle now. But shouldn’t we guard it with an assert or try to handle general case here?

Thanks,
Michael

> 
> +        default:
> +          bool HasScalarOpd = hasVectorInstrinsicScalarOpd(ID, 1);
> +          for (unsigned Part = 0; Part < UF; ++Part) {
> +            SmallVector<Value *, 4> Args;
> +            for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i) {
> +              if (HasScalarOpd && i == 1) {
> +                Args.push_back(CI->getArgOperand(i));
> +                continue;
> +              }
> +              VectorParts &Arg = getVectorValue(CI->getArgOperand(i));
> +              Args.push_back(Arg[Part]);
> +            }
> +            Type *Tys[] = {CI->getType()};
> +            if (VF > 1)
> +              Tys[0] = VectorType::get(CI->getType()->getScalarType(), VF);
> +
> +            Function *F = Intrinsic::getDeclaration(M, ID, Tys);
> +            Entry[Part] = Builder.CreateCall(F, Args);
> +          }
> +
> +          propagateMetadata(Entry, it);
> +          break;
> +        }
> +        // Otherwise, perform the lib call.
> +      } else if (TLI && TLI->isFunctionVectorizable(FnName, VF)) {
> +        // This is a function with a vector form.
> +        StringRef VFnName = TLI->getVectorizedFunction(FnName, VF);
> +        assert(!VFnName.empty());
> +
> +        Function *VectorF = M->getFunction(VFnName);
> +        if (!VectorF) {
> +          // Generate a declaration
> +          FunctionType *FTy = FunctionType::get(RetTy, Tys, false);
> +          VectorF =
> +              Function::Create(FTy, Function::ExternalLinkage, VFnName, M);
> +          assert(VectorF);
> +        }
> +
>          for (unsigned Part = 0; Part < UF; ++Part) {
>            SmallVector<Value *, 4> Args;
>            for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i) {
> -            if (HasScalarOpd && i == 1) {
> -              Args.push_back(CI->getArgOperand(i));
> -              continue;
> -            }
>              VectorParts &Arg = getVectorValue(CI->getArgOperand(i));
>              Args.push_back(Arg[Part]);
>            }
> -          Type *Tys[] = {CI->getType()};
> -          if (VF > 1)
> -            Tys[0] = VectorType::get(CI->getType()->getScalarType(), VF);
> -
> -          Function *F = Intrinsic::getDeclaration(M, ID, Tys);
> -          Entry[Part] = Builder.CreateCall(F, Args);
> +          Entry[Part] = Builder.CreateCall(VectorF, Args);
>          }
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150226/f84018d4/attachment.html>


More information about the llvm-commits mailing list