[PATCH] D80744: DAGCombiner optimization for pow(x, 0.75) even in case massv function is asked

Thu Jun 4 23:57:54 PDT 2020

steven.zhang added a comment.

> Although, this fix is not ideal. In case we want to use other vector libraries like `Accelerate` or `SVML` on PowerPC in future, this code is preventing them to generate accurate libcall for them. Any idea how to fix this issue?

As you mark the FPOW as custom which will change the cost when vectorize the llvm.pow.f32. So it will vectorize it as llvm.pow.v4f32 **even MASSV is disabled**, as we are telling the loop vectorizer that it is cheap in Backend to lower FPOW(llvm.pow.v4f32), which is not always true. That will **bring regression** for the code path with MASSV disabled if the argument is not 0.75.

Further, even we want to expand POW as libcall, it is not the right place to do it in lower, but do it in legalizer when we are trying to expand the POW to powf.

I think, the best way to do this is to change the loop vectorizer cost model for PowerPC. If the argument is 0.75, the cost of vectorizing llvm.pow.f32 is small, no matter MASSV enabled or not. So that, we will always get llvm.pow.v4f32. But I don't know if it is easy to do it.

Another easy way is to turn the massv function to intrinsic, as that is the motivation of this patch as you described which makes sense to me.

  diff --git a/llvm/lib/Target/PowerPC/PPCLowerMASSVEntries.cpp b/llvm/lib/Target/PowerPC/PPCLowerMASSVEntries.cpp
  index 429b8a31fbe9..74bd31b0b044 100644
  --- a/llvm/lib/Target/PowerPC/PPCLowerMASSVEntries.cpp
  +++ b/llvm/lib/Target/PowerPC/PPCLowerMASSVEntries.cpp
  @@ -105,6 +105,21 @@ bool PPCLowerMASSVEntries::lowerMASSVCall(CallInst *CI, Function &Func,
     if (CI->use_empty())
       return false;

  +  // FIXME - add necessary fast math flag check here.
  +  if (Func.getName() == "__powf4_massv") {
  +    if (Constant *Exp = dyn_cast<Constant>(CI->getArgOperand(1))) {
  +      if (ConstantFP *CFP = dyn_cast<ConstantFP>(Exp->getSplatValue())) {
  +        // If the argument is 0.75, it is cheaper to turn it into pow intrinsic
  +        // so that it could be optimzed as two sqrt's.
  +        if (CFP->isExactlyValue(0.75)) {
  +          CI->setCalledFunction(Intrinsic::getDeclaration(&M, Intrinsic::pow,
  +                                                          CI->getType()));
  +          return true;
  +        }
  +      }
  +    }
  +  }
  +
     std::string MASSVEntryName = createMASSVFuncName(Func, Subtarget);
     FunctionCallee FCache = M.getOrInsertFunction(
         MASSVEntryName, Func.getFunctionType(), Func.getAttributes());

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80744/new/

https://reviews.llvm.org/D80744