[PATCH] D27695: Add Instruction number to LSR cost model (PR23384)

Evgeny Stupachenko via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 16 11:42:00 PST 2016


evstupac added a comment.

Hi Wei,

Thanks for taking a look and you comments.
This is a change affecting almost every test. With benchmarks I have it looks good. However more data is better.
Would you mind testing it on your benchmarks/machines, please?

> It may be separated as a NFC patch?

Yes. But that will be a patch with postponed effect. Before introducing Insns in solution cost I don't see a reason for x86 to get own cost function.
If we introduce Insns first that will affect others architectures as well (which I'm not specialist in). So I'd like to leave target cost functions tune for the target professionals.

> Add new cross use generation for ICmpZero that ends with zero
> Can it be split into a separate patch?

I'd like to leave this here as well. When LSR starts to count Insns, it is able to remove some unnecessary compares, but not all (because for some solution we don't generate appropriate cross use). That way for some test I need to remove cmp from one function and leave it in another. I'm trying to avoid a case when say for one arch/mode we check for cmp and for another just add.



================
Comment at: lib/Transforms/Scalar/LoopStrengthReduce.cpp:917
+  }
 
+  bool isLower(Cost &Other, const TargetTransformInfo &TTI);
----------------
wmi wrote:
> Better to use member initializer list? Cost() : C({0, 0, ... ,0}) {}
Yes. Will fix.


================
Comment at: lib/Transforms/Scalar/LoopStrengthReduce.cpp:1208
+  }
+
   // Determine how many (unfolded) adds we'll need inside the loop.
----------------
wmi wrote:
> C.NumRegs only calculate the regs used in induction var expr so it is not a good estimation for register number used in the loop. It can be much less than the real register number used.
> 
> We can have a utility like that in vectorization to get a better register number estimation used in the loop, but that can be in a further enhancement.  Before we have such a utility in place, I would rather conservatively think every loop has high register pressure,  and always add C.NumRegs  into C.Insns.  I think it avoids the case that LSR significantly increase register pressure just to reduce one addinc. 
>C.NumRegs only calculate the regs used in induction var expr so it is not a good estimation for register number used in the loop. It can be much less than the real register number used.

That is right. But we can say for sure that when we exceed TTIRegNum for a solution, we'll get at least fill. Why not to do this if already have NumRegs used by solution.

I like the idea to get better estimation. However let's leave this for a separate patch.


Repository:
  rL LLVM

https://reviews.llvm.org/D27695





More information about the llvm-commits mailing list