[PATCH] D38313: [InstCombine] Introducing Pattern Instruction Combine plug-in into InstCombine pass

Sun Oct 15 05:55:04 PDT 2017

aaboud marked 6 inline comments as done.
aaboud added a comment.

Thanks Craig and Zia for the review and sorry for the late answer.
Please, see answers below.

In https://reviews.llvm.org/D38313#891163, @craig.topper wrote:

> Do you have tests for "Truncating to different width than the original trunc instruction requires. (this is useful if we reduce the expression width, even if we are not eliminating any instruction)."

I do have tests, but this is not possible yet with this patch, we need the shift/udiv/urem instructions to get this case. I will introduce that in next patches.
In this patch, I just implemented the infrastructure for this optimization.

In https://reviews.llvm.org/D38313#897527, @zansari wrote:

> Hi Amjad,
>
> I didn't do a thorough drill down of all the changes (I'll leave that to Craig and Sanjay), but skimmed through it. The design seems reasonable, as long as others are ok with the instcombine plugin.
>  I can't tell all the effects this will have, but take care that the zext->(operations...)->trunc optimization doesn't remove a zext on a load that feed into an operation such that we turn a "movz[bw] mem -> reg" into "mov[bw] mem -> [bw]reg", since this can introduce extra merge/false dependence hits on most newer Intel architectures.
>
> Zia.

I believe that we have passes in the codegen to make sure we will not suffer from these false dependencies, right?
I do not believe that instcombine should look that far to recognize such cases.

================
Comment at: lib/Transforms/InstCombine/InstCombinePatterns.cpp:146
+  case Instruction::SExt:
+    // trunc(trunc(x)) -> trunc(x)
+    // trunc(ext(x)) -> ext(x) if the source type is smaller than the new dest
----------------
zansari wrote:
> Are you planning on doing these peepholes here? If not, is there any other reason for leaving these cases in this switch?
I do need to handle these cases here as I am calling "getRelevantOperands()" function from "getMinBitWidth()" function for all supported cases (without a switch).

================
Comment at: lib/Transforms/InstCombine/InstCombinePatterns.cpp:180
+  switch (Opc) {
+  case Instruction::Trunc:
+  case Instruction::ZExt:
----------------
zansari wrote:
> Same as above. Any reason for not breaking for trunc/zext/sext?
I can add the break here, I agree that it makes code faster.

================
Comment at: lib/Transforms/InstCombine/InstCombinePatterns.cpp:255
+  Type *NewDstSclTy = DstTy->getScalarType();
+  if (DL.isLegalInteger(OrigBitWidth) || MinBitWidth > ValidBitWidth) {
+    NewDstSclTy = DL.getSmallestLegalIntType(DstTy->getContext(), MinBitWidth);
----------------
craig.topper wrote:
> Does this call isLegalInteger using the scalar bit width for vectors? Not sure that's valid.
Can you explain, what you mean by "that's not valid"?
Are you referring to the algorithm?

Are you concerned of cases where a scalar type is legal (e.g. i32), while a vector type is not (e.g. <8xi32>), or the opposite direction?

I believe that my algorithm should not mind about the number of elements in the type, only about the width.
The reason for this check, is to keep the old case, where we will not reduce a legal to non-legal type.

Do you think that we should do a more accurate check?

https://reviews.llvm.org/D38313