[PATCH] D38313: [InstCombine] Introducing Aggressive Instruction Combine pass

Tue Dec 19 04:47:56 PST 2017

aaboud added inline comments.

================
Comment at: lib/Transforms/AggressiveInstCombine/TruncInstCombine.cpp:205
+  if (MinBitWidth > TruncBitWidth) {
+    // Use the smallest integer type in the range [MinBitWidth, OrigBitWidth).
+    Type *Ty = DL.getSmallestLegalIntType(DstTy->getContext(), MinBitWidth);
----------------
aaboud wrote:
> craig.topper wrote:
> > I'm not sure you've addressed my vector concerns. The first part of this 'if' would create a new type for vectors by using getSmallestLegalIntType.
> I think that in the "else" part, the one that I kept the same behavior as the original instcombine code, we might end up creating a vector type that was not in the IR, as for vector types we do not check for type legality.
> So, why in this case we should behave differently?
> 
> Regarding the scalar check, it might be redundant, but not always, because even if the "trunc" instruction is performed on vector type, the evaluated expression might contain scalar operations (due to the "insertelement" instruction, which will be supported in next few patches).
> 
> Furthermore, my assumption is that codegen legalizer will promote the illegal vector type back to the original type (or to a smaller one), in both cases we will not get worse code than the one we started with!
> Is that assumption too optimistic?
Just to emphasize I am adding two examples.

Case 1 (MinBitWidth == TruncBitWidth):
```
%A1 = zext <2 x i32> %X to <2 x i64>
%B1 = mul <2 x i64> %A1, %A1
%C1 = extractelement <2 x i64> %B1, i32 0
%D1 = extractelement <2 x i64> %B1, i32 1
%E1 = add i64 %C1, %D1
%T = trunc i64 %E1 to i16
  =>
%A2 = trunc <2 x i32> %X to <2 x i16>
%B2 = mul <2 x i16> %X, %X
%C2 = extractelement <2 x i16> %B2, i32 0
%D2 = extractelement <2 x i16> %B2, i32 1
%T = add i16 %C1, %D1
```

Case 2 (MinBitWidth > TruncBitWidth):
```
%A1 = zext <2 x i32> %X to <2 x i64>
%B1 = lshr <2 x i64> %A1, <i64 8, i64 8>
%C1 = mul <2 x i64> %B1, %B1
%T = trunc <2 x i64> %C1 to <2 x i8>
  =>
%A2 = trunc <2 x i32> %X to <2 x i16>
%B2 = lshr <2 x i16> %A2, <i32 8, i32 8>
%C2 = mul <2 x i16> %B2, %B2
%T = trunc <2 x i16> %C2 to <2 x i8>
```

Notice that in both cases the "new" vector type (<2 x i16>) in the transformed IR did not exist in the original IR.

Do not you think that we should perform these transformations and reduce the expression type width?

https://reviews.llvm.org/D38313