[PATCH] D133850: [AArch64] Improve codegen for "trunc <4 x i64> to <4 x i8>" for all cases

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 9 03:12:02 PDT 2022


dmgreen added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17896
+
+    switch (DestVT.getSimpleVT().SimpleTy) {
+    case MVT::v2i32:
----------------
0x59616e wrote:
> mingmingl wrote:
> > I wonder if it makes code more readable and succinct by combining this switch with the switch inside `HalfElementSize` above, given that `BitcastResultTy` and `Uzp1ResultTy` are co-related (i.e. 'BitcastResultTy' == 'TruncOperandTy', and 'TruncOperandTy' == bitcast 'Uzp1ResultTy' to double-element-size)
> > 
> Combining the two switches to make it more terse is possible, but to make it more readable is beyond my ability, since the two switches have different responsibilities.
> 
> That is, the first one is responsible for `half the element size of a 128 bit vector to a 128bit vector` :
> 
> v2i64 => v4i32
> v4i32 => v8i16
> v8i16 => v16i8
> 
> on the other hand, the second one is responsible for `double the element size of a 64bit vector to a 128bit vector`:
> 
> v2i32 => v2i64
> v4i16 => v4i32
> v8i8 => v8i16
> 
> Putting two different responsibilities on a single switch may turn it into a enigma. So I prefer to maintain the status quo. 
> 
That makes sense, if the two types are independent and we handle all combinations through the bicast. If we know that SourceOp0 == SourceOp1 though, we needn't call HalfElementSize twice. Just calculate the ResultTy once and bitcast both results to the same type.

It should check that the truncate input VT is simple though, if it is used in a switch. Just to be safe.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17845-17846
+
+  if (!ResVT.isSimple())
+    return SDValue();
+
----------------
This doesn't need to check for Simple, given it is checking for specific MVT's.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17848
+
+  if (!ResVT.is64BitVector())
+    return SDValue();
----------------
If we check for specific MVT's below, we are already checking for 64BitVector.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17851
+
+  if (ResVT.getVectorNumElements() < 2)
+    return SDValue();
----------------
Same again..


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17854
+
+  if (auto SVT = ResVT.getSimpleVT().SimpleTy;
+      SVT != MVT::v2i32 && SVT != MVT::v4i16 && SVT != MVT::v8i8)
----------------
This can just check the ResVT directly
```
if (ResVT != MVT::v2i32 && ResVT != MVT::v4i16 && ResVT != MVT::v8i8)
```


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17902
+
+  assert(UzpOp1.getValueType() == UzpOp1.getValueType());
+  UzpResult =
----------------
This looks like it should be UzpOp0.


================
Comment at: llvm/test/CodeGen/AArch64/aarch64-uzp1-combine.ll:5
 
 ; Test cases to show when UZP1 (TRUNC, TRUNC) could be combined to TRUNC (UZP1) but not yet implemented.
 
----------------
This comment can now be updated.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133850/new/

https://reviews.llvm.org/D133850



More information about the llvm-commits mailing list