[llvm] [AArch64] Remove EXT instr before UZP when extracting elements from vector (PR #91328)

David Green via llvm-commits llvm-commits at lists.llvm.org
Wed May 8 08:25:05 PDT 2024


================
@@ -21448,6 +21448,29 @@ static SDValue performUzpCombine(SDNode *N, SelectionDAG &DAG,
   SDValue Op1 = N->getOperand(1);
   EVT ResVT = N->getValueType(0);
 
+  // uzp(extract_lo(x), extract_hi(x)) -> extract_lo(uzp x, x)
+  if (Op0.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+      Op1.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+      Op0.getOperand(0) == Op1.getOperand(0)) {
+
+    SDValue SourceVec = Op0.getOperand(0);
+    uint64_t ExtIdx0 = Op0.getConstantOperandVal(1);
+    uint64_t ExtIdx1 = Op1.getConstantOperandVal(1);
+    uint64_t NumElements = SourceVec.getValueType().getVectorMinNumElements();
+    if (ExtIdx0 == 0 && ExtIdx1 == NumElements / 2) {
+      EVT OpVT = Op0.getOperand(1).getValueType();
+      EVT WidenedResVT = ResVT.getDoubleNumVectorElementsVT(*DAG.getContext());
+      SDValue uzp2 =
----------------
davemgreen wrote:

Thanks that does sound good. The Cortex-A55 issues Neon operations in 64bit chunks, and I believe an xtn counts as a 64-bit operations so can to two per cycle instead of 1 for 128bit operations. That's getting older now, but it might be a tiny bit nicer in places.

One of the problems it can cause is that because the undef is replaced by any register, it can create false-dependencies at times. Hopefully this won't be too much of a problem anywhere though, we can adjust it if needed.

Variable names should be Capitalized though!

https://github.com/llvm/llvm-project/pull/91328


More information about the llvm-commits mailing list