[llvm] [AArch64] Remove EXT instr before UZP when extracting elements from vector (PR #91328)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Wed May 8 08:25:05 PDT 2024
================
@@ -21448,6 +21448,29 @@ static SDValue performUzpCombine(SDNode *N, SelectionDAG &DAG,
SDValue Op1 = N->getOperand(1);
EVT ResVT = N->getValueType(0);
+ // uzp(extract_lo(x), extract_hi(x)) -> extract_lo(uzp x, x)
+ if (Op0.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+ Op1.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+ Op0.getOperand(0) == Op1.getOperand(0)) {
+
+ SDValue SourceVec = Op0.getOperand(0);
+ uint64_t ExtIdx0 = Op0.getConstantOperandVal(1);
+ uint64_t ExtIdx1 = Op1.getConstantOperandVal(1);
+ uint64_t NumElements = SourceVec.getValueType().getVectorMinNumElements();
+ if (ExtIdx0 == 0 && ExtIdx1 == NumElements / 2) {
+ EVT OpVT = Op0.getOperand(1).getValueType();
+ EVT WidenedResVT = ResVT.getDoubleNumVectorElementsVT(*DAG.getContext());
+ SDValue uzp2 =
----------------
davemgreen wrote:
Thanks that does sound good. The Cortex-A55 issues Neon operations in 64bit chunks, and I believe an xtn counts as a 64-bit operations so can to two per cycle instead of 1 for 128bit operations. That's getting older now, but it might be a tiny bit nicer in places.
One of the problems it can cause is that because the undef is replaced by any register, it can create false-dependencies at times. Hopefully this won't be too much of a problem anywhere though, we can adjust it if needed.
Variable names should be Capitalized though!
https://github.com/llvm/llvm-project/pull/91328
More information about the llvm-commits
mailing list