[llvm] [DAG] Matched FixedWidth pattern for ISD::AVGFLOORU (PR #84903)

David Green via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 14 02:52:48 PDT 2024


================
@@ -2821,6 +2821,47 @@ SDValue DAGCombiner::visitADDLike(SDNode *N) {
   return SDValue();
 }
 
+// Attempt to form avgflooru(A, B) from add(and(A, B), lshr(xor(A, B), 1))
+static SDValue combineFixedwidthToAVGFLOORU(SDNode *N, SelectionDAG &DAG) {
+  assert(N->getOpcode() == ISD::ADD && "ADD node is required here");
+  SDValue And = N->getOperand(0);
+  SDValue Lshr = N->getOperand(1);
+  if (And.getOpcode() == ISD::SRL && Lshr.getOpcode() == ISD::AND) {
+    SDValue temp = And;
+    And = Lshr;
+    Lshr = temp;
+  } else if (And.getOpcode() != ISD::AND || Lshr.getOpcode() != ISD::SRL)
+    return SDValue();
+  SDValue Xor = Lshr.getOperand(0);
+  if (Xor.getOpcode() != ISD::XOR)
+    return SDValue();
+  SDValue And1 = And.getOperand(0);
+  SDValue And2 = And.getOperand(1);
+  SDValue Xor1 = Xor.getOperand(0);
+  SDValue Xor2 = Xor.getOperand(1);
+  if (And1 == Xor2 && And2 == Xor1) {
+    SDValue temp = And1;
+    And1 = And2;
+    And2 = temp;
+  } else if (And1 != Xor1 || And2 != Xor2)
+    return SDValue();
+  // Is the right shift using an immediate value of 1?
+  ConstantSDNode *N1C = isConstOrConstSplat(Lshr.getOperand(1));
+  if (!N1C or N1C->getAPIntValue() != 1)
+    return SDValue();
+  EVT VT = And1.getValueType();
+  EVT NVT = EVT::getIntegerVT(*DAG.getContext(), VT.getSizeInBits());
+  if (VT.isVector())
+    VT = EVT::getVectorVT(*DAG.getContext(), NVT, VT.getVectorElementCount());
+  else
+    VT = NVT;
----------------
davemgreen wrote:

I wasn't sure what Fixedwidth was referring to in the title. Usually in LLVM FixedWidth refers to vectors which are not scalable, but it would seem this fold should be able to apply fine to scalable types too.

Currently this code will take a vector type (say v4i32), create an integer type of the size of the original vector (a i128), and then convert it to a vector type with the same number of elements (a v4i128). This won't be legal (and won't match the original types), so the transform won't be being performed where it should. I believe it should be fine to just use the original VT from the existing operations, as they are all the same size and the same size as the node we want to produce.

https://github.com/llvm/llvm-project/pull/84903


More information about the llvm-commits mailing list