[PATCH] [PATCH][CodeGen] Adding "llvm.sad" intrinsic and corresponding ISD::SAD node for "Sum Of Absolute Differences" operation

Ahmed Bougacha ahmed.bougacha at gmail.com
Wed Apr 15 11:56:37 PDT 2015


A few general notes:

- A LangRef description of the intrinsic would be useful for reviewing (and of course necessary for committing).
- Also, the TTI parts are independent of the intrinsic/building/legalization parts; perhaps submit those with the patch that introduces their usage?  (the vectorizer I guess)
- I'm a bit confused about the lowering:  if you only implement the v16i8 legal case, I don't expect to see anything other than a .td pattern and the SelectionDAGBuilder code.
- In general, I think this will eventually need much stronger legalization:  once something is in IR, people *will* use it, and IMO, if there isn't a good reason, it shouldn't fail to select or crash the compiler.

Thanks for working on this, I'm excited about the vectorizer parts!
-Ahmed


REPOSITORY
  rL LLVM

================
Comment at: include/llvm/IR/Intrinsics.td:589
@@ +588,3 @@
+// vectors.
+def int_sad : Intrinsic<[llvm_anyint_ty],
+                        [ llvm_anyvector_ty, llvm_anyvector_ty ], [IntrNoMem]>;
----------------
The opposite question was asked on the thread about i32, but can we constraint the return type more than this? (open question, I don't think so)

================
Comment at: include/llvm/IR/Intrinsics.td:590
@@ +589,3 @@
+def int_sad : Intrinsic<[llvm_anyint_ty],
+                        [ llvm_anyvector_ty, llvm_anyvector_ty ], [IntrNoMem]>;
+
----------------
No spaces after '[' and before ']'.

================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1291-1295
@@ -1289,3 +1290,7 @@
     break;
-
+  case ISD::SAD:
+    Action = TLI.getOperationAction(Node->getOpcode(),
+                                    Node->getOperand(0).getValueType());
+    SimpleFinishLegalizing = false;
+    break;
   default:
----------------
This basically bypasses the legalizer for SAD, for no good reason.  It should just be handled by the default case, and the legalization below.

================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:2741-2760
@@ -2735,1 +2740,22 @@
 
+SDValue SelectionDAGLegalize::ExpandSAD(EVT DestVT, SDValue LHS, SDValue RHS,
+                                        SDLoc DL) {
+  SDValue result;
+  switch (LHS.getValueType().getSimpleVT().SimpleTy) {
+  default:
+    llvm_unreachable("Unhandled Expand type in SAD");
+  case MVT::i16:
+    SDValue sad = DAG.getNode(ISD::SAD, DL, MVT::v2i64, LHS, RHS);
+    EVT VecIdxTy = DAG.getTargetLoweringInfo().getVectorIdxTy();
+    SDValue BottomHalf = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, MVT::i64, sad,
+                                     DAG.getConstant(0, VecIdxTy));
+    SDValue TopHalf = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, MVT::i64, sad,
+                                  DAG.getConstant(1, VecIdxTy));
+
+    BottomHalf = DAG.getNode(ISD::TRUNCATE, DL, DestVT, BottomHalf);
+    TopHalf = DAG.getNode(ISD::TRUNCATE, DL, DestVT, TopHalf);
+    result = DAG.getNode(ISD::ADD, DL, DestVT, TopHalf, BottomHalf);
+  }
+  return result;
+}
+
----------------
I'm confused:  what does this achieve?

================
Comment at: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:5096-5100
@@ -5088,2 +5095,7 @@
     return nullptr;
+  case Intrinsic::sad: {
+    setValue(&I, ExpandSad(sdl, TLI.getValueType(I.getType()), getValue(I.getArgOperand(0)),
+                           getValue(I.getArgOperand(1)), DAG));
+    return nullptr;
+  }
   case Intrinsic::cttz: {
----------------
I would inline ExpandSad here, like bswap right above.

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:2087-2097
@@ -2086,2 +2086,13 @@
   default: break;
+  case ISD::SAD: {
+    SDNode *New;
+    SDValue Ops[] = {Node->getOperand(0), Node->getOperand(1)};
+
+    if (Subtarget->hasAVX() || Subtarget->hasSSE1() || Subtarget->hasSSE2())
+      New =
+          CurDAG->getMachineNode(X86::PSADBWrr, dl, Node->getValueType(0), Ops);
+    else
+      New = Node;
+    return New;
+  }
   case ISD::INTRINSIC_W_CHAIN: {
----------------
Why not a .td pattern?  Should be a one-liner (+ the SDNode in TargetSelectionDAG.td).

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:2089
@@ +2088,3 @@
+    SDNode *New;
+    SDValue Ops[] = {Node->getOperand(0), Node->getOperand(1)};
+
----------------
I think getMachineNode has an overload for two operands you can use.

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:2091
@@ +2090,3 @@
+
+    if (Subtarget->hasAVX() || Subtarget->hasSSE1() || Subtarget->hasSSE2())
+      New =
----------------
Two things:
- SDM says SSE1 supports this on v8i8, which isn't really legal.
- why not a single check? For instance, is hasAVX() useful here?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:915
@@ +914,3 @@
+    setOperationAction(ISD::SAD, MVT::v8i8, Custom);
+    setOperationAction(ISD::SAD, MVT::v16i8, Expand);
+
----------------
Why not Legal?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16928-16929
@@ +16927,4 @@
+    EVT VecIdxTy = DAG.getTargetLoweringInfo().getVectorIdxTy();
+    SDValue V0 = DAG.getUNDEF(MVT::v8i8);
+    SDValue V1 = DAG.getUNDEF(MVT::v8i8);
+    SDValue Op0 = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v16i8, V0, N0);
----------------
Why not a single SDValue?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16930-16936
@@ +16929,9 @@
+    SDValue V1 = DAG.getUNDEF(MVT::v8i8);
+    SDValue Op0 = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v16i8, V0, N0);
+    SDValue Op1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v16i8, V1, N1);
+    SDValue Ops[] = {Op0, Op1};
+    SDValue SAD = DAG.getNode(ISD::SAD, dl, MVT::v2i64, Ops);
+    SDValue BottomHalf = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i64, SAD,
+                                     DAG.getConstant(0, VecIdxTy));
+    result = DAG.getNode(ISD::TRUNCATE, dl, MVT::i32, BottomHalf);
+  } else
----------------
The original ISD::SAD building returns an integer (matching the intrinsic).  Why v2i64 + extract + trunc here, instead of just doing DAG.getNode(ISD::SAD, dl, Op.getValueType(), ...) ?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16932-16933
@@ +16931,4 @@
+    SDValue Op1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v16i8, V1, N1);
+    SDValue Ops[] = {Op0, Op1};
+    SDValue SAD = DAG.getNode(ISD::SAD, dl, MVT::v2i64, Ops);
+    SDValue BottomHalf = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::i64, SAD,
----------------
There's a getNode overload you can use for two operands:  getNode(..., Op0, Op1), like you do right below.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:1143-1144
@@ +1142,4 @@
+
+  assert(DataTy->isVectorTy() && "Must be a vector");
+  assert(DataTy->getScalarType()->isIntegerTy() && "Elem must be an integer");
+
----------------
Why not "return false"?  It doesn't make much sense to call it on any other type, but I think that's the point of the function.

================
Comment at: test/CodeGen/X86/sad_intrinsic.ll:1
@@ +1,2 @@
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu  -mattr=+avx < %s | FileCheck %s -check-prefix=AVX
+
----------------
Why AVX?  This is all SSE2, no?

================
Comment at: test/CodeGen/X86/sad_intrinsic.ll:5
@@ +4,3 @@
+
+; AVX: psadbw     %xmm1, %xmm0 
+
----------------
Explicitly checking the movd would be useful, I think

http://reviews.llvm.org/D9029

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list