[PATCH] D13633: Type legalizer for masked gather/scatter intrinsics

Mon Dec 14 17:40:50 PST 2015

mbodart added a comment.

Hi Elena,

I added few more minor comments and a question, but in general it LGTM.

regards,

- mitch


================
Comment at: ../lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp:3216
@@ +3215,3 @@
+
+  assert(OpNo == 1 && "Unexpected operand number");
+  // Widen the value
----------------
Redundant assert, already checked a few lines above.

================
Comment at: ../lib/Target/X86/X86ISelLowering.cpp:19849
@@ +19848,3 @@
+  MVT BitMaskVT = MVT::getVectorVT(MVT::i1, NumElts);
+  Mask = DAG.getNode(ISD::TRUNCATE, dl, BitMaskVT, Mask);
+
----------------
Minor mechanics question:  should we first check if Mask's element type is i1, and if so, avoid inserting the TRUNCATE?

================
Comment at: ../lib/Target/X86/X86ISelLowering.cpp:19946
@@ +19945,3 @@
+
+    // Minimal number of elements in Gather
+    NumElts = 8;
----------------
For a 32-bit target, we can gather 16 elements with a single gather.
Though the comment says 8 is the minimum number of elements in
a gather, which is true, the code also seems to be treating this as
the maximum number.  How are 16-element gathers supported?

================
Comment at: ../lib/Target/X86/X86ISelLowering.cpp:26980
@@ +26979,3 @@
+  SDLoc DL(N);
+  // The mask will be truncated anyway. The SIGN_EXTEND_INREG is redundant.
+  SDValue Mask = N->getOperand(2);
----------------
It is hard to interpret this comment, and thus the correctness of this routine,
without any additional context.
Why is a mask always truncated at this point?
How do you know that it is truncated down to
the element size of the SIGN_EXTEND_INREG's operand?
Please add some supporting comments answering these kinds of questions.

================
Comment at: ../test/CodeGen/X86/masked_gather_scatter.ll:2-4
@@ -2,1 +1,5 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu  -mattr=+avx512f < %s | FileCheck %s --check-prefix=KNL_64
+; RUN: llc -mtriple=i386-unknown-linux-gnu  -mattr=+avx512f < %s | FileCheck %s --check-prefix=KNL_32
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu  -mattr=+avx512vl -mattr=+avx512dq < %s | FileCheck %s --check-prefix=SKX
 ; RUN: opt -mtriple=x86_64-apple-darwin -codegenprepare -mcpu=corei7-avx -S < %s | FileCheck %s -check-prefix=SCALAR
----------------
I would think we also want a test for i386 and avx512vl.


Repository:
  rL LLVM

http://reviews.llvm.org/D13633