[PATCH] D13633: Type legalizer for masked gather/scatter intrinsics
Elena Demikhovsky via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 15 00:24:22 PST 2015
delena marked an inline comment as done.
================
Comment at: ../lib/Target/X86/X86ISelLowering.cpp:19849
@@ +19848,3 @@
+ MVT BitMaskVT = MVT::getVectorVT(MVT::i1, NumElts);
+ Mask = DAG.getNode(ISD::TRUNCATE, dl, BitMaskVT, Mask);
+
----------------
mbodart wrote:
> Minor mechanics question: should we first check if Mask's element type is i1, and if so, avoid inserting the TRUNCATE?
getNode(ISD::TRUNCATE..) does this optimization.
================
Comment at: ../lib/Target/X86/X86ISelLowering.cpp:19946
@@ +19945,3 @@
+
+ // Minimal number of elements in Gather
+ NumElts = 8;
----------------
mbodart wrote:
> For a 32-bit target, we can gather 16 elements with a single gather.
> Though the comment says 8 is the minimum number of elements in
> a gather, which is true, the code also seems to be treating this as
> the maximum number. How are 16-element gathers supported?
We always can sign-extend the indices. VGATHERQPS is supported in 32-bit mode.
================
Comment at: ../lib/Target/X86/X86ISelLowering.cpp:26980
@@ +26979,3 @@
+ SDLoc DL(N);
+ // The mask will be truncated anyway. The SIGN_EXTEND_INREG is redundant.
+ SDValue Mask = N->getOperand(2);
----------------
mbodart wrote:
> It is hard to interpret this comment, and thus the correctness of this routine,
> without any additional context.
> Why is a mask always truncated at this point?
> How do you know that it is truncated down to
> the element size of the SIGN_EXTEND_INREG's operand?
> Please add some supporting comments answering these kinds of questions.
I added more words here:
// Gather and Scatter instructions use k-registers for masks. The type of
// the masks is v*i1. So the mask will be truncated anyway.
// The SIGN_EXTEND_INREG my be dropped.
================
Comment at: ../test/CodeGen/X86/masked_gather_scatter.ll:2-4
@@ -2,1 +1,5 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+avx512f < %s | FileCheck %s --check-prefix=KNL_64
+; RUN: llc -mtriple=i386-unknown-linux-gnu -mattr=+avx512f < %s | FileCheck %s --check-prefix=KNL_32
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+avx512vl -mattr=+avx512dq < %s | FileCheck %s --check-prefix=SKX
; RUN: opt -mtriple=x86_64-apple-darwin -codegenprepare -mcpu=corei7-avx -S < %s | FileCheck %s -check-prefix=SCALAR
----------------
mbodart wrote:
> I would think we also want a test for i386 and avx512vl.
I added.
Repository:
rL LLVM
http://reviews.llvm.org/D13633
More information about the llvm-commits
mailing list