[PATCH] D37446: [x86] eliminate unnecessary vector compare for AVX masked store

Tue Sep 5 06:14:06 PDT 2017

spatel added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:33185
+    SDValue Mask = Mst->getMask();
+    if (Mask.getOpcode() == X86ISD::PCMPGT &&
+        ISD::isBuildVectorAllZeros(Mask.getOperand(0).getNode())) {
----------------
RKSimon wrote:
> aymanmus wrote:
> > Is there any canonical form of compare-with-all-zeros that can be guaranteed here? Or should the pattern with (pcmplt X, 0) be added also?
> Add X86ISD::PCMPGTM support?
Waiting until this is PCMPGT is a kind of canonicalization (compared to the general setcc node) because SSE/AVX don't have any other compare predicates. Ie, there's no other simple way to encode this; there is no PCMPLT node.

================
Comment at: test/CodeGen/X86/masked_memop.ll:1158
 ; SKX-LABEL: trunc_mask:
 ; SKX:       ## BB#0:
 ; SKX-NEXT:    vpxor %xmm1, %xmm1, %xmm1
----------------
aymanmus wrote:
> I think the optimal code for SKX is:
> vpmovd2m %xmm2, %k1
> vmovups %xmm0, (%rdi) {%k1}
> 
Ok - let me try to shake that out of here. To be clear, we're saying this is the optimal sequence for any CPU with avx512vl/avx512bw. SKX is just an implementation of those ISAs.

https://reviews.llvm.org/D37446