[PATCH] D37446: [x86] eliminate unnecessary vector compare for AVX masked store

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 5 06:14:06 PDT 2017


spatel added inline comments.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:33185
+    SDValue Mask = Mst->getMask();
+    if (Mask.getOpcode() == X86ISD::PCMPGT &&
+        ISD::isBuildVectorAllZeros(Mask.getOperand(0).getNode())) {
----------------
RKSimon wrote:
> aymanmus wrote:
> > Is there any canonical form of compare-with-all-zeros that can be guaranteed here? Or should the pattern with (pcmplt X, 0) be added also?
> Add X86ISD::PCMPGTM support?
Waiting until this is PCMPGT is a kind of canonicalization (compared to the general setcc node) because SSE/AVX don't have any other compare predicates. Ie, there's no other simple way to encode this; there is no PCMPLT node.


================
Comment at: test/CodeGen/X86/masked_memop.ll:1158
 ; SKX-LABEL: trunc_mask:
 ; SKX:       ## BB#0:
 ; SKX-NEXT:    vpxor %xmm1, %xmm1, %xmm1
----------------
aymanmus wrote:
> I think the optimal code for SKX is:
> vpmovd2m %xmm2, %k1
> vmovups %xmm0, (%rdi) {%k1}
> 
Ok - let me try to shake that out of here. To be clear, we're saying this is the optimal sequence for any CPU with avx512vl/avx512bw. SKX is just an implementation of those ISAs.


https://reviews.llvm.org/D37446





More information about the llvm-commits mailing list