[PATCH] D37446: [x86] eliminate unnecessary vector compare for AVX masked store
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 5 06:14:06 PDT 2017
spatel added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:33185
+ SDValue Mask = Mst->getMask();
+ if (Mask.getOpcode() == X86ISD::PCMPGT &&
+ ISD::isBuildVectorAllZeros(Mask.getOperand(0).getNode())) {
----------------
RKSimon wrote:
> aymanmus wrote:
> > Is there any canonical form of compare-with-all-zeros that can be guaranteed here? Or should the pattern with (pcmplt X, 0) be added also?
> Add X86ISD::PCMPGTM support?
Waiting until this is PCMPGT is a kind of canonicalization (compared to the general setcc node) because SSE/AVX don't have any other compare predicates. Ie, there's no other simple way to encode this; there is no PCMPLT node.
================
Comment at: test/CodeGen/X86/masked_memop.ll:1158
; SKX-LABEL: trunc_mask:
; SKX: ## BB#0:
; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
----------------
aymanmus wrote:
> I think the optimal code for SKX is:
> vpmovd2m %xmm2, %k1
> vmovups %xmm0, (%rdi) {%k1}
>
Ok - let me try to shake that out of here. To be clear, we're saying this is the optimal sequence for any CPU with avx512vl/avx512bw. SKX is just an implementation of those ISAs.
https://reviews.llvm.org/D37446
More information about the llvm-commits
mailing list