[PATCH] D52964: [x86] use demanded bits to simplify masked store codegen

Sat Oct 6 15:23:46 PDT 2018

craig.topper added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:36499
     return SDValue();
-
+  
+  EVT VT = Mst->getValue().getValueType();
----------------
Can you drop the 2 spaces at the start of this blank line.

================
Comment at: test/CodeGen/X86/masked_memop.ll:1282
 ; TODO: SimplifyDemandedBits should eliminate an ashr here.
+; It works for AVX2, but not the more complicated pattern for AVX1.

----------------
RKSimon wrote:
> We're going to have to add SimplifyDemandedBitsForTargetNode to handle this properly - @craig.topper didn't you have a patch that was going to add that at some point?
It was in https://reviews.llvm.org/D38832, but I found a simpler approach for that specific case.

================
Comment at: test/CodeGen/X86/masked_memop.ll:1283
+; It works for AVX2, but not the more complicated pattern for AVX1.

 define void @masked_store_bool_mask_demand_trunc_sext(<4 x double> %x, <4 x double>* %p, <4 x i32> %masksrc) {
----------------
Why might be able to get this without target support if we stop splitting v4i32<-v4i64 sign extends during DAG combine on AVX1 targets. We already handle the split in LowerSIGN_EXTEND so we shouldn't need to split in combine. The splitting creates a sequence we can't run SimplifyDemandedBits through because we ended up with 2 uses of the v4i32 input.

https://reviews.llvm.org/D52964