[PATCH] D64142: [SLP] try to create vector loads from bitcasted scalar pointers

Thu Jul 25 04:23:58 PDT 2019

lebedev.ri added a comment.

I personally think this seems to be going in the right direction,
though it isn't obvious without some more more complicated tests
that will show the further transforms this could allow.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/load-bitcast-vec.ll:7
 ; CHECK-LABEL: @matching_scalar(
-; CHECK-NEXT:    [[BC:%.*]] = bitcast <4 x float>* [[P:%.*]] to float*
-; CHECK-NEXT:    [[R:%.*]] = load float, float* [[BC]], align 16
-; CHECK-NEXT:    ret float [[R]]
+; CHECK-NEXT:    [[TMP1:%.*]] = load <4 x float>, <4 x float>* [[P:%.*]], align 16
+; CHECK-NEXT:    [[TMP2:%.*]] = extractelement <4 x float> [[TMP1]], i32 0
----------------
spatel wrote:
> ABataev wrote:
> > Seems to me, it must be masked load rather than just load. Plus, what about the cost? This does not look like cost optimal.
> If the load is guaranteed dereferenceable, does that not allow speculated load of the entire vector?
> 
> I'm open to suggestions about the cost calc. It's not clear to me if there's an existing TTI API for this or if we need to create a new one?
I agree that there is no reason this should be a maskedload.
Do we have opposite folds for this in dagcombine?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64142/new/

https://reviews.llvm.org/D64142