[PATCH] D106399: [VectorCombine] Widening of partial vector loads

Wed Jul 21 12:10:15 PDT 2021

lebedev.ri marked an inline comment as done.
lebedev.ri added inline comments.

================
Comment at: llvm/test/Transforms/VectorCombine/X86/load-widening.ll:172

 define <7 x float> @vec_with_7elts_256bits(<7 x float>* align 32 dereferenceable(32) %p) {
 ; CHECK-LABEL: @vec_with_7elts_256bits(
----------------
We widen this to either 2x XMM, or an 1x YMM, and we know we can do this as per deref info.

================
Comment at: llvm/test/Transforms/VectorCombine/X86/load-widening.ll:194
 ; We can't tell if we can load more than 256 bits.
 define <9 x float> @vec_with_9elts_256bits(<9 x float>* align 32 dereferenceable(32) %p) {
 ; CHECK-LABEL: @vec_with_9elts_256bits(
----------------
spatel wrote:
> Can you explain the difference between this test and vec_with_7elts_256bits for an SSE target? It's not obvious to me why we are ok widening to 256-bit if that's not legal, but not wider than that.
We need to widen this to either 3x XMM or 2x YMM, but we don't know we can load that many bytes.

There is another problem hiding here, iff we know we can load 3x XMM, we still try to widen to 4x XMM,
because that's what the legalizer told us, because it only knows how to double.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106399/new/

https://reviews.llvm.org/D106399