[PATCH] D68337: [ARM][MVE] Enable extending masked loads

Mon Oct 7 07:17:09 PDT 2019

samparker marked 2 inline comments as done.
samparker added inline comments.

================
Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887
   // zero too, and other values are lowered to a select.
   SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT,
                                 DAG.getTargetConstant(0, dl, MVT::i32));
----------------
dmgreen wrote:
> This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type?
Well, surely the result VT of the masked load has to match the VT of the passthru input. passthru is not about what memory is accessed, but what is written to the destination register. VOVIMM will also generate the same zero value for all full width vector types so for vector widths less than 128-bits, the higher elements will be zeroed and that makes sense. For vectors wider than 128-bits, I think something would have gone before here. I'll add some tests for both these cases.

================
Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:511
+      // Only support extending integers if the memory is aligned.
+      if ((EltWidth == 16 && Alignment < 2) ||
+          (EltWidth == 32 && Alignment < 4))
----------------
dmgreen wrote:
> samparker wrote:
> > dmgreen wrote:
> > > If this is coming from codegen, can the alignment here be 0? I think in ISel it is always set (and clang will always set it), but it may not be guaranteed in llvm in general.
> > I can't see anything in the spec for any guarantees of these intrinsics, but for normal loads, it becomes defined by the target ABI. It's always safe for us to use a i8* accessor, so I don't see 0 being a problem here.
> Yeah. Alignment of 0 means ABI alignment, which means 8, not unaligned.
> 
> I think it may be better to just check this alignment is always the case, getting rid of that weird "use i8's to load unaligned masked loads" thing. That was probably a bad idea, more trouble than it's worth.
> 
> I think what will happen here at the moment is that the Vectorizer will call isLegalMaskedLoad with an scalar type and an alignment (which, lets say is unaligned). That alignment won't be checked so the masked loads and stores will be created. Then when we get to the backend the legalizer will call this with a vector type and we'll hit this check, expanding out the masked load into a that very inefficient bunch of code. Which is probably something that we want to avoid.
Hmmm, okay. I also can't see removing unaligned support having a big negative effect. Sounds like I need to add some vectorization tests too, unless we already have them?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68337/new/

https://reviews.llvm.org/D68337