[PATCH] D68337: [ARM][MVE] Enable extending masked loads

Mon Oct 7 08:36:14 PDT 2019

dmgreen added inline comments.

================
Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887
   // zero too, and other values are lowered to a select.
   SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT,
                                 DAG.getTargetConstant(0, dl, MVT::i32));
----------------
samparker wrote:
> dmgreen wrote:
> > This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type?
> Well, surely the result VT of the masked load has to match the VT of the passthru input. passthru is not about what memory is accessed, but what is written to the destination register. VOVIMM will also generate the same zero value for all full width vector types so for vector widths less than 128-bits, the higher elements will be zeroed and that makes sense. For vectors wider than 128-bits, I think something would have gone before here. I'll add some tests for both these cases.
Hmmm. Yeah OK. I see. The PassThru is explicitly extended in tryToFoldExtOfMaskedLoad?

That makes sense, and the tests look OK. (There's one that is both sext and zext the same value, but that looks correct for where it is used).

Test for masked loads/stores longer than 128 bits sounds like a good idea. We should ideally be able to deal with longer vector by splitting them just fine.

================
Comment at: lib/Target/ARM/ARMInstrMVE.td:5196
   def : MVE_vector_maskedload_typed<v4f32, MVE_VLDRWU32, alignedmaskedload32, 2>;
+  // Extending masked loads.
+  def : Pat<(v8i16 (sextmaskedload8 t2addrmode_imm7<0>:$addr, VCCR:$pred,
----------------
samparker wrote:
> dmgreen wrote:
> > There likely needs to be an anyext too. Can (or is it beneficial for) these be merged into the MVEExtLoad multiclass below?
> As much as I don't like copy-paste, I do appreciate being able to read the code! I think adding to that multiclass is more hassle than it's worth :)
Ha, Fair. I will agree with you there that sometimes more code is simpler.

================
Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:511
+      // Only support extending integers if the memory is aligned.
+      if ((EltWidth == 16 && Alignment < 2) ||
+          (EltWidth == 32 && Alignment < 4))
----------------
samparker wrote:
> dmgreen wrote:
> > samparker wrote:
> > > dmgreen wrote:
> > > > If this is coming from codegen, can the alignment here be 0? I think in ISel it is always set (and clang will always set it), but it may not be guaranteed in llvm in general.
> > > I can't see anything in the spec for any guarantees of these intrinsics, but for normal loads, it becomes defined by the target ABI. It's always safe for us to use a i8* accessor, so I don't see 0 being a problem here.
> > Yeah. Alignment of 0 means ABI alignment, which means 8, not unaligned.
> > 
> > I think it may be better to just check this alignment is always the case, getting rid of that weird "use i8's to load unaligned masked loads" thing. That was probably a bad idea, more trouble than it's worth.
> > 
> > I think what will happen here at the moment is that the Vectorizer will call isLegalMaskedLoad with an scalar type and an alignment (which, lets say is unaligned). That alignment won't be checked so the masked loads and stores will be created. Then when we get to the backend the legalizer will call this with a vector type and we'll hit this check, expanding out the masked load into a that very inefficient bunch of code. Which is probably something that we want to avoid.
> Hmmm, okay. I also can't see removing unaligned support having a big negative effect. Sounds like I need to add some vectorization tests too, unless we already have them?
There was one added to the vectoriser tests, but not for alignment checks as far as I remember.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68337/new/

https://reviews.llvm.org/D68337