[PATCH] D23646: Generalize strided store pattern in interleave access pass

Renato Golin via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 14 12:37:59 PDT 2016


rengolin added a comment.

Hi Alina,

This is looking much better, thanks!

The code has a lot of undef handling, but not much in the way of testing it. I think we should have at least the following:

- one and two undefs in the middle
- one undef at the beginning and one at the end
- all undefs in one lane
- one undef in each lane, at different positions

Repeating the pattern of your current tests but adding undefs should be enough.

cheers,
--renato



================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:193
+      for (j = 0; j < NumSubElts-1; j++) {
+        unsigned ij = j*Factor + i;
+        if (Mask[ij] >= 0 && Mask[ij + Factor] >= 0 &&
----------------
asbirlea wrote:
> rengolin wrote:
> > Nit, `ij` is hard to follow. Try `lane` or something more expressive. (this is not a matrix :)
> Renamed to Lane. I hope changing the NumSubElts to LaneLen makes more sense too. 
> I'm inclined to change it in the lowering files as well for consistency.
Nice, much better! I agree with renaming the lowering code, too.


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:188
+    unsigned I = 0, J;
+    for (; I < Factor; I++) {
+      bool SavedNonUndef = false;
----------------
Better to declare I and J inside the `for` declaration.


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:200
+
+        if (LaneValue >= 0 && NextLaneValue >= 0 &&
+            LaneValue + 1  != NextLaneValue)
----------------
A comment here would help...

    // If both defined, only sequential values allowed


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:216
+          SavedPos = J;
+        }
+      }
----------------
What about the case where the first lane is undef, but the others aren't?


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:226
+      else {
+        for (J = 1; J < LaneLen; J++) {
+          if (Mask[J*Factor + I] >= 0) {
----------------
Instead of using a `SavedNonUndef` above, you could save the last non-undef value and the number of undefs since that value. That'd make the next-value computation easier:

    If (NextValue != SavedValue + NumUndefs)
      break;

and also help get the StartMask here, for free.


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:237
+
+    // Find a interleaved mask of current factor.
+    if (I == Factor)
----------------
"Found an interleaved..."


================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:7247
+        if (Mask[j*Factor + i] >= 0) {
+          StartMask = Mask[j*Factor + i] - j;
+          break;
----------------
asbirlea wrote:
> rengolin wrote:
> > I don't get the `- j` here.
> Assuming the mask starts with a few undefs, this computes what the start of the mask would be based on the first non-undef value. 
> The computation is done first in the pass to make sure the start is a positive value (hence the correctness comment below on "StartMask cannot be negative")
Right, I agree you could repeat the naming pattern above, here.


================
Comment at: test/CodeGen/ARM/arm-interleaved-accesses.ll:321
+; NEON-LABEL: store_general_mask_factor4:
+; NEON: vst4.32 {d18, d19, d20, d21}, [r0]
+; NONEON-LABEL: store_general_mask_factor4:
----------------
asbirlea wrote:
> rengolin wrote:
> > is this really guaranteed to reproduce? they don't seem connected to the pcs directly...
> I'm not sure about this TBH, and not sure how to verify it. 
> Should I replace it by a simple vst4.32 check?
Something like:

    vst4.32 {d{{\n+}}, d{{\n+}}, d{{\n+}}, d{{\n+}}}, [r0]

would do. (I'm not sure of the triple brackets there...)


https://reviews.llvm.org/D23646





More information about the llvm-commits mailing list