[PATCH] D23646: Generalize strided store pattern in interleave access pass

Alina Sbirlea via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 14 16:39:47 PDT 2016


asbirlea added a comment.

Great point on the lack of testing. But I wasn't happy with the coverage the vst4/st4 had.
I added a pattern for vst3/st3 that covers the undefs in the middle of a lane.



================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:188
+    unsigned I = 0, J;
+    for (; I < Factor; I++) {
+      bool SavedNonUndef = false;
----------------
rengolin wrote:
> Better to declare I and J inside the `for` declaration.
There's a check on I and J following each loop. I could add an additional flag to check that we broke out of the loop early, but it seemed overkill to do that when I and J could be used if declared outside the loop.


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:216
+          SavedPos = J;
+        }
+      }
----------------
rengolin wrote:
> What about the case where the first lane is undef, but the others aren't?
Nothing wrong with that (unless I'm missing something)..
It'll check the correctness for the ones that follow and the first one will receive a value based on the following values - that's the start mask value.


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:226
+      else {
+        for (J = 1; J < LaneLen; J++) {
+          if (Mask[J*Factor + I] >= 0) {
----------------
rengolin wrote:
> Instead of using a `SavedNonUndef` above, you could save the last non-undef value and the number of undefs since that value. That'd make the next-value computation easier:
> 
>     If (NextValue != SavedValue + NumUndefs)
>       break;
> 
> and also help get the StartMask here, for free.
I'm still looking into this one.
I can do without SaveNonUndef, and update the condition to a "SavedLaneValue+SavedNoUndefs (+1)". 
This needs an additional if clause in the loop to increment the SavedNoUndefs, and at least another check to help with computing the mask. The second check is because right now I only store SavedLaneValue if a value is followed by an undef, but at the end of the mask we'll need this updated too to get the correct StartMask as something like SavedLaneValue+SavedNoUndefs -LaneLen (+/- 1).
Right now I find it easier to just compute the StartMask in the same j loop.
So, yeah, still looking what's the cleanest way to do this.




https://reviews.llvm.org/D23646





More information about the llvm-commits mailing list