[PATCH] D95139: [SVE][LoopVectorize] Add support for extracting the last lane of a scalable vector

Thu Jan 28 07:43:50 PST 2021

david-arm added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4428
+      else
+        Lane = VF.getKnownMinValue() - 1;
+    }
----------------
sdesmalen wrote:
> Should Lane not be set to VF.getKnownMinValue() - 1 for VF.isScalable() as well?
Not in the way I've defined it, i.e. LastLane - Lane. This is a bit similar to how the offset is defined for Cullen's vector.splice intrinsic. A Lane of 0 is the same as LastLane - this does make the calculation of the runtime lane a bit easier.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:101-108
+  /// There are cases where we wish to extract a particular lane of a vector,
+  /// but don't know the exact value at compile time. For example, we may wish
+  /// to extract the last lane of a scalable vector and this requires using a
+  /// runtime calculation. The enum below describes how the 'Lane' is
+  /// interpreted. The default is LK_First, which means the lane offset from
+  /// the start of the vector. If you want to extract a lane at an offset from
+  /// the end, i.e. LastLane - Lane, you can set the lane 'Kind' to
----------------
sdesmalen wrote:
> How about:
> 
>   /// LaneKind describes how to interpret Lane.
>   /// For LK_First, Lane is the index into the first N elements of a fixed-vector <N x <etltty>> or a scalable vector <vscale x N x <eltty>>.
>   /// For LK_ScalableLast, Lane is the index into the last N elements of a scalable vector <vscale x N x  <eltty>>
Sure, although I think the way I've defined LK_ScalableLast is not quite the same as you described above. The calculation of lane in my patch is:

LaneFromStartOfVec = LastLaneOfVec - Lane.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:205
            "ScalarParts has wrong dimensions.");
-    return Entry[Instance.Part][Instance.Lane] != nullptr;
+    unsigned CacheIdx = mapInstanceLaneToIndex(Instance);
+    return Entry[Instance.Part][CacheIdx] != nullptr;
----------------
sdesmalen wrote:
> Can you make this a method to VPIteration, e.g. `getIndex()`
Sure. I did originally think about that, but then I wondered if that only really makes sense in the context of a cache? For example, if I move this to VPIteration then I think I should also move getNumCachedLanes() there for consistency otherwise it's a bit odd having the cache size defined in one class and the mapping to a cache index in another.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95139/new/

https://reviews.llvm.org/D95139