[PATCH] D95139: [SVE][LoopVectorize] Add support for extracting the last lane of a scalable vector

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 28 07:52:36 PST 2021


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:224
+  assert(!Instance.isNonConstLane() &&
+         "Non-constant lanes need handling with callbacks!");
+
----------------
can you add support here as well? The callback is going away soon, so we also need to support the non-callback version.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:110
+  enum LaneKind {
+    LK_First,
+    LK_ScalableLast,
----------------
are you planning on adding more kinds here? otherwise this can just be a boolean? Or make this an `enum class`?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:117
+
+  bool isNonConstLane() const { return Kind != LK_First; }
 };
----------------
Can this have a better name, e.g. in line with the enum value or the Boolean variable name, if you change it?.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:205
            "ScalarParts has wrong dimensions.");
-    return Entry[Instance.Part][Instance.Lane] != nullptr;
+    unsigned CacheIdx = mapInstanceLaneToIndex(Instance);
+    return Entry[Instance.Part][CacheIdx] != nullptr;
----------------
david-arm wrote:
> sdesmalen wrote:
> > Can you make this a method to VPIteration, e.g. `getIndex()`
> Sure. I did originally think about that, but then I wondered if that only really makes sense in the context of a cache? For example, if I move this to VPIteration then I think I should also move getNumCachedLanes() there for consistency otherwise it's a bit odd having the cache size defined in one class and the mapping to a cache index in another.
I think both should be moved to `VPIteration`, as we need to support both `VPIteration` versions there as well.


================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/neon-extract-last-veclane.ll:3
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-unknown-linux-gnu"
----------------
This test does not need to be neon specific, right? extracting the last lane for fixed vectors should be tested fairly well already, so not sure if the test is needed at all?


================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-extract-last-veclane.ll:22
+entry:
+  %cmp12 = icmp sgt i64 %n, 0
+  br i1 %cmp12, label %for.body, label %for.cond.cleanup
----------------
not needed?


================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-extract-last-veclane.ll:25
+
+for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.body
+  %mul.lcssa = phi i32 [ %mul, %for.body ]
----------------
nit: the names of the blocks could be improved.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95139/new/

https://reviews.llvm.org/D95139



More information about the llvm-commits mailing list