[PATCH] D87378: [SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors

Mon Sep 14 03:09:40 PDT 2020

sdesmalen added inline comments.

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:103
+  // checks.
+  if (isa<ScalableVectorType>(AllocElTy) || isa<ScalableVectorType>(CastElTy))
+    return nullptr;
----------------
david-arm wrote:
> sdesmalen wrote:
> > This also seems to prevent the optimisation on bitcasts between scalable-vector-type T1 and an alloca with scalable-vector-type T2. You can make the expression use `!=` instead of `||`, but then the code below needs some more changes to work on TypeSize.
> Yes, this was deliberate. I think I discussed this in the commit message. I was trying to avoid making needless code changes that's all because casting alloca to scalable still causes us to fail the alignment checks below. I simply couldn't write a test case that proved the additional work needed was correct. I thought it better to avoid changing the code path and then being unable to defend it. I still have the original patch that converted everything to using TypeSize, which I could make use of here if we are able to come up with a test that showed a before and after difference.
Okay, thanks for explaining! I guess an extra argument is that because scalable vectors are not allowed in arrays, there is little benefit of making the code below work for the scalable/scalable case even if the ABI alignment would match.

If it is not too much effort, it would be nice if this function could handle:

  %tmp = alloca <vscale x 16 x i8>, align 16
  %cast = bitcast <vscale x 16 x i8>* %tmp to <vscale x 2 x i64>*
  store volatile <vscale x 2 x i64> zeroinitializer, <vscale x 2 x i64>* %cast, align 16
  %reload = load volatile <vscale x 2 x i64>, <vscale x 2 x i64>* %cast, align 16
  store <vscale x 2 x i64> %reload, <vscale x 2 x i64>* %out, align 16ret void

->

  %tmp = alloca <vscale x 2 x i64>, align 16
  store volatile <vscale x 2 x i64> zeroinitializer, <vscale x 2 x i64>* %tmp, align 16
  %reload = load volatile <vscale x 2 x i64>, <vscale x 2 x i64>* %tmp, align 16
  store <vscale x 2 x i64> %reload, <vscale x 2 x i64>* %out, align 16
  ret void

Even if that is only handled as a special case before bailing out. This case is currently supported albeit with the necessary warnings being emitted.

================
Comment at: llvm/test/Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll:30
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP:%.*]] = alloca <vscale x 4 x i32>, align 64
+; CHECK-NEXT:    [[CAST:%.*]] = bitcast <vscale x 4 x i32>* [[TMP]] to <16 x i32>*
----------------
Not something to be fixed in this patch, but InstCombine changing the alignment to 64 seems wrong for scalable vectors.

================
Comment at: llvm/test/Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll:46
+
+define void @fixed16i32_to_scalable4i32(<vscale x 4 x i32>* %out) {
+; CHECK-LABEL: @fixed16i32_to_scalable4i32(
----------------
nit: This test does not actually test your change, because it bails out on the first alignment check. The same holds for the two functions below (`scalable16i32_to_fixed16i32` and `scalable32i32_to_scalable16i32`).

I think it's fine to leave the tests in, because at least it guards against possible regressions in case the ABI alignment or checks ever change.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87378/new/

https://reviews.llvm.org/D87378