[PATCH] D158724: [AArch64][LoopVectorize] Add truncated store values to list of types for widening

Fri Sep 1 07:51:57 PDT 2023

david-arm added inline comments.

================
Comment at: llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization.ll:743
 ; CHECK-PROFITABLE-BY-DEFAULT-LABEL: @f4(
-; CHECK-PROFITABLE-BY-DEFAULT-NEXT:  iter.check:
+; CHECK-PROFITABLE-BY-DEFAULT-NEXT:  entry:
 ; CHECK-PROFITABLE-BY-DEFAULT-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N:%.*]] to i64
----------------
Hmm, it looks like we've decided not to vectorise at all now. Perhaps because the maximum register width is 32 bits, and since the largest type in the loop is now 32 bits the max VF we can choose is 1? In order to still demonstrate some vectorisation you might have to change the loop IR to be something like this:

  %conv = trunc i32 %0 to i16
  store i16 %conv, ptr %arrayidx, align 1

================
Comment at: llvm/test/Transforms/LoopVectorize/vplan-stress-test-no-explict-vf.ll:21
   %arrayidx = getelementptr inbounds [8 x i32], ptr @arr2, i64 0, i64 %indvars.iv21
   %0 = trunc i64 %indvars.iv21 to i32
   store i32 %0, ptr %arrayidx, align 4
----------------
Similar to the test above you may need to change the test so you still get VF=1. You could try choosing to use a 32-bit phi and truncate that to i16?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158724/new/

https://reviews.llvm.org/D158724