[PATCH] D124612: [AArch64][LV] AArch64 does not prefer vectorized addressing

Fri Apr 29 03:47:35 PDT 2022

sdesmalen added a comment.

Did you do any performance measurements to get an impression of the performance impact of this change?

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h:151

+  bool prefersVectorizedAddressing() const;
+
----------------
Intuitively I would think that `false` would be a more sensible default anyway.

That wouldn't make much difference to this patch, because we still want to distinguish SVE and NEON.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/gather-do-not-vectorize-addressing.ll:63
+; SVE-NEXT:    [[TMP9:%.*]] = getelementptr inbounds double, double* [[DATA:%.*]], <vscale x 2 x i64> [[TMP8]]
+; SVE-NEXT:    [[WIDE_MASKED_GATHER:%.*]] = call <vscale x 2 x double> @llvm.masked.gather.nxv2f64.nxv2p0f64(<vscale x 2 x double*> [[TMP9]], i32 8, <vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 true, i32 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x double> undef)
+; SVE-NEXT:    [[TMP10:%.*]] = getelementptr inbounds [[STRUCT_STU:%.*]], %struct.stu* [[PARAM:%.*]], i64 0, i32 0, i64 [[TMP4]]
----------------
For a scalable VF there will be no difference in practice, because it won't try to scalarise the addresses.
If you want to test the difference between SVE and NEON, you'll need to force the VF using `-force-vector-width=2` for both RUN lines.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124612/new/

https://reviews.llvm.org/D124612