[llvm] [SVE][InstCombine] Fold ld1d and splice into ld1ro (PR #69565)

Thu Oct 19 05:21:17 PDT 2023

================
@@ -0,0 +1,32 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -mattr=+sve,+f64mm -passes=instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define <vscale x 2 x double> @combine_ld1ro_double(<vscale x 2 x i1> %pred, ptr %addr) {
+; CHECK-LABEL: @combine_ld1ro_double(
+; CHECK-NEXT:    [[RES:%.*]] = call <vscale x 2 x double> @llvm.aarch64.sve.ld1ro.nxv2f64(<vscale x 2 x i1> [[PRED:%.*]], ptr [[ADDR:%.*]])
----------------
david-arm wrote:

This transformation doesn't seem to right to me. The ldr1od instruction definition says "Load four contiguous doublewords to elements of a 256-bit (octaword) vector from the memory address". So first you have to prove at compile time that you're loading exactly 256 bits, which requires vscale to be exactly 2. However, even then I am not sure if the transformation is valid because don't you also need to prove the mask used for the load and splice is all ones?

https://github.com/llvm/llvm-project/pull/69565