[PATCH] D94230: [AArch64][SVE] Add SVE IR pass to coalesce ptrue instrinsic calls

Bradley Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 7 05:22:14 PST 2021


bsmith requested changes to this revision.
bsmith added a comment.
This revision now requires changes to proceed.

I'm not sure this patch is correct as it's not taking into account how the predicates are used, for example in following case your patch replaces the `ptrue_b32()` predicate of the `%5` 8 x i16 load with a `ptrue_b16()`, which changes the behaviour.

  declare <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 immarg)
  declare <vscale x 8 x i1> @llvm.aarch64.sve.ptrue.nxv8i1(i32 immarg)
  
  declare <vscale x 4 x i32> @llvm.aarch64.sve.ld1.nxv4i32(<vscale x 4 x i1>, i32*)
  declare <vscale x 8 x i16> @llvm.aarch64.sve.ld1.nxv8i16(<vscale x 8 x i1>, i16*)
  
  declare <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1>)
  declare <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1>)
  
  define <vscale x 8 x i16> @coalesce_test_basic(i32* %addr1, i16* %addr2) {
    %1 = call <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
    %2 = call <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1> %1)
    %3 = call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> %2)
  
    %4 = call <vscale x 4 x i32> @llvm.aarch64.sve.ld1.nxv4i32(<vscale x 4 x i1> %1, i32* %addr1)
    %5 = call <vscale x 8 x i16> @llvm.aarch64.sve.ld1.nxv8i16(<vscale x 8 x i1> %3, i16* %addr2)
  
    %6 = call <vscale x 8 x i1> @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
    %7 = call <vscale x 8 x i16> @llvm.aarch64.sve.ld1.nxv8i16(<vscale x 8 x i1> %6, i16* %addr2)
  
    ret <vscale x 8 x i16> %7
  }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94230/new/

https://reviews.llvm.org/D94230



More information about the llvm-commits mailing list