[PATCH] D151184: [AArch64] Adjust costs of i1 and/or/xor reductions
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 31 05:49:51 PDT 2023
dmgreen added a comment.
ping
================
Comment at: llvm/test/Analysis/CostModel/AArch64/reduce-xor.ll:20
; CHECK-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %V8i8 = call i8 @llvm.vector.reduce.xor.v8i8(<8 x i8> undef)
; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16i8 = call i8 @llvm.vector.reduce.xor.v16i8(<16 x i8> undef)
; CHECK-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %V32i8 = call i8 @llvm.vector.reduce.xor.v32i8(<32 x i8> undef)
----------------
david-arm wrote:
> Interestingly, we can also do much better for xor reductions like v16i8, v8i16, etc. by using SVE if available too. For a v8i16 xor reduction we can just do:
>
> ptrue p0.h, vl8
> eorv h0, p0, z0.h
> fmov w0, s0
>
> whereas I see we currently do
>
> ext v1.16b, v0.16b, v0.16b, #8
> eor v0.8b, v0.8b, v1.8b
> fmov x8, d0
> eor x8, x8, x8, lsr #32
> lsr x9, x8, #16
> eor w0, w8, w9
>
OK cool.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D151184/new/
https://reviews.llvm.org/D151184
More information about the llvm-commits
mailing list