[PATCH] D133955: [AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE

Wed Nov 2 06:18:34 PDT 2022

peterwaller-arm added inline comments.

================
Comment at: llvm/test/Analysis/CostModel/AArch64/cast.ll:1035
+; FIXED-OVER-SVE2048-NEXT:  Cost Model: Found an estimated cost of 21 for instruction: %r232 = uitofp <16 x i8> undef to <16 x float>
+; FIXED-OVER-SVE2048-NEXT:  Cost Model: Found an estimated cost of 21 for instruction: %r233 = sitofp <16 x i8> undef to <16 x float>
+; FIXED-OVER-SVE2048-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %r234 = uitofp <16 x i16> undef to <16 x float>
----------------
peterwaller-arm wrote:
> These costs look high, consider:
> 
>   define <16 x float> @f(<16 x i8> %v) {
>     %1 = sitofp <16 x i8> %v to <16 x float>
>     ret <16 x float> %1
>   }
> 
> The codegen I see with `llc -mtriple=aarch64 -mattr=+sve -aarch64-sve-vector-bits-min=2048`, for example:
> 
>   ptrue   p0.s, vl16
>   sunpklo z0.h, z0.b
>   sunpklo z0.s, z0.h
>   scvtf   z0.s, p0/m, z0.s
>   st1w    { z0.s }, p0, [x8]
>   ret
> 
> I suspect there are more where this isn't quite right but I discovered this one by cherry picking. If I'm right, let's make sure this one is fixed and then iterate.
Through in-person discussion I've understand that the reason these costs are high is because of an issue which is different to the one you've set out to fix here, so I retract this comment.

The evidence for this is that it's the same as the NEON cost. So we can leave it for the time being.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133955/new/

https://reviews.llvm.org/D133955