[PATCH] D155592: [AArch64] Reuse larger DUPLANE if available

JinGu Kang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 20 02:16:26 PDT 2023


jaykang10 added a comment.

In D155592#4517823 <https://reviews.llvm.org/D155592#4517823>, @dmgreen wrote:

> Thanks. Looks pretty good. Do we need to handle the other "indexing" operations in the same way? For example something like this below, which is for fmul. I would guess that the extending operations (umull) are more likely to see the problem, but you can imagine the others in the "AdvSIMD indexed element" section of AArch64InstrInfo.td might all be better using dup_v8i16 now.
>
>   define <4 x float> @sel.v8i16(ptr %p, ptr %q, <4 x float> %a, <4 x float> %b, <2 x float> %c) {
>     %splat = shufflevector <4 x float> %a, <4 x float> poison, <4 x i32> zeroinitializer
>     %splat2 = shufflevector <4 x float> %a, <4 x float> poison, <2 x i32> zeroinitializer
>     
>     %r = fmul <4 x float> %b, %splat
>     %r2 = fmul <2 x float> %c, %splat2
>     store <2 x float> %r2, ptr %p
>     ret <4 x float> %r
>   }

I have added `dup_v8i16` and `dup_v4i32` into `SIMDVectorIndexedLongSD` and `SIMDIndexedLongSD` multiclass.
The `umull` uses `SIMDVectorIndexedLongSD` like `smull` so it will be matched with the `dup_v8i6` and `dup_v4i32`.
Let me check the `fmul` example and 64-bits `AArch64duplane` with `VectorIndexS:$idx`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155592/new/

https://reviews.llvm.org/D155592



More information about the llvm-commits mailing list