[PATCH] D87884: [CostModel][X86] add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8)

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 22 00:01:56 PDT 2020


craig.topper added a comment.

In D87884#2287015 <https://reviews.llvm.org/D87884#2287015>, @yubing wrote:

> In D87884#2286958 <https://reviews.llvm.org/D87884#2286958>, @craig.topper wrote:
>
>> In D87884#2286918 <https://reviews.llvm.org/D87884#2286918>, @yubing wrote:
>>
>>> With avx512f, the cost SK_Select(v32i16 or v64i8) shoulde be 3(vmovdqa64 + vpternlogq)
>>
>> The moves probably don't really count since they can be eliminated during register renaming. So only the vpternlog executes.
>
> Eh, Craig, why it has relationship with register renaming? I thought, vternlog's third operand should be provided by a vmovdqa64.
> Besides, we can observe the following asm for v32i16's SK_Select:
>
>   vmovdqa64       .LCPI0_0(%rip), %zmm0   # zmm0 = [0,0,0,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535]
>   vpternlogq      $202, 144(%rbp), %zmm4, %zmm0

Sorry I thought the vmovdqa you mentioned was due to the vpternlogq reading 3 sources and clobbering one of them. So sometimes it needs a register to register move to preserve a register.

I'm not sure if we usually cost the constant pool load since its loop invariant. Do we cost the load that vpermi2b/w/d/q would use for 2 source permute?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87884/new/

https://reviews.llvm.org/D87884



More information about the llvm-commits mailing list