[PATCH] D129715: [LoongArch] Heuristically load FP immediates by movgr2fr from materialized integer

Tue Jul 26 19:31:33 PDT 2022

xry111 added a comment.

In D129715#3681339 <https://reviews.llvm.org/D129715#3681339>, @gonglingqin wrote:

> In D129715#3656776 <https://reviews.llvm.org/D129715#3656776>, @gonglingqin wrote:
>
>> In D129715#3654612 <https://reviews.llvm.org/D129715#3654612>, @xen0n wrote:
>>
>>> I think some assembly comparison could go a long way, but again, SPEC2006 is *horribly outdated* so actually IMO the argument for 3-instruction threshold would be a lot stronger if you could replicate this result on some more recent or comprehensive benchmark suites. (PTS or newer SPEC are all better than SPEC2006 in this regard.)
>>
>> Thanks,  I will test other benchmark sets.
>
> I used `cpu2017(fortran excluded)` to test the performance in 5 cases,
>
> 1. using constant pool,
> 2. materialized integer with 1 instruction,
> 3. materialized integer within 2 instructions,
> 4. materialized integer within 3 instructions,
> 5. materialized integer within 4 instructions.
>
> (Tests were run three times for each condition and the scores were geometrically averaged). 
> The results showed no change in the scores for the 5 cases. @xen0n, @xry111, do you have any suggestions?

Make it a tunable (`-loongarch-materialize-float-imm=0/1/2/3/4`), I guess.  And set the default to `0` for `-mtune=generic` or `-mtune=la464`.  Then we can set it to other values if a futher uarch behaves differently.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129715/new/

https://reviews.llvm.org/D129715