[PATCH] D118534: [X86] Introduce more common modern tunings into `generic`

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 4 06:21:32 PST 2022


spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

In D118534#3296065 <https://reviews.llvm.org/D118534#3296065>, @RKSimon wrote:

> In D118534#3295994 <https://reviews.llvm.org/D118534#3295994>, @lebedev.ri wrote:
>
>> Please split Znver changes into a separate review.
>> At least for znver3, i'm not really confident that `fsqrt` is fast,
>> https://www.agner.org/optimize/instruction_tables.pdf says ~25cy,
>> while NR takes ~19cy: https://godbolt.org/z/rK9ra4hse
>
> 'fsqrt' is the x87 instruction, I think the tuning flag (despite its name which is IR based not x87 based) is concerned with the SSE instruction (v)sqrtss - https://godbolt.org/z/qTzesKWvj

Correct. Also AFAIK, that tuning flag only comes into play when expanding a plain sqrt(X) operation, not a 1/sqrt(X) operation. But I agree that we can make that change independently for the zen models.
So this patch LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118534/new/

https://reviews.llvm.org/D118534



More information about the llvm-commits mailing list