[PATCH] D53927: [AArch64] Enable trigonometry libm vectorized functions via SLEEF

Wed Oct 31 10:41:40 PDT 2018

steleman added a comment.

In https://reviews.llvm.org/D53927#1282380, @fpetrogalli wrote:

> Hi @steleman , thank you for working on this!
>
> Before doing a detailed review, I have a couple of comments and requests.

> 2. You are using Vector Function ABI mangled names, which means you ware assuming to link agains `libsleefgnuabi.so` instead of `libsleef.so`. For this reason, I think it is worth to replace the `-fveclib=SLEEF` option with `-fveclib=SLEEFGNUABI`. This request applies to all places where you  have used `SLEEF`.

Sorry but I wholeheartedly disagree with this idea.

For starters, saying -fveclib=SLEEFGNUABI is just ugly IMO.

Furthermore, I do not understand why this change is necessary. It provides nothing that the current naming convention doesn't already provide. So, really, it's a matter of personal preference.

Also: Intel did not name their veclib argument "SVMLINTELABI". They named it "SVML".

> 3. Given that SLEEF has some degree of target independence, I would recommend using the TargetTriple `if` that selects the list of functions into a switch statement that sets the architectural extension token in the mangled names. On the long term, the list of available vector functions should be generated by setting such token instead of hard coding all names for all targets. Notice that I am not saying that you should add all targets, I am just saying that you should prepare the ground for extending this, as I think that `-fveclib=SLEEFGNUABI` should work the same for all the extensions supported by SLEEF.

I don't understand any of this. It seems to me that you are describing what I've already done.

The TargetTriple wasn't present in the TargetLibraryInfoImpl class. I had to add it, specifically because it is needed by the SLEEF switch statement case for AArch64.

> On the long term, the list of available vector functions should be generated by setting such token instead of hard coding all names for all targets.

I don't understand what this means.

Are you suggesting that the VecLib array of function name mappings should be generated by tblgen? Yes, I agree. It should be. But it currently isn't. And I don't know when tblgen will be extended to do this job. Or if it ever will be extended to do this job.

Also, we (Cavium) will not be extending the SLEEF mappings to other ISA's. Simply because we do not have the capability of testing PPC64, or Windows.

So, really, providing additional SLEEF mappings for other ISA's or platforms is best suited for those who are best versed in the subtle details of those other ISA's or platforms.

The SLEEFGNUABI ABI has very little CPU independence when compiling for for X86_64. It is, in fact, very CPU-dependent.

In fact, in the current implementation of TargetLibraryInfo, it is impossible to provide X86_64 mappings for SLEEF using the GNUABI. That is because the X86_64 GNUABI mappings require knowledge of the capabilities of the specific CPU that the object is being compiled for. These CPU capabilities are hardcoded by SLEEF, in the SLEEF GNU ABI mangled name:

'b' -> SSE
'c' -> AVX
'd' -> AVX2
'e' -> AVX512

So, at a minimum, on X86_64, you would need to know the values passed to -march= and -mtune=. But these CPU capabilities are currently inaccessible from TargetLibraryInfo.

TargetLibraryInfo could be extended to acquire the CPU capability information, but that's for another changeset. Maybe after TargetLibraryInfo is extended to support this capability we can revisit the construction of the GNU ABI mangled name.

> 4. On the longer term, I want to make you aware of the fact that there is plan to replace the TLI implementation of -fveclib with an OpenMP based one (to decouple backend and frontend). We discussed this at the last LLVM dev meeting, I will circulate an RFC soon. See https://llvm.org/devmtg/2018-10/talk-abstracts.html#bof7.

That's an interesting plan.

I'd like to make you aware that brute-forcing OpenMP for vectorization will probably run afoul of SPEC rules.

SPEC allows OpenMP in SPECspeed, but **not** in SPECrate, and that is clearly stated at the SPEC web site. Not to mention that the SPEC benchmarks code cannot be modified to add #pragma omp simd. And having clang silently and secretly modify emitted code to insert OpenMP simd will likely not fly at SPEC.

I am also not a fan of OpenMP being used as a giant hammer that can hit any nail. You would be introducing a dependency on OpenMP for programs that do not otherwise use, or need, OpenMP. So, in reality, there is no "decoupling" here. There is only a dependency substitution: the existing dependency on SLEEF is replaced by the future dependency on OpenMP.

> 5. Please change your commit messages header, the function you are adding are not just trigonometric functions. May I suggest "[AArch64] Enable libm vectorized functions via SLEEFGNUABI"?

I'll settle for "[AArch64] Enable libm vectorized functions via SLEEF".

Repository:
  rL LLVM

https://reviews.llvm.org/D53927