[flang-commits] [flang] [flang][AMDGPU] Convert math ops to AMD GPU library calls instead of libm calls (PR #99517)

Fri Jul 26 12:08:38 PDT 2024

arsenm wrote:

> If I understand this correctly the main issue is that clang and flang use OCML at all, and it should be purely internal to LLVM. 

Mainly yes.

> OCML is only used within the LLVM projects from what I can tell. It doesn't seem to be user-facing, since it is not mentioned in the ROCm documentation anymore (not for a long time), so we can perhaps assume we control where and how it is used. Once a real solution is implemented we should be able to just delete this pass and let the standard lowering passes do the work. In the mean time I think it is reasonable to use OCML as the uniform interface.

LLVM is the interface. OCML is an implementation detail.

> 
> What are your opinions on adding the "implements" attribute as long term solution? This would add an attribute to tell the compiler that a function implements an intrinsic something like **attribute** implements(llvm.pow). I was looking into this, but I don't have time to work on it right now. The discussion thread is here: https://discourse.llvm.org/t/nvptx-codegen-for-llvm-sin-and-friends/58170/33

It's not really how I wanted this to go, but the current trend is to have all of the math library functions available as intrinsics. Given that, I don't see much point in adding such an attribute. I was leaning towards removing any intrinsics that no backend can reasonably implement without a runtime call, and that would be complemented by improved libcall-by-name handling. It would make more sense to add implements if we went in the other direction. With this trend, I think it's jumping through hoops to maintain implementation detail names we control and can replace.

We lack a proper platform definition. I think it would be best if we defined amdhsa like a normal operating system, with a well defined set of provided libm functions, using the standard names. The __ocml prefixes are a relic from how OpenCL was implemented long ago. It would make more sense to put any extension functions in an __amd or __amdhsa prefix. Another issue is amdpal and the other triples we use. Different projects have thrown assorted builtin libraries together in different ways, so it would be better to define this bottom up.

https://github.com/llvm/llvm-project/pull/99517