[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

Mon Aug 7 15:36:31 PDT 2023

arsenm added inline comments.

================
Comment at: clang/lib/Headers/opencl-c-base.h:832
+
+inline float __ovld __cnfn sqrt(float __x) {
+  return __builtin_elementwise_sqrt(__x);
----------------
Anastasia wrote:
> arsenm wrote:
> > svenvh wrote:
> > > Anastasia wrote:
> > > > Is this a generic implementation enough? Would some targets not need to do something different for this built-in?
> > > > 
> > > > Ideally this header is to be kept light so I am a bit worried about adding definitions of the functions here. Otherwise we will end up in the same situation as we one day were with opencl-c.h. So could these be left there instead? It might be good to check with @svenvh if TableGen header has already a way to do this function forwarding or can be extended to do such a thing. Then it would be implementable in the both header mechanisms. I don't know if Sven has some other ideas or opinions...
> > > We did already discuss this a bit on the GitHub issue: https://github.com/llvm/llvm-project/issues/64264
> > As I mentioned on the ticket, it's only this one case so I'm not worried about adding a lot more to the base header. I think we can start by assuming llvm.sqrt always works correctly, I don't want to add more complexity to handle this case without a specific reason
> > As I mentioned on the ticket, it's only this one case so I'm not worried about adding a lot more to the base header.
> 
> This is how things normally start. Someone else might want to continue this approach because it is already there.
> 
> >I think we can start by assuming llvm.sqrt always works correctly, I don't want to add more complexity to handle this case without a specific reason
> 
> Do you mean it would apply to all implementations? What I am missing here is why it is required to be in the headers? Is this because it needs to be inlined or is it because the compiler must see `__builtin_elementwise_sqrt` with the surrounding code where it is called from?
Yes, I would expect any llvm consumer to correctly lower llvm.sqrt, and such an implementation would be correctly rounded and pass conformance with -cl-fp32-correctly-rounded-divide-sqrt

The goal is to get !fpmath attached to an llvm.sqrt call. The way this currently happens is based on the language, it gets emitted from the various __builtin_sqrt* calls

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156743/new/

https://reviews.llvm.org/D156743