[clang] [NFC][HIP] Disable device-side kernel launches for HIP (PR #171043)
via cfe-commits
cfe-commits at lists.llvm.org
Tue Dec 9 21:04:15 PST 2025
darkbuck wrote:
> > > > We added sema check @ https://github.com/llvm/llvm-project/blob/8378a6fa4f5c83298fb0b5e240bb7f254f7b1137/clang/lib/Sema/SemaCUDA.cpp#L83
> > > >
> > > > to generate error message on HIP based on Sam's request as HIP currently doesnt' support device-side kernel calls. I don't follow how we could have `CUDAKernelCallExpr` in the device compilation. Could you elaborate in details?
> > >
> > >
> > > The sema check doesn't work as is for `hipstdpar`, because it's gated on the current target being either a `__global__` function or a `__device__` function. What happens is that we do the parsing on a normal function, the <<<>>> expression is semantically valid, and then we try to `EmitCUDAKernelCallExpr`, because at CodeGen that is gated on whether the entire compilation is host or device, not on whether or not the caller is `__global__` or `__device__`. So either the latter check should actually establish the caller's context, or we should bypass this altogether when compiling for hipstdpar. This is the simplest NFC workaround to unbreak things.
> >
> >
> > Why not add `getLangOpts().HIPStdPar` check in sema to skip generating device-side kernel call? So that we have a central place to make that decision?
>
> Because, as far as I can ascertain, the `Sema` check is insufficient / the separate assert in `EmitCUDAKernelCallExpr` is disjoint. Here's what would happen:
>
> 1. In Sema what we see is that `IsDeviceKernelCall` is false - this is fine, but we still would emit a `CudaKernelCallExpr` for the `<<<>>>` callsite, which was the case anyways before this change;
You mean that so far we could generate `CudaKernelCallExpr` in the device compilation but it's not a device-side kernel call. I don't follow how that could happen. You mean, under hipstdpar, `<<<>>>` could be used in the device side but not being treated as a device kernel call. What's the semantics of that?
> 2. Later on, when we get to `CodeGen`, we see the `CudaKernelCallExpr`, and try to handle it, except now the assumption is that if we're compiling for device and we see that, it must be a device side launch, and go look up a non-existent symbol, and run into the bug.
https://github.com/llvm/llvm-project/pull/171043
More information about the cfe-commits
mailing list