[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

Wed Sep 22 12:38:09 PDT 2021

tra added a comment.

In D110089#3016145 <https://reviews.llvm.org/D110089#3016145>, @jlebar wrote:

>> Depending on which particular operation is used, the arguments vary, too.
>
> So something like
>
>   T __nv_tex_surf_handler(name, arg1) {
>     switch (name) {
>       ...
>       default:
>         panic();
>     }
>   }
>   
>   T __nv_tex_surf_handler(name, arg1, arg2) {
>     switch(...) { ... }
>   }
>
> and so on?

Yes, and there will be multiple such overloads for each `name`. So the switch will have to be replicated/adjusted in each overload.

> If it didn't buy us anything, I'd agree.  The thing I'm concerned about is readability of this code.  Which, if we want to tie it back to users, affects our ability to catch bugs in this implementation.
>
>> Having one trivial compiler builtin that simplifies things a lot is a better trade-off, IMO.
>
> Ah, maybe I wasn't clear then.  I'm not actually super-concerned with the compiler builtin.  It'd be nice to get rid of it if there's a clean way to do so, but if we don't, that's ok.  Basically, the builtin is just for changing `strcmp(x, "foo")` into `builtin(x) == builtin("foo")`.  Fine.
>
> What I'm more concerned with is the spaghetti of macros here to do something as simple as a series of overloaded functions.  It seems like a premature optimization, and I don't feel confident I can check it for bugs.

The choice is between using macros to generate the boilerplate vs replicating things manually. If we could get templates to generaste inline asm operands, that would be great. Unfortunately it requires using literals, so the macros are the only way to construct the right instruction. 
If I do not do that with macros, I'll have to manually write each instruction variant and get it right every time.

I can preprocess the macros and commit the results. That would be about an order of magnitude more code than what we have now. That would be harder to change en masse without errors. E.g. try spotting the differences between any two neighboring functions there: https://gist.github.com/Artem-B/ec4290809650f5092d61d6dafa6b0131

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110089/new/

https://reviews.llvm.org/D110089