[PATCH] D28538: [CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules.

Tue Jan 17 09:19:51 PST 2017

jlebar added inline comments.

================
Comment at: clang/lib/CodeGen/CGCall.cpp:1720
+      getLLVMContext(), llvm::AttributeSet::FunctionIndex, FuncAttrs);
+  F.addAttributes(llvm::AttributeSet::FunctionIndex, AS);
+}
----------------
hfinkel wrote:
> mehdi_amini wrote:
> > jlebar wrote:
> > > mehdi_amini wrote:
> > > > Are we sure it always work? What if a function is annotated with an attribute incompatible with the attributes you're gonna add?
> > > I have no idea -- what is going to happen?
> > > 
> > > In the CUDA use-case we make tons of assumptions about the contents of libdevice -- if we're making an additional assumption here, I'm fine with that.  Obviously we don't just want to nuke all of the attributes on the incoming function, because some of them might be derived from the IR as opposed to the codegen options.
> > > I have no idea -- what is going to happen?
> > 
> > BAD THINGS WILL HAPPEN! ;)
> > 
> > (The verifier will fail)
> I think we need to decide what happens here. We can either remove attributes we're going to set, merge them properly, or skip them. I think that we should do the following:
> 
>  1. Merge attributes properly (or just skip those already present on the function)
>  2. Add a flag to Clang, that can be used when compiling libdevice (or whatever), that causes functions to be generated with only minimal attributes (i.e. keep readnone/only for pure/const, but don't add unsafe-math flags or target info).
> 
> Also, higher-level question: Why is this bitcode-linking functionality specific to CUDA at all?
> 
> Merge attributes properly (or just skip those already present on the function)

There are 27 attributes here -- I am pretty hesitant to add merging logic for each one individually.  That logic will be dead code, with all the resulting problems (yagni, probably doesn't actually work, etc.).

Either skipping attrs already present on the function or explicitly overwriting them seems to be the Simplest Thing That Could Possibly Work, so if it's OK with you all, I'd prefer to do one of those two.

Looking through the list of attrs, many of them are things you'd want to overwrite-if-present rather than skip-if-present, at least in my use-case (which is the only one we have atm).  And indeed, per below, using overwrite-if-present gets us out of adding additional complexity and dead code to clang, so my inclination is to go towards that.  

If there are a few attrs here that we *don't* want to overwrite-if-present, maybe we could just move them out of this function entirely and not set them when linking.  All I actually care about are the math flags and the CUDA flags; I just thought it was a cleaner API to say that we set all these attrs than to cherrypick.

> Add a flag to Clang, that can be used when compiling libdevice (or whatever), that causes functions to be generated with only minimal attributes (i.e. keep readnone/only for pure/const, but don't add unsafe-math flags or target info).

We do not compile libdevice, so this will again be dead code.  Do we have a user in mind for this?

To me the Simplest Thing would be to overwrite-if-present, per above.  Then we don't need to add any dead code or additional flags.  I also think that's probably what you want when you're doing libdevice linking, which is our only usecase at the moment.

> Why is this bitcode-linking functionality specific to CUDA at all?

What in the patch makes you think it is?  Is the issue that we can get this functionality only via -mlink-cuda-bitcode -- -mlink-bitcode does something different?

Again we could "fix" this by coming up with some "link spec" you pass to -mlink-bitcode along with the bitcode file, but that'd be added complexity for little gain, I think?

https://reviews.llvm.org/D28538