[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

Mon May 23 15:25:24 PDT 2022

tra added a comment.

In D125904#3532545 <https://reviews.llvm.org/D125904#3532545>, @jhuber6 wrote:

> clang a.c -c -o a-0.o // Has some externalized static variable.
> clang a.c -c -o a-1.o
> clang a-0.o a-1.o // Redefined symbol error

Ah. OK. This is a bit less of a concern. As long as we take compiler options into account it should work.

There are use cases when the same file is recompiled with different -DFOO macros. And that *is* supported by NVCC and, I think it is (or used to be?) broken in clang.
I think we currently end up renaming the source files for each compilation variant.

>> The fact that NVCC didn't always generate the same output **was** an issue when we were using it for CUDA compilation.
>> In general, "not supported by NVCC" is not quite applicable here, IMO. The goal here is to make clang work correctly.
>
> I feel like linking a file with itself is pretty uncommon, but in order to support that we'd definitely need the CUID method so we can pass it to both the host and device. I'm personally fine with this and the CUID living together so if for whatever reason there's a symbol clash, the user can specify a CUID to make it go away.

I agree that compiling and linking the same file is probably not very common and I can't think of practical use case for it. 
That said, I would consider compiling the same source with different preprocessor options to be a legitimate use case that we should support. 
Explicitly passing cuid would work as a workaround in those cases, so it's not a major issue if we can't make it work out of the box without explicit cuid.

> We also discussed the problem of non-static source trees which neither this nor the current CUID would solve. As far as I can tell, this method would work fine for like 99.99% of codes, but getting that last 0.01% would require something like generating a UUID for each compilation job, which requires intervention from the driver to set up offloading compilation properly. So I'm not sure if it's the best trade-off.

Acknowledged. Creating globally-unique, yet consistent across all sub-compilations  ID based only the info available to individual subcompilation is probably hard-to-impossible to do.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6845
+    }
+    OS << llvm::format("%x", ID.getFile()) << llvm::format("%x", ID.getDevice())
+       << PLoc.getLine();
----------------
Considering that the file name may have arbitrary symbols in it, that may result in a symbol name that we can't really use.
I'd print a hash of the (finename+device+line) -- that would guarantee that we know there are no funky characters in the suffix.

I'm also not sure if the line number makes any difference here. There's no need to differentiate between symbols within the same TU within the same compilation, only across different compilations and for that filename+device should be sufficient.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125904/new/

https://reviews.llvm.org/D125904