[PATCH] D80750: llvm-link: Add module flag behavior MergeTargetID

Tue Jul 28 15:15:43 PDT 2020

jdoerfert added a comment.

I believe there are three things in this patch, but feel free to correct me:

1. A way to specify a target triple + cpu. Basically like `target triple = ...` but in the module metadata plus some additional target cpu suffix, which is so far in the `target-cpu` function attribute list.
2. A way to specify global target features, which are so far in the `target-features` function attribute list.
3. Making llvm-link aware of 1) and 2) and verifying they match (under some rules).

If this is the case, what is the benefit over a toplevel module entry that allows you to specify `target-cpu` and `target-feature` for the entire module? I mean, they seem to be very much the same thing as `target triple`, yet we go a totally different route to add them and verify the match during linking. I believe other people might benefit from this, e.g., to get rid of function-level attributes, so we should shoot for a generic solution.

================
Comment at: llvm/docs/LangRef.rst:6493
+           is in both source and destination target ID's but with different
+           signs, an error of conflict module flags will be emitted.
+
----------------
yaxunl wrote:
> jdoerfert wrote:
> > yaxunl wrote:
> > > jdoerfert wrote:
> > > > yaxunl wrote:
> > > > > jdoerfert wrote:
> > > > > > This sounds an awful lot like mismatches in regular target triples. Do we really need a new mechanism and wording here, and if so, couldn't we restrict it to the features? I mean, there is a target triple already in the module, right?
> > > > > Yes. It is a design decision made by AMD after thorough internal discussions.
> > > > > 
> > > > > We need to have an efficient way to identify device binaries embedded in a host executable for single source languages e.g. HIP. There are multiple device binaries embedded. The device binaries are not just per processor, they are per processor/feature combination. We do not want to encode features in GPU names since it incurs combination explosion. In stead, we need to use processor:feature1+:feature2+ (so called target ID) to identify a device binary.
> > > > > 
> > > > > The target ID is a real ID to identify a device binary. It is specified by user to clang and will be passed to backend to embed in device binary to be used by runtime. Since it is per module, it needs to be represented as a module flag. And since modules with different target ID may be linked together, they need to be checked to ensure compatibility and merged if necessary.
> > > > > 
> > > > > Checking target feature directly is not suitable here since target feature is per function. Also not all target features are part of target ID.
> > > > > 
> > > > > Currently target ID is only supported by AMDGPU target. It is NFC for other targets. However it can be adopted by other targets easily. 
> > > > > 
> > > > > Yes. It is a design decision made by AMD after thorough internal discussions.
> > > > 
> > > > Given that this is a problem for various people and languages, maybe such a discussion should happen in the open such that we implement a solution which can be used by HIP, OPENMP, CUDA, SYCL, ...
> > > > 
> > > > As an alternative design, I have (for a while now) a prototype to allow multi-target LLVM-IR modules. It seems that would solve your problem as well but (IMHO) closer aligned to what we use "usually" to define the target (namely the target string in the module).
> > > > 
> > > > I'll share my prototype this week and start a discussion. Feel free to let me know beforehand what you think.
> > > Have you shared your prototype? We would like to evaluate using it in place of the module flag. Thanks.
> > Sorry for the wait, this was not good of me.
> > 
> > I wrote the email I was postponing for months: http://lists.llvm.org/pipermail/llvm-dev/2020-July/143808.html
> > And you can see one way of having different targets in one module prototyped in D84728.
> > 
> > Please let me know what you think.
> Thanks for sharing the info.
> 
> Overall I support the idea of heterogeneous LLVM IR and I think this is the right direction. I would like to suggest to make global values not per triple but per triple-processor, since the IR of a function depends on processor in general. Another issue is that the name of global value may be different for different triple, e.g. when compiling on Windows with MSVC, the host IR and device IR may use different name mangling scheme.
> 
> I can see it will take considerable efforts and time to adopt the heterogeneous IR in the compiler pipe line. I don't think it is feasible to defer all the current feature development which are potentially rely on heterogeneous IR. A feasible approach is that how to transit this feature to heterogeneous IR when time comes.
> 
> My patch introduced a target-id module flag, which is actually a generalization of triple-processor module flag. We need this since we need it in llvm codegen and also we want to check compatibility of modules when linking.
> 
> The current implementation does not consider heterogeneous IR. From what I see, we still need a module flag to convey the information about what triple-processors this heterogeneous IR is compiled for. The difference is that now it becomes a list of triple-processors instead of just one triple-processor. At least this will be true for amdgpu target. Since this module flag is optional, a target can choose to use it or not, so it will have no impact on other targets.
> 
> So what I need to do is to make it future proof. Basically instead of assuming each module has one target ID, assuming each module can have a list of target ID's. 
> 
Could you also respond on the list. The feedback is very valuable and has more reach there :)

I am not (trying to) blocking this patch but I still doubt it is the right direction. We have too many levels of triple and features and this is yet another one which will only solve a particular problem you are having right now (as far as I can tell).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80750/new/

https://reviews.llvm.org/D80750