<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 8, 2021 at 11:23 AM Liu, Yaxun (Sam) <<a href="mailto:Yaxun.Liu@amd.com">Yaxun.Liu@amd.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US" style="overflow-wrap: break-word;">
<div class="gmail-m_-7898721503581791544WordSection1">
<p class="gmail-m_-7898721503581791544msipheader251902e5" style="margin:0in"><span style="font-size:10pt;font-family:Arial,sans-serif;color:rgb(49,113,0)">[AMD Public Use]</span><u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">The amdgpu xnack and sramecc need to be part of GPU arch name the same way as for --offload-arch, e.g.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">--offload=amdgcn-gfx906:xnack+,amdgcn-gfx906:xnack-<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">They behave like GPU arch.<u></u><u></u></p>
<p class="MsoNormal"><u></u> </p></div></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:verdana,sans-serif">It's just that it's rather unwieldy to use in practice. It's not a showstopper, but perhaps now may be a convenient point to consider the naming scheme for AMDGPU sub-compilations again.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">It should be easy enough to add useful or commonly used names/aliases.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">E.g. `--offload=nvidia-ampere` would be equivalent to `--offload=sm_80,sm_86`.</div><div class="gmail_default" style="font-family:verdana,sans-serif">Or `--offload=amd-navi33` -> `--offload=gfx3011:+something:-something_else`</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Even for CUDA and NVIDIA GPUs that've been around for a pretty long time, I'm still getting the questions from the users -- "I've got this GTX/RTX-whatever video card and can't figure out how to compile for it. What are those compute_XY and sm_YZ and which ones should I use?"</div><div class="gmail_default" style="font-family:verdana,sans-serif">I can only imagine trying to explain to someone : "You need to use gfx-XYZ<colon><dash>xnack<colon><plus>sram-ecc.... Oh, you must have mistyped that, let's try it again."</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Perhaps we need to split offloading machinery further.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">The --offloat=target still serves the double purpose of creating a sub-compilation *and* specifying the target details, providing the initial set of parameters for the given target. It also prevents creation for multiple subcompilations for targets with minor differences which may be one of the reasons that led to AMDGPU's encoding various features in the target name.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">What if we were to modify the scheme a bit in a way that allows better handling of multiple variants of the same target.</div><div class="gmail_default" style="font-family:verdana,sans-serif">E.g.:</div><div class="gmail_default" style="font-family:verdana,sans-serif">--offload=gfx906@A,gfx906@B -- creates two sub-compilations both targeting gfx906. Optional @suffix makes it possible to match them independently.</div><div class="gmail_default" style="font-family:verdana,sans-serif">-Xoffload=@A --set-features=xnack+,sram-ecc-</div><div class="gmail_default" style="font-family:verdana,sans-serif">-Xoffload=@B --set-features=xnack-,sram-ecc+</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Would something like this help with AMDGPU's feature handling?</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">--Artem</div><div class="gmail_default" style="font-family:verdana,sans-serif"><span style="font-family:Arial,Helvetica,sans-serif"> </span><br></div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="gmail-m_-7898721503581791544WordSection1"><p class="MsoNormal"><u></u></p>
<p class="MsoNormal">Sam <u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in">
<p class="MsoNormal"><b>From:</b> Artem Belevich <<a href="mailto:tra@google.com" target="_blank">tra@google.com</a>> <br>
<b>Sent:</b> Monday, March 8, 2021 2:01 PM<br>
<b>To:</b> Liu, Yaxun (Sam) <<a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>><br>
<b>Cc:</b> Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov" target="_blank">jdoerfert@anl.gov</a>>; Ben Boeckel <<a href="mailto:ben.boeckel@kitware.com" target="_blank">ben.boeckel@kitware.com</a>>; Lieberman, Ron <<a href="mailto:Ron.Lieberman@amd.com" target="_blank">Ron.Lieberman@amd.com</a>>; <a href="mailto:a.bataev@hotmail.com" target="_blank">a.bataev@hotmail.com</a>; Chan, SiuChi <<a href="mailto:siuchi.chan@amd.com" target="_blank">siuchi.chan@amd.com</a>>; Searles, Mark <<a href="mailto:Mark.Searles@amd.com" target="_blank">Mark.Searles@amd.com</a>>; cfe-dev (<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>)
<<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>; <a href="mailto:jeffrey.sandoval@hpe.com" target="_blank">jeffrey.sandoval@hpe.com</a>; Jon Chesterfield <<a href="mailto:jonathanchesterfield@gmail.com" target="_blank">jonathanchesterfield@gmail.com</a>>; Rodgers, Gregory <<a href="mailto:Gregory.Rodgers@amd.com" target="_blank">Gregory.Rodgers@amd.com</a>><br>
<b>Subject:</b> Re: [cfe-dev] [RFC] Unified offloading option for CUDA/HIP/OpenMP<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">[CAUTION: External Email] <u></u><u></u></p>
<div>
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Sat, Mar 6, 2021 at 7:13 AM Liu, Yaxun (Sam) <<a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<p class="MsoNormal">[AMD Public Use]<br>
<br>
We need to different target triples since it may not always be possible to infer target triple by cpu name. So I guess it would be like:<br>
<br>
"--offload=amdgcn-gfx906,amdgcn-gfx1010"<br>
"--Xoffload=amdgcn-gfx* options common to all AMD GPUs"<br>
"--Xoffload=amdgcn-gfx906 -mcpu=gfx906 --fsomething-specific-to-gfx906"<u></u><u></u></p>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:Verdana,sans-serif">SGTM.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Verdana,sans-serif">Do you expect the AMDGPU's features (+xnack, -ecc, etc) to be part of the offload target ? Or would they be specified via -Xoffload arguments?<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Arial,sans-serif"> </span><span style="font-family:Verdana,sans-serif"><u></u><u></u></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Arial,sans-serif">--Artem</span><span style="font-family:Verdana,sans-serif"><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<p class="MsoNormal"><br>
Sam<br>
<br>
-----Original Message-----<br>
From: Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov" target="_blank">jdoerfert@anl.gov</a>>
<br>
Sent: Friday, March 5, 2021 1:25 PM<br>
To: Artem Belevich <<a href="mailto:tra@google.com" target="_blank">tra@google.com</a>>; Liu, Yaxun (Sam) <<a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>><br>
Cc: Ben Boeckel <<a href="mailto:ben.boeckel@kitware.com" target="_blank">ben.boeckel@kitware.com</a>>; Lieberman, Ron <<a href="mailto:Ron.Lieberman@amd.com" target="_blank">Ron.Lieberman@amd.com</a>>;
<a href="mailto:a.bataev@hotmail.com" target="_blank">a.bataev@hotmail.com</a>; Chan, SiuChi <<a href="mailto:siuchi.chan@amd.com" target="_blank">siuchi.chan@amd.com</a>>; Searles, Mark <<a href="mailto:Mark.Searles@amd.com" target="_blank">Mark.Searles@amd.com</a>>;
cfe-dev (<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>) <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>;
<a href="mailto:jeffrey.sandoval@hpe.com" target="_blank">jeffrey.sandoval@hpe.com</a>; Jon Chesterfield <<a href="mailto:jonathanchesterfield@gmail.com" target="_blank">jonathanchesterfield@gmail.com</a>>; Rodgers, Gregory <<a href="mailto:Gregory.Rodgers@amd.com" target="_blank">Gregory.Rodgers@amd.com</a>><br>
Subject: Re: [cfe-dev] [RFC] Unified offloading option for CUDA/HIP/OpenMP<br>
<br>
[CAUTION: External Email]<br>
<br>
On 3/4/21 3:05 PM, Artem Belevich wrote:<br>
> On Thu, Mar 4, 2021 at 10:34 AM Liu, Yaxun (Sam) <<a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>> wrote:<br>
><br>
>> [AMD Public Use]<br>
>><br>
>> There is another aspect we need to consider: how to modify the <br>
>> -target option by additional options?<br>
>><br>
>> For the existing --offload-arch option, we could use -Xarch_ to add <br>
>> specific options for it.<br>
>><br>
> `-Xarch_xxx` as implemented right now is a rather limiter hack. IIRC <br>
> it only accepts options w/o arguments which limits its usability.<br>
><br>
><br>
>> Assuming we have an -offload="amdgcn -mcpu=gfx906" option, then we <br>
>> want to add some options specific to it by an additional option, what <br>
>> should we do?<br>
>><br>
> I think we've been conflating telling the driver what to compile for <br>
> and customizing individual sub-compilations.<br>
><br>
> We could explicitly separate the two tasks. E.g.:<br>
> `--[no-]offload=target1,target2,target3...`<br>
> `--Xoffload=target_pattern target_options...`<br>
><br>
> This way your example would be handled with:<br>
> "--offload=gfx906,gfx1010"<br>
> "--Xoffload=gfx* options common to all AMD GPUs"<br>
> "--Xoffload=gfx906 -mcpu=gfx906 --fsomething-specific-to-gfx906"<br>
><br>
> In the end `-Xarch_xxx` would become an alias for '-Xoffload=xxx'.<br>
<br>
+1<br>
<br>
<br>
> --Artem<br>
><br>
><br>
><br>
><br>
>> Thanks.<br>
>><br>
>> Sam<br>
>><br>
>> -----Original Message-----<br>
>> From: Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov" target="_blank">jdoerfert@anl.gov</a>><br>
>> Sent: Thursday, February 11, 2021 12:59 PM<br>
>> To: Artem Belevich <<a href="mailto:tra@google.com" target="_blank">tra@google.com</a>>; Liu, Yaxun (Sam)
<br>
>> <<a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>><br>
>> Cc: Ben Boeckel <<a href="mailto:ben.boeckel@kitware.com" target="_blank">ben.boeckel@kitware.com</a>>; Lieberman, Ron <
<br>
>> <a href="mailto:Ron.Lieberman@amd.com" target="_blank">Ron.Lieberman@amd.com</a>>;
<a href="mailto:a.bataev@hotmail.com" target="_blank">a.bataev@hotmail.com</a>; Chan, SiuChi <
<br>
>> <a href="mailto:siuchi.chan@amd.com" target="_blank">siuchi.chan@amd.com</a>>; Searles, Mark <<a href="mailto:Mark.Searles@amd.com" target="_blank">Mark.Searles@amd.com</a>>; cfe-dev (<br>
>> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>) <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>;
<br>
>> <a href="mailto:jeffrey.sandoval@hpe.com" target="_blank">jeffrey.sandoval@hpe.com</a>; Jon Chesterfield
<br>
>> <<a href="mailto:jonathanchesterfield@gmail.com" target="_blank">jonathanchesterfield@gmail.com</a>><br>
>> Subject: Re: [cfe-dev] [RFC] Unified offloading option for <br>
>> CUDA/HIP/OpenMP<br>
>><br>
>> [CAUTION: External Email]<br>
>><br>
>> I'm OK with either.<br>
>><br>
>> On 2/11/21 11:42 AM, Artem Belevich wrote:<br>
>>> On Thu, Feb 11, 2021 at 8:30 AM Liu, Yaxun (Sam) <<a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>><br>
>> wrote:<br>
>>>> [AMD Public Use]<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Sorry for the delay.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Both Johannes’ and Artem’s proposals should satisfy the needs of users:<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Option 1:<br>
>>>><br>
>>>><br>
>>>><br>
>>>> `-offload=<offload-pattern> optA optB optC`.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Option 2:<br>
>>>><br>
>>>><br>
>>>><br>
>>>> `-offload=<offload-pattern>,optA,optB,optC`.<br>
>>>><br>
>>> I'm fine with #2. We're using something similar with our build tools <br>
>>> and it works reasonably well.<br>
>>> However, it does have one annoying corner case. There's no easy way <br>
>>> to pass an option which has a comma in it. E.g. if I want to pass <br>
>>> `-Wl,something,something`. Perhaps we could use sed-like approach <br>
>>> and allow changing the separator. E.g. `s/a/b/` == `s@a@b@`.<br>
>>><br>
>>> --Artem<br>
>>><br>
>>><br>
>>><br>
>>>> Compared to the old options, they are more concise and more readable.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> The main difference is the delimiter. To me option 2 is more <br>
>>>> attractive since it does not need quotations for most cases.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Can we reach an agreement on option 2?<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Thanks.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Sam<br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>> *From:* Artem Belevich <<a href="mailto:tra@google.com" target="_blank">tra@google.com</a>><br>
>>>> *Sent:* Tuesday, December 15, 2020 2:13 PM<br>
>>>> *To:* Ben Boeckel <<a href="mailto:ben.boeckel@kitware.com" target="_blank">ben.boeckel@kitware.com</a>><br>
>>>> *Cc:* Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov" target="_blank">jdoerfert@anl.gov</a>>; Liu, Yaxun (Sam) <
<br>
>>>> <a href="mailto:Yaxun.Liu@amd.com" target="_blank">Yaxun.Liu@amd.com</a>>; Lieberman, Ron <<a href="mailto:Ron.Lieberman@amd.com" target="_blank">Ron.Lieberman@amd.com</a>>;
<br>
>>>> <a href="mailto:a.bataev@hotmail.com" target="_blank">a.bataev@hotmail.com</a>; Chan, SiuChi <<a href="mailto:siuchi.chan@amd.com" target="_blank">siuchi.chan@amd.com</a>>; Searles,
<br>
>>>> Mark < <a href="mailto:Mark.Searles@amd.com" target="_blank">Mark.Searles@amd.com</a>>; cfe-dev (<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>) <
<br>
>>>> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>><br>
>>>> *Subject:* Re: [cfe-dev] [RFC] Unified offloading option for <br>
>>>> CUDA/HIP/OpenMP<br>
>>>><br>
>>>><br>
>>>><br>
>>>> [CAUTION: External Email]<br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>> On Tue, Dec 15, 2020 at 10:23 AM Ben Boeckel <br>
>>>> <<a href="mailto:ben.boeckel@kitware.com" target="_blank">ben.boeckel@kitware.com</a>><br>
>>>> wrote:<br>
>>>><br>
>>>> On Mon, Dec 14, 2020 at 14:04:43 -0800, Artem Belevich via cfe-dev<br>
>> wrote:<br>
>>>>> It all may be an utter overkill, too. WDYT?<br>
>>>> Note that tools such as ccache and sccache generally need to be <br>
>>>> able to understand what's going on (I believe distcc and other <br>
>>>> distributed compilation tools also generally need to know too), so <br>
>>>> making it sensible enough for interpretation based on just the <br>
>>>> flags to be possible should be considered.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> I think this is somewhat orthogonal to how we specify per-target<br>
>> options.<br>
>>>> Such a tool almost never knows about all possible compiler options <br>
>>>> and has to pass through the unknown options as-is. However, any <br>
>>>> form<br>
>> of 'nested'<br>
>>>> options specified on the command line will have a chance to confuse <br>
>>>> such tool. E.g. if I want to pass '-E' to some sub-tool for a <br>
>>>> particular offload-target, ccache, not being aware that it's not a <br>
>>>> top-level compilation option, may interpret it as an attempt to<br>
>> preprocess the TU.<br>
>>>><br>
>>>><br>
>>>> I wonder if it would make sense to just move all this per-target <br>
>>>> option complexity into an external response file. As far as <br>
>>>> existing tools are concerned, it would look like <br>
>>>> `--offload-options=target-opts.file` without affecting tool's <br>
>>>> general idea what this compilation is about to do, and the external <br>
>>>> file would allow us to be as flexible as we need to be to specify <br>
>>>> per-target<br>
>> options. It could be just a flat list of pairs `-Xarch_...<br>
>>>> optA`. Or we could use YAML.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> That approach, however, has its own issues and would still need to <br>
>>>> be optional. If it's the only way to specify offload options, that <br>
>>>> will complicate other use cases as now they would have to deal with <br>
>>>> temporary files.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> Maybe a slightly modified variant of jdoefert@'s idea would work<br>
>> better:<br>
>>>>>>> -offload="amd -march=gfx906 -fno-vectorize" -fopenmp<br>
>>>><br>
>>>> Implement it in a way similar to -Wl,optA,optB,optC and extend it <br>
>>>> to match an offload scope glob/regex.<br>
>>>><br>
>>>> E.g. `-offload=<offload-pattern>,optA,optB,optC`.<br>
>>>><br>
>>>> As far as the external tools are concerned, it's just one option to <br>
>>>> pass though. At the same time it should be flexible enough to apply <br>
>>>> the options to subset of offload targets in a human-manageable way.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> --<br>
>>>><br>
>>>> --Artem Belevich<br>
>>>><br>
><u></u><u></u></p>
</blockquote>
</div>
<p class="MsoNormal"><br clear="all">
<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal">-- <u></u><u></u></p>
<div>
<div>
<p class="MsoNormal">--Artem Belevich<u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">--Artem Belevich</div></div></div>