[PATCH] D81780: AMDGPU/AMDHSA: Implement new target ID support in AMDGPU backend

Mon Jun 22 11:49:40 PDT 2020

scott.linder added inline comments.

================
Comment at: llvm/docs/AMDGPUUsage.rst:657
      ``ELFOSABI_AMDGPU_MESA3D``      66
-     ``ELFABIVERSION_AMDGPU_HSA``    1
+     ``ELFABIVERSION_AMDGPU_HSA``    1 or 2
      ``ELFABIVERSION_AMDGPU_PAL``    0
----------------
Why do we enumerate `ELFABIVERSION_*` if they can have multiple actual values? It seems like the only other place we mention these is to say that they describe the version of their corresponding `ELFOSABI_*`. Could we just replace those places with the literal values that are possible for each `ELFOSABI_*`?

================
Comment at: llvm/docs/AMDGPUUsage.rst:726

-  .. table:: AMDGPU ELF Header ``e_flags``
-     :name: amdgpu-elf-header-e_flags-table
+  .. table:: AMDGPU ELF Header ``e_flags`` (``EI_ABIVERSION=0`` and ``EI_ABIVERSION=1``)
+     :name: amdgpu-elf-header-e_flags-table-01
----------------
Are these shared between all `ELFOSABI_*`? Other places seem to also mention the `EI_OSABI` field when specifying where these values are legal.

================
Comment at: llvm/docs/AMDGPUUsage.rst:778
+                                                  representing 3 values:
+                                                  ``EF_AMDGPU_FEATURE_XNACK_DEFAULT`` (1),
+                                                  ``EF_AMDGPU_FEATURE_XNACK_OFF`` (2),
----------------
Nit: I would prefer the value in parenthesis be written in hex to be consistent with the `Value` column.

================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:718

   // Indicates if the "xnack" target feature is enabled for all code contained
   // in the object.
----------------
Can you update these comments to explicitly state they are applicable to EI_ABIVERSION<2 ?

================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:726
+  // XNACK selection mask for EF_AMDGPU_FEATURE_XNACK_* values. Applicable
+  // to EI_OSABI=ELFOSABI_AMDGPU_HSA and EI_ABIVERSION=2.
+  EF_AMDGPU_FEATURE_XNACK = 0x300,
----------------
Same question as above; are these really only under the specified `EI_OSABI` and if so can that be reflected in AMDGPUUsage?

================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:729
+  // XNACK is any/default.
+  EF_AMDGPU_FEATURE_XNACK_DEFAULT=0x100,
+  // XNACK is off.
----------------
Is there a reason for this to be non-zero? Maybe it would make sense to have four values:

* 0b00: Not an applicable feature for the given `MACH`
* 0b01: Default
* 0b10: Off
* 0b11: On

But from my original reading of the proposal my understanding was we would instead use:

* 0b00: If applicable, Default. Otherwise, the only legal value for a non-applicable target feature.
* 0b01: Off
* 0b10: On

In effect the context of the `ARCH` tells you if you even need to consider the feature bits for a given feature, so differentiating the case isn't useful. I guess we don't save any bits this way though?

The original proposal also mentions "makes using a mask to check for a match easier as value & test will be 0 if exact match or matching against default", although I'm not actually clear what it means to say.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:136
+  if (enableNewTargetID()) {
+    if (!isSubtargetInfoEquivalentToTargetID(getGlobalSTI(), TargetID)) {
+      report_fatal_error("Subtarget info does not match TargetID");
----------------
I think I need some help to understand what case this intends to diagnose. Is the GlobalSTI a property of the module? Where is it derived from if the subtarget is a per-function concept?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineModuleInfo.cpp:46
+  for (const auto &MFE : ModuleFlags) {
+    if (MFE.Behavior != Module::MergeTargetId) {
+      continue;
----------------
It seems like the condition/assertion here should be flipped. We should identify the metadata by the key, then assert it has the required merge behavior. Is it just inverted to avoid repeated string comparisons, and because we can be certain there won't be other uses of the merge behavior for AMDGPU IR?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineModuleInfo.cpp:53
+  }
+  if (TargetID.empty()) {
+    auto TargetTriple = MMI.getTarget().getTargetTriple();
----------------
Do we really want to have this "fallback"? It seems like we should just define the "target-id" metadata entry as required, and fail here instead.

I can't remember exactly, but my understanding is that the Target you get from MMI is essentially incorrect, and is only there for legacy purposes. We have had bugs in the past, some of which led us to the whole TargetID proposal in the first place, when we used the MMI instead of per-function Subtargets. Not AMDGPU, but see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150316/266041.html for an example.

I am worried that this behavior will just lead to more bugs. Is it here to avoid updating tests or something similar?

================
Comment at: llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:3764

-  std::string ExpectedTarget;
-  raw_string_ostream ExpectedTargetOS(ExpectedTarget);
-  IsaInfo::streamIsaVersion(&getSTI(), ExpectedTargetOS);
+  if (!enableNewTargetID()) {
+    std::string ExpectedTargetIDStr;
----------------
Nit: Can this condition be inverted, i.e. the new case first.

================
Comment at: llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:181
+    // V3 + Original TargetID.
+    std::string ConstructedTargetIDStr;
+    raw_string_ostream ConstructedTargetIDOStr(ConstructedTargetIDStr);
----------------
Why is `TargetID` ignored, but passed in by `AMDGPUAsmParser::ParseDirectiveAMDGCNTarget` for this case, and constructed in the same way?

================
Comment at: llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:432
+    EFlags &= ~ELF::EF_AMDGPU_FEATURE_XNACK;
+    EFlags |= ELF::EF_AMDGPU_FEATURE_XNACK_DEFAULT;
+
----------------
Same question as above about the bit-pattern for the Default case; is there a need to differentiate the non-supported-feature case from the supported-and-Default case? If not, can we just change the bitpatterns of the three values and delete these bitwise-ORs?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81780/new/

https://reviews.llvm.org/D81780