[PATCH] D81780: AMDGPU/AMDHSA: Implement new target ID support in AMDGPU backend

Tue Aug 11 08:11:44 PDT 2020

kzhuravl marked 8 inline comments as done.
kzhuravl added inline comments.

================
Comment at: llvm/docs/AMDGPUUsage.rst:657
      ``ELFOSABI_AMDGPU_MESA3D``      66
-     ``ELFABIVERSION_AMDGPU_HSA``    1
+     ``ELFABIVERSION_AMDGPU_HSA``    1 or 2
      ``ELFABIVERSION_AMDGPU_PAL``    0
----------------
scott.linder wrote:
> Why do we enumerate `ELFABIVERSION_*` if they can have multiple actual values? It seems like the only other place we mention these is to say that they describe the version of their corresponding `ELFOSABI_*`. Could we just replace those places with the literal values that are possible for each `ELFOSABI_*`?
doc update in separate review.

================
Comment at: llvm/docs/AMDGPUUsage.rst:726

-  .. table:: AMDGPU ELF Header ``e_flags``
-     :name: amdgpu-elf-header-e_flags-table
+  .. table:: AMDGPU ELF Header ``e_flags`` (``EI_ABIVERSION=0`` and ``EI_ABIVERSION=1``)
+     :name: amdgpu-elf-header-e_flags-table-01
----------------
scott.linder wrote:
> Are these shared between all `ELFOSABI_*`? Other places seem to also mention the `EI_OSABI` field when specifying where these values are legal.
doc update in separate review.

================
Comment at: llvm/docs/AMDGPUUsage.rst:778
+                                                  representing 3 values:
+                                                  ``EF_AMDGPU_FEATURE_XNACK_DEFAULT`` (1),
+                                                  ``EF_AMDGPU_FEATURE_XNACK_OFF`` (2),
----------------
scott.linder wrote:
> Nit: I would prefer the value in parenthesis be written in hex to be consistent with the `Value` column.
doc update in separate review.

================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:726
+  // XNACK selection mask for EF_AMDGPU_FEATURE_XNACK_* values. Applicable
+  // to EI_OSABI=ELFOSABI_AMDGPU_HSA and EI_ABIVERSION=2.
+  EF_AMDGPU_FEATURE_XNACK = 0x300,
----------------
scott.linder wrote:
> Same question as above; are these really only under the specified `EI_OSABI` and if so can that be reflected in AMDGPUUsage?
yes.

================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:729
+  // XNACK is any/default.
+  EF_AMDGPU_FEATURE_XNACK_DEFAULT=0x100,
+  // XNACK is off.
----------------
scott.linder wrote:
> Is there a reason for this to be non-zero? Maybe it would make sense to have four values:
> 
> * 0b00: Not an applicable feature for the given `MACH`
> * 0b01: Default
> * 0b10: Off
> * 0b11: On
> 
> But from my original reading of the proposal my understanding was we would instead use:
> 
> * 0b00: If applicable, Default. Otherwise, the only legal value for a non-applicable target feature.
> * 0b01: Off
> * 0b10: On
> 
> In effect the context of the `ARCH` tells you if you even need to consider the feature bits for a given feature, so differentiating the case isn't useful. I guess we don't save any bits this way though?
> 
> The original proposal also mentions "makes using a mask to check for a match easier as value & test will be 0 if exact match or matching against default", although I'm not actually clear what it means to say.
discussed offline. docs have been updated to match this implementation.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineModuleInfo.cpp:46
+  for (const auto &MFE : ModuleFlags) {
+    if (MFE.Behavior != Module::MergeTargetId) {
+      continue;
----------------
scott.linder wrote:
> It seems like the condition/assertion here should be flipped. We should identify the metadata by the key, then assert it has the required merge behavior. Is it just inverted to avoid repeated string comparisons, and because we can be certain there won't be other uses of the merge behavior for AMDGPU IR?
yes, avoiding string comparisons. also implementation of MergeTargetID implies there won't be other uses of the merge behavior.

================
Comment at: llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:432
+    EFlags &= ~ELF::EF_AMDGPU_FEATURE_XNACK;
+    EFlags |= ELF::EF_AMDGPU_FEATURE_XNACK_DEFAULT;
+
----------------
scott.linder wrote:
> Same question as above about the bit-pattern for the Default case; is there a need to differentiate the non-supported-feature case from the supported-and-Default case? If not, can we just change the bitpatterns of the three values and delete these bitwise-ORs?
no, according to latest docs.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81780/new/

https://reviews.llvm.org/D81780