[PATCH] D92483: AMDGPU - Use MUBUF instructions for global address space access

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 23 22:51:30 PST 2020


t-tye added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:92
 
   if (isAmdHsaOS()) // Turn on FlatForGlobal for HSA.
     FullFS += "+flat-for-global,+unaligned-access-mode,+trap-handler,";
----------------
Suggest:

  // Turn on features that HSA ABI requires. Also turn on FlatForGlobal by
  // default.
  if (isAmdHsaOS())



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:97-105
   // Disable mutually exclusive bits.
   if (FS.find_lower("+wavefrontsize") != StringRef::npos) {
     if (FS.find_lower("wavefrontsize16") == StringRef::npos)
       FullFS += "-wavefrontsize16,";
     if (FS.find_lower("wavefrontsize32") == StringRef::npos)
       FullFS += "-wavefrontsize32,";
     if (FS.find_lower("wavefrontsize64") == StringRef::npos)
----------------
I do not understand how this is supposed to work. The code does not seem to match the comment. The parser seems to convert all these features into a single WavefrontSizeLog2 so last + one would win. So why is this even needed?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:111-113
+  // Implement the "generic" processors, which acts as the default when no
+  // generation features are enabled (e.g for -mcpu=''). There are two variants 
+  // one for hsa and one for everything else.
----------------
Suggest:

  // Implement the "generic" processors, which acts as the default when no
  // generation features are enabled (e.g for -mcpu=''). HSA OS defaults to the
  // first amdgcn target that supports flat addressing. Other OSes default to
  // the first amdgcn target.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:122-132
   // Unless +-flat-for-global is specified, turn on FlatForGlobal for all OS-es
   // on VI and newer hardware to avoid assertion failures due to missing ADDR64
   // variants of MUBUF instructions.
   if (!hasAddr64() && !FS.contains("flat-for-global")) {
     FlatForGlobal = true;
   }
 
----------------
Suggest:

  // Targets must either support 64-bit offsets for MUBUF instructions, and/or
  // support flat operations, otherwise they cannot access a 64-bit global
  // address space.
  assert(hasAddr64() || hasFlat());
  // Unless +-flat-for-global is specified, turn on FlatForGlobal for targets
  // that do not support ADDR64 variants of MUBUF instructions. Such targets
  // cannot use a 64 bit offset with a MUBUF instruction to access the global
  // address space.
  if (!hasAddr64() && !FS.contains("flat-for-global") && !FlatForGlobal) {
    ToggleFeature(AMDGPU::FeatureFlatForGlobal);
    FlatForGlobal = true;
  }
  // Unless +-flat-for-global is specified, use MUBUF instructions for global
  // address space access if flat operations are not available.
  if (!hasFlat() && !FS.contains("flat-for-global") && FlatForGlobal) {
    ToggleFeature(AMDGPU::FeatureFlatForGlobal);
    FlatForGlobal = false;
  }

Also add the following to llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h since that is the way other similar predicates are defined:

  bool hasFlat() const {
    return (getGeneration() > AMDGPUSubtarget::SOUTHERN_ISLANDS);
  }



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92483/new/

https://reviews.llvm.org/D92483



More information about the llvm-commits mailing list