[PATCH] D11067: AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops

Matt Arsenault Matthew.Arsenault at amd.com
Thu Jul 9 10:05:45 PDT 2015


arsenm accepted this revision.
arsenm added a comment.
This revision is now accepted and ready to land.

LGTM


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:271-283
@@ -264,3 +270,15 @@
   case AMDGPUAS::GLOBAL_ADDRESS:
-  case AMDGPUAS::CONSTANT_ADDRESS: // XXX - Should we assume SMRD instructions?
+    if (Subtarget->getGeneration() >= AMDGPUSubtarget::VOLCANIC_ISLANDS) {
+      // Assume the we will use FLAT for all global memory accesses
+      // on VI.
+      // FIXME: This assumption is currently wrong.  On VI we still use
+      // MUBUF instructions for the r + i addressing mode.  As currently
+      // implemented, the MUBUF instructions only work on buffer < 4GB.
+      // It may be possible to support > 4GB buffers with MUBUF instructions,
+      // by setting the stride value in the resource descriptor which would
+      // increase the size limit to (stride * 4GB).  However, this is risky,
+      // because it has never been validated.
+      return isLegalFlatAddressingMode(AM);
+    }
+    // fall-through
   case AMDGPUAS::PRIVATE_ADDRESS:
----------------
This should go under all of the cases, since the code assumes MUBUF for all of these address spaces.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:271-283
@@ -264,3 +270,15 @@
   case AMDGPUAS::GLOBAL_ADDRESS:
-  case AMDGPUAS::CONSTANT_ADDRESS: // XXX - Should we assume SMRD instructions?
+    if (Subtarget->getGeneration() >= AMDGPUSubtarget::VOLCANIC_ISLANDS) {
+      // Assume the we will use FLAT for all global memory accesses
+      // on VI.
+      // FIXME: This assumption is currently wrong.  On VI we still use
+      // MUBUF instructions for the r + i addressing mode.  As currently
+      // implemented, the MUBUF instructions only work on buffer < 4GB.
+      // It may be possible to support > 4GB buffers with MUBUF instructions,
+      // by setting the stride value in the resource descriptor which would
+      // increase the size limit to (stride * 4GB).  However, this is risky,
+      // because it has never been validated.
+      return isLegalFlatAddressingMode(AM);
+    }
+    // fall-through
   case AMDGPUAS::PRIVATE_ADDRESS:
----------------
arsenm wrote:
> This should go under all of the cases, since the code assumes MUBUF for all of these address spaces.
I'm not sure about using this only for global, but it sort of makes sense

================
Comment at: test/CodeGen/AMDGPU/cgp-addressing-modes.ll:7-11
@@ -5,7 +6,7 @@
 
 ; OPT-LABEL: @test_sink_global_small_offset_i32(
 ; OPT-NOT: getelementptr i32, i32 addrspace(1)* %in
 ; OPT: br i1
 ; OPT: ptrtoint
 
 ; GCN-LABEL: {{^}}test_sink_global_small_offset_i32:
----------------
A test that shows an offset that is sunk on CI and not sunk on VI would be useful. I'm surprised that one of these existing test cases didn't break.


http://reviews.llvm.org/D11067







More information about the llvm-commits mailing list