[PATCH] D20694: AMDGPU/SI: Enable load-store-opt by default.

Thu May 26 11:22:32 PDT 2016

cfang added inline comments.

================
Comment at: test/CodeGen/AMDGPU/fmin3.ll:15-17
@@ -14,5 +14,5 @@
 define void @test_fmin3_olt_0(float addrspace(1)* %out, float addrspace(1)* %aptr, float addrspace(1)* %bptr, float addrspace(1)* %cptr) nounwind {
-  %a = load float, float addrspace(1)* %aptr, align 4
-  %b = load float, float addrspace(1)* %bptr, align 4
-  %c = load float, float addrspace(1)* %cptr, align 4
+  %a = load volatile float, float addrspace(1)* %aptr, align 4
+  %b = load volatile float, float addrspace(1)* %bptr, align 4
+  %c = load volatile float, float addrspace(1)* %cptr, align 4
   %f0 = call float @llvm.minnum.f32(float %a, float %b) nounwind readnone
----------------
arsenm wrote:
> Why did this test change as it doesn't use local memory?
if (getOptLevel() > CodeGenOpt::None && ST.loadStoreOptEnabled()) {
    // Don't do this with no optimizations since it throws away debug info by
    // merging nonadjacent loads.

    // This should be run after scheduling, but before register allocation. It
    // also need extra copies to the address operand to be eliminated.
    insertPass(&MachineSchedulerID, &SILoadStoreOptimizerID);
    insertPass(&MachineSchedulerID, &RegisterCoalescerID);
  }

I think the RegisterCoalescer Pass makes a difference. An ideal approach is to add the coalescer pass only when load-store-opt actually happens, but I think it is no harm here to have this additional coalescer pass.




http://reviews.llvm.org/D20694