[PATCH] D20694: AMDGPU/SI: Enable load-store-opt by default.
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu May 26 11:23:57 PDT 2016
arsenm added inline comments.
================
Comment at: test/CodeGen/AMDGPU/fmin3.ll:15-17
@@ -14,5 +14,5 @@
define void @test_fmin3_olt_0(float addrspace(1)* %out, float addrspace(1)* %aptr, float addrspace(1)* %bptr, float addrspace(1)* %cptr) nounwind {
- %a = load float, float addrspace(1)* %aptr, align 4
- %b = load float, float addrspace(1)* %bptr, align 4
- %c = load float, float addrspace(1)* %cptr, align 4
+ %a = load volatile float, float addrspace(1)* %aptr, align 4
+ %b = load volatile float, float addrspace(1)* %bptr, align 4
+ %c = load volatile float, float addrspace(1)* %cptr, align 4
%f0 = call float @llvm.minnum.f32(float %a, float %b) nounwind readnone
----------------
cfang wrote:
> arsenm wrote:
> > Why did this test change as it doesn't use local memory?
> if (getOptLevel() > CodeGenOpt::None && ST.loadStoreOptEnabled()) {
> // Don't do this with no optimizations since it throws away debug info by
> // merging nonadjacent loads.
>
> // This should be run after scheduling, but before register allocation. It
> // also need extra copies to the address operand to be eliminated.
> insertPass(&MachineSchedulerID, &SILoadStoreOptimizerID);
> insertPass(&MachineSchedulerID, &RegisterCoalescerID);
> }
>
> I think the RegisterCoalescer Pass makes a difference. An ideal approach is to add the coalescer pass only when load-store-opt actually happens, but I think it is no harm here to have this additional coalescer pass.
>
>
Oh, OK. That extra run is a workaround anyway. We should fix the pass to not depend on the scheduler and run in SSA form
http://reviews.llvm.org/D20694
More information about the llvm-commits
mailing list