[PATCH] D20694: AMDGPU/SI: Enable load-store-opt by default.
Changpeng Fang via llvm-commits
llvm-commits at lists.llvm.org
Thu May 26 11:22:32 PDT 2016
cfang added inline comments.
================
Comment at: test/CodeGen/AMDGPU/fmin3.ll:15-17
@@ -14,5 +14,5 @@
define void @test_fmin3_olt_0(float addrspace(1)* %out, float addrspace(1)* %aptr, float addrspace(1)* %bptr, float addrspace(1)* %cptr) nounwind {
- %a = load float, float addrspace(1)* %aptr, align 4
- %b = load float, float addrspace(1)* %bptr, align 4
- %c = load float, float addrspace(1)* %cptr, align 4
+ %a = load volatile float, float addrspace(1)* %aptr, align 4
+ %b = load volatile float, float addrspace(1)* %bptr, align 4
+ %c = load volatile float, float addrspace(1)* %cptr, align 4
%f0 = call float @llvm.minnum.f32(float %a, float %b) nounwind readnone
----------------
arsenm wrote:
> Why did this test change as it doesn't use local memory?
if (getOptLevel() > CodeGenOpt::None && ST.loadStoreOptEnabled()) {
// Don't do this with no optimizations since it throws away debug info by
// merging nonadjacent loads.
// This should be run after scheduling, but before register allocation. It
// also need extra copies to the address operand to be eliminated.
insertPass(&MachineSchedulerID, &SILoadStoreOptimizerID);
insertPass(&MachineSchedulerID, &RegisterCoalescerID);
}
I think the RegisterCoalescer Pass makes a difference. An ideal approach is to add the coalescer pass only when load-store-opt actually happens, but I think it is no harm here to have this additional coalescer pass.
http://reviews.llvm.org/D20694
More information about the llvm-commits
mailing list