[PATCH] D30751: [MachineCopyForwarding] Add new pass to do register COPY forwarding at end of register allocation.
Geoff Berry via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 14 14:01:15 PDT 2017
gberry added a comment.
@javed.absar The purpose of this pass is not to reduce register pressure (since it is run just after register allocation), but to allow more scheduling flexibility and to a lesser degree to remove some redundant COPYs. I'll elaborate on this in my response to Quentin.
As for your question about why more ARM tests aren't effected, I don't have a good answer, but my guess would be that there are just more X86 lit test cases both in general and in the number that are sensitive to changes in register allocation.
================
Comment at: test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll:7
; CHECK-LABEL: t:
-; CHECK: mov x0, [[REG1:x[0-9]+]]
-; CHECK: mov x1, [[REG2:x[0-9]+]]
+; CHECK: mov [[REG2:x[0-9]+]], x3
+; CHECK: mov [[REG1:x[0-9]+]], x2
----------------
javed.absar wrote:
> Would it be better to rewrite these as MIR tests?
I'm not sure how that would help. In this test, similar to the one Hal asked about before, the newly checked 'mov's aren't new, I just needed to add them to get the new register numbers. Here are the full diffs of the generated code for this test case:
```
_t: ; @t
; BB#0: ; %entry
stp x20, x19, [sp, #-32]! ; 8-byte Folded Spill
stp x29, x30, [sp, #16] ; 8-byte Folded Spill
mov x19, x3
mov x20, x2
- mov x0, x20
- mov x1, x19
+ mov x0, x2
+ mov x1, x3
bl _foo
mov x0, x20
mov x1, x19
bl _foo
```
================
Comment at: test/CodeGen/AArch64/neg-imm.ll:9
; CHECK_LABEL: test:
; CHECK_LABEL: %entry
+; CHECK: subs [[REG0:w[0-9]+]],
----------------
javed.absar wrote:
> Would it be better adding new/separate test file instead of changing the purpose of this one ?
Again, I'm not trying to change the purpose of this test. My change just caused things to be scheduled slightly differently. The test is still checking that the condition is computed by a 'subs' feeding a 'csel'. Here are the full diffs:
```
test: // @test
str x20, [sp, #-32]! // 8-byte Folded Spill
stp x19, x30, [sp, #16] // 8-byte Folded Spill
+ subs w8, w0, #1 // =1
mov w19, w0
- subs w8, w19, #1 // =1
csel w20, wzr, w8, lt
.LBB0_1: // %for.body
// =>This Inner Loop Header: Depth=1
cmp w19, w20
b.eq .LBB0_3
// BB#2: // %if.then3
// in Loop: Header=BB0_1 Depth=1
mov w0, w20
bl foo
.LBB0_3: // %for.inc
// in Loop: Header=BB0_1 Depth=1
cmp w20, w19
add w20, w20, #1 // =1
b.le .LBB0_1
// BB#4: // %for.cond.cleanup
ldp x19, x30, [sp, #16] // 8-byte Folded Reload
ldr x20, [sp], #32 // 8-byte Folded Reload
ret
```
https://reviews.llvm.org/D30751
More information about the llvm-commits
mailing list