[llvm] 9966021 - [AArch64][GlobalISel] When generating SUBS for compares, don't write to wzr/xzr.

Amara Emerson via llvm-commits llvm-commits at lists.llvm.org
Tue May 26 15:07:12 PDT 2020



> On May 26, 2020, at 9:41 AM, Philip Reames <listmail at philipreames.com> wrote:
> 
> Going purely off the submission comment, have we explored fixing the tail merge logic to understand the semantics of a zero register?  It really seems like the previous code gen was the better codegen.
> 
> (Well out of my area, if I'm wrong, just say so.)
No that’s a fair point. I was just trying to close the deficit to SelectionDAG here in performance. I didn’t dig much into why a physreg write seemed to block the tail merge optimization. If I have time in the coming weeks I’ll see if I can look at it closer.

In terms of whether or not the previous codegen was better, in theory it doesn’t really matter since regalloc is free to allocate any register to satisfy the regclass.

Thanks,
Amara
> 
> Philip
> 
> On 5/23/20 11:29 PM, Amara Emerson via llvm-commits wrote:
>> Author: Amara Emerson
>> Date: 2020-05-23T22:59:49-07:00
>> New Revision: 99660217e930c131407766f68024f76acc375597
>> 
>> URL: https://github.com/llvm/llvm-project/commit/99660217e930c131407766f68024f76acc375597
>> DIFF: https://github.com/llvm/llvm-project/commit/99660217e930c131407766f68024f76acc375597.diff
>> 
>> LOG: [AArch64][GlobalISel] When generating SUBS for compares, don't write to wzr/xzr.
>> 
>> Although writing to wzr/xzr is correct since we don't care about the result
>> of the sub, only the flags, doing so causes tail merge blocks to fail.
>> 
>> Writing to an unused virtual register instead allows the optimization to fire,
>> improving performance significantly on 256.bzip2.
>> 
>> Differential Revision: https://reviews.llvm.org/D80460
>> 
>> Added:
>>     
>> Modified:
>>     llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp
>>     llvm/test/CodeGen/AArch64/GlobalISel/opt-fold-compare.mir
>>     llvm/test/CodeGen/AArch64/GlobalISel/select-arith-immed-compare.mir
>>     llvm/test/CodeGen/AArch64/GlobalISel/select-cmp.mir
>>     llvm/test/CodeGen/AArch64/GlobalISel/select.mir
>>     llvm/test/CodeGen/AArch64/GlobalISel/tbnz-slt.mir
>>     llvm/test/CodeGen/AArch64/GlobalISel/tbz-sgt.mir
>> 
>> Removed:
>>     
>> 
>> ################################################################################
>> diff  --git a/llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp b/llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp
>> index abcbdd3b9684..5789d1d2531c 100644
>> --- a/llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp
>> +++ b/llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp
>> @@ -3699,10 +3699,10 @@ AArch64InstructionSelector::emitIntegerCompare(
>>           "Expected scalar or pointer");
>>    if (CmpTy == LLT::scalar(32)) {
>>      CmpOpc = AArch64::SUBSWrr;
>> -    ZReg = AArch64::WZR;
>> +    ZReg = MRI.createVirtualRegister(&AArch64::GPR32RegClass);
>>    } else if (CmpTy == LLT::scalar(64) || CmpTy.isPointer()) {
>>      CmpOpc = AArch64::SUBSXrr;
>> -    ZReg = AArch64::XZR;
>> +    ZReg = MRI.createVirtualRegister(&AArch64::GPR64RegClass);
>>    } else {
>>      return {nullptr, CmpInst::Predicate::BAD_ICMP_PREDICATE};
>>    }
>> 
>> diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/opt-fold-compare.mir b/llvm/test/CodeGen/AArch64/GlobalISel/opt-fold-compare.mir
>> index d6bc336323e9..26e4e8c7a8a6 100644
>> --- a/llvm/test/CodeGen/AArch64/GlobalISel/opt-fold-compare.mir
>> +++ b/llvm/test/CodeGen/AArch64/GlobalISel/opt-fold-compare.mir
>> @@ -111,7 +111,7 @@ body:             |
>>      ; CHECK: [[COPY2:%[0-9]+]]:gpr32 = COPY $wzr
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 1
>>      ; CHECK: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr [[COPY2]], [[COPY1]], implicit-def $nzcv
>> -    ; CHECK: $wzr = SUBSWrr [[COPY]], [[SUBSWrr]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSWrr1:%[0-9]+]]:gpr32 = SUBSWrr [[COPY]], [[SUBSWrr]], implicit-def $nzcv
>>      ; CHECK: [[CSELWr:%[0-9]+]]:gpr32 = CSELWr [[MOVi32imm]], [[COPY2]], 11, implicit $nzcv
>>      ; CHECK: $w0 = COPY [[CSELWr]]
>>      ; CHECK: RET_ReallyLR implicit $w0
>> @@ -144,7 +144,7 @@ body:             |
>>      ; CHECK: [[COPY2:%[0-9]+]]:gpr32 = COPY $wzr
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 1
>>      ; CHECK: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr [[COPY2]], [[COPY]], implicit-def $nzcv
>> -    ; CHECK: $wzr = SUBSWrr [[SUBSWrr]], [[COPY1]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSWrr1:%[0-9]+]]:gpr32 = SUBSWrr [[SUBSWrr]], [[COPY1]], implicit-def $nzcv
>>      ; CHECK: [[CSELWr:%[0-9]+]]:gpr32 = CSELWr [[MOVi32imm]], [[COPY2]], 11, implicit $nzcv
>>      ; CHECK: $w0 = COPY [[CSELWr]]
>>      ; CHECK: RET_ReallyLR implicit $w0
>> @@ -244,7 +244,7 @@ body:             |
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 1
>>      ; CHECK: [[SUBREG_TO_REG:%[0-9]+]]:gpr64 = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
>>      ; CHECK: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY2]], [[COPY1]], implicit-def $nzcv
>> -    ; CHECK: $xzr = SUBSXrr [[COPY]], [[SUBSXrr]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSXrr1:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[SUBSXrr]], implicit-def $nzcv
>>      ; CHECK: [[CSELXr:%[0-9]+]]:gpr64 = CSELXr [[SUBREG_TO_REG]], [[COPY2]], 11, implicit $nzcv
>>      ; CHECK: $x0 = COPY [[CSELXr]]
>>      ; CHECK: RET_ReallyLR implicit $x0
>> @@ -278,7 +278,7 @@ body:             |
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 1
>>      ; CHECK: [[SUBREG_TO_REG:%[0-9]+]]:gpr64 = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
>>      ; CHECK: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY2]], [[COPY]], implicit-def $nzcv
>> -    ; CHECK: $xzr = SUBSXrr [[SUBSXrr]], [[COPY1]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSXrr1:%[0-9]+]]:gpr64 = SUBSXrr [[SUBSXrr]], [[COPY1]], implicit-def $nzcv
>>      ; CHECK: [[CSELXr:%[0-9]+]]:gpr64 = CSELXr [[SUBREG_TO_REG]], [[COPY2]], 11, implicit $nzcv
>>      ; CHECK: $x0 = COPY [[CSELXr]]
>>      ; CHECK: RET_ReallyLR implicit $x0
>> @@ -498,7 +498,7 @@ body:             |
>>      ; CHECK: liveins: $x0, $x1
>>      ; CHECK: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
>>      ; CHECK: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
>> -    ; CHECK: $xzr = SUBSXrr [[COPY]], [[COPY1]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[COPY1]], implicit-def $nzcv
>>      ; CHECK: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 1, implicit $nzcv
>>      ; CHECK: $w0 = COPY [[CSINCWr]]
>>      ; CHECK: RET_ReallyLR implicit $x0
>> 
>> diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-arith-immed-compare.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-arith-immed-compare.mir
>> index 91780e2601fe..59fcbd09c4c1 100644
>> --- a/llvm/test/CodeGen/AArch64/GlobalISel/select-arith-immed-compare.mir
>> +++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-arith-immed-compare.mir
>> @@ -462,7 +462,7 @@ body:             |
>>      ; CHECK: liveins: $w0
>>      ; CHECK: [[COPY:%[0-9]+]]:gpr32 = COPY $w0
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm -2147483648
>> -    ; CHECK: $wzr = SUBSWrr [[COPY]], [[MOVi32imm]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr [[COPY]], [[MOVi32imm]], implicit-def $nzcv
>>      ; CHECK: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 10, implicit $nzcv
>>      ; CHECK: [[ANDWri:%[0-9]+]]:gpr32sp = ANDWri [[CSINCWr]], 0
>>      ; CHECK: $w0 = COPY [[ANDWri]]
>> @@ -498,7 +498,7 @@ body:             |
>>      ; CHECK: liveins: $x0
>>      ; CHECK: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
>>      ; CHECK: [[MOVi64imm:%[0-9]+]]:gpr64 = MOVi64imm -9223372036854775808
>> -    ; CHECK: $xzr = SUBSXrr [[COPY]], [[MOVi64imm]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[MOVi64imm]], implicit-def $nzcv
>>      ; CHECK: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 10, implicit $nzcv
>>      ; CHECK: [[DEF:%[0-9]+]]:gpr64all = IMPLICIT_DEF
>>      ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:gpr64 = INSERT_SUBREG [[DEF]], [[CSINCWr]], %subreg.sub_32
>> @@ -537,7 +537,7 @@ body:             |
>>      ; CHECK: liveins: $w0
>>      ; CHECK: [[COPY:%[0-9]+]]:gpr32 = COPY $w0
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 2147483647
>> -    ; CHECK: $wzr = SUBSWrr [[COPY]], [[MOVi32imm]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSWrr:%[0-9]+]]:gpr32 = SUBSWrr [[COPY]], [[MOVi32imm]], implicit-def $nzcv
>>      ; CHECK: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 12, implicit $nzcv
>>      ; CHECK: [[ANDWri:%[0-9]+]]:gpr32sp = ANDWri [[CSINCWr]], 0
>>      ; CHECK: $w0 = COPY [[ANDWri]]
>> @@ -574,7 +574,7 @@ body:             |
>>      ; CHECK: liveins: $x0
>>      ; CHECK: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
>>      ; CHECK: [[MOVi64imm:%[0-9]+]]:gpr64 = MOVi64imm 9223372036854775807
>> -    ; CHECK: $xzr = SUBSXrr [[COPY]], [[MOVi64imm]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[MOVi64imm]], implicit-def $nzcv
>>      ; CHECK: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 12, implicit $nzcv
>>      ; CHECK: [[DEF:%[0-9]+]]:gpr64all = IMPLICIT_DEF
>>      ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:gpr64 = INSERT_SUBREG [[DEF]], [[CSINCWr]], %subreg.sub_32
>> 
>> diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-cmp.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-cmp.mir
>> index b5c498d29bae..6932f59aee6d 100644
>> --- a/llvm/test/CodeGen/AArch64/GlobalISel/select-cmp.mir
>> +++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-cmp.mir
>> @@ -60,7 +60,7 @@ body:             |
>>      ; CHECK: [[COPY:%[0-9]+]]:gpr64 = COPY $x0
>>      ; CHECK: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 13132
>>      ; CHECK: [[SUBREG_TO_REG:%[0-9]+]]:gpr64 = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
>> -    ; CHECK: $xzr = SUBSXrr [[COPY]], [[SUBREG_TO_REG]], implicit-def $nzcv
>> +    ; CHECK: [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr [[COPY]], [[SUBREG_TO_REG]], implicit-def $nzcv
>>      ; CHECK: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr $wzr, $wzr, 1, implicit $nzcv
>>      ; CHECK: $w0 = COPY [[CSINCWr]]
>>      ; CHECK: RET_ReallyLR implicit $w0
>> 
>> diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/select.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select.mir
>> index e64b62699ec4..2e38f1ce62e9 100644
>> --- a/llvm/test/CodeGen/AArch64/GlobalISel/select.mir
>> +++ b/llvm/test/CodeGen/AArch64/GlobalISel/select.mir
>> @@ -199,13 +199,13 @@ registers:
>>    - { id: 11, class: gpr }
>>    # CHECK:  body:
>> -# CHECK:    $wzr = SUBSWrr %0, %0, implicit-def $nzcv
>> +# CHECK:    SUBSWrr %0, %0, implicit-def $nzcv
>>  # CHECK:    %1:gpr32 = CSINCWr $wzr, $wzr, 1, implicit $nzcv
>>  -# CHECK:    $xzr = SUBSXrr %2, %2, implicit-def $nzcv
>> +# CHECK:    SUBSXrr %2, %2, implicit-def $nzcv
>>  # CHECK:    %3:gpr32 = CSINCWr $wzr, $wzr, 3, implicit $nzcv
>>  -# CHECK:    $xzr = SUBSXrr %4, %4, implicit-def $nzcv
>> +# CHECK:    SUBSXrr %4, %4, implicit-def $nzcv
>>  # CHECK:    %5:gpr32 = CSINCWr $wzr, $wzr, 0, implicit $nzcv
>>    body:             |
>> 
>> diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/tbnz-slt.mir b/llvm/test/CodeGen/AArch64/GlobalISel/tbnz-slt.mir
>> index abdacf8b129f..a01fe760085a 100644
>> --- a/llvm/test/CodeGen/AArch64/GlobalISel/tbnz-slt.mir
>> +++ b/llvm/test/CodeGen/AArch64/GlobalISel/tbnz-slt.mir
>> @@ -132,7 +132,7 @@ body:             |
>>    ; CHECK:   successors: %bb.0(0x40000000), %bb.1(0x40000000)
>>    ; CHECK:   %copy:gpr64 = COPY $x0
>>    ; CHECK:   %zero:gpr64 = COPY $xzr
>> -  ; CHECK:   $xzr = SUBSXrr %zero, %copy, implicit-def $nzcv
>> +  ; CHECK:   [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr %zero, %copy, implicit-def $nzcv
>>    ; CHECK:   %cmp:gpr32 = CSINCWr $wzr, $wzr, 10, implicit $nzcv
>>    ; CHECK:   TBNZW %cmp, 0, %bb.1
>>    ; CHECK:   B %bb.0
>> 
>> diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/tbz-sgt.mir b/llvm/test/CodeGen/AArch64/GlobalISel/tbz-sgt.mir
>> index 329e555c81cb..16b8ff5e6a8f 100644
>> --- a/llvm/test/CodeGen/AArch64/GlobalISel/tbz-sgt.mir
>> +++ b/llvm/test/CodeGen/AArch64/GlobalISel/tbz-sgt.mir
>> @@ -132,7 +132,7 @@ body:             |
>>    ; CHECK:   successors: %bb.0(0x40000000), %bb.1(0x40000000)
>>    ; CHECK:   %copy:gpr64 = COPY $x0
>>    ; CHECK:   %negative_one:gpr64 = MOVi64imm -1
>> -  ; CHECK:   $xzr = SUBSXrr %negative_one, %copy, implicit-def $nzcv
>> +  ; CHECK:   [[SUBSXrr:%[0-9]+]]:gpr64 = SUBSXrr %negative_one, %copy, implicit-def $nzcv
>>    ; CHECK:   Bcc 12, %bb.1, implicit $nzcv
>>    ; CHECK:   B %bb.0
>>    ; CHECK: bb.1:
>> 
>> 
>>         _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list