[llvm] [AArch64][FEAT_CMPBR] Codegen for Armv9.6-a CBB and CBH (PR #164899)
David Tellenbach via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 24 10:16:01 PDT 2025
dtellenbach wrote:
> What I really don't want is a situation where the semantics are inconsistent; one bit of code tries to optimize based on the instruction only accessing the low bits, then another bit of code optimizes based on assertions that aren't valid anymore.
Agreed, that would be a bad spot to be in.
Maybe I am still misunderstanding you or did a bad job explaining what's going on but the new CB[B,H] pseudos are not consuming the assertions and also do not perform the extension; they rather just ignore them. If other users of the registers that need extension or that have assert[z,s]ext exist, that will stick. Here is an example:
```
int bar(unsigned char);
unsigned char foo(unsigned char x, unsigned char y) {
if (x > y)
__builtin_trap();
return bar(x + y);
}
```
On targets that require the caller to zero-/sing-extend (e.g. Darwin), we codegen `zeroext`:
```
define zeroext i8 @foo(i8 noundef zeroext %x, i8 noundef zeroext %y) local_unnamed_addr #0 {
entry:
%cmp = icmp ugt i8 %x, %y
br i1 %cmp, label %if.then, label %if.end
if.then: ; preds = %entry
tail call void @llvm.trap()
unreachable
if.end: ; preds = %entry
%add = add i8 %y, %x
%call = tail call i32 @bar(i8 noundef zeroext %add) #3
%conv6 = trunc i32 %call to i8
ret i8 %conv6
}
```
With and without cmpbr, the inputs to `%cmp` don't need zero-extension (because of `assertzext` during ISel) but `%add` needs zero-extension because the ABI requires the caller to be extended. Now, crucially the zero-extension for `%add` sticks even with `cmpbr` and the final codegen is
```
_foo: ; @foo
; %bb.0: ; %entry
cbbhi w0, w1, LBB0_2
; %bb.1: ; %if.end
stp x29, x30, [sp, #-16]! ; 16-byte Folded Spill
mov x29, sp
add w8, w1, w0
and w0, w8, #0xff
bl _bar
and w0, w0, #0xff
ldp x29, x30, [sp], #16 ; 16-byte Folded Reload
ret
LBB0_2: ; %if.then
brk #0x1
```
For targets that require the callee to zero- or sign-extend (I think that's all that are using aapcs64), the inputs to the comparison need zero-extension:
```
define i8 @foo(i8 noundef %x, i8 noundef %y) {
entry:
%cmp = icmp ugt i8 %x, %y
br i1 %cmp, label %if.then, label %if.end
if.then: ; preds = %entry
tail call void @llvm.trap()
unreachable
if.end: ; preds = %entry
%add = add i8 %y, %x
%call = tail call i32 @bar(i8 noundef %add) #3
%conv6 = trunc i32 %call to i8
ret i8 %conv6
}
```
Codegen without cmpbr:
```
foo: // @foo
// %bb.0: // %entry
and w8, w0, #0xff
cmp w8, w1, uxtb
b.hi .LBB0_2
// %bb.1: // %if.end
stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
add w0, w1, w0
mov x29, sp
bl bar
ldp x29, x30, [sp], #16 // 16-byte Folded Reload
ret
.LBB0_2: // %if.then
brk #0x1
```
Codegen with cmpbr:
```
foo: // @foo
// %bb.0: // %entry
cbbhi w0, w1, .LBB0_2
// %bb.1: // %if.end
stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
add w0, w1, w0
mov x29, sp
bl bar
ldp x29, x30, [sp], #16 // 16-byte Folded Reload
ret
.LBB0_2: // %if.then
brk #0x1
```
Maybe I'm missing a case that could go wrong?
https://github.com/llvm/llvm-project/pull/164899
More information about the llvm-commits
mailing list