[llvm] [AArch64][FEAT_CMPBR] Codegen for Armv9.6-a CBB and CBH (PR #164899)

David Tellenbach via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 24 10:16:01 PDT 2025


dtellenbach wrote:

> What I really don't want is a situation where the semantics are inconsistent; one bit of code tries to optimize based on the instruction only accessing the low bits, then another bit of code optimizes based on assertions that aren't valid anymore.

Agreed, that would be a bad spot to be in.

Maybe I am still misunderstanding you or did a bad job explaining what's going on but the new CB[B,H] pseudos are not consuming the assertions and also do not perform the extension; they rather just ignore them. If other users of the registers that need extension or that have assert[z,s]ext exist, that will stick. Here is an example:

```
int bar(unsigned char);

unsigned char foo(unsigned char x, unsigned char y) {
  if (x > y)
    __builtin_trap();
  return bar(x + y);
}
```

On targets that require the caller to zero-/sing-extend (e.g. Darwin), we codegen `zeroext`:

```
define zeroext i8 @foo(i8 noundef zeroext %x, i8 noundef zeroext %y) local_unnamed_addr #0 {
entry:
  %cmp = icmp ugt i8 %x, %y
  br i1 %cmp, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  tail call void @llvm.trap()
  unreachable

if.end:                                           ; preds = %entry
  %add = add i8 %y, %x
  %call = tail call i32 @bar(i8 noundef zeroext %add) #3
  %conv6 = trunc i32 %call to i8
  ret i8 %conv6
}
```

With and without cmpbr, the inputs to `%cmp` don't need zero-extension (because of `assertzext` during ISel) but `%add` needs zero-extension because the ABI requires the caller to be extended. Now, crucially the zero-extension for `%add` sticks even with `cmpbr` and the final codegen is

```
_foo:                                   ; @foo
; %bb.0:                                ; %entry
	cbbhi	w0, w1, LBB0_2
; %bb.1:                                ; %if.end
	stp	x29, x30, [sp, #-16]!           ; 16-byte Folded Spill
	mov	x29, sp
	add	w8, w1, w0
	and	w0, w8, #0xff
	bl	_bar
	and	w0, w0, #0xff
	ldp	x29, x30, [sp], #16             ; 16-byte Folded Reload
	ret
LBB0_2:                                 ; %if.then
	brk	#0x1
```

For targets that require the callee to zero- or sign-extend (I think that's all that are using aapcs64), the inputs to the comparison need zero-extension:

```
define i8 @foo(i8 noundef %x, i8 noundef %y)  {
entry:
  %cmp = icmp ugt i8 %x, %y
  br i1 %cmp, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  tail call void @llvm.trap()
  unreachable

if.end:                                           ; preds = %entry
  %add = add i8 %y, %x
  %call = tail call i32 @bar(i8 noundef %add) #3
  %conv6 = trunc i32 %call to i8
  ret i8 %conv6
}
```

Codegen without cmpbr:

```
foo:                                    // @foo
// %bb.0:                               // %entry
	and	w8, w0, #0xff
	cmp	w8, w1, uxtb
	b.hi	.LBB0_2
// %bb.1:                               // %if.end
	stp	x29, x30, [sp, #-16]!           // 16-byte Folded Spill
	add	w0, w1, w0
	mov	x29, sp
	bl	bar
	ldp	x29, x30, [sp], #16             // 16-byte Folded Reload
	ret
.LBB0_2:                                // %if.then
	brk	#0x1
```

Codegen with cmpbr:

```
foo:                                    // @foo
// %bb.0:                               // %entry
	cbbhi	w0, w1, .LBB0_2
// %bb.1:                               // %if.end
	stp	x29, x30, [sp, #-16]!           // 16-byte Folded Spill
	add	w0, w1, w0
	mov	x29, sp
	bl	bar
	ldp	x29, x30, [sp], #16             // 16-byte Folded Reload
	ret
.LBB0_2:                                // %if.then
	brk	#0x1
```

Maybe I'm missing a case that could go wrong?

https://github.com/llvm/llvm-project/pull/164899


More information about the llvm-commits mailing list