[PATCH] D74228: [PatternMatch] Match XOR variant of unsigned-add overflow check.

Tue Feb 11 10:51:48 PST 2020

fhahn added a comment.

In D74228#1869653 <https://reviews.llvm.org/D74228#1869653>, @spatel wrote:

> In D74228#1869304 <https://reviews.llvm.org/D74228#1869304>, @fhahn wrote:
>
> > I remember that there was a discussion on llvm-dev about this topic. Do you know if we already have similar tests in test-suite?
>
>
> I don't follow test-suite closely, but I'm not seeing anything that checks asm output at first glance....but it's probably worth raising again on the list because that topic seems to come up every few months.
>
> But back to this patch - I'm just now looking at the test case in this patch, and I think that I missed the point of the earlier comment by @nikic. Let's step back and answer if CGP is the right place to do this transform.
>
> If we are comparing codegen for these 2 patterns:
>
>   define i64 @uaddo_no_math_use(i64 %a, i64 %b) {
>     %t = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
>     %ov = extractvalue { i64, i1 } %t, 1
>     %Q = select i1 %ov, i64 %b, i64 42
>     ret i64 %Q
>   }
>  
>   define i64 @no_intrinsic(i64 %a, i64 %b) {
>     %x = xor i64 %a, -1
>     %cmp = icmp ult i64 %x, %b
>     %Q = select i1 %cmp, i64 %b, i64 42
>     ret i64 %Q
>   }
>  
>
>
> ...then should we form an intrinsic in CGP or create ISD::UADDO later? Although this CGP transform uses a TLI hook (shouldFormOverflowOp), that hook is aggressive by default - it assumed we would generate this intrinsic only if we needed both results (similarly, we can argue that we're missing a canonicalization to replace the intrinsic call in the first example). If we want to create the intrinsic even if we don't need both results, then we may need a DAG reversal because the default expansion does not seem to give us optimal asm for all targets.

The example below is interesting! It seems like we have the same problem with the add version of UAddWithOverflow, as we also generate uadd calls if the sum is not used besides the compare (e.g. the @uaddo1 test case in llvm/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll)

Maybe we should adjust the shouldFormOverflowOp hook to check if there are actual users of the sum and default to OFF if there are actual users? And then opt in on X86 & AArch64, where I think it should be beneficial for overflow-only checks. I think doing it in CGP is more convenient then in SelDag and we would also need an appropriate target hook there I think. Unless we match that pattern in the target implementations.

What do you think?

> For example:
> 
>   $ llc -o - ov.ll -mtriple=sparcv9
>   uaddo_no_math_use:                      ! @uaddo_no_math_use
>   ! %bb.0:
>   	add %o0, %o1, %o2
>   	mov	%g0, %o3
>   	cmp %o2, %o0
>   	movcs	%xcc, 1, %o3
>   	mov	42, %o0
>   	cmp %o3, 0
>   	retl
>   	movne	%icc, %o1, %o0
>   no_intrinsic:                           ! @no_intrinsic
>   ! %bb.0:
>   	xor %o0, -1, %o2
>   	mov	42, %o0
>   	cmp %o2, %o1
>   	retl
>   	movcs	%xcc, %o1, %o0
>   
> 
> 
> Hopefully, I'm seeing the whole problem now. Let me know if I'm still missing it.

I think that should cover it, thanks for sharing the problematic case on SPARC.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74228/new/

https://reviews.llvm.org/D74228