[llvm-dev] Shift-by-signext - sext is bad for analysis - ignore it's use count?

Tue Oct 1 11:09:11 PDT 2019

On 9/27/19 1:40 PM, Roman Lebedev via llvm-dev wrote:
> In https://reviews.llvm.org/D68103 the InstCombine learned that shift-by-sext
> is simply a shift-by-zext.

Just to make sure I'm following, the reasoning here is that the shift 
amount must be positive or the shift would produce poison? And thus, 
it's safe to assume that the sext == zext because we've (at worst) 
removed UB in the original program?

If so, two slightly off topic ideas.

1) This feels like a demanded bits problem.  We know that any shift 
value outside of a given range is UB, and thus only need to demand the 
bits necessary to represent the defined range.  Might be an interesting 
extension.

2) Are we possibly missing opportunities by not exploiting knowledge of 
the a known negative shift amount?

> But the transform is limited to single-use sext.
> We can quite trivially get a case where there are two shifts by the same sext:
> https://godbolt.org/z/j6mO3t  <- We should handle those cases.
>
> In https://reviews.llvm.org/D68103#1686130 Sanjay Patel notes that this
> sext is intrusive for analysis, that we will gain far better analysis with zext,
> so we should just ignore forego of the one-use check,
> and simply replace all shift-by-sext with shift-by-zext.

Doing the multi-use case is unfortunately complicated.  Your limited use 
scan might be a reasonable option in practice, but the need for cutoffs 
creates undesirable dynamics.

A couple ideas on how to possibly approach the problem:

1) If we can prove that one shift dominates the other uses, then if we 
can find UB which triggers based on overflow, we can do the replacement.

2) Having a general multiple use demanded use routine would be very 
powerful.  Is it worth exploring the harder topic for generality?

3) If we had an anyextend IR node, it might be reasonable to eagerly 
produce the duplicate nodes, and rely on later CSE.  I keep running 
across cases where we have an extend where we know the high bits don't 
matter, maybe it's time to represent that?

> I implemented this proposed suggestion here:
> https://reviews.llvm.org/D68150
>
> Does anyone see any problems with that trade-off?
>
> Roman.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev