[llvm-commits] [PATCH] fold umax(zext A, zext B) -> zext (umax(A, B))

Thu Jun 18 13:27:52 PDT 2009

On Jun 18, 2009, at 12:37 AM, Török Edwin wrote:

> On 2009-06-18 01:12, Dan Gohman wrote:
>
>> On Jun 17, 2009, at 1:44 PM, Török Edwin wrote:
>>
>>
>>
>>
>>
>>
>>
>>> On 2009-06-17 23:25, Dan Gohman wrote:
>>>
>>>
>>>
>>>
>>>
>>>> On Jun 17, 2009, at 11:35 AM, Török Edwin wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I noticed that umax (zext t1 %X to %t0, zext t1 %Y to %t0) isn't
>>>>>
>>>>>
>>>>>
>>>>> folded,
>>>>>
>>>>>
>>>>>
>>>>> the attached patch folds this into:
>>>>>
>>>>>
>>>>>
>>>>> umax (zext t1 %X to %t0, zext t1 %Y to %t0).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> It also folds umax (sext t1 %X to %t0, sext t1 %Y to %t0) ->  sext
>>>>>
>>>>> t1
>>>>>
>>>>>
>>>>>
>>>>> (umax (%X, %Y)) to %t0.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> zext is very often encountered in SCEV expressions on x86-64,  
>>>>> since
>>>>>
>>>>>
>>>>>
>>>>> pointer indexes for GEP are i64.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thoughts?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Another question to ask is whether this kind of thing belongs in
>>>>
>>>>
>>>>
>>>> ScalarEvolution, or if it would be more appropriate for
>>>>
>>>>
>>>>
>>>> instcombine. Instcombine looks at all instructions in a program,
>>>>
>>>>
>>>>
>>>> while ScalarEvolution typically only looks at those related to
>>>>
>>>>
>>>>
>>>> loop iteration. Also, instcombine could more easily handle more
>>>>
>>>>
>>>>
>>>> generalized cases of this optimization, for example with umin.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On the other hand, there are cases where it makes sense to do
>>>>
>>>>
>>>>
>>>> such simplifications in ScalarEvolution. Can you given an
>>>>
>>>>
>>>>
>>>> example where you're seeing this kind of code?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>> It doesn't have much to do with code generation/optimization, but
>>>
>>> rather
>>>
>>> with analysis.
>>>
>>> I have a pass that tries to find buffer overflow bugs, doing that
>>>
>>> involves lots of umax() expressions.
>>>
>>> If I can move the zext out of umax I can decide quite early that an
>>>
>>> access is valid:
>>>
>>>
>>>
>>> int foo(unsigned n, unsigned i)
>>>
>>> {
>>>
>>> char *a = malloc(n);
>>>
>>> if (!a)
>>>
>>>  return -1;
>>>
>>> if (i < n)
>>>
>>>   a[i]=0;
>>>
>>> }
>>>
>>>
>>>
>>
>>
>> Where does the umax come from in this code?  I guess that you're
>>
>> transforming it in some way; could you show what the code looks
>>
>> like after the transformation?
>>
>>
>>
>
> %0 = malloc i8, i32 %n          ; <i8*> [#uses=2]
> %2 = zext i32 %i to i64         ; <i64> [#uses=1]
> %3 = getelementptr i8* %0, i64 %2               ; <i8*> [#uses=1]
>
> %3 is '((zext i32 %i to i64) + %0' according to SCEV,
> %0 has size %n =>
>
> offset is (zext i32 %i to i64), hence one of the safety conditions  
> is i
>
>> = 0 &&  i < n. We already have i >= 0 since I treat all
>>
> values as unsigned, so that leaves us with i < n:
>
> if ((zext i32 %n to i64) umax (zext i32 %i to i64)) == (zext i32 %i to
> i64) then access is always valid
> if ((zext i32 %n to i64) umax (zext i32 %i to i64)) == (zext i32 %n to
> i64) then access is always invalid

Thanks, though it's still not clear why umax is needed.
Wouldn't the code be a lot simpler expanded like this:

if (zext %i to i64) < (zext %n to i64) then access is always valid
else access is always invalid

Stripping out the zexts from comparisons is fairly trivial.

> Otherwise I use a solver to decide
>
> There are 2 more conditions (access size < size, offset +  
> access_size <
> size) with similar expressions.
> As I said I could do the zext simplification in my pass only, but I
> thought this is something that would benefit SCEV(Expander) in  
> general.

I agree that the transformation sounds more generally applicable,
though umax is relatively uncommon in common SCEV usage, so I was
curious where you were seeing this kind of expression. Having
SCEVExpander do clever things is fine, though it's often easier
to just let it expand expressions in a simple way and let a
subsequent instcombine run tidy things up.

Dan