[llvm-commits] [PATCH] fold umax(zext A, zext B) -> zext (umax(A, B))

Wed Jun 17 15:12:50 PDT 2009

On Jun 17, 2009, at 1:44 PM, Török Edwin wrote:

> On 2009-06-17 23:25, Dan Gohman wrote:
>
>> On Jun 17, 2009, at 11:35 AM, Török Edwin wrote:
>>
>>
>>
>>
>>
>>> Hi,
>>>
>>>
>>>
>>> I noticed that umax (zext t1 %X to %t0, zext t1 %Y to %t0) isn't
>>>
>>> folded,
>>>
>>> the attached patch folds this into:
>>>
>>> umax (zext t1 %X to %t0, zext t1 %Y to %t0).
>>>
>>>
>>>
>>> It also folds umax (sext t1 %X to %t0, sext t1 %Y to %t0) ->  sext  
>>> t1
>>>
>>> (umax (%X, %Y)) to %t0.
>>>
>>>
>>>
>>> zext is very often encountered in SCEV expressions on x86-64, since
>>>
>>> pointer indexes for GEP are i64.
>>>
>>>
>>>
>>> Thoughts?
>>>
>>>
>>>
>>
>>
>> Another question to ask is whether this kind of thing belongs in
>>
>> ScalarEvolution, or if it would be more appropriate for
>>
>> instcombine. Instcombine looks at all instructions in a program,
>>
>> while ScalarEvolution typically only looks at those related to
>>
>> loop iteration. Also, instcombine could more easily handle more
>>
>> generalized cases of this optimization, for example with umin.
>>
>>
>>
>> On the other hand, there are cases where it makes sense to do
>>
>> such simplifications in ScalarEvolution. Can you given an
>>
>> example where you're seeing this kind of code?
>>
>>
>>
>
> It doesn't have much to do with code generation/optimization, but  
> rather
> with analysis.
> I have a pass that tries to find buffer overflow bugs, doing that
> involves lots of umax() expressions.
> If I can move the zext out of umax I can decide quite early that an
> access is valid:
>
> int foo(unsigned n, unsigned i)
> {
>  char *a = malloc(n);
>  if (!a)
>   return -1;
>  if (i < n)
>    a[i]=0;
> }

Where does the umax come from in this code?  I guess that you're
transforming it in some way; could you show what the code looks
like after the transformation?

Thanks,

Dan