[LLVMdev] Convert fdiv - X/Y -> X*1/Y
Chad Rosier
chad.rosier at gmail.com
Thu Aug 8 14:39:08 PDT 2013
On Thu, Aug 8, 2013 at 5:23 PM, Chandler Carruth <chandlerc at google.com>wrote:
>
> On Thu, Aug 8, 2013 at 2:07 PM, Chad Rosier <chad.rosier at gmail.com> wrote:
>
>> On Thu, Aug 8, 2013 at 1:56 PM, Mark Lacey <mark.lacey at apple.com> wrote:
>>
>>>
>>> On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at apple.com> wrote:
>>>
>>> Hi Chad,
>>>
>>> This is a great transform to do, but you’re right that it’s only safe
>>> under fast-math. This is particularly interesting when the original divisor
>>> is a constant so you can materialize the reciprocal at compile-time. You’re
>>> right that in either case, this optimization should only kick in when there
>>> is more than one divide instruction that will be changed to a mul.
>>>
>>>
>>> It can be worthwhile to do this even in the case where there is only a
>>> single divide since 1/Y might be loop invariant, and could then be hoisted
>>> out later by LICM. You just need to be able to fold it back together when
>>> there is only a single use, and that use is not inside a more deeply nested
>>> loop.
>>>
>>
>> Ben's patch does exactly this, so perhaps that is the right approach.
>>
>
> Just to be clear of what is being proposed (which I rather like):
>
> 1) Canonical form is to use the reciprocal when allowed (by the fast math
> flags, whichever we decide are appropriate).
> 2) The backend folds a single-use reciprocal into a direct divide.
>
> Did I get it right? If so, I think this is a really nice way to capture
> all of the potential benefits of forming reciprocals without pessimizing
> code where it isn't helpful.
>
I believe you're describing Ben's patch perfectly. A few transformations
are pessimize, however.
>From test/Transforms/InstCombine/fast-math.ll
1. Previously x/y + x/z was not transformed. Not it becomes x*(1/y+1/x).
define float @fact_div1(float %x, float %y, float %z) {
%t1 = fdiv fast float %x, %y
%t2 = fdiv fast float %x, %z
%t3 = fadd fast float %t1, %t2
ret float %t3
}
combines to:
define float @fact_div1(float %x, float %y, float %z) {
%reciprocal = fdiv fast float 1.000000e+00, %y
%reciprocal1 = fdiv fast float 1.000000e+00, %z
%1 = fadd fast float %reciprocal, %reciprocal1
%2 = fmul fast float %1, %x
ret float %t3
}
I don't believe the fixup in CodeGenPrepare will undo such a transformation.
2. Similarly, x/y + z/x was not previously changed, but now we generate
x*(1/y) + z*(1/x).
I believe we can undo this transformation.
3. Previously we would transform y/x + z/x => (y+z)/x. Now y/x + z/x is
transformed to y*(1/x)+z*(1/x).
This might be an ordering problem or perhaps we could just transform
y*(1/x)+z*(1/x) => (y+z)/x. The same
holds true for y/x - z/x.
Chad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/fe3cdafe/attachment.html>
More information about the llvm-dev
mailing list