[LLVMdev] Question about FMA formation
milseman at apple.com
Wed Dec 12 20:14:10 PST 2012
On Dec 12, 2012, at 5:20 PM, Eric Christopher <echristo at gmail.com> wrote:
> Why not just form them via a fast IR level pass and just have patterns match in fast isel instead of trying to form code? Or are we saying the same thing? (Your words of "fast isel spot"ting and "form better code" caused me to think you mean to do optimizations within the fast isel pass).
Sorry, we've kind of been jumping around a bit. I'll try to expound on what's being debated: We have a few options ahead of us as far as benefitting fast-isel is concerned.
We can write a pass to form fmuladds. The intent being to run this very late, perhaps before or part of codegen prepare. The downside here is that it somewhat goes against the point of fast-isel. Fast-isel allows us to skip extra representations of the program, and replacing IR with intrinsic calls is similar to having an extra representation, albeit only for part of the program.
However, the basic task of spotting an fadd of an fmul is simple enough that fast-isel could just emit the FMA equivalent if it likes. This has the benefit that we avoid the extra representation, but the downside that it makes fast-isel a little more complicated and it only does simple patterns.
Shuxin was showing some more complicated patterns that required re-association to match (fast-math flags permitting). For those, we're considering if having a re-associate-for-FMA functionality in codegen-prepare would solve that problem. Thus, we can re-associate in codegen-prepare and emit FMA in fast-isel.
> On Wed, Dec 12, 2012 at 5:14 PM, Michael Ilseman <milseman at apple.com> wrote:
>> Right now we're shying towards having a re-association helper in codegen-prepare that will re-associate expressions (if allowed). This would allow fast-isel to more easily spot FMA opportunities, and form better code.
>> On Dec 12, 2012, at 5:11 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>> You hit send right when I did!
>>>> For your example, do you mean that it's grouped like:
>>>> (fadd (fadd (fmul a b) (fmul c d)) e)
>>>> How would your pass go about handling these patterns and is that something that would be too complicated for fast-isel to do on the fly?
>>> Depends on how they're grouped, but if the formation happens prior to codegen then fast-isel will just handle whatever new instruction you've got. An example of IR would be useful though :)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev