Fw: [PATCH] Fw: [LLVMdev] More FMA folding opportunities

Olivier H Sallenave ohsallen at us.ibm.com
Mon Jan 5 07:51:22 PST 2015



Hi,

Here's a more comprehensive patch also handling the fact that FP_EXTEND
will later be removed for PPC. Accordingly, new patterns are implemented in
the PPC-specific combiner:

> Finally, specifically for the PPC target, we could ignore FP_EXTEND
> in the patterns above as it will be removed by the Machine Common
> Subexpression Elimination pass. For instance:
>
> fold (fadd (fpext (fmul x, y)), z) -> (fma x, y, z)
> fold (fadd (fpext (fma x, y, (fmul u, v))), z) -> (fma x, y (fma u,
> v, z))

I also moved the tests in new files (fma-ext.ll, fma-assoc.ll) for more
clarity.

Thanks,
Olivier

(See attached file: patch-v2.diff)



----- Forwarded by Olivier H Sallenave/Watson/IBM on 01/05/2015 10:44 AM
-----

From:	Olivier H Sallenave/Watson/IBM
To:	llvm-commits at cs.uiuc.edu
Cc:	"Hal Finkel" <hfinkel at anl.gov>
Date:	12/29/2014 05:09 PM
Subject:	[PATCH] Fw: [LLVMdev] More FMA folding opportunities


Hi,

Attached is a patch to support more FMA folding opportunities (especially
for PPC) as discussed below.

Thanks,
Olivier

(See attached file: patch.diff)



----- Forwarded by Olivier H Sallenave/Watson/IBM on 12/29/2014 04:56 PM
-----

From:	Hal Finkel <hfinkel at anl.gov>
To:	Olivier H Sallenave/Watson/IBM at IBMUS
Cc:	<llvmdev at cs.uiuc.edu>
Date:	09/30/2014 08:08 PM
Subject:	Re: [LLVMdev] More FMA folding opportunities



----- Original Message -----
> From: "Olivier H Sallenave" <ohsallen at us.ibm.com>
> To: llvmdev at cs.uiuc.edu
> Sent: Monday, September 29, 2014 3:34:51 PM
> Subject: [LLVMdev] More FMA folding opportunities
>
> Hi,
>
> I think more opportunities might be added for FMA in the DAG
> combiner, please tell me what you think. Right now, those cases are
> implemented:
>
> fold (fadd (fmul x, y), z) -> (fma x, y, z)
> fold (fadd x, (fmul y, z)) -> (fma y, z, x)
>
> When the TLI callback "enableAggressiveFMAFusion" returns true, we
> might also support:
>
> fold (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y (fma u, v, z))
> fold (fadd x, (fma y, z, (fmul u, v)) -> (fma y, z (fma u, v, x))
>
> This kind of reassociation generates two FMA for (x^2 + y^2 + z).

Yes, this all sounds reasonable.

Thanks again,
Hal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150105/59e0e278/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-v2.diff
Type: application/octet-stream
Size: 13956 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150105/59e0e278/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.diff
Type: application/octet-stream
Size: 6299 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150105/59e0e278/attachment-0001.obj>


More information about the llvm-commits mailing list