[PATCH] "float2int": Add a new pass to demote from float to int where possible.
Hal Finkel
hfinkel at anl.gov
Wed Mar 18 19:54:15 PDT 2015
----- Original Message -----
> From: "Sean Silva" <chisophugis at gmail.com>
> To: reviews+D7790+public+42b40e367a6aa137 at reviews.llvm.org
> Cc: "James Molloy" <james.molloy at arm.com>, "Hal Finkel"
> <hfinkel at anl.gov>, llvm-commits at cs.uiuc.edu
> Sent: Wednesday, March 18, 2015 7:18:44 PM
> Subject: Re: [PATCH] "float2int": Add a new pass to demote from float
> to int where possible.
> On Thu, Mar 5, 2015 at 7:36 AM, hfinkel at anl.gov < hfinkel at anl.gov >
> wrote:
> > In http://reviews.llvm.org/D7790#134758 , @mkuper wrote:
>
> > > Hi James,
>
> > >
>
> > > Just so we have a record of what we talked about on IRC (and can
> > > give Hal a chance to disagree :-)
>
> > Good; I disagree :-)
>
> > The first question is answer is: What is the most useful and
> > reasonable canonical form? The reason I support running this pass
> > early in the pipeline is because I believe that demoting these int
> > -> fp -> int sequences to int sequences, when semantically
> > equivalent, is the most useful canonical form.
>
> > If it is useful, because of microarchitectural features, to use FP
> > vector ops instead of integer vector ops, then that should be
> > 'actively' handled later (instead of just taking advantage of it
> > when it happens to happen).
>
> Actually, the opposite transformation might be useful in any backend
> and is not limited to vector ops. Currently, extremely integer-heavy
> workloads (there are many applications that fall into this category,
> e.g. LLVM itself) end up leaving all the floating point units idle
> regardless of architecture. So it's just a matter of the relative
> domain-crossing costs vs. the extra ILP due to having more execution
> resources.
I think that the tough part of the modeling here is actually the instruction latency. In situations where you have enough ILP, this does not matter too much, but floating-point operations often have higher latency than the integer ones. So you need to make sure you have enough spare ILP to cover the increased latency.
-Hal
> On architectures like x86 that have a memory-->register int to FP
> conversion instruction, some of the domain crossing cost can be
> avoided. A cursory look at the wikipedia page for POWER8 indicates
> that the core has 2x integer units, but 7 other units that can do
> basic arithmetic (4x FPU, 2x VMX, 1x Decimal FP).
> -- Sean Silva
> > So I think that this should run early by default, x86 included. We
> > should also reverse the transformation later, perhaps within the
> > vectorizer, using an actual cost model, if that proves useful.
>
> > ).
>
> > > On x86, vector i64 muls can be much worse than vector double
> > > muls.
> > > Since this is pre-LoopV, and we don't know if we'll end up with
> > > vector or scalar code, I think the safe thing to do on x86 would
> > > be to disable this for cases where we'll do a double -> i64
> > > transformation.
>
> > >
>
> > > This means we should probably have a target hook for that that
> > > x86
> > > can override.
>
> > REPOSITORY
>
> > rL LLVM
>
> > http://reviews.llvm.org/D7790
>
> > EMAIL PREFERENCES
>
> > http://reviews.llvm.org/settings/panel/emailpreferences/
>
> > _______________________________________________
>
> > llvm-commits mailing list
>
> > llvm-commits at cs.uiuc.edu
>
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150318/0ccfbfd2/attachment.html>
More information about the llvm-commits
mailing list