[llvm-commits] [RFC/PATCH] PPCDoubleDouble compile-time arithmetic
Hal Finkel
hfinkel at anl.gov
Mon Oct 29 10:26:38 PDT 2012
----- Original Message -----
> From: "Duncan Sands" <baldrick at free.fr>
> To: llvm-commits at cs.uiuc.edu
> Sent: Monday, October 29, 2012 12:04:33 PM
> Subject: Re: [llvm-commits] [RFC/PATCH] PPCDoubleDouble compile-time arithmetic
>
> Hi Ulrich,
>
> On 29/10/12 17:11, Ulrich Weigand wrote:
> > Chris Lattner <clattner at apple.com> wrote on 28.10.2012 05:03:31:
> >
> >> Given that the PowerPC format expands out into operations on two
> >> doubles, how reasonable would it be for clang to generate pre-
> >> expanded IR that exposed this lowering to the optimizers?
> >>
> >> This wouldn't help you with constant parsing, but would simplify
> >> the
> >> IR and optimizer and almost certainly give you better code quality
> >> for this type.
> >
> > First of all, I'd tend to agree with Hal and Bill that expanding
> > PowerPC double-double in the front end might be an interesting
> > optimization for the longer term, but this is really an independent
> > issue of what I'm addressing with the current patch set. As you
> > note as well, we'd still need APFloat support for things like
> > constant parsing ...
>
> it could be build on top of APFloat instead of being part of APFloat.
>
> > So I'd certainly propose we should commit something along the lines
> > of my patch soon, since without this long double is pretty much
> > unusable. Anything further can then still be done later on. Do
> > you agree, or would you object to the patch at this stage?
> >
> >
> > Now, thinking further about what we could do in the future: it
> > seems to me that to really "simplify the IR" would mean to
> > completely remove "ppc_fp128" as a primitive type on the IR level.
>
> I'm pretty sure this is what Chris has in mind (based on previous
> discussions).
Yes, this is also my understanding.
Nevertheless, I think that, regardless of the later direction, this is the correct incremental step. It does, after all, make APFloat cleaner, removing a bunch of non-working code and replacing it with (simpler) working code. Unless someone has a specific objection, let's commit this.
>
> > As long as it is still there, we'd still have to deal with it.
> > Is that what you had in mind?
> >
> > Now, in order to get rid of ppc_fp128 completely, I think there's
> > a couple of issues that need to be considered. I'm not sure I
> > understand enough LLVM infrastructure at this point to come up
> > with an exhausive list, but here's some points that come to mind
> > immediately:
> >
> > - What about other front-ends than clang? They'd all have to be
> > changed to likewise eliminate generation of ppc_fp128 ...
>
> Correct. However if LLVM gains some utility libraries for
> manipulating
> "floating point number pairs" like PPC long double, this shouldn't be
> too
> bad. There are a bunch of classical algorithms for taking a pair of
> floating point numbers (of arbitrary precision) and having the pair
> quack
> like a floating point number of twice the precision. It would be
> neat to
> have a completely generic (generic in the size of the underlying
> floating
> point type) implementation of this, and use it for PPC long double
> (as far
> as I know PPC long doubles are an instance of this technique).
I agree this would be nice. We could then add it as a clang extension as well, and I think a lot of people would really like that. Nevertheless, there are a number of special cases in the algorithms, and the code will take time to develop.
>
> > - How to represent ppc_fp128 values used as function arguments
> > or return values? It seems the back-end still needs to handle
> > them differently; for example, passing a long double is *nearly*
> > the same as passing two doubles, except that a long double may
> > never be split such that one half is passed in register and
> > the other in memory. *Returning* a long double is even more
> > special, since it is returned in a float register pair, unlike
> > any other type ...
>
> I think this is an issue for the front-end. If both doubles should
> go on the stack, then both should get the onstack attribute (not yet
> implemented), if both should go in registers they both get inreg. As
> for returning them, it sounds analogous to returning { double, double
> }
> which is what x86-64 does to return a complex number IIRC (i.e. in a
> pair of floating point registers).
>
> >
> > - How ought expansion of arithmetic operations look like?
> > Currently, these are done by calling library routines like
> > __gcc_qadd. We could expand those calls (as calls) in the
> > front end. But that would actually *reduce* the opportunities
> > for the optimizers to work on long double: currently, they
> > see "add" nodes throughout optimization (and thus can act on
> > things like operands becoming known constant). If they saw
> > only function calls, this might be more difficult.
>
> We could teach the optimizers the semantics of these library calls.
And this is where things start to get messy ;)
>
> > On the other hand, we could expand the whole algorithm used
> > by those helper routines inline. This would expose the
> > internals
> > to the optimizers. But those algorithms are somewhat large
> > (and carefully tuned the way they are to attempt to contain
> > build-up of inaccuracies ...), and it's unclear that
> > unconditional inline expansion really lead to better
> > performance overall, taking code growth into account.
>
> They are currently expanded inline by the code generators, so you
> already have the code growth problem.
Unless the runtime libraries are compiled with LLVM, and we're using LTO, then they're not expanded.
Thanks again,
Hal
>
> Ciao, Duncan.
>
> >
> > I guess this would need some experimentation to find out what the
> > best way is, and what performance improvements (if any) we can
> > find. Overall, I expect that at this point other improvements
> > to PowerPC code generation have bigger opportunities to visibly
> > help overall performance (VSX support? full support for new-ish
> > instruction sets in general?), so I'd probably put long double
> > improvements lower on the priority list (once we actually get
> > it working at all, of course).
> >
> > Bye,
> > Ulrich
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list