[llvm-commits] [RFC/PATCH] PPCDoubleDouble compile-time arithmetic

Sun Oct 28 17:57:13 PDT 2012

Hi Chris,

I think we should add the possibility of exposing the internals of long
double to the lengthy list of PowerPC tasks, but as Hal indicated, it
will be a rather large step and not simple; so it would be appropriate
to treat this as a longer term effort.  Because "long double" is a
special entity at the ABI level and has its own rules for parameter
passing, structural layout, and so forth, care would have to be taken to
ensure the "long-doubleness" remains visible at function call boundaries
and when calculating alignment.  Within structures, a pair of contiguous
doubles has 8-byte alignment, but a long double has 16-byte alignment.
I am sure all these issues can be solved with sufficient hackery, I mean
design ;), but it requires careful thought and nontrivial effort.

Right now we are working hard just to get the PPC64 ELF backend into ABI
compliance, and there is much remaining work before that will be
completed.  Having a functional long double implementation will be a big
step for eventually getting LLVM/Clang to self-host on 64-bit PowerPC
Linux.  After we all have more experience with LLVM/Clang, optimizations
such as you suggest here will be more feasible for us to implement.

If it's OK with you, I'd like to record this discussion in our list of
eventual work items, but for now Uli's patch will really be a
significant milestone for us.  I hope that seems reasonable.

Thanks!
Bill

-- 
Bill Schmidt, Ph.D.
IBM Advance Toolchain for PowerLinux
IBM Linux Technology Center
wschmidt at us.ibm.com
wschmidt at linux.vnet.ibm.com

> I'm not going to stand in the way of this patch if its the right incremental step, but I would really really like to get rid of double double support from LLVM IR.  It just doesn't make any sense to have a type that is always expanded in codegen.  It would also be nice to generate faster and more correct code than gcc!
> 
> -Chris
> 
> On Oct 27, 2012, at 6:45 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Chris Lattner" <clattner at apple.com>
> >> To: "Ulrich Weigand" <Ulrich.Weigand at de.ibm.com>
> >> Cc: llvm-commits at cs.uiuc.edu
> >> Sent: Saturday, October 27, 2012 11:03:31 PM
> >> Subject: Re: [llvm-commits] [RFC/PATCH] PPCDoubleDouble compile-time    arithmetic
> >> 
> >> Hi Ulrich,
> >> 
> >> Given that the PowerPC format expands out into operations on two
> >> doubles, how reasonable would it be for clang to generate
> >> pre-expanded IR that exposed this lowering to the optimizers?
> > 
> > As I recall, Bill had some thought about how this would interact with the ABI requirements. Bill?
> > 
> > Chris, I'd like to get this patch in, even if we would like to move some/all support into the frontend. With this patch, we move from something that is 100% broken to something that is 99.9% functional, plus we get a nice cleanup in APFloat. Moving double-double support into clang looks like a major project.
> > 
> > I do certainly agree, however, that being able to inline the arithmetic seems like a nice performance win (and it could then be vectorized too).
> > 
> > Thanks again,
> > Hal
> > 
> >> 
> >> This wouldn't help you with constant parsing, but would simplify the
> >> IR and optimizer and almost certainly give you better code quality
> >> for this type.
> >> 
> >> -Chris
> >> 
> >> On Oct 26, 2012, at 3:27 AM, Ulrich Weigand
> >> <Ulrich.Weigand at de.ibm.com> wrote:
> >> 
> >>> 
> >>> Hello,
> >>> 
> >>> on PowerPC, there is no true "long double" data type supported by
> >>> hardware.
> >>> The PowerPC ABI instead defines "long double" to be a 128-bit type
> >>> interpreted as a pair of doubles.  The LLVM back-end seems to
> >>> support code
> >>> generation involving this data type well enough.  However, the
> >>> clang
> >>> front-end currenly has only extremely limited support for this
> >>> type; in
> >>> particular, it is unable to parse long double floating-point
> >>> constants.
> >>> 
> >>> The reason for this is that while the APFloat data type provides
> >>> PPCDoubleDouble floating-point semantics, it disallows any
> >>> compile-time
> >>> arithmetic on such numbers.  One way to implement this would be to
> >>> fully
> >>> emulate the operations done by run-time arithmetic routines on
> >>> double
> >>> pairs.  However, this is a significant effort to ensure equivalent
> >>> results,
> >>> and would also require restructuring of the APFloat data type and
> >>> operations.
> >>> 
> >>> On the other hand, GCC doesn't implement long double compile-time
> >>> arithmetic on PowerPC this way either.  Instead, GCC's real.c
> >>> simply
> >>> pretends the type is a 106-bit IEEE floating-point type, and
> >>> implements all
> >>> operations using its regular IEEE arithmetic routines, parametrized
> >>> to the
> >>> corresponding "pretend" mantissa and exponent sizes.   This has the
> >>> effect
> >>> that not all operations give the identical result as run-time
> >>> operations on
> >>> double-double pairs would, but it is good enough for the most
> >>> common use
> >>> cases (where "long double" is in fact used as if it were and IEEE
> >>> type with
> >>> larger mantissa).  In particular, it's good enough to parse
> >>> floating-point
> >>> constants ...
> >>> 
> >>> It turns out that it is quite straightforward to implement long
> >>> double
> >>> artithmetic along those same lines in LLVM's APFloat.   The patch
> >>> below
> >>> implements a representation that is exactly equivalent to GCC's
> >>> real.c
> >>> representation of long double on PowerPC.   This fixes a large
> >>> number of
> >>> test suite failures (no test fails due to long double issues any
> >>> more):
> >>> Clang :: ARCMT/objcmt-numeric-literals.m
> >>> Clang :: CXX/expr/p9.cpp
> >>> Clang :: CXX/lex/lex.literal/lex.ext/p4.cpp
> >>> Clang :: CXX/lex/lex.literal/lex.ext/p7.cpp
> >>> Clang :: CodeGen/2008-01-21-PackedStructField.c
> >>> Clang :: CodeGen/builtins.c
> >>> Clang :: CodeGen/global-with-initialiser.c
> >>> Clang :: Sema/builtin-unary-fp.c
> >>> Clang :: Sema/constant-builtins-2.c
> >>> Clang :: Sema/constant-builtins.c
> >>> Clang :: SemaCXX/cxx11-ast-print.cpp
> >>> Clang :: SemaObjC/objc-literal-nsnumber.m
> >>> Clang-Unit :: AST/Release
> >>> +Asserts/ASTTests/StmtPrinter.TestFloatingPointLiteral
> >>> MultiSource/Applications/sqlite3/sqlite3
> >>> MultiSource/Benchmarks/McCat/08-main/main
> >>> MultiSource/Benchmarks/MiBench/automotive-basicmath/automotive-basicmath
> >>> MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4  (*)
> >>> SingleSource/Benchmarks/CoyoteBench/fftbench
> >>> SingleSource/Benchmarks/Misc-C++-EH/spirit
> >>> SingleSource/Benchmarks/Misc-C++/Large/ray
> >>> SingleSource/Benchmarks/Misc/mandel
> >>> SingleSource/UnitTests/2009-04-16-BitfieldInitialization
> >>> SingleSource/UnitTests/byval-alignment
> >>> 
> >>> (*) additionally requires two other patches to fix unrelated
> >>> problems
> >>> 
> >>> 
> >>> The first patch appended below implements the core arithmetic
> >>> routines to
> >>> treat PPCDoubleDouble as 106-bit mantissa type, including a couple
> >>> of unit
> >>> tests verifying basic behaviour.  Two follow-on patches clean up
> >>> APFloat
> >>> code further: the first by removing the now unused "sign2" and
> >>> "exponent2"
> >>> bit fields, and the second by removing the now unused
> >>> "arithmeticOK" logic.
> >>> A final fourth patch removes a number of special-case checks for
> >>> PPCDoubleDouble in the LLVM back-end, where the code used to
> >>> explicitly
> >>> avoid performing compile-time arithmetic on such numbers since it
> >>> wasn't
> >>> implemented.
> >>> 
> >>> Note that this fourth patch also includes a tweak to a test case;
> >>> that test
> >>> explicitly verified that converting a constant integer 0 to PowerPC
> >>> long
> >>> double invokes a run-time library call.  Since this is now actually
> >>> done at
> >>> compile-time, that routine is no longer used in that test.
> >>> 
> >>> 
> >>> Would this be OK to commit?
> >>> 
> >>> Bye,
> >>> Ulrich
> >>> 
> >>> (See attached file: diff-llvm-ppcdoubledouble)
> >>> (See attached file: diff-llvm-ppcdoubledouble-cleanup)
> >>> (See attached file: diff-llvm-ppcdoubledouble-arithmeticok)
> >>> (See attached file: diff-llvm-ppcdoubledouble-enable)
> >>> <diff-llvm-ppcdoubledouble>
> >>> <diff-llvm-ppcdoubledouble-cleanup>
> >>> <diff-llvm-ppcdoubledouble-arithmeticok>
> >>> <diff-llvm-ppcdoubledouble-enable>
> >>> _______________________________________________
> >>> llvm-commits mailing list
> >>> llvm-commits at cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> > -- 
> > Hal Finkel
> > Postdoctoral Appointee
> > Leadership Computing Facility
> > Argonne National Laboratory
>