[llvm-dev] A libc in LLVM
Stephen Canon via llvm-dev
llvm-dev at lists.llvm.org
Fri Jun 28 09:58:08 PDT 2019
> On Jun 26, 2019, at 2:20 PM, Siva Chandra via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 5. Avoid assembly language as far as possible - Again, there will be
> places where one cannot avoid assembly level implementations. But,
> wherever possible, we want to avoid assembly level implementations.
> There are a few reasons here as well:
> a) We want to leverage the compiler for performance wherever possible,
> and as part of the LLVM project, fix compiler bugs rather than use
As a long time libm and libc developer, and occasional compiler contributor, I will point out that this is either fundamentally in conflict with your other stated goals, entails a commitment to wide-ranging compiler improvements, or requires some very specific choices about your implementation. Much of a libc can be implemented quite easily in C or C++. However:
- You say you want to conform to relevant standards; however, e.g. the UNIX test suite requires that math.h functions not set spurious flags. This is impossible to reliably achieve in C with clang, because clang and LLVM do not precisely model the floating-point environment. On Apple’s platforms, much of the math library is written in assembly as much for this reason as for performance. I see four basic options for you here:
1. You could partially work around this by adding builtins and an extensive conformance suite, making your implementations fragile to compiler optimization but detecting the breakages immediately.
2. You could do the work of precisely modeling the floating-point environment.
3. You could simply declare that you are not going to care about flags at all, which is fine for 99% of users, but is a clear break from relevant standards (and would make your libc unable to be adopted by some platform maintainers).
4. You could implement significant pieces of the math library in assembly.
None of these is a decision to be undertaken lightly. Have you thought about this issue at all?
I would also be curious what your plans are with regard to reproducible results in the math library: is it your intention to produce the same result on all platforms? On all microarchitectures? If so, and you’re developing for baseline x86_64 first, you’re locking yourself out of using many architectural features that are critical to delivering 30-50% of performance for these functions on other platforms (and even on newer x86)—static rounding control, FMA, etc. Even if you don’t care about that, implementation choices you make for around x86_64 will severely restrict your performance on other platforms if exact reproducibility is a requirement and you don’t carefully choose a set of “required ISA operations” on which to implement your math functions.
- For most platforms, there are significant performance wins available for some of the core strings and memory functions using assembly, even as compared to the best compiler auto-vectorization output. There are a few reasons for this, but one of the major ones is that—in assembly, on most architectures—we can safely do aligned memory accesses that are partially outside the buffer that has been passed in, and mask off or ignore the bytes that are invalid. This is a hugely significant optimization for edging around core vector loops, and it’s simply unavailable in C and C++ because of the abstract memory models they define. A compiler could do this for you automatically, but this is not yet implemented in LLVM (and you don’t want to be tightly coupled to LLVM, anyway?) In practice, on many systems, the small-buffer case dominates usage for these functions, so getting the most efficient edging code is basically the only thing that matters.
1. Are you going to teach LLVM to perform these optimizations? If so, awesome, but this is not at all a small project—you’re not just fixing an isolated perf bug, you’re fundamentally reworking autovectorization. What about other compilers?
2. Are you going to simply write off performance in these cases and let the autovectorizer do what it does?
3. Will you use assembly instead purely for optimization purposes?
A bunch of other questions will probably come to me around the math library, but I would encourage you to think very carefully about what specifications you want to have for a libm before you start building one. All that said, I think having more libc implementations is great, but I would be very careful to define what design tradeoffs you’re making around these choices and to what spec(s) you plan to conform, and why they necessitate a new libc rather than adapting an existing one.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev