[flang-dev] [F18] Build time investigations
David Truby via flang-dev
flang-dev at lists.llvm.org
Mon Dec 9 08:03:23 PST 2019
Hi all,
I’ve continued looking in to the build time issues as mentioned before. Having looked at the new Boost Spirit design it seems they have a mechanism for splitting their parsers across multiple files, as well as ensuring that a parsing rule doesn’t need to have access to the entire definition of other rules when being instantiated.
Splitting across multiple files should help with peak memory usage as well as increasing the potential for parallelism, however the second result is more interesting as it means that when instantiating a parsing rule the template instantiation need not be recursive.
The following example of a very simple grammar shows what this looks like:
----------------------------------------------------------------------------
x3::rule<class Add, ast::Add> const Add = "add";
x3::rule<class Sub, ast::Sub> const Sub = "sub";
x3::rule<class Paren, ast::Paren> const Paren = "paren";
auto const Paren_def = x3::double_ | ("(" >> Add >> ")");
auto const Sub_def = Paren >> "-" >> Sub;
auto const Add_def = Sub >> "+" >> Add;
BOOST_SPIRIT_DEFINE(Add, Sub, Paren);
---------------------------------------------------------------------------
In essence, the BOOST_SPIRIT_DEFINE macro creates (for each rule passed to it) a Parse function that associates the declaration of the rule (as given by x3::rule…. = “name”;) with its definition. However, the definitions only need to see that the rule exists, and not what it contains or which other rules it references. As such the template instantiation depth ends up being much lower than our parser.
If we could somehow leverage a similar mechanism in our own parsing library, we could use this to significantly reduce compile times without much changing the current design (although a relatively large amount of code change might be necessary).
Interestingly, the Boost Spirit developers found that this also increased runtime performance of the generated parsers, as the code given to the optimizer is much simpler and therefore optimizes much better.
Does anyone else have any thoughts about whether this idea might be worth investigating further?
As a side not, I am out of office from the 12th of December to the 6th of January so will not be able to look into this further during that time.
David Truby
On 28 Nov 2019, at 10:30, David Truby <David.Truby at arm.com<mailto:David.Truby at arm.com>> wrote:
Hi Richard,
With regards to what I’m going to look at in the immediate future, I noticed after sending the email yesterday that Boost::Spirit has undergone a redesign recently that they claim gives 3-4x compile time speedup relative to their old implementation. While I realise that we may not want to introduce a dependency on Boost, if it really does improve compile times by this much we can at least look at what they’ve done to get this benefit. However I don’t know where our parser sits on the spectrum of build times between Boost Spirit 2 and 3, so I’ll want to investigate that first. It’s also worth noting that Boost Spirit has its own variant implementation designed specifically for representing ASTs, which may also provide some benefit for us, but I haven’t looked into that in detail. If some combination of these things gives us good compile times while keeping the same design we have now, it would seem to me that that would be a good way forward.
My concern with modules is they require changes to the code and build system that would introduce a hard dependency on a compiler/version of cmake supporting them. It’s not as simple as flicking a switch and getting the improvements that modules may offer.
My calculations on when it would be possible for an in-tree LLVM project to use these are based on the policy that LLVM would like to be able to build (for example) with the system compiler in the oldest supported Ubuntu LTS release.
The Ubuntu LTS releases have 5 years support from release, and track the odd numbered GCC releases. So the next release in April will contain GCC 9, which has no module support. That means that if we follow the same compiler policy as LLVM then the earliest we can change our code to require modules would be 5.5 years from now.
That’s assuming that GCC 11 will have module support; however there’s no (even partial) module support in GCC 10 as of yet either so assuming that modules would be stable in GCC 11 might be optimistic. If they aren’t that gives us an Ubuntu release with no modules in 2022, which would need supporting until 2027. So although 10 years is possibly pessimistic, if we follow the same compiler policy as the rest of the LLVM tree we would be looking at 5-8 years minimum.
It’s possible that the LLVM compiler policy isn’t this concrete, however I believe they are currently sticking to GCC 5.1 as the oldest supported compiler for the reason that it’s the system compiler in Ubuntu 16.04.
Thanks
David
On 27 Nov 2019, at 17:53, Richard Barton <Richard.Barton at arm.com<mailto:Richard.Barton at arm.com>> wrote:
Hi David
Thanks for this great update on the progress so far. I’m obviously privy to most of it already sitting as I do in the same office in Manchester, UK.
I wonder if you could share your strawman proposal of what ideas you will look into next in the absence of any other feedback from the community? Do you intend to prototype the external parser generator idea, or look further into modules to see if they would be a good fix, even if some years out?
Modules in 10 years seems like a very conservative estimate. What is your rationale here?
Ta
Rich
From: flang-dev <flang-dev-bounces at lists.llvm.org<mailto:flang-dev-bounces at lists.llvm.org>> On Behalf Of David Truby via flang-dev
Sent: 27 November, 2019 16:44
To: flang-dev at lists.llvm.org<mailto:flang-dev at lists.llvm.org>
Subject: [flang-dev] [F18] Build time investigations
Hi all,
As I mentioned on the list last week, I’ve been looking into the build time/memory usage of F18 builds and trying to narrow down where the problem lies and what possible solutions there might be. I thought I’d post an update on here to see if anyone else has any input about my findings so far.
First I’ll talk about things I tried that I don’t think would help:
I looked into using modules to improve build times without significant code changes first. However I quickly hit some stumbling blocks with this with regards to implementation; there does not as yet exist a module implementation that is compliant with the approved C++20 proposal on modules. Clang 9 has a partial implementation, however there are some gaps in the implementation, particularly as regards module header units (the import <header_file.hpp> mechanism) which I could not get to work correctly. Furthermore, cmake doesn’t have support for modules in either the current release or in their source trunk; they have some plans for module support but nothing concrete as yet. As such if we went forward with modules we would be changing our code to introduce a dependency to LLVM on an as-yet-nonexistent compiler and some future version of cmake. So my conclusion here is that this could be a solution 5 or 10 years down the line but doesn’t currently help.
With regards to writing our own llvm::variant type and using that, I did some investigation into the state-of-the-art of variant implementations to see what the build time characteristics of them are. First, I found online an incredibly simple implementation that didn’t have all of variants operations and simply used dynamic_cast/rtti everywhere rather than the template trickery found in std::variant (this was just a proof of concept as rtti is not acceptable for us). I couldn’t get this to compile fully because it didn’t implement all of the operations of variant we actually use, but it did take longer to tell me that it wouldn’t compile than our current implementation takes to compile in total, so this wouldn’t help even if I went further to get it to compile. I also investigated the variant implementation found here https://github.com/mpark/variant. This implementation has had a lot of build time performance work done on it as you can see from the website (https://mpark.github.io/variant/) however this gives very similar performance to the std::variant currently found in libc++, on which I believe it is based. It seems that the number/depth of our template instantiations is the problem rather than anything specific to the implementation.
I also briefly looked at using a custom tagged union implementation rather than std::variant, however I quickly discarded this idea due to the abnormal behaviour of non-trivial types in unions. In short constructors and destructors are not called automatically, and when changing the active field you need to manually call the destructor of the first type and then placement-new the second type into the same memory space, which is incredibly error prone. As such I don’t see this as a valid way forward.
I have also been looking at the parser, since after another look at the trace it seems a large amount of time is spent in the template instantiations of the overloaded operators that are used to generate the parser as well as the parse_tree. This is something I’m currently looking further into but it’s possible that with some changes to the way the parser is written we could still keep the current variant-based parse tree and see acceptable build times. I feel that it is the combination of these two things that is causing the build time behaviour we see, and that changing either one could bring us down to acceptable levels. I remember using Boost::Spirit (which the f18 parser seems to have a similar design to) for my dissertation and seeing very similar build time behaviour to what we see here. In the end I changed to using an external parser generator to avoid this behaviour.
If anyone has any input on this it’d be very welcome so we can look at a way forward.
Thanks
David Truby
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/flang-dev/attachments/20191209/3cf3e9eb/attachment-0001.html>
More information about the flang-dev
mailing list