[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses
Sameer Sahasrabuddhe via llvm-dev
llvm-dev at lists.llvm.org
Thu Apr 1 01:54:29 PDT 2021
Nicolai Hähnle writes:
> A few high-level remarks from what I remember. Let's always keep in mind
> that there are a whole bunch of puzzle pieces in flight here, and there's
> always a risk that the discussion gets off track by losing sight of that
> bigger picture. Some of those puzzle pieces can hopefully be
> non-controversial. In particular, I'm thinking here of:
>
> 1. Aligning the different IR classes so that they implement common concepts
> and allow writing templates idiomatically.
> 2. Allowing the use of opaque handles
> <https://docs.google.com/document/d/1sbeGw5uNGFV0ZPVk6h8Q5_dRhk4qFnKHa-uZ-O3c4UY/edit?ts=5fd124e3#heading=h.adqu14d3b7jd>
> as described in the document.
I agree if you mean to say that the above two things are not mutually
exclusive, and both are beneficial. At least from my point of view,
templates over a common set of concepts are very valuable for writing
utility functions, but become cumbersome for writing large
analyses. Opaque handles (combined with a context class) can make it
easier to separate the IR-independent implementation from the
IR-dependent interface.
The context class provides a way to look up properties on the
opaque handle (such as the successors for a BlockHandle object) through
virtual functions. But having a non-abstract base class quickly reduce
the surface of these virtual functions, since the most common functions
like successors() are readily available in the base class itself.
That is related to what Mehdi asked in a sibling email:
> I'm fairly old school here I guess, as I'd have naively written an
> abstract base class with virtual methods (or function_ref callback
> into the implementation), but there are concerns with the indirection
> compared to a templated implementation.
> But aren't we trading the cost of this indirection with even more cost
> if we have to duplicate information in a base class?
Yes, we are avoiding indirect calls (not to be confused with just
indirection through a data pointer). It's useful to ask whether it is
okay to have indirect calls (through virtual functions) when computing
an analysis instead of doing something that affects the IR
itself. Is it an option to let each analysis worry about its own
performance, perhaps by caching internal results of the virtual calls?
Presumably, using the results of an analysis or updating the analysis
will only have an incremental cost from indirect calls?
If we want to move away from templates, then dynamic polymorphism cannot
be completely avoided, like when looking up the definition of a
value. But having a common base class can take out big chunks of the
surface area. The posterboy for this is CFG traversal along the
successor/predecessor pointers.
> An observation about the performance implications of common base classes
> for basic blocks. I'm taking it as a given that a common base class would
> have predecessor and successor vectors. In addition to making predecessor
> and successor iteration simpler, this could actually end up _saving_ memory
> for LLVM IR in many cases if we make the following additional changes:
>
> 1. Change the definition of phi nodes so that they only contain the
> predecessor _values_ in their argument lists (removing all the arguments
> referring to the predecessor basic blocks).
> 2. Change the definition of terminator instructions so that they no longer
> contain the successor basic block(s) in their argument lists and instead
> implicitly refer to the basic blocks successor list. For example, a
> conditional branch would be defined such that its only argument is the i1
> condition. If the condition is true, the program branches to
> brInst->getParent()->getSuccessors()[0]; if the condition is false, it
> branches to brInst->getParent()->getSuccessors()[1].
>
> This is an option that I hadn't taken into account when I calculated those
> 3-4% of memory overhead, but it should end up saving memory whenever a
> basic block has >= 2 phi nodes (the vectors themselves have an overhead, so
> it depends). My experiment would have to be re-done to get a better idea of
> the final balance.
>
> The changes mentioned above are admittedly very invasive and complicated
> changes. It would be good to get an idea of what sort of acceptance
> criterion the LLVM community has here.
Guess what, I had actually attempted this before reviving this thread!
My first experiment was to put an "assert(getParent())" in the
{get,set}Successor() methods. That didn't result in any interesting
failures. The real problem is with the terminator constructors. Usually
any Instruction::Create() sort of function optionally takes an
InsertBefore or InsertAtEnd argument. But if we require terminators to
access successors from the parent block, then variants of say
BranchInst::Create() that take successor pointers as arguments must now
require either an InsertBefore or an InsertAtEnd. That breaks all sorts
of code that manipulates the CFG.
Requiring a parent block when creating a terminator is a lot of work,
and I wasn't sure that the gain was worth the effort. It would need
refactoring many places that have been built around the assumption that
anyone can create an instruction in a vacuum. Instead, redefining the
BasicBlock to not be derived from the Value class seemed more practical
to me. It reduces the same overhead of redundantly tracking CFG
edges. For the special uses of llvm::BasicBlock as a value, we could
provide a new BasicBlockValue class. I tried commenting out the Value
class parent from the BasicBlock declaration, but that experiment is too
big to try on my own, especially without some consensus forming in this
discussion.
Sameer.
More information about the llvm-dev
mailing list