[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses

Tue Mar 30 17:25:47 PDT 2021

On Tue, Mar 30, 2021 at 9:20 AM Sameer Sahasrabuddhe <sameer at sbuddhe.net>
wrote:

> ---- On Thu, 11 Feb 2021 06:07:17 +0530 David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote ----
>
>  > Owen - perhaps you've got some relevant perspective here given the GPU
>  > context/background, and general middle end optimization design
>  > questions. Could you take a look?
>  > (& perhaps Johannes?)
>  >
>  > On Mon, Feb 1, 2021 at 11:36 AM David Blaikie <dblaikie at gmail.com>
> wrote:
>  > >
>  > > Thanks for sending this out!
>  > >
>  > > +Mehdi - if you have a chance to look, I'd appreciate your thoughts
> on this, partly as a general LLVM contributor, but also with an eye to
> abstractions over multiple IRs given your work on MLIR
>  > > +Lang - you mentioned maybe some colleagues of yours might be able to
> take a look at the design here?
>  > >
>  > > I'd like to try writing up maybe alternatives to a couple of the
> patches in the series to see how they compare, but haven't done that as yet.
>  > >
>  > > My brief take is that I think the opaque handles make a fair bit of
> sense/seem high-value/low-cost (if it turned out the handle for one of
> these IR types needed to become a bit heavier, I don't think it'd be super
> difficult to revisit this code - make the generic handle large enough to
> accommodate whatever the largest concrete handle happens to be, etc).
> Though I'm still a bit less certain about the full runtime polymorphic
> abstractions.
>
> For context, this is about the options laid out in the following document
> from Nicolai:
>
>
> https://docs.google.com/document/d/1sbeGw5uNGFV0ZPVk6h8Q5_dRhk4qFnKHa-uZ-O3c4UY/edit?usp=sharing
>
> The above document was first introduced in the following email, which is
> the starting point of the current thread:
>
> https://lists.llvm.org/pipermail/llvm-dev/2020-December/147433.html
>
> The main issue is the need for an abstraction that greatly improves the
> experience of writing analyses that work on LLVM IR and MIR (and
> potentially also on MLIR). The way I understand it, Nicolai proposed an
> abstraction that involves dynamic polymorphism to abstract out the details
> of the underlying IR. David was more in favour of first unifying the IRs to
> the point that the dynamic polymorphism is limited to only occasional
> corner cases instead.
>
> I am currently involved in the activity towards decoupling AMDGPU's
> dependency on the abstractions being discussed here, and the one thing that
> immediately jumps out is the ability to traverse the predecessors and
> successors of a basic block. From all the emails so far and the resulting
> proposal, it seems that both David and Nicolai think that having a common
> non-abstract base class for basic blocks is a good idea. Such a base class
> will provide a vector of predecessors and successors that can be used to
> traverse the CFG independent of the actual IR.
>
> The current situation is that the basic blocks in LLVM IR and MLIR do not
> explicitly track their preds and succs. Preds are determined from the uses
> of the block, while succs are determined from the terminator instruction.
> MIR has explicit pred and succ vectors, but the succ pointers are
> duplicated by their presence as operands to the terminator instructions.
> The MachineBasicBlock is not a value, so there is no notion of traversing
> its uses for predecessors.
>
> So if we create a common non-abstract base class for basic blocks, then
> MLIR and LLVM IR will incur the overhead of two vectors, one each for the
> preds and succs. Nicolai had reported a 3.1% to 4.1% increase in the size
> LLVM IR in some typical programs:
>
>
> https://docs.google.com/spreadsheets/d/1cwRy2K4XjWCjwfK53MuCqws1TsQ0S6FtRjbwoUdGBfo/edit?usp=sharing
>
> Is this overhead the main issue in deciding whether we should introduce
> the common base class for traversing a CFG? Do note that MIR already
> duplicates successors. One could argue that duplicating the successor
> pointers in LLVM IR and MLIR merely brings them on the same level as MIR.
> If that argument sticks, then the "real" overhead is only half of what is
> reported to account for the duplicate predecessor pointers.
>
> The duplication of predecessor pointers stems from the fact that basic
> blocks in LLVM IR and MLIR are also values that are (primarily) used as
> operands to the terminator and PHI instructions. We could redefine these
> instructions to use non-value basic blocks as operands. CFG edges are not
> the typical use-def relation that other values represent, and it's
> reasonable to say that PHINodes and terminators are special in this one
> way.  I have not managed to see how far that rabbit hole goes, but I am not
> really convinced that this is attractive in any way. Is this even an option?
>

I'm not knowledgeable about everything you're looking at, but one minor
thing: in MLIR blocks aren't regular value operands and aren't part of the
value hierarchy, they don't participate in the usual use-def chains (and
MLIR does not have Phi nodes either).
Note that LLVM takes advantage of this to implement the `blockaddress`
instruction, which would be difficult in MLIR at the moment.

Best,

-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210330/4c62648d/attachment.html>