[llvm-dev] RFC: Refactor SubclassData

Ehud Katz via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 1 23:54:11 PST 2020

The solution can support most (if not all) of the features you have
mentioned. It can always be improved if needed, but for now, it is a great,
neat, solution for the classes in the LLVM IR library.

Just for an example: using the new implementation, I have managed to move
the SSID field into the Subclass Data of the Instructions, and in turn
reduce the size of those instructions by 8 bytes.

I have opened a new review for the RFC:

Please review it.

On Mon, Dec 30, 2019 at 3:23 PM Bruno Ricci <riccibrun at gmail.com> wrote:

> Hi,
> Do you have some code we can look at (even if it is in a nasty unpolished
> state, just mark it WIP
> and put it on Phab) ? It is hard to evaluate an alternative without the
> code. That said I think
> that the table is a little bit one-sided. I have added some inline
> comments.
> On 30/12/2019 11:53, Ehud Katz wrote:
> > The solution in Clang is still very complicated and error prone. A lot
> of synchronization work (between different classes - at least in the same
> hierarchy) needs to be done manually.
> >
> > I'll summarize the capabilities of the 3 approaches in the following
> (kinda) table, using the following columns *[ Current LLVM | Clang | New
> RFC ]* :
> >
> >   * *[_|X|X] static assert *- that the declared accumulated bitfields do
> not exceed the underlying subclass data size (note that int the New
> implementation it is automatically added on declaration)
> >       o *[_|_|X] automatic static assert *- is adding the static assert
> needs to be manually or is it done automatically with the declaration of
> the new bitfield.
> This is not actually true. There is only a single static_assert for the
> size of
> the union in Stmt [1] and the same could be done for the union in DeclBase.
> >   * *[_|_|X] runtime assert* - that a new value set, fits into the the
> bitfield (without truncation).
> This is true, I agree here that this is useful.
> >   * *[_|_|X] typed* - as opposed to using a representative type (like
> `int`) and then cast to the actual required type (like `bool` or `enum`).
> Typed (ordinary) bitfields cannot be implemented correctly in MSVC, as the
> types of all the bitfields must be of the same type. Using typed bitfields
> also saves us the need to synchronize the use of `unsigned/signed int` with
> the actual type needed.
> >   * *[X|_|X] declare in actual class* - as opposed to one of the base
> classes.
> True, but I am curious to see how you can avoid doing this in the base.
> The whole point
> of doing this bit-field thing is to save space by reusing the padding
> after the base
> subobject. Maybe with an aligned char array reinterpret_cast'ed by the
> derived classes
> as mentioned before? The trick is that we don't want to repeat the members
> from the
> base classes. Also we have to avoid UB.
> >   * *[_|_|X] declare (a bitfield) in a single line* - as opposed to the
> need to declare helpers or somekind, like `enum` (manually)>   * *[_|_|X]
> clean bitfields* - without exposing a bit manipulation `enum`.
> The enum is only used by the other bit-field classes. This is invisible to
> the class itself.
> No helpers are needed: just write SomeClassBits.SomeBit to refer to
> SomeBit in the
> bit-field of SomeClass.
> >   * *[_|_|X] automatic inheritance of unused bits* - no need to get
> offset from super (manually).
> >   * *[_|_|X] automatic calculation of unused bits* - changing a single
> bitfield doesn't require any other change, but the actual bitfield itself
> (as opposed to changing also the sum of the bit count used by the class, in
> an `enum` - for exmple).
> >   * *[_|_|X] implicit reference to superclass* - as opposed to the need
> to use the base class' info explicitly.
> >   * *[_|_|X] no need to know anything about any of the base classes*.
> I agree that forgetting to update the offset in the enum can be a source
> of error (but
> the enum is just a few lines below so it is hard to miss). Some things to
> keep in mind:
> - Some bit-fields are aligned to a byte boundary for (benchmarked) faster
> access [2]
> - Some bit-field classes have a hole which is used by derived classes [3]
> - Some bit-fields do not have a fixed size, just something "large enough".
>   (this is not ideal and it would be better to have a well-defined limit
>    which could then be used to trigger an error instead of just
> overflowing,
>    that's on my TODO list...) [4]
> >
> > I think the table speaks for itself.
> >
> > Craig, regarding the `getSubclassDataFromInstruction()`, it still does
> not turn the tides of the table, above, into the current implementation.
> >
> Cheers!
> [1]
> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L1088
> [2]
> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L471
> [3]
> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L849
> [4]
> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L804
> Bruno Ricci
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200102/60ef37cc/attachment.html>

More information about the llvm-dev mailing list