[llvm-dev] RFC: Refactor SubclassData

Ehud Katz via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 28 12:45:05 PST 2020


It's been awhile... any comment?

On Thu, 2 Jan 2020 at 9:54 Ehud Katz <ehudkatz at gmail.com> wrote:

> The solution can support most (if not all) of the features you have
> mentioned. It can always be improved if needed, but for now, it is a great,
> neat, solution for the classes in the LLVM IR library.
>
> Just for an example: using the new implementation, I have managed to move
> the SSID field into the Subclass Data of the Instructions, and in turn
> reduce the size of those instructions by 8 bytes.
>
> I have opened a new review for the RFC:
> https://reviews.llvm.org/D72068
>
> Please review it.
> Thanks,
> Ehud.
>
> On Mon, Dec 30, 2019 at 3:23 PM Bruno Ricci <riccibrun at gmail.com> wrote:
>
>> Hi,
>>
>> Do you have some code we can look at (even if it is in a nasty unpolished
>> state, just mark it WIP
>> and put it on Phab) ? It is hard to evaluate an alternative without the
>> code. That said I think
>> that the table is a little bit one-sided. I have added some inline
>> comments.
>>
>> On 30/12/2019 11:53, Ehud Katz wrote:
>> > The solution in Clang is still very complicated and error prone. A lot
>> of synchronization work (between different classes - at least in the same
>> hierarchy) needs to be done manually.
>> >
>> > I'll summarize the capabilities of the 3 approaches in the following
>> (kinda) table, using the following columns *[ Current LLVM | Clang | New
>> RFC ]* :
>> >
>> >   * *[_|X|X] static assert *- that the declared accumulated bitfields
>> do not exceed the underlying subclass data size (note that int the New
>> implementation it is automatically added on declaration)
>> >       o *[_|_|X] automatic static assert *- is adding the static assert
>> needs to be manually or is it done automatically with the declaration of
>> the new bitfield.
>>
>> This is not actually true. There is only a single static_assert for the
>> size of
>> the union in Stmt [1] and the same could be done for the union in
>> DeclBase.
>>
>> >   * *[_|_|X] runtime assert* - that a new value set, fits into the the
>> bitfield (without truncation).
>>
>> This is true, I agree here that this is useful.
>>
>> >   * *[_|_|X] typed* - as opposed to using a representative type (like
>> `int`) and then cast to the actual required type (like `bool` or `enum`).
>> Typed (ordinary) bitfields cannot be implemented correctly in MSVC, as the
>> types of all the bitfields must be of the same type. Using typed bitfields
>> also saves us the need to synchronize the use of `unsigned/signed int` with
>> the actual type needed.
>> >   * *[X|_|X] declare in actual class* - as opposed to one of the base
>> classes.
>>
>> True, but I am curious to see how you can avoid doing this in the base.
>> The whole point
>> of doing this bit-field thing is to save space by reusing the padding
>> after the base
>> subobject. Maybe with an aligned char array reinterpret_cast'ed by the
>> derived classes
>> as mentioned before? The trick is that we don't want to repeat the
>> members from the
>> base classes. Also we have to avoid UB.
>>
>> >   * *[_|_|X] declare (a bitfield) in a single line* - as opposed to the
>> need to declare helpers or somekind, like `enum` (manually)>   * *[_|_|X]
>> clean bitfields* - without exposing a bit manipulation `enum`.
>>
>> The enum is only used by the other bit-field classes. This is invisible
>> to the class itself.
>> No helpers are needed: just write SomeClassBits.SomeBit to refer to
>> SomeBit in the
>> bit-field of SomeClass.
>>
>> >   * *[_|_|X] automatic inheritance of unused bits* - no need to get
>> offset from super (manually).
>> >   * *[_|_|X] automatic calculation of unused bits* - changing a single
>> bitfield doesn't require any other change, but the actual bitfield itself
>> (as opposed to changing also the sum of the bit count used by the class, in
>> an `enum` - for exmple).
>> >   * *[_|_|X] implicit reference to superclass* - as opposed to the need
>> to use the base class' info explicitly.
>> >   * *[_|_|X] no need to know anything about any of the base classes*.
>>
>> I agree that forgetting to update the offset in the enum can be a source
>> of error (but
>> the enum is just a few lines below so it is hard to miss). Some things to
>> keep in mind:
>>
>> - Some bit-fields are aligned to a byte boundary for (benchmarked) faster
>> access [2]
>> - Some bit-field classes have a hole which is used by derived classes [3]
>> - Some bit-fields do not have a fixed size, just something "large enough".
>>   (this is not ideal and it would be better to have a well-defined limit
>>    which could then be used to trigger an error instead of just
>> overflowing,
>>    that's on my TODO list...) [4]
>>
>> >
>> > I think the table speaks for itself.
>> >
>> > Craig, regarding the `getSubclassDataFromInstruction()`, it still does
>> not turn the tides of the table, above, into the current implementation.
>> >
>>
>> Cheers!
>>
>> [1]
>> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L1088
>> [2]
>> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L471
>> [3]
>> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L849
>> [4]
>> https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L804
>>
>> Bruno Ricci
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200128/6d3ecb2c/attachment.html>


More information about the llvm-dev mailing list