[cfe-dev] [RFC] Handling implementation limits

Wed Jan 1 20:14:16 PST 2020

On 1 Jan 2020, at 11:16, Mark de Wever wrote:
> This RFC asks the community for its interest in centralizing Clang's
> implementation limits and how you feel about the proposed approach.
>
> Abstract
> ========
>
> At the moment the implementation limits of the Clang compiler are 
> hidden
> in magic numbers in the code. These numbers sometimes differ between 
> the
> frontend and the backend. They are not always properly guarded by 
> compiler
> warnings, instead they trigger assertion failures.

*Technically*, crashing is still a valid way of indicating 
non-acceptance,
although obviously I agree that we should diagnose these things 
properly.
(They can’t just be warnings, though.)

> This proposal suggests a TableGen based solution to better maintain 
> these
> limits. This solution will be able to generate a header and 
> documentation
> of these limits. The latter is required by the C++ standard 
> [implimits].

I think this is a great approach.

> The problem
> ===========
>
> The proposal tries to solve 2 issues:
> * Implementation limits are not always clear and duplicated in the 
> source.
> * Document C++ Implementation quantities  is required by the C++ 
> standard
>   [implimits].

I suspect that the biggest part of this project by far will be testing
these implementation limits and then figuring out all the places that 
they
fall over.

> Unclear limits
> --------------
>
> The compiler has several limitations often 'hidden' as the width of a
> bit-field in a structure. This makes it hard to find these limits. Not
> only for our users but also for developers. While looking at a proper
> limit for the maximum width of a bit-field in the frontend I 
> discovered
> it didn't matter what the frontend picked, the backend already had a
> limit [D71142]. To fix this issue the frontend and backend should have
> the same limit. To avoid duplicating the value it should be in a 
> header
> available to the frontend and backend.

FWIW, we don’t generally refer to IRGen as the “backend”.

In many cases, the right code change will probably be to introduce a
static assertion linking the implementation limit to some value in code,
rather than using the limit directly.  For example, many limits will
be absolute numbers, and it is probably better to
`static_assert(IQ_MaxWidgets < (1ULL << SomeBitFieldWidth))` than to
try to make that bit-field have the exact right width.  This is also
a pattern that works even if the limit is stored in a normal field
of type (say) `unsigned`; such places should also have a `static_assert`
in order to remind the reader/maintainer that there’s an 
implementation
limit affected.

John.

> Since the values are often stored in a bit-field there is no standard 
> way
> to get the maximum value. This means the limit needs a small helper
> function to get the value, for example [D63975] uses
> `getMaxFunctionScopeDepth()`.
>
>
> Standard conformance
> --------------------
>
> This is rather simple the C++ standard Annex B states:
>    Because computers are finite, C++ implementations are inevitably
>    limited in the size of the programs they can successfully process.
>    Every implementation shall document those limitations where known.
>
> Currently this documentation is not available.
>
>
> The proposed solution
> =====================
>
> In order to solve this issue I created a proof-of-concept solution
> [D72053]. It contains a new file
> clang/include/clang/Basic/ImplementationQuantities.td with two 
> TableGen
> generators which create:
> * An inc file with a set of constants containing the limits. This file
>   $BUILD_DIR/tools/clang/include/clang/Basic/ImplementationQuantities.inc
>   is included in the header
>   clang/include/clang/Basic/ImplementationQuantities.h.
> * The document clang/docs/ImplementationQuantities.rst
>   This document documents the limits of Clang. These limits include, 
> but
>   are not limited to the quantities in [implimits].
>
> The quantity limit has the following possible types:
> * The quantity is limited by the number of bits in a bit-field. 
> TableGen
>   generates two constants:
>   * FieldNameBits, the number of bits in the bit-field.
>   * MaxFieldName, the maximum value of the bit-field. (This assumes 
> all
>     bit-fields can be stored in an unsigned.)
> * The quantity is limited by a value. TableGen generates one constant:
>   * MaxFieldName, the maximum value of the field.
> * The quantity's limit is determined by a compiler flag. TableGen
>   generates one constant:
>   * FieldNameDefault, the default value of the compiler flag.
> * The quantity's limit cannot be expressed in a number. For example it
>   depends on the stack size or available memory of the system. In this
>   case TableGen generates no constant, only documentation.
>
> For all types documentation is generated. The documentation shows the
> description of the limit and the limit implemented in Clang. If the
> recommended value is not 0 it is a value described in the C++ 
> Standard. In
> this case the recommended value is shown in the documentation.
>
>
> Questions
> =========
>
> * If this proposal is accepted how to we make sure the document is 
> updated
>   before releasing a new version of clang?
> * The compiler flag limits are also documented in the UserManual. This 
> means
>   these limits are still duplicated. Do we want to let the UserManual 
> also be
>   generated and use the values here or just keep the duplication?
>
>
> Bikeshed
> ========
>
> I picked IQ for the namespace of the quantities since it's short and 
> not
> often used in the codebase so it's easy greppable.
>
>
> [implimits] http://eel.is/c++draft/implimits
> [D63975] https://reviews.llvm.org/D63975
> [D71142] https://reviews.llvm.org/D71142
> [D72053] https://reviews.llvm.org/D72053
>
>
> Kind regards,
> Mark de Wever
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200101/a1570442/attachment.html>