[cfe-dev] __PRETTY_FUNCTION__ output loses enum type info for non-enumerated values

Richard Smith via cfe-dev cfe-dev at lists.llvm.org
Tue Sep 11 16:55:56 PDT 2018


On Mon, 10 Sep 2018 at 09:47, will wray via cfe-dev <cfe-dev at lists.llvm.org>
wrote:

> Thanks Richard.
>
> Ok; ignore Question 2 - you are absolutely right (what I was thinking).
> Printing different labels for the same type defeats the point of this
> request.
>
> OTOH, Question 1 is a request for Clang to print different output for
> different types.
>

I think you might be under the impression that I said that we shouldn't
address this, which was very much not my intent. What I tried to say was
that we *should* emit a cast in the case of a deduced template parameter
type, when the type of the argument expression we'd otherwise print is
different from that of the parameter. That should be sufficient to ensure
that Clang prints different outputs for different types. Note that
special-casing enums is the wrong way to fix that problem, as the problem
is not specific to enums. Consider, for example:

template<auto> struct X {};
X<(short)0> xa;
X<(long)0> xb;

The types of xa and xb both pretty-print as X<0> currently.

There are (at least) two natural ways to fix this: by always printing a
cast when the type of the argument isn't the same as that of the parameter
(which is the unnecessary verbosity that I was concerned with, and is what
we'd get if we generalized the proposed approach for enumerations to
address the whole problem), or by doing something smarter that depends on
whether the template parameter was deduced (which is what I'm suggesting is
the right way to address this).

The linked bug report is an example of where it is important to distinguish
> types.
>
Here's a summary of suggested output vs current output for Clang, GCC, MSVC.
> https://godbolt.org/z/QEae7F
>
> Plain, unscoped enum
>    enum e { a, b, c=b };
>    auto_name<a, e(1), c, e(3)>
>
>   Suggested: a, b, b, (e)3
>       Clang: a, b, b, 3
>         GCC: (e)0, (e)1, (e)1, (e)3
>        MSCV: 0, 1, 1, 3
>
> Scoped / opaque enum:
>    enum class E { a, b, c=b };
>    auto_name<E::a, E{1}, E::c, E{3}>
>
>   Suggested:  E::a, E::b, E::b, (E)3
>       Clang:  E::a, E::b, E::b, 3
>         GCC:  (E)0, (E)1, (E)1, (E)3
>        MSCV: 0, 1, 1, 3
>
> None of these are ideal -
> - Clang loses type info for non-enumerated values.
> - GCC does not distinguish enumerated / non-enumerated.
> - MSVC loses type all info, does not distinguish.
>
> Interestingly, it turns out that GCC's behaviour is a bug!
> (The *intent* of gcc code is to print the suggested output.
>  I intend to submit a bug report and a patch.)
>
> You are concerned that this suggested change would
>
>> make the template argument string more verbose without (necessarily)
>> providing any more information.
>
>
> Yet the information is necessary in order to distinguish different types.
> This change is needed for correctness.
>

For what it's worth, I don't see a correctness concern here. Anything
assuming our pretty-printed type names are unique is wrong. The bug you
referenced is a bug caused by a tool getting this wrong and using the
pretty type name string inappropriately, and the contributors on that bug
report seem happy to acknowledge that.

That said, as a quality of implementation matter, we should try to ensure
our pretty-printed type names are unique anyway wherever possible.

Verbosity is a secondary concern.
> Clang's current behaviour here could be treated as a bug
> (it has been described as "seriously suboptimal" by a gcc dev).
>
> For Clang, the increased verbosity is only for non-enumerated values.
> Enum-templates would be expected to mostly use enumerated values.
> Also, verbosity becomes more similar for enumerated / non-enumerated
> the number of characters in the identifier label vs the number of digits.
>
> I'm now preferring C-style cast notation over C++ function-style
> as it simplifies output and parsing code.
>
> Please reflect on this and consider - is this suggested change acceptable?
>

As noted above, the problem you described, while valid, does not justify
the solution you're suggesting. However, for enums specifically, I think
there's another argument to consider (one that I think you've not made
yet): passing a plain integer as a template argument to an enumeration
template parameter won't compile, so our pretty-printed type is not valid
C++. It's reasonable to consider how valuable it is to fix that, compared
with the value of having a shorter pretty-printed type name, but as you
point out, we're already paying the cost of including the enum type name in
the template argument in the case where an scoped enumeration is named, and
an integer cast to enum type would be of comparable length. So I think the
change you suggested is a good idea for that reason.

On Tue, Sep 4, 2018 at 4:30 PM Richard Smith <richard at metafoo.co.uk> wrote:
>
>> On Tue, 4 Sep 2018 at 09:09, will wray via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> A request to change Clang's __PRETTY_FUNCTION__ output to fix an issue.
>>>
>>> The code used in pretty function is also used to generate debug info in
>>> DWARF etc
>>> .
>>> (I believe). See, for instance the comment on this bug report:
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=21492#c1
>>>
>>>> Clang shows these arguments with the same DW_AT_name, X<1>, but they're
>>>> distinct types
>>>
>>>
>>> https://godbolt.org/z/0Ik0Hl
>>> Pretty function output, for auto... args
>>> (variadic just so that outputs display together):
>>>
>>> template <auto... V>
>>> constexpr char const* auto_name()
>>> {
>>> return __PRETTY_FUNCTION__;
>>> }
>>>
>>> enum e { a, b, c=b };
>>> enum class E { a, b, c=b };
>>>
>>> char const* nameU() { return auto_name<a, b, c, e(3)>(); }
>>> char const* nameS() { return auto_name<E::a, E::b, E::c, E{3}>(); }
>>>
>>>
>>> Gives output with these strings embedded;
>>>
>>> nameU:
>>> <a, b, b, 3>
>>>
>>> nameS:
>>> <E::a, E::b, E::b, 3>
>>>
>>>
>>> The first two enumerated values, a & b, are returned with no loss of
>>> type info as a, b.
>>> The next enumerated value, c, is a duplicate label for b's value, and is
>>> reported as b.
>>> The final value is not enumerated - there is no enum label with value 3 -
>>> it is reported with an unnecessary loss of type info
>>> as plain digits 3.
>>>
>>> The non-enumerated value should be reported with a cast or constructor:
>>>
>>> unscoped e: 3 -> (e)3 or e(3)
>>>
>>>   scoped   E: 3 -> (E)3 or E(3) or E{3}
>>>
>>>  (init-list E{3} only for scoped enum and only since c++17)
>>>
>>>
>>> Question 1:
>>>
>>> Will it be acceptable to change the output for non-enumerated values as
>>> so?
>>>
>>
>> In the case where the template parameter involves a deduced type (`auto`
>> or `decltype(auto)` or eventually a deduced template specialization type),
>> it would make a lot of sense to include the actual type in the printed
>> template argument if it differs from the type of the formatted argument
>> string. But I think we should leave out the type in all other cases, as it
>> would make the template argument string more verbose without (necessarily)
>> providing any more information.
>>
>> This will retain type info and allow round-trip without loss.
>>> Either C-style cast (e)3  or  function-cast / type-constructor e(3) -
>>> seem ok
>>> (gcc uses C-style casts and Clang uses this same style in source labels).
>>>
>>> For the duplicate-valued labels case there is no loss of type info,
>>> but there is a loss of information about the label used for the auto arg.
>>> Presumably, the code works from the supplied underlying-type value
>>> and picks the first label with the given value. I'd guess that Clang
>>> knows
>>> the label used in a direct instantiation and could return it in that
>>> case.
>>> (c.f. named member-pointer args have to retain the id for reporting).
>>>
>>
>> In the case of duplicated enumerators, there simply is no distinction
>> between 'b' and 'c' -- they are both merely names for the value 1 of type
>> 'e'. auto_name<b>() and auto_name<c>() are the same function and therefore
>> must return the same string.
>>
>>
>>> Question 2:
>>>
>>> Is it possible / acceptable to retain the label in the duplicate enum
>>> value case?
>>>
>>
>> No, per the above, we cannot and must not do that.
>>
>> Can anyone point me to the relevant code?
>>>
>>
>> This is printIntegral, at the top of lib/AST/TemplateBase.cpp. The "else"
>> case at the bottom would need to print out additional text representing the
>> cast when needed.
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180911/2ee55047/attachment.html>


More information about the cfe-dev mailing list