[cfe-dev] __PRETTY_FUNCTION__ output loses enum type info for non-enumerated values

will wray via cfe-dev cfe-dev at lists.llvm.org
Mon Sep 10 09:46:45 PDT 2018


Thanks Richard.

Ok; ignore Question 2 - you are absolutely right (what I was thinking).
Printing different labels for the same type defeats the point of this
request.

OTOH, Question 1 is a request for Clang to print different output for
different types.
The linked bug report is an example of where it is important to distinguish
types.

Here's a summary of suggested output vs current output for Clang, GCC, MSVC.
https://godbolt.org/z/QEae7F

Plain, unscoped enum
   enum e { a, b, c=b };
   auto_name<a, e(1), c, e(3)>

  Suggested: a, b, b, (e)3
      Clang: a, b, b, 3
        GCC: (e)0, (e)1, (e)1, (e)3
       MSCV: 0, 1, 1, 3

Scoped / opaque enum:
   enum class E { a, b, c=b };
   auto_name<E::a, E{1}, E::c, E{3}>

  Suggested:  E::a, E::b, E::b, (E)3
      Clang:  E::a, E::b, E::b, 3
        GCC:  (E)0, (E)1, (E)1, (E)3
       MSCV: 0, 1, 1, 3

None of these are ideal -
- Clang loses type info for non-enumerated values.
- GCC does not distinguish enumerated / non-enumerated.
- MSVC loses type all info, does not distinguish.

Interestingly, it turns out that GCC's behaviour is a bug!
(The *intent* of gcc code is to print the suggested output.
 I intend to submit a bug report and a patch.)

You are concerned that this suggested change would

> make the template argument string more verbose without (necessarily)
> providing any more information.


Yet the information is necessary in order to distinguish different types.
This change is needed for correctness. Verbosity is a secondary concern.
Clang's current behaviour here could be treated as a bug
(it has been described as "seriously suboptimal" by a gcc dev).

For Clang, the increased verbosity is only for non-enumerated values.
Enum-templates would be expected to mostly use enumerated values.
Also, verbosity becomes more similar for enumerated / non-enumerated
the number of characters in the identifier label vs the number of digits.

I'm now preferring C-style cast notation over C++ function-style
as it simplifies output and parsing code.

Please reflect on this and consider - is this suggested change acceptable?


On Tue, Sep 4, 2018 at 4:30 PM Richard Smith <richard at metafoo.co.uk> wrote:

> On Tue, 4 Sep 2018 at 09:09, will wray via cfe-dev <cfe-dev at lists.llvm.org>
> wrote:
>
>> A request to change Clang's __PRETTY_FUNCTION__ output to fix an issue.
>>
>> The code used in pretty function is also used to generate debug info in
>> DWARF etc
>> .
>> (I believe). See, for instance the comment on this bug report:
>> https://sourceware.org/bugzilla/show_bug.cgi?id=21492#c1
>>
>>> Clang shows these arguments with the same DW_AT_name, X<1>, but they're
>>> distinct types
>>
>>
>> https://godbolt.org/z/0Ik0Hl
>> Pretty function output, for auto... args
>> (variadic just so that outputs display together):
>>
>> template <auto... V>
>> constexpr char const* auto_name()
>> {
>> return __PRETTY_FUNCTION__;
>> }
>>
>> enum e { a, b, c=b };
>> enum class E { a, b, c=b };
>>
>> char const* nameU() { return auto_name<a, b, c, e(3)>(); }
>> char const* nameS() { return auto_name<E::a, E::b, E::c, E{3}>(); }
>>
>>
>> Gives output with these strings embedded;
>>
>> nameU:
>> <a, b, b, 3>
>>
>> nameS:
>> <E::a, E::b, E::b, 3>
>>
>>
>> The first two enumerated values, a & b, are returned with no loss of type
>> info as a, b.
>> The next enumerated value, c, is a duplicate label for b's value, and is
>> reported as b.
>> The final value is not enumerated - there is no enum label with value 3 -
>> it is reported with an unnecessary loss of type info
>> as plain digits 3.
>>
>> The non-enumerated value should be reported with a cast or constructor:
>>
>> unscoped e: 3 -> (e)3 or e(3)
>>
>>   scoped   E: 3 -> (E)3 or E(3) or E{3}
>>
>>  (init-list E{3} only for scoped enum and only since c++17)
>>
>>
>> Question 1:
>>
>> Will it be acceptable to change the output for non-enumerated values as
>> so?
>>
>
> In the case where the template parameter involves a deduced type (`auto`
> or `decltype(auto)` or eventually a deduced template specialization type),
> it would make a lot of sense to include the actual type in the printed
> template argument if it differs from the type of the formatted argument
> string. But I think we should leave out the type in all other cases, as it
> would make the template argument string more verbose without (necessarily)
> providing any more information.
>
> This will retain type info and allow round-trip without loss.
>> Either C-style cast (e)3  or  function-cast / type-constructor e(3) -
>> seem ok
>> (gcc uses C-style casts and Clang uses this same style in source labels).
>>
>> For the duplicate-valued labels case there is no loss of type info,
>> but there is a loss of information about the label used for the auto arg.
>> Presumably, the code works from the supplied underlying-type value
>> and picks the first label with the given value. I'd guess that Clang knows
>> the label used in a direct instantiation and could return it in that case.
>> (c.f. named member-pointer args have to retain the id for reporting).
>>
>
> In the case of duplicated enumerators, there simply is no distinction
> between 'b' and 'c' -- they are both merely names for the value 1 of type
> 'e'. auto_name<b>() and auto_name<c>() are the same function and therefore
> must return the same string.
>
>
>> Question 2:
>>
>> Is it possible / acceptable to retain the label in the duplicate enum
>> value case?
>>
>
> No, per the above, we cannot and must not do that.
>
> Can anyone point me to the relevant code?
>>
>
> This is printIntegral, at the top of lib/AST/TemplateBase.cpp. The "else"
> case at the bottom would need to print out additional text representing the
> cast when needed.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180910/e7b5cb95/attachment.html>


More information about the cfe-dev mailing list