[cfe-dev] [Proposal]: Fast non-portable extension to retrieve a type name from a pointer as a GV: __builtin_type_name(T* Ptr, [int Style])

Ben Craig via cfe-dev cfe-dev at lists.llvm.org
Thu Sep 6 10:46:34 PDT 2018


What happens if you…

  1.  typeid(ClassWithExternalVtable) from a -fno-rtti TU, and the vtable is in a -frtti TU?
  2.  typeid in the -frtti, vtable in the -fno-rtti?

Wild guesses…

  1.  will cause an extra, weak linkage type_info to be created, but the linker or loader will use the one from the -frtti TU.  Object file size suffers a little bit, but people are generally happy.
  2.  I’m expecting an error at link or load time here, because the -frtti TU is expecting someone else to provide the type_info, but no one does.

From: James Y Knight <jyknight at google.com>
Sent: Thursday, September 6, 2018 10:49 AM
To: Ben Craig <ben.craig at ni.com>
Cc: Richard Smith <richard at metafoo.co.uk>; kristina at nym.hush.com; Clang Dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] [Proposal]: Fast non-portable extension to retrieve a type name from a pointer as a GV: __builtin_type_name(T* Ptr, [int Style])

On Wed, Sep 5, 2018 at 10:32 PM Ben Craig <ben.craig at ni.com<mailto:ben.craig at ni.com>> wrote:
Which TU(s) will the type_info object go in, and what will the linkage of those objects be?  If you call typeid(SomeClass), and all of SomeClass’s functions are out-of-lined in another TU, will you generate a type_info object of weak_linkage in the current TU?  Or will you expect the TU that has SomeClass’s functions in it to define the type_info object?

I think if you want things to work, then you will need to generate a weak linkage type_info in each TU that calls typeid(SomeClass).  I’m not sure if that’s ok or not though.

Yes, the last. It is what we do already -- only in special cases can we assume that there is typeinfo available externally (certain standard types, and when RTTI is enabled, types with an externally available vtable). In other cases, the typeinfo is already emitted where required, with weak linkage.


From: Richard Smith <richard at metafoo.co.uk<mailto:richard at metafoo.co.uk>>
Sent: Tuesday, September 4, 2018 5:49 PM
To: Ben Craig <ben.craig at ni.com<mailto:ben.craig at ni.com>>
Cc: James Y Knight <jyknight at google.com<mailto:jyknight at google.com>>; kristina at nym.hush.com<mailto:kristina at nym.hush.com>; Clang Dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>
Subject: Re: [cfe-dev] [Proposal]: Fast non-portable extension to retrieve a type name from a pointer as a GV: __builtin_type_name(T* Ptr, [int Style])

On Tue, 4 Sep 2018 at 15:02, Ben Craig via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
If this proves useful, I could amend my in-flight paper ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1105r0.html#rtti<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_jtc1_sc22_wg21_docs_papers_2018_p1105r0.html-23rtti&d=DwMFaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=7aFHQSaTijYTIv4QldkgDVFOtHraMIeWJ9BI91rRi2c&s=DIlVM7DkikKHU3dBzN5SBhcNaQ5wA2a0k8xuUbDoasE&e=> ) to allow for typeid(type) in freestanding, while still allowing typeid(variable) to be ill formed.  I would need to carefully evaluate how much of <typeinfo> and <typeindex> would make sense.

If you go down this path, it would make more sense to only disallow the dynamic form of typeid(expr) (that is, when the operand is a glvalue of polymorphic class type). There's also implementation experience of that: that's what MSVC's /GR- flag does.

From: cfe-dev <cfe-dev-bounces at lists.llvm.org<mailto:cfe-dev-bounces at lists.llvm.org>> On Behalf Of James Y Knight via cfe-dev
Sent: Tuesday, September 4, 2018 4:42 PM
To: kristina at nym.hush.com<mailto:kristina at nym.hush.com>
Cc: Clang Dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>
Subject: Re: [cfe-dev] [Proposal]: Fast non-portable extension to retrieve a type name from a pointer as a GV: __builtin_type_name(T* Ptr, [int Style])

As discussed on IRC, ISTM this would be better spelled as:
  typeid(<typename>).name();

The issue with that at the moment is that typeid() is an error if you build with -fno-rtti. However, there appears to be no reason why we cannot support `typeid(<typename>)` even with -fno-rtti. Unlike `typeid(<variable>)`, it requires no extra data to be emitted, since there's no possibility that dynamic dispatch is required. Therefore, similarly to how exception support still functions with -fno-rtti by emitting the explicitly required typeinfo data on demand, so too can typeid(<typename>).

I note also that when .name() is the only value used, the remainder of the typeinfo data is already omitted from the output when compiling with optimizations.

It looks like supporting this would be only be a few line change in clang, since all the underlying infrastructure is there already to support the EH with -fno-rtti use-case.

On Mon, Sep 3, 2018 at 10:41 AM via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
The concept itself is pretty trivial, the builtin takes one or two arguments, one of which
has to be a pointer to an arbitrary type, this makes it particularly useful with C++
auto declarations or even printing the type by simply casting a null pointer to
the type in question. It's qualifiers are retained for the sake of printing them
depending on the requested style.

Implementation
^^^^^^^^^^^^^^

const char* TypeName =  __builtin_type_name(T* Ptr, [int Style])

After validating either 1 or 2 argument form (ie. Pointed to type, and whether it's a
record declaration), SemaChecking will set the return type to TheCall->setType(
Context.getPointerType(Context.CharTy.withConst())) leaving it for Clang's CodeGen
to deal with.

Second argument is used to control the output format in form of a bitmask
passed down to PrintingPolicy (as I port this I will decouple the values so the
builtin's behavior isn't dependent on further changes in PrintingPolicy. At
which point the type is retrieved using `getAsString` and stored in the CGM
with `GetAddrOfConstantCString` allowing coalescing of those strings later
during linking. as it's cast to a Int8Ptr.

This is all done in Clang without needing to add anything to the LLVM core.

Things left to do before submitting for code review
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* There is no test coverage, so I need to write up a test file, comprehensively
  testing cases I haven't considered before like Objective-C pointers, any
  pointer to a special Clang type that may behave unexpectedly, complex
  C++ templates (or simple ones, was my main use of it). Target is IR since
  Clang fully lowers it, so in theory it's platform agnostic providing there is
  enough space (the original use was for Embedded C++ with no RTTI.
* While this is out of scope, a GUID buffer as a a style would provide a form
  type IDs in absence of RTTI or alternatively smaller types like u32 or u16
  (aimed at single module builds where they can remain unique, ie. embedded
  systems with a kernel using those to represent kernel object types).

Rationale
^^^^^^^^^
It's clear that this functionality is desired outside of embedded domain,
typically with hacks involving a lambda and __PRETTY_FUNCTION__, on case
being in `lib/Support/TypeName.h` in LLVMSupport. Many non-portable hacks
that depend on compiler versions exist. This doesn't aim to be portable,
just to be compact, not have a runtime cost, and provide this information
either for debugging or even for more practical reasons.

I wanted to find out if there was interest in this kind of thing since
I have developed a variety of different useful and not so useful extensions,
originally for embedded use but I want to upstream them for general use,
I want to know what the consensus is on 1) This particular extension
2). Further extensions that were originally developed for embedded use
but have turned out to be useful in a lot of contexts where you are willing
to sacrifice portability  (now with `__has_builtin` this is extremely easy to
test for and fallback on something else).

On the other hand it's a lot to do overall so I would prefer to get a consensus
whether each feature (even this small one) is worth cleaning up and putting
up for code review, since I understand that something like builtin bloat
or non portability may be of concern. As for the formal name I'd like to call
it extensions for Embedded C++ 2 gating it behind an opt-in flag such as
`-fecxx2-extensions`.

Other things involve limited reflection retrieving the bare minimum that's
needed, a llvm::formatv style formatter accelerator, getting names of record
scope at different levels with 0 being similar to the desired __CLASSNAME__
macro (at least by some).

Looking forward to any feedback whether positive or negative.
Thank you for your time.
- Kristina

_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_cfe-2Ddev&d=DwMFaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=-sbbHSiNjZYv97P_NOvZquxu27yN18IZsUBeKIBtYN8&s=wg_6fnqbc-wF7U1ZrUAr6UnwHYQegzVRuQLTCJqn_YM&e=>
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_cfe-2Ddev&d=DwMFaQ&c=I_0YwoKy7z5LMTVdyO6YCiE2uzI1jjZZuIPelcSjixA&r=y8mub81SfUi-UCZRX0Vl1g&m=7aFHQSaTijYTIv4QldkgDVFOtHraMIeWJ9BI91rRi2c&s=4lVMCfxPmZ3n-7uGW3BhNzswEi1LSw3n63eDKRDhFgs&e=>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180906/40e6b950/attachment.html>


More information about the cfe-dev mailing list