[cfe-users] [cfe-dev] Removing or obfuscating RTTI type name strings

David Blaikie via cfe-users cfe-users at lists.llvm.org
Fri Sep 3 16:18:33 PDT 2021


On Fri, Sep 3, 2021 at 2:55 PM Richard Smith via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> On Fri, 27 Aug 2021 at 11:03, Andy Gibbs via cfe-users <
> cfe-users at lists.llvm.org> wrote:
>
>> Hi there,
>>
>> I'm hitting a rather difficult problem.  I have to compile with RTTI data
>> structures generated because, even though I am not using dynamic_cast or
>> typeid in my application code, I am linking and using a library that does
>> use dynamic_cast.  Therefore my code will crash if I compile with -fno-rtti.
>>
>> The problem is, then, that the size of my code is greatly increased, and
>> also (which is more important) critical information is being leached into
>> the resulting application binary by the RTTI type name string that is
>> generated.  Unlike normal symbols, these cannot be stripped from the
>> executable.
>>
>> Therefore I would like to make a change to the clang compiler to either
>> replace all the type name strings with a single "?" string (this would be
>> best) or doing something like a rot-x encryption on the complete string (I
>> would rather not do this since these strings are literally hundreds of
>> characters long given that the types are complex template types).
>>
>> My suggestion would be that I would attempt to add a -fno-rtti-names
>> parameter.  If this is of interest to the general clang community I would
>> be happy to submit a patch for consideration, but at the very least I need
>> something for my own purposes.
>>
>> This brings me to my request.  I would be very grateful if someone here
>> might be able to direct me into the right place for making such a change.
>>
>> Looking at the source code there is a
>> ItaniumRTTIBuilder::BuildTypeInfo(...) function in
>> CodeGen/ItaniumCXXABI.cpp (see
>> https://github.com/llvm/llvm-project/blob/fe177a1773e4f88dde1aa37d34a0d3f8cb582f14/clang/lib/CodeGen/ItaniumCXXABI.cpp#L3730).
>> In there, the first thing it does is lay down a field for the mangled type
>> name.  My guess is that it should be possible to substitute the line
>>
>>     llvm::GlobalVariable *TypeName = GetAddrOfTypeName(Ty, Linkage);
>>
>> with something that generates a static string "?" and returns the address
>> of that.  Then it will build the table pointing at this string, I am
>> guessing.
>>
>> Is this a feasible approach or will this break loads of things elsewhere
>> in the compiler and/or c++ runtime?  I am not interested in a run-time
>> ability to get the mangled (or otherwise) name of the class, so if
>> replacing this string has no effect on, for example, the correction
>> function of dynamic_cast or typeid and only means that std::type_info::name
>> returns a bogus value, then I'm happy with that.
>>
>
> The address and (sometimes) contents of the _ZTS type_info name are used
> for type_info equality comparisons. The implementation of dynamic_cast
> internally uses type_info comparisons to find the destination type within
> the source type's type_info tree. So changing the contents of the string to
> be non-unique may lead to problems, especially if the ABI rule in question
> results in the use of strcmp. You could perhaps instead consider replacing
> the string contents with something like a hash of the mangled name of the
> type (though be aware that the ABI library will want to interpret it as a
> nul-terminated byte string).
>

The symbol hashing (while preserving the nul-terminated-ness, as you say -
so using base64 encoding of the hash or something like that) might have
some overlap with ideas that gets thrown around from time to time, to
reduce symbol name length generally (in the DWARF, and in the ELF symbol
tables - though the latter would mostly/only apply if you're OK with an ABI
break or possibly a floating ABI (ie: build all your C++ code with this
exact compiler, no prebuilt libraries, etc)) - so not /exactly/ the same
thing, but might have enough overlap to benefit from some common
machinery/family of options. I hadn't actually thought about the RTTI side
of things - I should check that in more detail, perhaps another
place/source of redundant names & potential size benefit of this overall
direction.

- Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-users/attachments/20210903/65ea59ff/attachment.html>


More information about the cfe-users mailing list