[LLVMdev] Offset to C++ structure members

Eli Friedman eli.friedman at gmail.com
Mon Oct 1 21:58:01 PDT 2012


On Mon, Oct 1, 2012 at 9:33 PM, Paul J. Lucas <paul at lucasmail.org> wrote:
> Given the C++ struct:
>
>         struct S {
>           string a;
>           string b;
>         };
>
> I also have C "thunk" functions that I call from LLVM code:
>
>         // calls S::S()
>         void* T_S_M_new( void *heap );
>
>         // call string::assign(char const*)
>         void T_string_M_assign_Pv( void *that, void *value );
>
> I want to do the LLVM equivalent of the following C++ (where that 's' is pointer to an instance of 'S'):
>
>         s->b.assign( "Hello, world!" ); // assign to S::b
>
> If there were an S member function:
>
>         void S::assign_to_b( char const* );
>
> it would be easy to write a "thunk" wrapper to call it.  However, assume that there is no such S member function.  I therefore need a way to get the offset of 'b' and add it to 's' so that I can call T_string_M_assign_Pv() on it.
>
> Given this helper function:
>
>         template<class ClassType,class MbrType> inline
>         ptrdiff_t mbr_offset_of( MbrType ClassType::*p ) {
>           ClassType const *const c = static_cast<ClassType*>( nullptr );
>           return reinterpret_cast<ptrdiff_t>( &(c->*p) );
>         }
>
> I could take a Pointer to an S, use ptrtoint, add the offset, use inttoptr, and use that pointer to pass as the 'this' argument to T_string_M_assign_Pv().  The LLVM code generated via the IRBuilder is:
>
>   %0 = call i8* @T_S_M_new(i8* %heap)
>   %1 = ptrtoint i8* %0 to i64
>   %2 = add i64 %1, 8           ; 8 is what's returned by mbr_offset_of()
>   %3 = inttoptr i64 %2 to i8*
>   call void @T_string_M_assign_A_Pv(i8* %3, i8* getelementptr inbounds ([15 x i8]* @0, i64 0, i64 0))
>
> The code does in fact work.  My questions are:
>
> * Is this an "OK" thing to do?

Doing math on pointers if you know the offsets is perfectly
legitimate.  clang will generate code like this for certain casts
which can't be represented in the type system.

Using GEP on an i8* is a bit nicer to the optimizer, though, because
using ptrtoint/inttoptr has effects on alias analysis.

> * Is there a better way?

I'm not entirely sure how you're using mbr_offset_of, but it's broken
if there are any classes with virtual bases involved.  Getting this
case right probably involves using clang somehow (either to synthesize
the relevant thunks, or query for the right offsets and generate the
code yourself).

-Eli



More information about the llvm-dev mailing list