[LLVMdev] Offset to C++ structure members

Paul J. Lucas paul at lucasmail.org
Tue Oct 2 15:28:22 PDT 2012


On Oct 2, 2012, at 2:34 PM, Eli Friedman <eli.friedman at gmail.com> wrote:

> On Tue, Oct 2, 2012 at 11:33 AM, Paul J. Lucas <paul at lucasmail.org> wrote:
> 
>> My understanding is that, in order to use GEP, you have to provide the LLVM code with the struct layout, i.e., build a StructType object.  In my case, that struct is declared in C++ code already and, in order to use GEP, I'd have to replicate the struct layout (exactly as the C++ compiler would) in LLVM code -- something that I'd rather not do, not to mention that it's fairly "brittle" even if I could manage to get it right.  (Simple structs would probably be easy, but struct that have virtual functions or multiple base classes would be much harder.)
> 
> No, you don't have to... you can just use GEP on i8*'s.  The LLVM type
> system doesn't have any semantic significance.

Oh!  I just tried it and it works.  :-)

>>> I'm not entirely sure how you're using mbr_offset_of
>> 
>> Given 't', an instance of some class T, and some member T::m, find the integer offset in bytes from &t to &t.m.  This offset, when added to &t, should be &t.m.
>> 
>> I'm using mbr_offset_of to get the C++ compiler to do the work of telling me what the correct offset is for the already existing struct.
> 
> If you can do that, why not just generate a thunk to perform the addressing?

Because if I can create a thunk to do that, I can just as easily create a thunk to provide a "setter" for the struct member (something I'd prefer not to do).

I'm trying to compute the offset "inline" in the LLVM code rather than (a) have to create yet another C thunk and (b) call it.

>>> but it's broken if there are any classes with virtual bases involved.
>> 
>> Really?  This simple code works just fine:
>> 
>>        struct A             { int ai; };
>>        struct X : virtual A { int xi; };
>>        struct Y : virtual A { int yi; };
>> 
>>        struct S : X, Y {
>>          string a;
>>          string b;
>>        };
>> 
>>        template<class ClassType,class MbrType> inline
>>        ptrdiff_t mbr_offset_of( MbrType ClassType::*p ) {
>>          ClassType const *const c = static_cast<ClassType*>( nullptr );
>>          return reinterpret_cast<ptrdiff_t>( &(c->*p) );
>>        }
>> 
>>        int main() {
>>          ptrdiff_t offset = mbr_offset_of( &S::b );
>>          S s;
>>          string *p = (string*)((char*)&s + offset);
>>          p->assign( "Hello, world!" );
>>          cout << *p << endl;
>>         return 0;
>>        }
> 
> It starts to become an issue when you try to compute the offset to
> e.g. A::ai in your example.

Hmmmm.... I just changed:

	s/int ai/string as/
	s/&S::b/&S::as/

and the code still works.  The offset is 0 which is what you'd expect with class 'A' being a public virtual (shared) base class.

>> Despite that, however, the equivalent code in LLVM (once I introduce a base class for S, even just ordinary inheritance), crashes.  I don't understand why, however.  I print out the offset, and it's the correct value that's getting added to the Pointer.
> 
> No idea what's happening here.

The IR code is now:

	@0 = private unnamed_addr constant [14 x i8] c"Hello, world!\00"
	...
	%0 = call i8* @T_S_M_new(i8* %heap)
	%1 = getelementptr i8* %0, i64 16
	call void @T_string_M_assign_A_Pv(i8* %1, i8* getelementptr inbounds ([14 x i8]* @0, i64 0, i64 0)) 

where the "16" is the correct offset (it agrees with my pure C++ version of the code), yet it still crashes.  It's not obvious why.

- Paul





More information about the llvm-dev mailing list