[LLVMdev] Offset to C++ structure members
Paul J. Lucas
paul at lucasmail.org
Tue Oct 2 15:28:22 PDT 2012
On Oct 2, 2012, at 2:34 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Tue, Oct 2, 2012 at 11:33 AM, Paul J. Lucas <paul at lucasmail.org> wrote:
>
>> My understanding is that, in order to use GEP, you have to provide the LLVM code with the struct layout, i.e., build a StructType object. In my case, that struct is declared in C++ code already and, in order to use GEP, I'd have to replicate the struct layout (exactly as the C++ compiler would) in LLVM code -- something that I'd rather not do, not to mention that it's fairly "brittle" even if I could manage to get it right. (Simple structs would probably be easy, but struct that have virtual functions or multiple base classes would be much harder.)
>
> No, you don't have to... you can just use GEP on i8*'s. The LLVM type
> system doesn't have any semantic significance.
Oh! I just tried it and it works. :-)
>>> I'm not entirely sure how you're using mbr_offset_of
>>
>> Given 't', an instance of some class T, and some member T::m, find the integer offset in bytes from &t to &t.m. This offset, when added to &t, should be &t.m.
>>
>> I'm using mbr_offset_of to get the C++ compiler to do the work of telling me what the correct offset is for the already existing struct.
>
> If you can do that, why not just generate a thunk to perform the addressing?
Because if I can create a thunk to do that, I can just as easily create a thunk to provide a "setter" for the struct member (something I'd prefer not to do).
I'm trying to compute the offset "inline" in the LLVM code rather than (a) have to create yet another C thunk and (b) call it.
>>> but it's broken if there are any classes with virtual bases involved.
>>
>> Really? This simple code works just fine:
>>
>> struct A { int ai; };
>> struct X : virtual A { int xi; };
>> struct Y : virtual A { int yi; };
>>
>> struct S : X, Y {
>> string a;
>> string b;
>> };
>>
>> template<class ClassType,class MbrType> inline
>> ptrdiff_t mbr_offset_of( MbrType ClassType::*p ) {
>> ClassType const *const c = static_cast<ClassType*>( nullptr );
>> return reinterpret_cast<ptrdiff_t>( &(c->*p) );
>> }
>>
>> int main() {
>> ptrdiff_t offset = mbr_offset_of( &S::b );
>> S s;
>> string *p = (string*)((char*)&s + offset);
>> p->assign( "Hello, world!" );
>> cout << *p << endl;
>> return 0;
>> }
>
> It starts to become an issue when you try to compute the offset to
> e.g. A::ai in your example.
Hmmmm.... I just changed:
s/int ai/string as/
s/&S::b/&S::as/
and the code still works. The offset is 0 which is what you'd expect with class 'A' being a public virtual (shared) base class.
>> Despite that, however, the equivalent code in LLVM (once I introduce a base class for S, even just ordinary inheritance), crashes. I don't understand why, however. I print out the offset, and it's the correct value that's getting added to the Pointer.
>
> No idea what's happening here.
The IR code is now:
@0 = private unnamed_addr constant [14 x i8] c"Hello, world!\00"
...
%0 = call i8* @T_S_M_new(i8* %heap)
%1 = getelementptr i8* %0, i64 16
call void @T_string_M_assign_A_Pv(i8* %1, i8* getelementptr inbounds ([14 x i8]* @0, i64 0, i64 0))
where the "16" is the correct offset (it agrees with my pure C++ version of the code), yet it still crashes. It's not obvious why.
- Paul
More information about the llvm-dev
mailing list