[cfe-dev] [LLVMdev] Mapping field names to GEP indices in clang-compiled C

Jeffrey Yasskin jyasskin at google.com
Wed May 20 10:20:20 PDT 2009


On Mon, May 11, 2009 at 4:56 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, May 11, 2009 at 2:17 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:
>>> clang's current behavior here is a bug, but it's a low priority to fix
>>> because the generated IR isn't incorrect, just somewhat difficult to
>>> read.
>>
>> Oh, good. Is there a bug number for that? I'd assumed that it was
>> intentional to give clang more control over struct layout, but if it's
>> accidental I'll look forward to a fix. If I get annoyed enough at it,
>> I may even try to fix it myself. Do you have any pointers to code I
>> should look at to fix it?
>
> I don't think there's a bug filed.  The relevant code is
> RecordOrganizer::layoutStructFields in lib/CodeGen/CodeGenTypes.cpp.
>
> If I recall correctly, I originally wrote the code in question.  I
> wanted to make the first implementation as simple as possible, and
> therefore I only wrote the general case.  The way to fix this is
> basically to add detection for structs where the amount of padding
> LLVM would insert is never more than the necessary amount, and use an
> unpacked struct in those cases.
>
> -Eli
>

Instead of fixing this in clang, I took your and Chris's other advice
and used offsetof. The patch is
http://code.google.com/p/unladen-swallow/source/detail?r=555 and looks
like the following. Sorry for the Python-specific code in there;
hopefully the meaning is clear for other projects.

unsigned int
_PyTypeBuilder_GetFieldIndexFromOffset(
    const llvm::StructType *type, size_t offset)
{
    static const llvm::TargetData *const target_data =
        PyGlobalLlvmData::Get()->getExecutionEngine()->getTargetData();
    const llvm::StructLayout *layout = target_data->getStructLayout(type);
    unsigned int index = layout->getElementContainingOffset(offset);
    assert(layout->getElementOffset(index) == offset &&
           "offset must be at start of element");
    return index;
}

#define DEFINE_FIELD(TYPE, FIELD_NAME) \
    static Value *FIELD_NAME(IRBuilder<> &builder, Value *ptr) { \
        assert(ptr->getType() == PyTypeBuilder<TYPE*>::get() && \
               "*ptr must be of type " #TYPE); \
        static const unsigned int index = \
            _PyTypeBuilder_GetFieldIndexFromOffset( \
                PyTypeBuilder<TYPE>::get(), \
                offsetof(TYPE, FIELD_NAME)); \
        return builder.CreateStructGEP(ptr, index, #FIELD_NAME); \
    }
...
    DEFINE_FIELD(PyListObject, ob_size)
    DEFINE_FIELD(PyListObject, ob_item)
    DEFINE_FIELD(PyListObject, allocated)
...

where PyListObject was defined as:

typedef struct {
    PyObject_VAR_HEAD  /* includes ob_size along with other common fields */
    PyObject **ob_item;
    Py_ssize_t allocated;
} PyListObject;


Thanks again for the help!
Jeffrey




More information about the cfe-dev mailing list