[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
Chris Lattner
clattner at apple.com
Mon Aug 27 22:22:10 PDT 2012
<moving this to llvmdev now that the lists are back up!>
On Aug 23, 2012, at 4:37 PM, Dan Gohman <gohman at apple.com> wrote:
> On Aug 23, 2012, at 4:05 PM, Chris Lattner <clattner at apple.com> wrote:
>> On Aug 23, 2012, at 3:59 PM, Dan Gohman <gohman at apple.com> wrote:
>>> On Aug 23, 2012, at 3:31 PM, Chris Lattner <clattner at apple.com> wrote:
>>>> Interesting approach. The IR type for a struct may or may not be enough to describe holes (think unions and other cases), have you considered a more explicit MDNode that describes the ranges of any holes?
>>>
>>> What's the issue with unions? Do you mean unions containing structs
>>> containing holes?
>>
>> Unions don't lower to a unique or useful IR type. In general, I'm skeptical of anything that uses IR types to reason about source level types (except primitives like integers and floats).
>
> I'm confused. It seems a big difference here between your expectations
> and my understanding is that you're expecting to see source level types
> here, whereas it hadn't even occurred to me that we should try to represent
> source level types.
My point here is that the frontend reasons about two things: 1) a source level construct of a type, and 2) LLVM IR types. The LLVM IR type lowering is not guaranteed cover all fields in the source type (e.g. in the case of unions).
Let me give you a dumb example. Consider:
union x {
struct { char b; int c; } a;
short b;
} u;
On my system, Clang codegen's this to:
%union.x = type { %struct.anon }
%struct.anon = type { i8, i32 }
This isn't a safe IR type to use to describe a memcpy (because it wouldn't copy all of "b"), so implementing your proposal would requiring implementing yet-another conversion from AST types to LLVM types that *is* guaranteed to cover all the fields.
Instead of implementing this, it would be a lot easier for clang to walk a type and produce a mask describing all the holes in a type, using a simple recursive algorithm (where union intersects the member "hole sets", finding that byte 3/4 of the union is a hole).
Given this, it makes a lot more sense to explicitly model this hole set in an MDNode (e.g. by using a list of byte ranges?) instead of representing the holes with a null pointer constant of some IR type.
Does this make sense?
-Chris
More information about the llvm-dev
mailing list