[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information (new version)

Mon Sep 10 14:46:41 PDT 2012

On Sep 10, 2012, at 11:54 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:

> On 9/10/2012 1:29 PM, Chandler Carruth wrote:
>> 
>> The idea would be to make all struct types be packed[1], and to
>> represent padding as explicit members of the struct.
>> 
> > [...]
> >
>> Thoughts?
> 
> Frankly, I like this idea a lot.  I have one comment though: the data type used for the padding fields would need to always be the same, or else we run into the issue of having two types that are equal with respect to the non-padded data, but differ in the types (but not lengths) of the padding.  Those should be considered identical.

After the great struct type rewrite, LLVM no longer unifies structurally
equivalent struct types, so this issue doesn't seem relevant any more.

> 
> It brings my attention back to this:
> 
> On 8/31/2012 3:15 AM, Renato Golin wrote:> On 30 August 2012 21:30, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:
> >> I guess I'm late to the party, but another possibility would be to model
> >> structure types as lists of members with their offsets from the beginning of
> >> the parent aggregate.  This would require extensive changes to LLVM, so I'm
> >> not sure if it's an option.
> >
> > This has been proposed already, and could also be used by bitfields,
> > but the changes were too many and was not accepted.
> >
> > I think the biggest reason against was that it was strongly based on
> > C++ semantics and not generic enough to be considered IR material.
> 
> 
> This would simply omit any non-member information from a type, and provide explicit placement (offset) of the members.  What were the specific concerns regarding this idea in the past?

I don't know of any semantic problems with this. Bytes in memory in LLVM do not have
built-in types, so struct types are "just" sized and aligned blobs with associated
addressing hints. The main design problem here is ease of use, and there are several
different kinds of user here.

For example, front-end writers focused on micro-optimizing memory usage might welcome
a change which would give them more control by default. Optimizer writers often
appreciate than in the current system, simple things are very simple, and
uncoincidentally similar to C.

Conceptually, much of my response to Chandler's recent proposal applies to this
proposal as well. I'm don't dislike it outright, but it doesn't seem the best
way to solve the specific problems I started this thread with, and it doesn't
(yet) seem sufficiently motivated otherwise (it'd be a big change).

Dan