[PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

Wed Aug 9 22:16:28 PDT 2017

On 08/10/2017 12:01 AM, Chandler Carruth wrote:
> On Wed, Aug 9, 2017 at 9:51 PM Hal Finkel <hfinkel at anl.gov 
> <mailto:hfinkel at anl.gov>> wrote:
>
>
>     On 08/09/2017 11:03 PM, Chandler Carruth wrote:
>>     Hal already answered much of this, just continuing this part of
>>     the discussion...
>>
>>     On Wed, Aug 9, 2017 at 8:56 PM Xinliang David Li via llvm-commits
>>     <llvm-commits at lists.llvm.org
>>     <mailto:llvm-commits at lists.llvm.org>> wrote:
>>
>>         On Wed, Aug 9, 2017 at 8:37 PM, Hal Finkel <hfinkel at anl.gov
>>         <mailto:hfinkel at anl.gov>> wrote:
>>
>>
>>             On 08/09/2017 10:14 PM, Xinliang David Li via
>>             llvm-commits wrote:
>>>              Can you elaborate here too? If there were missed
>>>             optimization that later got fixed, there should be
>>>             regression tests for them, right?  And what information
>>>             is missing?
>>
>>             To make a general statement, if we load (a, i8) and (a+2,
>>             i16), for example, and these came from some structure,
>>             we've lost the information that the load (a+1, i8) would
>>             have been legal (i.e. is known to be deferenceable). This
>>             is not specific to bit fields, but the fact that we lose
>>             information on the dereferenceable byte ranges around
>>             memory access turns into a problem when we later can't
>>             legally widen. There may be a better way to keep this
>>             information other than producing wide loads (which is an
>>             imperfect mechanism, especially the way we do it by
>>             restricting to legal integer types),
>>
>>
>>     I don't think we have such a restriction? Maybe I'm missing
>>     something. When I originally added this logic, it definitely was
>>     not restricted to legal integer types.
>
>     I believe you're right for bitfields. For general structures,
>     however, we certainly load individual fields instead of loading
>     the whole structure with some wide integer in order to preserve
>     dereferenceability information.
>
>
> I don't believe structures provide that information. See below.
>
>
>
>>
>>             but at the moment, we don't have anything better.
>>
>>
>>         Ok, as you mentioned, widening looks like a workaround to
>>         paper over the weakness in IR to annotate the information. 
>>         More importantly, my question is whether this is a just
>>         theoretical concern.
>>
>>
>>     I really disagree with this being a workaround.
>>
>>     I think it is very fundamentally the correct model -- the
>>     semantics are that this is a single, wide memory operation that a
>>     narrow data type is extracted from.
>
>     That is one option. We do need to preserve this information (maybe
>     we can do this with TBAA, or similar, or maybe using some other
>     mechanism entirely). However, we do try harder to do this with
>     bitfields than with other aggregates. If I have struct { int a, b,
>     c, d; } S; and I load S.d, we don't do this by loading a 128-bit
>     integer and then extracting some part of it. Should we? Probably not.
>
>
> We cannot, it isn't allowed (I'm pretty sure...)
>
> 1) It violates C++ (and C) memory model -- another thread could be 
> writing to the other variables.

Ah, indeed, you're correct. That does indeed motivate bitfields being a 
special case. Do the comments explain that somewhere?

I'll need to add this to my mental list of sometimes-unfortunate semantics.

  -Hal

>
> 2) Related to #1, there are applications that rely on this memory 
> model, for example structures where entire regions of the structure 
> live in protected pages and cannot be correctly accessed.
>
> 3) Again related to #1, there are applications that rely on the memory 
> model when doing memory-mapped IO to avoid reading or writing regions 
> that are being updated by the OS or other processes.
>
> Bitfields are the only place where we have specific license to widen 
> access in the C++ memory model (that I'm aware of)....
>
>     I suspect having better support for aggregate memory access would
>     be a better solution. Or, as noted, using metadata or some other
>     secondary mechanism.
>
>
> FWIW, I actually agree that if we want to do more of this, we would be 
> better served by a different IR, but I strongly suspect it would look 
> more like first class aggregates rather than metadata so that we could 
> reason about it more fundamentally in terms of SSA.
>
> But bitfields are (IMO) an importantly different problem in that they 
> are mergeable in interesting and important ways due to being integers 
> and often times sub-byte integers. This is why a single large integer 
> combined with late narrowing seems like a particularly desirable way 
> to represent the fundamental information of the semantic constraints 
> of the program.
>
>     Maybe more aggressively preserving this information for bit fields
>     is the right answer, empirically. I can believe that's true. The
>     more-general problem still exists, however.
>
>
> For other languages / semantics, yes. Increasingly I think a (better 
> designed / integrated / spec'ed, etc) system like FCAs would work 
> particularly well at making this easy to express and reason about. But 
> it would be a pretty significant change.
>
>
>     The thing that appeals to me about the IR-transformation approach
>     is the ability to handle "hand coded" bit fields as effectively as
>     language-level bit fields. I've certainly seen my share of these,
>     and they're definitely important. Moreover, this is true
>     regardless of what we think about the underlying optimal model for
>     preserving aggregate derefereceability in general.
>
>
> Completely agree. Teaching LLVM to handle wide integer accesses will 
> be beneficial no matter what decisions are made here.
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170810/e11e8f73/attachment.html>