[PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type
Hal Finkel via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 9 22:16:28 PDT 2017
On 08/10/2017 12:01 AM, Chandler Carruth wrote:
> On Wed, Aug 9, 2017 at 9:51 PM Hal Finkel <hfinkel at anl.gov
> <mailto:hfinkel at anl.gov>> wrote:
>
>
> On 08/09/2017 11:03 PM, Chandler Carruth wrote:
>> Hal already answered much of this, just continuing this part of
>> the discussion...
>>
>> On Wed, Aug 9, 2017 at 8:56 PM Xinliang David Li via llvm-commits
>> <llvm-commits at lists.llvm.org
>> <mailto:llvm-commits at lists.llvm.org>> wrote:
>>
>> On Wed, Aug 9, 2017 at 8:37 PM, Hal Finkel <hfinkel at anl.gov
>> <mailto:hfinkel at anl.gov>> wrote:
>>
>>
>> On 08/09/2017 10:14 PM, Xinliang David Li via
>> llvm-commits wrote:
>>> Can you elaborate here too? If there were missed
>>> optimization that later got fixed, there should be
>>> regression tests for them, right? And what information
>>> is missing?
>>
>> To make a general statement, if we load (a, i8) and (a+2,
>> i16), for example, and these came from some structure,
>> we've lost the information that the load (a+1, i8) would
>> have been legal (i.e. is known to be deferenceable). This
>> is not specific to bit fields, but the fact that we lose
>> information on the dereferenceable byte ranges around
>> memory access turns into a problem when we later can't
>> legally widen. There may be a better way to keep this
>> information other than producing wide loads (which is an
>> imperfect mechanism, especially the way we do it by
>> restricting to legal integer types),
>>
>>
>> I don't think we have such a restriction? Maybe I'm missing
>> something. When I originally added this logic, it definitely was
>> not restricted to legal integer types.
>
> I believe you're right for bitfields. For general structures,
> however, we certainly load individual fields instead of loading
> the whole structure with some wide integer in order to preserve
> dereferenceability information.
>
>
> I don't believe structures provide that information. See below.
>
>
>
>>
>> but at the moment, we don't have anything better.
>>
>>
>> Ok, as you mentioned, widening looks like a workaround to
>> paper over the weakness in IR to annotate the information.
>> More importantly, my question is whether this is a just
>> theoretical concern.
>>
>>
>> I really disagree with this being a workaround.
>>
>> I think it is very fundamentally the correct model -- the
>> semantics are that this is a single, wide memory operation that a
>> narrow data type is extracted from.
>
> That is one option. We do need to preserve this information (maybe
> we can do this with TBAA, or similar, or maybe using some other
> mechanism entirely). However, we do try harder to do this with
> bitfields than with other aggregates. If I have struct { int a, b,
> c, d; } S; and I load S.d, we don't do this by loading a 128-bit
> integer and then extracting some part of it. Should we? Probably not.
>
>
> We cannot, it isn't allowed (I'm pretty sure...)
>
> 1) It violates C++ (and C) memory model -- another thread could be
> writing to the other variables.
Ah, indeed, you're correct. That does indeed motivate bitfields being a
special case. Do the comments explain that somewhere?
I'll need to add this to my mental list of sometimes-unfortunate semantics.
-Hal
>
> 2) Related to #1, there are applications that rely on this memory
> model, for example structures where entire regions of the structure
> live in protected pages and cannot be correctly accessed.
>
> 3) Again related to #1, there are applications that rely on the memory
> model when doing memory-mapped IO to avoid reading or writing regions
> that are being updated by the OS or other processes.
>
> Bitfields are the only place where we have specific license to widen
> access in the C++ memory model (that I'm aware of)....
>
> I suspect having better support for aggregate memory access would
> be a better solution. Or, as noted, using metadata or some other
> secondary mechanism.
>
>
> FWIW, I actually agree that if we want to do more of this, we would be
> better served by a different IR, but I strongly suspect it would look
> more like first class aggregates rather than metadata so that we could
> reason about it more fundamentally in terms of SSA.
>
> But bitfields are (IMO) an importantly different problem in that they
> are mergeable in interesting and important ways due to being integers
> and often times sub-byte integers. This is why a single large integer
> combined with late narrowing seems like a particularly desirable way
> to represent the fundamental information of the semantic constraints
> of the program.
>
> Maybe more aggressively preserving this information for bit fields
> is the right answer, empirically. I can believe that's true. The
> more-general problem still exists, however.
>
>
> For other languages / semantics, yes. Increasingly I think a (better
> designed / integrated / spec'ed, etc) system like FCAs would work
> particularly well at making this easy to express and reason about. But
> it would be a pretty significant change.
>
>
> The thing that appeals to me about the IR-transformation approach
> is the ability to handle "hand coded" bit fields as effectively as
> language-level bit fields. I've certainly seen my share of these,
> and they're definitely important. Moreover, this is true
> regardless of what we think about the underlying optimal model for
> preserving aggregate derefereceability in general.
>
>
> Completely agree. Teaching LLVM to handle wide integer accesses will
> be beneficial no matter what decisions are made here.
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170810/e11e8f73/attachment.html>
More information about the llvm-commits
mailing list