[llvm-dev] [cfe-dev] RFC: Enforcing pointer type alignment in Clang

Thu Jan 14 17:13:39 PST 2016

> On Jan 14, 2016, at 3:21 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> ----- Original Message -----
>> From: "John McCall via cfe-dev" <cfe-dev at lists.llvm.org>
>> To: cfe-dev at lists.llvm.org, llvm-dev at lists.llvm.org
>> Sent: Thursday, January 14, 2016 2:56:37 PM
>> Subject: [cfe-dev] RFC: Enforcing pointer type alignment in Clang
>> 
>> C 6.3.2.3p7 (N1548) says:
>>  A pointer to an object type may be converted to a pointer to a
>>  different object type. If the resulting pointer is not correctly
>>  aligned) for the referenced type, the behavior is undefined.
>> 
>> C++ [expr.reinterpret.cast]p7 (N4527) defines pointer conversions in
>> terms
>> of conversions from void*:
>>  An object pointer can be explicitly converted to an object pointer
>>  of a different type. When a prvalue v of object pointer type is
>>  converted to the object pointer type “pointer to cv T”, the result
>>  is static_cast<cv T*>(static_cast<cv void*>(v)).
>> 
>> C++ [expr.static.cast]p13 says of conversions from void*:
>>  A prvalue of type “pointer to cv1 void” can be converted to a
>>  prvalue of type “pointer to cv2 T” .... If the original pointer
>>  value
>>  represents the address A of a byte in memory and A satisfies the
>>  alignment
>>  requirement of T, then the resulting pointer value represents the
>>  same
>>  address as the original pointer value, that is, A. The result of
>>  any
>>  other such pointer conversion is unspecified.
>> 
>> The clear intent of these rules is that the implementation may assume
>> that any pointer is adequately aligned for its pointee type, because
>> any
>> attempt to actually create such a pointer has undefined behavior.  It
>> is
>> very likely that, if we found a hole in those rules that seemed to
>> permit
>> the creation of unaligned pointers, we could go to the committees and
>> have that hole closed.  The language policy here is clear.
>> 
>> There are architectures where this policy is mandatory.  The classic
>> example is (I believe) the Cray C90, which provides word-addressed
>> 32-bit
>> pointers.  Pointers to most types can use this native representation
>> directly.
>> char*, however, requires sub-word addressing, which means void* and
>> char*
>> are actually 64 bits in order to permit the storage of the sub-word
>> offset.
>> An int* therefore literally cannot express an arbitrary void*.
>> 
>> Less dramatically, there are architectural features that clearly
>> depend
>> on alignment.  It's unreasonable to expect processors to support
>> atomic
>> accesses that straddle the basic unit of their cache coherence
>> implementations.
>> Supporting small unaligned accesses has a fairly marginal cost in
>> extra
>> hardware, but as accesses grow to 128 bits or larger, those costs can
>> spiral
>> out of control.  These restrictions are fairly widely understood by
>> compiler
>> users.
>> 
>> Everything below is mushier.  It's clearly advantageous for the
>> compiler to
>> be able to make stronger assumptions about alignment when accessing
>> memory.
>> ISAs often allow more efficient accesses to properly-aligned memory;
>> for
>> example, 32-bit ARM can perform a 64-bit memory access in a single
>> instruction, but the address is required to be 8-byte-aligned.
>> Alignment
>> also affects compiler decisions even when the architecture doesn't
>> enforce
>> it; for example, it can be profitable to combine two adjacent loads
>> into
>> a single, wider load, but this will often slow down code if the wider
>> load is
>> no longer properly aligned.
>> 
>> As is the case with most forms of undefined behavior, programmers
>> have at
>> best an abstract appreciation for the positive effects of these
>> optimizations,
>> but they have a very concrete understanding of the disruptive life
>> effects
>> of being forced to fix crashes from mis-alignment.
>> 
>> Our standard response in LLVM/Clang is to explain the undefined
>> behavior
>> rule, explain the benefits it provides, and politely ask users to,
>> well,
>> deal with it.  And that's appropriate; most forms of undefined
>> behavior
>> under the standard(s) are reasonable requests with reasonable code
>> workarounds.  However, we have also occasionally looked at a
>> particular
>> undefined behavior rule and decided that there's a real usability
>> problem
>> with enforcing it as written.  In these cases, we use our power as
>> implementors to make some subset of that behavior well-defined in
>> order to
>> fix that problem.  For example, we did this for TBAA, because we
>> recognized
>> that certain "obvious" aliasing violations were idiomatic and only
>> had
>> awkward workarounds under the standard.
>> 
>> There's a similar problem here.  Much like TBAA, fixing it doesn't
>> require
>> completely abandoning the idea of enforcing type-based alignment
>> assumptions.
>> It does, however, require a significant adjustment to the language
>> rule.
>> 
>> The problem is this: the standards make it undefined behavior to even
>> create an unaligned pointer.  Therefore, as soon as I've got such a
>> pointer, I'm basically doomed; it is no longer possible to locally
>> work around the problem.  I have to change the type of the pointer to
>> something that requires less alignment, and not just where I'm using
>> it, or even just within my function, but all the way up to wherever
>> it
>> came from.
>> 
>> For example, suppose I've got this function:
>> 
>>  void processBuffer(const int32_t *buffer, size_t length) {
>>    ...
>>  }
>> 
>> I get a bug report saying that my function is crashing, and I decide
>> that the right fix is to make the function handle unaligned buffers
>> correctly.  Maybe that's a binary-compatibility requirement, or maybe
>> the
>> buffer is usually coming from a serialized format that doesn't
>> guarantee
>> alignment, and it's clearly unreasonable to copy the buffer just to
>> satisfy
>> my function.
>> 
>> So how can I make this function handle unaligned buffers?  The type
>> of the
>> argument itself means that being passed an unaligned buffer has
>> undefined
>> behavior.  Now, I can change that parameter to use an unaligned
>> typedef:
>> 
>>  typedef int32_t unaligned_int32_t __attribute__((aligned(1)));
>>  void processBuffer(const unaligned_int32_t *buffer, size_t length)
>>  {
>>    ...
>>  }
>> 
>> But this has severe problems.  First off, this is a GCC/Clang
>> extension; a lot
>> of programmers feel uncomfortable adopting that, especially to fix a
>> problem
>> that's in principle common across compilers.  Second, alignment
>> attributes
>> are not really part of the type system, which means that they can be
>> silently dropped by any number of things, including both major
>> features
>> like templates and just day-to-day quality-of-implementation stuff
>> like the
>> common-type logic of the conditional operator.  And finally, my
>> callers
>> still have undefined behavior, and I really need to go audit all of
>> them
>> to make sure they're using the same sort of typedef.  This is not a
>> reliable
>> solution to the bug.
>> 
>> Furthermore, the compiler doesn't really care whether the pointer is
>> abstractly aligned independent of any access to memory.  There aren't
>> very
>> many interesting alignment-based optimizations on pointer values as
>> mere
>> values.  In principle, we could optimize operations that cast the
>> pointer
>> to an integral type and examine the low bits, but those operations
>> are not
>> very common, and when they're there, it's probably for a good reason;
>> that's the kind of optimization is very likely to just create
>> miscompiles
>> without really showing any benefit.
>> 
>> Therefore, I would like to propose that Clang formally adopt a
>> significantly
>> weaker language rule for enforcing the alignment of pointers.  The
>> basic
>> idea is this:
>> 
>>  It is not undefined behavior to create a pointer that is less
>>  aligned
>>  than its pointee type.  Instead, it is only undefined behavior to
>>  access memory through a pointer that is less aligned than its
>>  pointee
>>  type.
>> 
>> That is, the only thing that matters is the type when you actually
>> perform
>> the access, not any type the pointer might have had at some earlier
>> point
>> during execution.
>> 
>> Notably, I believe that this rule doesn't require any changes in our
>> current behavior, so adopting it is just a restriction on future
>> compiler
>> optimization.
> 
> For the sake of completeness, I'll mention one exception. If the pointer (or its type via a typedef) as the __attribute__((align_value(N))) attribute, then we do emit alignment attributes on the pointer values themselves and use that information in later optimizations. This is by design, but given that it is explicitly opt-in, I feel this falls into a different category than the situations you've described.

Sure, that seems reasonable.  It’s the default language rule I’m concerned about.

John.

> 
> Realistically, if we ever were to implement optimizations based on default type alignments, we'd need a flag to turn off those assumptions (just like we have a flag to turn off strict aliasing assumptions).
> 
> -Hal
> 
> 
>> For the most part, LLVM IR only attaches alignment to
>> loads,
>> stores, and specific intrinsics like llvm.memcpy; there is no way to
>> say
>> that a pointer value is expected to have a particular alignment.  The
>> one exception that I'm aware of is that an indirect parameter can
>> have
>> an expected alignment.  However, Clang currently only sets this for
>> by-value arguments that the calling convention says to pass
>> indirectly,
>> and that remains acceptable under this new rule because it's an ABI
>> rule
>> rather than a constraint on programmer behavior (other than assembly
>> programmers).  The rule just means that we can't start setting it on
>> arbitrary pointer parameters.
>> 
>> It is also a very portable rule; I'm not aware of any compilers that
>> do
>> try to take advantage of the formal alignment of pointer values
>> independent
>> of access.
>> 
>> The key question in this new rule is what counts as an "access".
>> I'll spell
>> this out in more detail, but it's mostly intuitive: anything that
>> ultimately
>> requires a load or store.  The only thing that's perhaps questionable
>> is that
>> we'd like to treat calls to library functions that access memory as
>> if they
>> were direct accesses to their arguments.  For example, we'd like to
>> assume
>> that the pointer arguments to memcpy are properly aligned for their
>> types
>> (that is, their explicit types, before the implicit conversion to
>> void*) so
>> that we can generate a more efficient copy operation.  This analysis
>> currently relies on the language rule that pointers may not be
>> misaligned;
>> preserving it requires us to treat calls to library functions as
>> special,
>> which of course we already do.  Programmers can still suppress this
>> assumption by explicitly casting the arguments to void*.
>> 
>> Here's the proposed new rule, expressed more formally:
>> 
>> ---
>> 
>> It is well-defined behavior to construct a pointer to memory that
>> is less aligned than the alignment of the pointee type (if a complete
>> type).  However, it is undefined behavior to “access” an expression
>> that
>> is an r-value of type T* or an l-value of type T if T is a complete
>> type
>> and the memory is less aligned than T.
>> 
>> An r-value expression of pointer type is accessed if:
>> - it is dereferenced (with *) and the resulting l-value is accessed,
>> - it is implicitly converted to another pointer type and the
>>   result is accessed,
>> - it undergoes pointer addition and the result is accessed,
>> - it is passed to a function in the C standard library that is known
>>   to access the memory,
>> - in C++, it is converted to a pointer to a virtual base, or
>> - in C++, it is explicitly cast (other than by a reinterpret_cast)
>> to
>>   a related class pointer type and the result is accessed.
>> 
>> An l-value expression is accessed if:
>> - it undergoes an lvalue-to-rvalue conversion (i.e. it is loaded),
>> - it is the LHS of an assignment operator (including the
>>   compound assignments),
>> - it is the base of a member access (with .) and the resulting
>> l-value
>>   is accessed (recall that x->y is defined as ((*x).y),
>> - it undergoes indirection (with &) and the resulting pointer is
>> accessed,
>> - in C++, it is implicitly converted to be an l-value to a base type
>>   and the result is accessed,
>> - in C++, it is converted to be an l-value of a virtual base type,
>> - in C++, it is used as the "this""" argument of a call to a
>>   non-static member function, or
>> - in C++, a reference is bound to it (which includes explicit
>>   casts to reference type).
>> 
>> These are the cases covered by the language standard.  There is a
>> very long tail of other kinds of expression that obviously access
>> memory,
>> like the atomic and overflow builtins, which I can't reasonably
>> enumerate.
>> The intent should be obvious, but I'm willing to spell it out in
>> other
>> cases where necessary.
>> 
>> Note that this definition is *syntactic*, meaning that it is
>> expressed
>> in terms of the components of a single statement.  This means that an
>> access that might be undefined behavior if written as a single
>> statement:
>>  highlyAlignedStruct->charMember = 0;
>> may not be undefined behavior if split across two statements:
>>  “char *member = &highlyAlignedStruct->charMember;
>>  *member = 0;
>> In effect, the compiler promises to never propagate alignment
>> assumptions
>> between statements through its knowledge of how a pointer was
>> constructed.
>> This is necessary in order to allow local workarounds to be reliable.
>> 
>> Note also that this definition does not propagate through explicit
>> casts,
>> other than class-hierarchy casts in C++.  Again, this is a deliberate
>> choice to make misalignment workarounds more straightforward.
>> 
>> But note that this rule does still allow the compiler to make
>> stronger
>> abstract assumptions about the alignment of C++ references and the
>> "this" pointer.
>> 
>> ---
>> 
>> Please let me know what you think.
>> 
>> John.
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory