[cfe-dev] RFC: Enforcing pointer type alignment in Clang

Thu Jan 14 15:21:13 PST 2016

----- Original Message -----
> From: "John McCall via cfe-dev" <cfe-dev at lists.llvm.org>
> To: cfe-dev at lists.llvm.org, llvm-dev at lists.llvm.org
> Sent: Thursday, January 14, 2016 2:56:37 PM
> Subject: [cfe-dev] RFC: Enforcing pointer type alignment in Clang
> 
> C 6.3.2.3p7 (N1548) says:
>   A pointer to an object type may be converted to a pointer to a
>   different object type. If the resulting pointer is not correctly
>   aligned) for the referenced type, the behavior is undefined.
> 
> C++ [expr.reinterpret.cast]p7 (N4527) defines pointer conversions in
> terms
> of conversions from void*:
>   An object pointer can be explicitly converted to an object pointer
>   of a different type. When a prvalue v of object pointer type is
>   converted to the object pointer type “pointer to cv T”, the result
>   is static_cast<cv T*>(static_cast<cv void*>(v)).
> 
> C++ [expr.static.cast]p13 says of conversions from void*:
>   A prvalue of type “pointer to cv1 void” can be converted to a
>   prvalue of type “pointer to cv2 T” .... If the original pointer
>   value
>   represents the address A of a byte in memory and A satisfies the
>   alignment
>   requirement of T, then the resulting pointer value represents the
>   same
>   address as the original pointer value, that is, A. The result of
>   any
>   other such pointer conversion is unspecified.
> 
> The clear intent of these rules is that the implementation may assume
> that any pointer is adequately aligned for its pointee type, because
> any
> attempt to actually create such a pointer has undefined behavior.  It
> is
> very likely that, if we found a hole in those rules that seemed to
> permit
> the creation of unaligned pointers, we could go to the committees and
> have that hole closed.  The language policy here is clear.
> 
> There are architectures where this policy is mandatory.  The classic
> example is (I believe) the Cray C90, which provides word-addressed
> 32-bit
> pointers.  Pointers to most types can use this native representation
> directly.
> char*, however, requires sub-word addressing, which means void* and
> char*
> are actually 64 bits in order to permit the storage of the sub-word
> offset.
> An int* therefore literally cannot express an arbitrary void*.
> 
> Less dramatically, there are architectural features that clearly
> depend
> on alignment.  It's unreasonable to expect processors to support
> atomic
> accesses that straddle the basic unit of their cache coherence
> implementations.
> Supporting small unaligned accesses has a fairly marginal cost in
> extra
> hardware, but as accesses grow to 128 bits or larger, those costs can
> spiral
> out of control.  These restrictions are fairly widely understood by
> compiler
> users.
> 
> Everything below is mushier.  It's clearly advantageous for the
> compiler to
> be able to make stronger assumptions about alignment when accessing
> memory.
> ISAs often allow more efficient accesses to properly-aligned memory;
> for
> example, 32-bit ARM can perform a 64-bit memory access in a single
> instruction, but the address is required to be 8-byte-aligned.
>  Alignment
> also affects compiler decisions even when the architecture doesn't
> enforce
> it; for example, it can be profitable to combine two adjacent loads
> into
> a single, wider load, but this will often slow down code if the wider
> load is
> no longer properly aligned.
> 
> As is the case with most forms of undefined behavior, programmers
> have at
> best an abstract appreciation for the positive effects of these
> optimizations,
> but they have a very concrete understanding of the disruptive life
> effects
> of being forced to fix crashes from mis-alignment.
> 
> Our standard response in LLVM/Clang is to explain the undefined
> behavior
> rule, explain the benefits it provides, and politely ask users to,
> well,
> deal with it.  And that's appropriate; most forms of undefined
> behavior
> under the standard(s) are reasonable requests with reasonable code
> workarounds.  However, we have also occasionally looked at a
> particular
> undefined behavior rule and decided that there's a real usability
> problem
> with enforcing it as written.  In these cases, we use our power as
> implementors to make some subset of that behavior well-defined in
> order to
> fix that problem.  For example, we did this for TBAA, because we
> recognized
> that certain "obvious" aliasing violations were idiomatic and only
> had
> awkward workarounds under the standard.
> 
> There's a similar problem here.  Much like TBAA, fixing it doesn't
> require
> completely abandoning the idea of enforcing type-based alignment
> assumptions.
> It does, however, require a significant adjustment to the language
> rule.
> 
> The problem is this: the standards make it undefined behavior to even
> create an unaligned pointer.  Therefore, as soon as I've got such a
> pointer, I'm basically doomed; it is no longer possible to locally
> work around the problem.  I have to change the type of the pointer to
> something that requires less alignment, and not just where I'm using
> it, or even just within my function, but all the way up to wherever
> it
> came from.
> 
> For example, suppose I've got this function:
> 
>   void processBuffer(const int32_t *buffer, size_t length) {
>     ...
>   }
> 
> I get a bug report saying that my function is crashing, and I decide
> that the right fix is to make the function handle unaligned buffers
> correctly.  Maybe that's a binary-compatibility requirement, or maybe
> the
> buffer is usually coming from a serialized format that doesn't
> guarantee
> alignment, and it's clearly unreasonable to copy the buffer just to
> satisfy
> my function.
> 
> So how can I make this function handle unaligned buffers?  The type
> of the
> argument itself means that being passed an unaligned buffer has
> undefined
> behavior.  Now, I can change that parameter to use an unaligned
> typedef:
> 
>   typedef int32_t unaligned_int32_t __attribute__((aligned(1)));
>   void processBuffer(const unaligned_int32_t *buffer, size_t length)
>   {
>     ...
>   }
> 
> But this has severe problems.  First off, this is a GCC/Clang
> extension; a lot
> of programmers feel uncomfortable adopting that, especially to fix a
> problem
> that's in principle common across compilers.  Second, alignment
> attributes
> are not really part of the type system, which means that they can be
> silently dropped by any number of things, including both major
> features
> like templates and just day-to-day quality-of-implementation stuff
> like the
> common-type logic of the conditional operator.  And finally, my
> callers
> still have undefined behavior, and I really need to go audit all of
> them
> to make sure they're using the same sort of typedef.  This is not a
> reliable
> solution to the bug.
> 
> Furthermore, the compiler doesn't really care whether the pointer is
> abstractly aligned independent of any access to memory.  There aren't
> very
> many interesting alignment-based optimizations on pointer values as
> mere
> values.  In principle, we could optimize operations that cast the
> pointer
> to an integral type and examine the low bits, but those operations
> are not
> very common, and when they're there, it's probably for a good reason;
> that's the kind of optimization is very likely to just create
> miscompiles
> without really showing any benefit.
> 
> Therefore, I would like to propose that Clang formally adopt a
> significantly
> weaker language rule for enforcing the alignment of pointers.  The
> basic
> idea is this:
> 
>   It is not undefined behavior to create a pointer that is less
>   aligned
>   than its pointee type.  Instead, it is only undefined behavior to
>   access memory through a pointer that is less aligned than its
>   pointee
>   type.
> 
> That is, the only thing that matters is the type when you actually
> perform
> the access, not any type the pointer might have had at some earlier
> point
> during execution.
> 
> Notably, I believe that this rule doesn't require any changes in our
> current behavior, so adopting it is just a restriction on future
> compiler
> optimization.

For the sake of completeness, I'll mention one exception. If the pointer (or its type via a typedef) as the __attribute__((align_value(N))) attribute, then we do emit alignment attributes on the pointer values themselves and use that information in later optimizations. This is by design, but given that it is explicitly opt-in, I feel this falls into a different category than the situations you've described.

Realistically, if we ever were to implement optimizations based on default type alignments, we'd need a flag to turn off those assumptions (just like we have a flag to turn off strict aliasing assumptions).

 -Hal

>  For the most part, LLVM IR only attaches alignment to
> loads,
> stores, and specific intrinsics like llvm.memcpy; there is no way to
> say
> that a pointer value is expected to have a particular alignment.  The
> one exception that I'm aware of is that an indirect parameter can
> have
> an expected alignment.  However, Clang currently only sets this for
> by-value arguments that the calling convention says to pass
> indirectly,
> and that remains acceptable under this new rule because it's an ABI
> rule
> rather than a constraint on programmer behavior (other than assembly
> programmers).  The rule just means that we can't start setting it on
> arbitrary pointer parameters.
> 
> It is also a very portable rule; I'm not aware of any compilers that
> do
> try to take advantage of the formal alignment of pointer values
> independent
> of access.
> 
> The key question in this new rule is what counts as an "access".
>  I'll spell
> this out in more detail, but it's mostly intuitive: anything that
> ultimately
> requires a load or store.  The only thing that's perhaps questionable
> is that
> we'd like to treat calls to library functions that access memory as
> if they
> were direct accesses to their arguments.  For example, we'd like to
> assume
> that the pointer arguments to memcpy are properly aligned for their
> types
> (that is, their explicit types, before the implicit conversion to
> void*) so
> that we can generate a more efficient copy operation.  This analysis
> currently relies on the language rule that pointers may not be
> misaligned;
> preserving it requires us to treat calls to library functions as
> special,
> which of course we already do.  Programmers can still suppress this
> assumption by explicitly casting the arguments to void*.
> 
> Here's the proposed new rule, expressed more formally:
> 
> ---
> 
> It is well-defined behavior to construct a pointer to memory that
> is less aligned than the alignment of the pointee type (if a complete
> type).  However, it is undefined behavior to “access” an expression
> that
> is an r-value of type T* or an l-value of type T if T is a complete
> type
> and the memory is less aligned than T.
> 
> An r-value expression of pointer type is accessed if:
>  - it is dereferenced (with *) and the resulting l-value is accessed,
>  - it is implicitly converted to another pointer type and the
>    result is accessed,
>  - it undergoes pointer addition and the result is accessed,
>  - it is passed to a function in the C standard library that is known
>    to access the memory,
>  - in C++, it is converted to a pointer to a virtual base, or
>  - in C++, it is explicitly cast (other than by a reinterpret_cast)
>  to
>    a related class pointer type and the result is accessed.
> 
> An l-value expression is accessed if:
>  - it undergoes an lvalue-to-rvalue conversion (i.e. it is loaded),
>  - it is the LHS of an assignment operator (including the
>    compound assignments),
>  - it is the base of a member access (with .) and the resulting
>  l-value
>    is accessed (recall that x->y is defined as ((*x).y),
>  - it undergoes indirection (with &) and the resulting pointer is
>  accessed,
>  - in C++, it is implicitly converted to be an l-value to a base type
>    and the result is accessed,
>  - in C++, it is converted to be an l-value of a virtual base type,
>  - in C++, it is used as the "this""" argument of a call to a
>    non-static member function, or
>  - in C++, a reference is bound to it (which includes explicit
>    casts to reference type).
> 
> These are the cases covered by the language standard.  There is a
> very long tail of other kinds of expression that obviously access
> memory,
> like the atomic and overflow builtins, which I can't reasonably
> enumerate.
> The intent should be obvious, but I'm willing to spell it out in
> other
> cases where necessary.
> 
> Note that this definition is *syntactic*, meaning that it is
> expressed
> in terms of the components of a single statement.  This means that an
> access that might be undefined behavior if written as a single
> statement:
>   highlyAlignedStruct->charMember = 0;
> may not be undefined behavior if split across two statements:
>   “char *member = &highlyAlignedStruct->charMember;
>   *member = 0;
> In effect, the compiler promises to never propagate alignment
> assumptions
> between statements through its knowledge of how a pointer was
> constructed.
> This is necessary in order to allow local workarounds to be reliable.
> 
> Note also that this definition does not propagate through explicit
> casts,
> other than class-hierarchy casts in C++.  Again, this is a deliberate
> choice to make misalignment workarounds more straightforward.
> 
> But note that this rule does still allow the compiler to make
> stronger
> abstract assumptions about the alignment of C++ references and the
> "this" pointer.
> 
> ---
> 
> Please let me know what you think.
> 
> John.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory