[cfe-dev] RFC: Enforcing pointer type alignment in Clang

John McCall via cfe-dev cfe-dev at lists.llvm.org
Thu Jan 14 12:56:37 PST 2016


C 6.3.2.3p7 (N1548) says:
  A pointer to an object type may be converted to a pointer to a
  different object type. If the resulting pointer is not correctly
  aligned) for the referenced type, the behavior is undefined.

C++ [expr.reinterpret.cast]p7 (N4527) defines pointer conversions in terms
of conversions from void*:
  An object pointer can be explicitly converted to an object pointer
  of a different type. When a prvalue v of object pointer type is
  converted to the object pointer type “pointer to cv T”, the result
  is static_cast<cv T*>(static_cast<cv void*>(v)).

C++ [expr.static.cast]p13 says of conversions from void*:
  A prvalue of type “pointer to cv1 void” can be converted to a
  prvalue of type “pointer to cv2 T” .... If the original pointer value
  represents the address A of a byte in memory and A satisfies the alignment
  requirement of T, then the resulting pointer value represents the same
  address as the original pointer value, that is, A. The result of any
  other such pointer conversion is unspecified.

The clear intent of these rules is that the implementation may assume
that any pointer is adequately aligned for its pointee type, because any
attempt to actually create such a pointer has undefined behavior.  It is
very likely that, if we found a hole in those rules that seemed to permit
the creation of unaligned pointers, we could go to the committees and
have that hole closed.  The language policy here is clear.

There are architectures where this policy is mandatory.  The classic
example is (I believe) the Cray C90, which provides word-addressed 32-bit
pointers.  Pointers to most types can use this native representation directly.
char*, however, requires sub-word addressing, which means void* and char*
are actually 64 bits in order to permit the storage of the sub-word offset.
An int* therefore literally cannot express an arbitrary void*.

Less dramatically, there are architectural features that clearly depend
on alignment.  It's unreasonable to expect processors to support atomic
accesses that straddle the basic unit of their cache coherence implementations.
Supporting small unaligned accesses has a fairly marginal cost in extra
hardware, but as accesses grow to 128 bits or larger, those costs can spiral
out of control.  These restrictions are fairly widely understood by compiler
users.

Everything below is mushier.  It's clearly advantageous for the compiler to
be able to make stronger assumptions about alignment when accessing memory.
ISAs often allow more efficient accesses to properly-aligned memory; for
example, 32-bit ARM can perform a 64-bit memory access in a single
instruction, but the address is required to be 8-byte-aligned.  Alignment
also affects compiler decisions even when the architecture doesn't enforce
it; for example, it can be profitable to combine two adjacent loads into
a single, wider load, but this will often slow down code if the wider load is
no longer properly aligned.

As is the case with most forms of undefined behavior, programmers have at
best an abstract appreciation for the positive effects of these optimizations,
but they have a very concrete understanding of the disruptive life effects
of being forced to fix crashes from mis-alignment.

Our standard response in LLVM/Clang is to explain the undefined behavior
rule, explain the benefits it provides, and politely ask users to, well,
deal with it.  And that's appropriate; most forms of undefined behavior
under the standard(s) are reasonable requests with reasonable code
workarounds.  However, we have also occasionally looked at a particular
undefined behavior rule and decided that there's a real usability problem
with enforcing it as written.  In these cases, we use our power as
implementors to make some subset of that behavior well-defined in order to
fix that problem.  For example, we did this for TBAA, because we recognized
that certain "obvious" aliasing violations were idiomatic and only had
awkward workarounds under the standard.

There's a similar problem here.  Much like TBAA, fixing it doesn't require
completely abandoning the idea of enforcing type-based alignment assumptions.
It does, however, require a significant adjustment to the language rule.

The problem is this: the standards make it undefined behavior to even
create an unaligned pointer.  Therefore, as soon as I've got such a
pointer, I'm basically doomed; it is no longer possible to locally
work around the problem.  I have to change the type of the pointer to
something that requires less alignment, and not just where I'm using
it, or even just within my function, but all the way up to wherever it
came from.

For example, suppose I've got this function:

  void processBuffer(const int32_t *buffer, size_t length) {
    ...
  }

I get a bug report saying that my function is crashing, and I decide
that the right fix is to make the function handle unaligned buffers
correctly.  Maybe that's a binary-compatibility requirement, or maybe the
buffer is usually coming from a serialized format that doesn't guarantee
alignment, and it's clearly unreasonable to copy the buffer just to satisfy
my function.

So how can I make this function handle unaligned buffers?  The type of the
argument itself means that being passed an unaligned buffer has undefined
behavior.  Now, I can change that parameter to use an unaligned typedef:

  typedef int32_t unaligned_int32_t __attribute__((aligned(1)));
  void processBuffer(const unaligned_int32_t *buffer, size_t length) {
    ...
  }

But this has severe problems.  First off, this is a GCC/Clang extension; a lot
of programmers feel uncomfortable adopting that, especially to fix a problem
that's in principle common across compilers.  Second, alignment attributes
are not really part of the type system, which means that they can be
silently dropped by any number of things, including both major features
like templates and just day-to-day quality-of-implementation stuff like the
common-type logic of the conditional operator.  And finally, my callers
still have undefined behavior, and I really need to go audit all of them
to make sure they're using the same sort of typedef.  This is not a reliable
solution to the bug.

Furthermore, the compiler doesn't really care whether the pointer is
abstractly aligned independent of any access to memory.  There aren't very
many interesting alignment-based optimizations on pointer values as mere
values.  In principle, we could optimize operations that cast the pointer
to an integral type and examine the low bits, but those operations are not
very common, and when they're there, it's probably for a good reason;
that's the kind of optimization is very likely to just create miscompiles
without really showing any benefit.

Therefore, I would like to propose that Clang formally adopt a significantly
weaker language rule for enforcing the alignment of pointers.  The basic
idea is this:

  It is not undefined behavior to create a pointer that is less aligned
  than its pointee type.  Instead, it is only undefined behavior to
  access memory through a pointer that is less aligned than its pointee
  type.

That is, the only thing that matters is the type when you actually perform
the access, not any type the pointer might have had at some earlier point
during execution.

Notably, I believe that this rule doesn't require any changes in our
current behavior, so adopting it is just a restriction on future compiler
optimization.  For the most part, LLVM IR only attaches alignment to loads,
stores, and specific intrinsics like llvm.memcpy; there is no way to say
that a pointer value is expected to have a particular alignment.  The
one exception that I'm aware of is that an indirect parameter can have
an expected alignment.  However, Clang currently only sets this for
by-value arguments that the calling convention says to pass indirectly,
and that remains acceptable under this new rule because it's an ABI rule
rather than a constraint on programmer behavior (other than assembly
programmers).  The rule just means that we can't start setting it on
arbitrary pointer parameters.

It is also a very portable rule; I'm not aware of any compilers that do
try to take advantage of the formal alignment of pointer values independent
of access.

The key question in this new rule is what counts as an "access".  I'll spell
this out in more detail, but it's mostly intuitive: anything that ultimately
requires a load or store.  The only thing that's perhaps questionable is that
we'd like to treat calls to library functions that access memory as if they
were direct accesses to their arguments.  For example, we'd like to assume
that the pointer arguments to memcpy are properly aligned for their types
(that is, their explicit types, before the implicit conversion to void*) so
that we can generate a more efficient copy operation.  This analysis
currently relies on the language rule that pointers may not be misaligned;
preserving it requires us to treat calls to library functions as special,
which of course we already do.  Programmers can still suppress this
assumption by explicitly casting the arguments to void*.

Here's the proposed new rule, expressed more formally:

---

It is well-defined behavior to construct a pointer to memory that
is less aligned than the alignment of the pointee type (if a complete
type).  However, it is undefined behavior to “access” an expression that
is an r-value of type T* or an l-value of type T if T is a complete type
and the memory is less aligned than T.

An r-value expression of pointer type is accessed if:
 - it is dereferenced (with *) and the resulting l-value is accessed,
 - it is implicitly converted to another pointer type and the
   result is accessed,
 - it undergoes pointer addition and the result is accessed,
 - it is passed to a function in the C standard library that is known
   to access the memory,
 - in C++, it is converted to a pointer to a virtual base, or
 - in C++, it is explicitly cast (other than by a reinterpret_cast) to
   a related class pointer type and the result is accessed.

An l-value expression is accessed if:
 - it undergoes an lvalue-to-rvalue conversion (i.e. it is loaded),
 - it is the LHS of an assignment operator (including the
   compound assignments),
 - it is the base of a member access (with .) and the resulting l-value
   is accessed (recall that x->y is defined as ((*x).y),
 - it undergoes indirection (with &) and the resulting pointer is accessed,
 - in C++, it is implicitly converted to be an l-value to a base type
   and the result is accessed,
 - in C++, it is converted to be an l-value of a virtual base type,
 - in C++, it is used as the "this""" argument of a call to a
   non-static member function, or
 - in C++, a reference is bound to it (which includes explicit
   casts to reference type).

These are the cases covered by the language standard.  There is a
very long tail of other kinds of expression that obviously access memory,
like the atomic and overflow builtins, which I can't reasonably enumerate.
The intent should be obvious, but I'm willing to spell it out in other
cases where necessary.

Note that this definition is *syntactic*, meaning that it is expressed
in terms of the components of a single statement.  This means that an
access that might be undefined behavior if written as a single statement:
  highlyAlignedStruct->charMember = 0;
may not be undefined behavior if split across two statements:
  “char *member = &highlyAlignedStruct->charMember;
  *member = 0;
In effect, the compiler promises to never propagate alignment assumptions
between statements through its knowledge of how a pointer was constructed.
This is necessary in order to allow local workarounds to be reliable.

Note also that this definition does not propagate through explicit casts,
other than class-hierarchy casts in C++.  Again, this is a deliberate
choice to make misalignment workarounds more straightforward.

But note that this rule does still allow the compiler to make stronger
abstract assumptions about the alignment of C++ references and the
"this" pointer.

---

Please let me know what you think.

John.


More information about the cfe-dev mailing list