[llvm-bugs] [Bug 35503] Bug in union member handling in TBAA

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Dec 4 02:53:42 PST 2017


https://bugs.llvm.org/show_bug.cgi?id=35503

Ivan Kosarev <ikosarev at accesssoftek.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |---
                 CC|                            |ch3root at openwall.com

--- Comment #4 from Ivan Kosarev <ikosarev at accesssoftek.com> ---
The failure is caused by this snippet:

---
  short *q;
  p->u.vec[i] = 0;
  q = &p->u.vec[16];
  *q = 1;
  return p->u.vec[i];
---

LLVM currently implements a TBAA model that assumes that differently-typed
accesses may never alias. For this reason we require accesses to union members
to have their most enclosed union objects specified explicitly and taking
addresses of union members is thus not allowed in strict-aliasing mode. In this
snippet we take the address of an element of the indirect union member 'vec'
and then dereference that address as if it is a pointer to a short object that
is not a member of a union.

Hal, answering your response in D39455:

> demonstrating that you can't use "union member" as the access type here.
> You need to use the actual access type, or something derived from it, because
> non-type-changing scalar accesses are still allowed to alias with the union-
> member accesses.

That wouldn't help because whatever are the objects accesses to which we treat
as no-alias ones, they can be members of the same union object. So this problem
is irrelevant to how we represent union members. Consider this:

struct A { int i; };
struct B { int i; };

In our current TBAA model we treat accesses to A::i and B::i as ones that are
not allowed to overlap:

int foo(A *a, B *b) {
  a->i = 0;
  b->i++;  // Doesn't alias with a->i.
  return a->i;
}

But they obviously may overlap if (*a) and (*b) are members of the same union
object:

union U { A a; B b; };

int bar(U *u) {
  return foo(&u->a, &u->b);
}

Which is why we require explicit references to union objects when we access
their members.

This means that the test case was passing only by accident; we do not support
code like this and the patch didn't change anything in this regard.

Same considerations apply to untyped memory, except that with untyped memory
there is no way to specify that we access something that may actually work as a
union. For this reason in the current model we won't be able to support untyped
memory.

If you wonder how this is different to what gcc does, then the answer is that
gcc does not assume that differently-typed accesses do not overlap. Instead, it
tries to figure it out if a given couple of pointers couldn't be pointing to
the same object without violating the requirement that types of read accesses
shall match types of last-written values.

So, here's how I see the situation:
1) As of today, we do not support taking addresses of union members and the
provided test case is not supposed to pass.
2) The patch is correct as long as we follow the current TBAA model.
3) At some point we should consider supporting the gcc model, which is would be
a huge improvement in terms of the amount of code in wild we can compile in
strict-aliasing mode.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171204/9b863d31/attachment-0001.html>


More information about the llvm-bugs mailing list