[LLVMdev] [cfe-dev] Clang devirtualization proposal
Reid Kleckner
rnk at google.com
Fri Jul 31 18:18:16 PDT 2015
On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <listmail at philipreames.com>
wrote:
>
> I'm wondering if there's a problematic interaction with CSE here.
> Consider this example is pseudo LLVM IR:
> v1 = load i64, %p, !invariant.group !Type1
> ; I called destructor/placement new for the same type, but that optimized
> entirely away
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> store i64 0, %p2, !invariant.group !Type1
> v2 = load i64, %p2, !invariant.group !Type1
> ret i64 v1 - v2
>
> (Assume that !Type is used to describe a write once integer field within
> some class. Not all instances have the same integer value.)
>
> Having CSE turn this into:
> v1 = load i64, %p, !invariant.group !Type1
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> store i64 0, %p1, !invariant.group !Type1
> v2 = load i64, %p1, !invariant.group !Type1
> ret i64 v1 - v2
>
> And then GVN turn this into:
> v1 = load i64, %p, !invariant.group !Type1
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> ret i64 v1 - v1 (-> 0)
>
> This doesn't seem like the result I'd expect. Is there something about my
> initial IR which is wrong/invalid in some way? Is the invariant.group
> required to be specific to a single bitpattern across all usages within a
> function/module/context? That would be reasonable, but I don't think is
> explicit said right now. It also makes !invariant.group effectively
> useless for describing constant fields which are constant per instance
> rather than per-class.
>
Yes, this family of examples scares me. :) It seems we've discovered a new
device testing IR soundness. We used it to build a test case that shows
that 'readonly' on arguments without 'nocapture' doesn't let you forward
stores across such a call.
Consider this pseudo-IR and some possible transforms that I would expect to
be semantics preserving:
void f(i32* readonly %a, i32* %b) {
llvm.assume(%a == %b)
store i32 42, i32* %b
}
...
%p = alloca i32
store i32 13, i32* %p
call f(i32* readonly %p, i32* %p)
%r = load i32, i32* %p
; Propagate llvm.assume info
void f(i32* readonly %a, i32* %b) {
store i32 42, i32* %a
}
...
%p = alloca i32
store i32 13, i32* %p
call f(i32* readonly %p, i32* %p)
%r = load i32, i32* %p
; Delete dead args
void f(i32* readonly %a) {
store i32 42
}
...
%p = alloca i32
store i32 13, i32* %p
call f(i32* readonly %p)
%r = load i32, i32* %p
; Forward store %p to load %p, since the only use of %p is readonly
void f(i32* readonly %a) {
store i32 42
}
...
%p = alloca i32
call f(i32* readonly %p)
%r = i32 13
Today LLVM will not do the final transform because it requires readonly on
the entire function, or nocapture on the argument. nocapture cannot be
inferred due to the assume comparison.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150731/79a6f691/attachment.html>
More information about the llvm-dev
mailing list