[llvm-dev] RFC: A change in InstCombine canonical form

Ehsan Amiri via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 16 09:35:12 PDT 2016


Hal knows better. My understanding is that, there is a similarity in the
code pattern generated, in that there will be no intervening bitcasts
between load and store.

Having said that, I just double checked one of the test cases that was
committed with canonicalization work.  My proposed solution may result in
lowering of memcpy to non-integer load and store. (See test1 in
test/Transforms/InstCombine/struct-assign-tbaa.ll).  This might be a
blocker. (Even if we can fix it, the proposed solution is now more
complicated than what I thought).



On Wed, Mar 16, 2016 at 11:34 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:

> Hi,
>
> How do it interact with the "typeless pointers" work?
>
> Thanks,
>
> --
> Mehdi
>
> On Mar 16, 2016, at 6:41 AM, Ehsan Amiri via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> === PROBLEM === (See this bug https://llvm.org/bugs/show_bug.cgi?id=26445)
>
> IR contains code for loading a float from float * and storing it to a
> float * address. After canonicalization of load in InstCombine [1], new
> bitcasts are added to the IR (see bottom of the email for code samples).
> This prevents select speculation in SROA to work. Also after SROA we have
> bitcasts from int32 to float. (Whereas originally after instCombine,
> bitcasts are only done on pointer types).
>
> === PROPOSED SOLUTION===
>
> [1] implies that we need load canonicalization when we load a value only
> to store it again. The reason is to avoid generating slightly different
> code (due to different ways of adding bitcasts), in different situations.
> In all examples presented in [1] there is a non-zero number of bitcasts. I
> think when we load a value of type T from a T* address and store it as a
> type T value to one or more T* address (and there is no other use or
> store), we can redefine canonical form to mean there should not be any
> bitcasts. So we still have a canonical form, but its definition is slightly
> different.
>
> === REASONS FOR / AGAINST===
>
> Hal Finkel warns that while this may be useful for power pc, this may hurt
> more than one other platform and become a very large project. Despite this
> he is fine with bringing up the issue to the mailing list to get feedback,
> mostly because this seems inline with our future direction of having a
> unique type for all pointers.  (Hal please correct me if I misunderstood
> your comment)
>
> This is a much simpler fix compared to alternatives. (ignoring potential
> regressions)
>
> === ALTERNATIVE SOLUTION ===
>
> Fix select speculation in SROA to see through bitcasts. Handle remaining
> bitcasts during code gen. Other alternative solutions are welcome.
>
> Should I implement the proposed solution or is it too risky? I understand
> that we may need to undo it if it breaks too many things. Comments are
> welcome.
>
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
> r226781  git commit id: b778cbc0c8
>
>
>
> Code Samples (only relevant part is copied):
>
> --------------------  Before Canonicalization (contains call to std::max):
> --------------------
> entry:
>   %max_value = alloca float, align 4
>   %1 = load float, float* %input, align 4, !tbaa !1
>   store float %1, float* %max_value, align 4, !tbaa !1
>
> for.body:
>   %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float*
> dereferenceable(4) %max_value, float* dereferenceable(4) %arrayidx1)
>   %3 = load float, float* %call, align 4, !tbaa !1
>   store float %3, float* %max_value, align 4, !tbaa !1
>
> --------------------  After Canonicalization (contains call to
> std::max):--------------------
>
> entry:
>   %max_value = alloca float, align 4
>   %1 = bitcast float* %input to i32*
>   %2 = load i32, i32* %1, align 4, !tbaa !1
>   %3 = bitcast float* %max_value to i32*
>   store i32 %2, i32* %3, align 4, !tbaa !1
>
> for.body:
>   %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float*
> nonnull dereferenceable(4) %max_value, float* dereferenceable(4) %arrayidx1)
>   %5 = bitcast float* %call to i32*
>   %6 = load i32, i32* %5, align 4, !tbaa !1
>   %7 = bitcast float* %max_value to i32*
>   store i32 %6, i32* %7, align 4, !tbaa !1
>
> -------------------- After SROA (the call to std::max is inlined
> now):--------------------
> entry:
>   %max_value.sroa.0 = alloca i32
>   %0 = bitcast float* %input to i32*
>   %1 = load i32, i32* %0, align 4, !tbaa !1
>   store i32 %1, i32* %max_value.sroa.0
>
> for.body:
>   %max_value.sroa.0.0.max_value.sroa.0.0.6 = load i32, i32*
> %max_value.sroa.0
>   %3 = bitcast i32 %max_value.sroa.0.0.max_value.sroa.0.0.6 to float
>   %max_value.sroa.0.0.max_value.sroa_cast8 = bitcast i32*
> %max_value.sroa.0 to float*
>   %__b.__a.i = select i1 %cmp.i, float* %arrayidx1, float*
> %max_value.sroa.0.0.max_value.sroa_cast8
>   %5 = bitcast float* %__b.__a.i to i32*
>   %6 = load i32, i32* %5, align 4, !tbaa !1
>   store i32 %6, i32* %max_value.sroa.0
>
> -------------------- After SROA when Canonicalization is turned
> off--------------------
> entry:
>   %0 = load float, float* %input, align 4, !tbaa !1
>
> for.cond:                                         ; preds = %for.body,
> %entry
>   %max_value.0 = phi float [ %0, %entry ], [ %.sroa.speculated, %for.body ]
>
> for.body:
>   %1 = load float, float* %arrayidx1, align 4, !tbaa !1
>   %cmp.i = fcmp olt float %max_value.0, %1
>   %.sroa.speculate.load.true = load float, float* %arrayidx1, align 4,
> !tbaa !1
>   %.sroa.speculated = select i1 %cmp.i, float %.sroa.speculate.load.true,
> float %max_value.0
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160316/e8f55458/attachment.html>


More information about the llvm-dev mailing list