[llvm-dev] RFC: A change in InstCombine canonical form

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 16 11:13:56 PDT 2016


On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri <ehsanamiri at gmail.com> wrote:

> David,
>
> Could you give us an update on the status of typeless pointer work? How
> much work is left and when you think it might be ready?
>

It's a bit of an onion peel, really - since it will eventually involve
generalizing/fixing every optimization that's currently leaning on typed
pointers to keep the performance while removing the crutch they're
currently leaning on. (in cases where bitcasts are literally just getting
in the way, those won't require cleaning up & should just become "free
performance wins" once we remove them, though)

At the moment we can roundtrip every LLVM IR test case through bitcode and
textual IR (reading/writing both formats) while using only a narrow
whitelist of places that request the type of a pointer (things like the
verifier, the parser/printer where it actually needs the typed pointer to
verify it matches the explicit type, etc).

The next thing on the list is probably figuring out the byval/inalloca
representation (add an explicit pointee type? just make the number of bytes
explicit with no type information?).

Then start migrating optimizations over - doing the same sort of testing I
did for the IR/bitcode roundtripping - assert that the pointee type is not
accessed, whitelist places that need it until the bitcasts go away, fix
anything else... it'll still be a fair bit of work & I don't really know
how much. It should parallelize pretty well (doing any of this work is
really helpful, each optimization is indepednent, etc) if anyone wants
to/is able to help.

- Dave


>
> Thanks
> Ehsan
>
>
> On Wed, Mar 16, 2016 at 1:12 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi,
>>>
>>> How do it interact with the "typeless pointers" work?
>>>
>>
>> Right - the goal of the typeless pointer work is to fix all these bugs
>> related to "didn't look through bitcasts" in optimizations. Sometimes
>> that's going to mean more work (because the code is leaning on the absence
>> of bitcasts & the presence of convenient (but not guaranteed) type
>> information to inform optimization decisions) but if we remove typed
>> pointers while keeping optimization quality in the cases we have today,
>> then we should've also fixed the cases that were broken because the type
>> information didn't end up aligning to produce the optimal output.
>>
>> & I know I've been off the typeless pointer stuff for a few months
>> working on llvm-dwp - but happy for any help (the next immediate piece is
>> probably figuring out teh right representation for byval and inalloca -
>> there were some code reviews sent out for that that I'll need to come back
>> around to - but also any optimizations people want to help rework/improve
>> would be great too & I can provide some techniques/tools to help people
>> approach those)
>>
>> - Dave
>>
>>
>>>
>>> Thanks,
>>>
>>> --
>>> Mehdi
>>>
>>> On Mar 16, 2016, at 6:41 AM, Ehsan Amiri via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> === PROBLEM === (See this bug
>>> https://llvm.org/bugs/show_bug.cgi?id=26445)
>>>
>>> IR contains code for loading a float from float * and storing it to a
>>> float * address. After canonicalization of load in InstCombine [1], new
>>> bitcasts are added to the IR (see bottom of the email for code samples).
>>> This prevents select speculation in SROA to work. Also after SROA we have
>>> bitcasts from int32 to float. (Whereas originally after instCombine,
>>> bitcasts are only done on pointer types).
>>>
>>> === PROPOSED SOLUTION===
>>>
>>> [1] implies that we need load canonicalization when we load a value only
>>> to store it again. The reason is to avoid generating slightly different
>>> code (due to different ways of adding bitcasts), in different situations.
>>> In all examples presented in [1] there is a non-zero number of bitcasts. I
>>> think when we load a value of type T from a T* address and store it as a
>>> type T value to one or more T* address (and there is no other use or
>>> store), we can redefine canonical form to mean there should not be any
>>> bitcasts. So we still have a canonical form, but its definition is slightly
>>> different.
>>>
>>> === REASONS FOR / AGAINST===
>>>
>>> Hal Finkel warns that while this may be useful for power pc, this may
>>> hurt more than one other platform and become a very large project. Despite
>>> this he is fine with bringing up the issue to the mailing list to get
>>> feedback, mostly because this seems inline with our future direction of
>>> having a unique type for all pointers.  (Hal please correct me if I
>>> misunderstood your comment)
>>>
>>> This is a much simpler fix compared to alternatives. (ignoring potential
>>> regressions)
>>>
>>> === ALTERNATIVE SOLUTION ===
>>>
>>> Fix select speculation in SROA to see through bitcasts. Handle remaining
>>> bitcasts during code gen. Other alternative solutions are welcome.
>>>
>>> Should I implement the proposed solution or is it too risky? I
>>> understand that we may need to undo it if it breaks too many things.
>>> Comments are welcome.
>>>
>>>
>>> [1] http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
>>> r226781  git commit id: b778cbc0c8
>>>
>>>
>>>
>>> Code Samples (only relevant part is copied):
>>>
>>> --------------------  Before Canonicalization (contains call to
>>> std::max): --------------------
>>> entry:
>>>   %max_value = alloca float, align 4
>>>   %1 = load float, float* %input, align 4, !tbaa !1
>>>   store float %1, float* %max_value, align 4, !tbaa !1
>>>
>>> for.body:
>>>   %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float*
>>> dereferenceable(4) %max_value, float* dereferenceable(4) %arrayidx1)
>>>   %3 = load float, float* %call, align 4, !tbaa !1
>>>   store float %3, float* %max_value, align 4, !tbaa !1
>>>
>>> --------------------  After Canonicalization (contains call to
>>> std::max):--------------------
>>>
>>> entry:
>>>   %max_value = alloca float, align 4
>>>   %1 = bitcast float* %input to i32*
>>>   %2 = load i32, i32* %1, align 4, !tbaa !1
>>>   %3 = bitcast float* %max_value to i32*
>>>   store i32 %2, i32* %3, align 4, !tbaa !1
>>>
>>> for.body:
>>>   %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float*
>>> nonnull dereferenceable(4) %max_value, float* dereferenceable(4) %arrayidx1)
>>>   %5 = bitcast float* %call to i32*
>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>   %7 = bitcast float* %max_value to i32*
>>>   store i32 %6, i32* %7, align 4, !tbaa !1
>>>
>>> -------------------- After SROA (the call to std::max is inlined
>>> now):--------------------
>>> entry:
>>>   %max_value.sroa.0 = alloca i32
>>>   %0 = bitcast float* %input to i32*
>>>   %1 = load i32, i32* %0, align 4, !tbaa !1
>>>   store i32 %1, i32* %max_value.sroa.0
>>>
>>> for.body:
>>>   %max_value.sroa.0.0.max_value.sroa.0.0.6 = load i32, i32*
>>> %max_value.sroa.0
>>>   %3 = bitcast i32 %max_value.sroa.0.0.max_value.sroa.0.0.6 to float
>>>   %max_value.sroa.0.0.max_value.sroa_cast8 = bitcast i32*
>>> %max_value.sroa.0 to float*
>>>   %__b.__a.i = select i1 %cmp.i, float* %arrayidx1, float*
>>> %max_value.sroa.0.0.max_value.sroa_cast8
>>>   %5 = bitcast float* %__b.__a.i to i32*
>>>   %6 = load i32, i32* %5, align 4, !tbaa !1
>>>   store i32 %6, i32* %max_value.sroa.0
>>>
>>> -------------------- After SROA when Canonicalization is turned
>>> off--------------------
>>> entry:
>>>   %0 = load float, float* %input, align 4, !tbaa !1
>>>
>>> for.cond:                                         ; preds = %for.body,
>>> %entry
>>>   %max_value.0 = phi float [ %0, %entry ], [ %.sroa.speculated,
>>> %for.body ]
>>>
>>> for.body:
>>>   %1 = load float, float* %arrayidx1, align 4, !tbaa !1
>>>   %cmp.i = fcmp olt float %max_value.0, %1
>>>   %.sroa.speculate.load.true = load float, float* %arrayidx1, align 4,
>>> !tbaa !1
>>>   %.sroa.speculated = select i1 %cmp.i, float
>>> %.sroa.speculate.load.true, float %max_value.0
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160316/adb51e99/attachment.html>


More information about the llvm-dev mailing list