[llvm-dev] RFC: A change in InstCombine canonical form

Tue Mar 22 20:57:46 PDT 2016


On 03/22/2016 06:50 PM, Ehsan Amiri wrote:
>
>
> On Tue, Mar 22, 2016 at 6:42 PM, Philip Reames 
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
>
>
>     On 03/22/2016 02:44 PM, Ehsan Amiri wrote:
>>     Thanks.
>>
>>     *Phillip, *As Hal said I do not think (1) is a very large item.
>>     Please let me know if I am mistaken.
>     I have no specific reason to believe it will be a large amount of
>     work, but prior experience tells me changes to canonical form have
>     a tendency of exposing unexpected issues.  To be clear, I am
>     supportive of you implementing solution 1.
>>
>
> Thanks Phillip. Do you mean exposing bugs that were unknown before? or 
> something else?
>
> If the problem is that some transformations will start to have 
> problems because they do not know how to deal with the new canonical 
> form, that's alarming. If the chances of exposing such problem is not 
> too high, then we can possibly go ahead with (1) and fallback to (2) 
> if (1) starts to be complicated. But if the risk is too high, then we 
> may want to prefer (2) to (1).
Most are not correctness bugs, but performance bugs.  Later passes 
(particular CGP and later) do not do well when presented arbitrary 
non-canonical IR.  They won't crash, but they may not be as effective as 
expected and may miss obvious performance wins.  When you change what 
canonical is, you sometimes miss the existing sweat spot and have to 
adjust the code to handle the new canonical form.

My suggestion would be to implement your change, run a reasonable set of 
performance tests, fix any problems, and iterate.  Just don't be 
surprised if what seems like a simple patch snowballs into a series of 
mostly minor patches.  I'm not trying to warn you off or scare you; I 
just want to make sure you have reasonable expectations going in.
>
> Any suggestions?
>
>
>
>
>
>
>>     *David *I think (1) is more inline with typeless pointer work
>>     than (2). Contributing to typeless pointer work will be great,
>>     but given its unknown time frame we cannot stop fixing existing
>>     problems. Of course, we should follow an approach consistent with
>>     the long-term solution.
>>
>>     On Tue, Mar 22, 2016 at 5:30 PM, David Blaikie
>>     <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>
>>
>>
>>         On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini
>>         <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>>
>>             Sorry I should have been more clear (writing to many
>>             email in parallel)
>>
>>             You're right. I was adding a +1 to you here.
>>
>>             Especially I wrote "unless there is an acknowledgement
>>             that typeless pointers won't be there before a couple of
>>             years" with  the PassManager in mind, and I was expecting
>>             from David some good indication of a timeframe for the
>>             typeless pointers.
>>             If the typeless pointer work is stalled or if it is not
>>             planned for LLVM 3.9,
>>
>>
>>         It's neither stalled nor planned, as such.
>>
>>             I agree with Philip to not block anything.
>>
>>
>>         All I'm suggesting is that if there are people who want to
>>         fix these bugs, I'd really appreciate them helping out on the
>>         typeless pointer work - it's totally
>>         parallelizable/shardable/shareable work & the project as a
>>         whole seemed to agree it was the right direction. Why not
>>         just get it done?
>>
>>         The Pass Manager work is a bit different & harder to share at
>>         certain points (I don't tihnk there's ever been a point at
>>         which someone could ask me "what can I do to help with the
>>         typeless pointer work" I couldn't've given some pretty large
>>         areas I wasn't anywhere near touching that they could've
>>         gotten their teeth into) - I think it's reaching the point
>>         where multiple passes can be ported independently & seen some
>>         work like that in Polly, etc. So at this point it seems like
>>         if people want to address the issues the new pass manager is
>>         aimed at addressing, they could pitch in on that effort.
>>
>>         That said, I'm not in a position (nor would I do so, even if
>>         I were) to block other patches, just to encourage people to
>>         help out on the bigger efforts in whatever way they can
>>         (either directly, or indirectly through ensuring stopgaps/new
>>         work is done in a way that makes that future work easier
>>         (where it's reasonable to judge what might be easier or
>>         harder, etc), etc)
>>
>>         - Dave
>>
>>
>>             -- 
>>             Mehdi
>>
>>
>>>             On Mar 22, 2016, at 1:37 PM, Philip Reames
>>>             <listmail at philipreames.com
>>>             <mailto:listmail at philipreames.com>> wrote:
>>>
>>>             But not what David was stating, unless I misread?  I was
>>>             specifically responding to David's wording:
>>>             "If we're talking about making an optimization able to
>>>             ignore the bitcast instructions - yes, that work is
>>>             unnecessary & perhaps questionable given the typeless
>>>             pointer work. Not outright off limits, but the same time
>>>             might be better invested in moving typeless pointers
>>>             forward if the contributor is so inclined/able to shift
>>>             in that direction."
>>>
>>>             Both "perhaps questionable" and "not outright off
>>>             limits" seem to strongly imply such work should be
>>>             discouraged. I disagree with that view which is why I
>>>             wrote my response.
>>>
>>>             Philip
>>>
>>>             On 03/22/2016 01:34 PM, Mehdi Amini wrote:
>>>>             This is roughly what I wrote...
>>>>
>>>>>             On Mar 22, 2016, at 1:31 PM, Philip Reames
>>>>>             <listmail at philipreames.com
>>>>>             <mailto:listmail at philipreames.com>> wrote:
>>>>>
>>>>>             I feel very strongly that blocking work on making
>>>>>             optimization bitcast-ignorant on the typeless pointer
>>>>>             work would be a major mistake. Unless we expected the
>>>>>             typeless pointer work to be concluded within the near
>>>>>             term (say 3-6 months maximum), we should not block any
>>>>>             development which would be accepted in the typeless
>>>>>             pointer work wasn't planned.
>>>>>
>>>>>             In my view, this is one of the largest mistakes we've
>>>>>             made with the pass manager work, it has seriously cost
>>>>>             us, and I don't want to see it happening again.
>>>>>
>>>>>             Philip
>>>>>
>>>>>             On 03/22/2016 01:09 PM, David Blaikie wrote:
>>>>>>             Ultimately everything is going to be made to not rely
>>>>>>             on the types of pointers - that's nearly equivalent
>>>>>>             to bitcast-ignorant (the difference being that the
>>>>>>             presence of an extra instruction (the bitcast) might
>>>>>>             trip up some optimizations - but the presence of the
>>>>>>             /type/ information implied by the bitcast should not
>>>>>>             trip up or be necessary for optimizations (two sides
>>>>>>             of the same coin))
>>>>>>
>>>>>>             If we're talking about making an optimization able to
>>>>>>             ignore the bitcast instructions - yes, that work is
>>>>>>             unnecessary & perhaps questionable given the typeless
>>>>>>             pointer work. Not outright off limits, but the same
>>>>>>             time might be better invested in moving typeless
>>>>>>             pointers forward if the contributor is so
>>>>>>             inclined/able to shift in that direction.
>>>>>>
>>>>>>             But if we're talking about the work to make the
>>>>>>             optimization not use the type of pointers as a crutch
>>>>>>             - that work is a necessary precursor to the typeless
>>>>>>             pointer work and would be most welcome.
>>>>>>
>>>>>>             - David
>>>>>>
>>>>>>             On Tue, Mar 22, 2016 at 12:39 PM, Mehdi Amini
>>>>>>             <mehdi.amini at apple.com
>>>>>>             <mailto:mehdi.amini at apple.com>> wrote:
>>>>>>
>>>>>>                 I don't really mind, but the intermediate stage
>>>>>>                 will not be very nice: that a lot of code / tests
>>>>>>                 that needs to be written with bitcast, and all of
>>>>>>                 that while they are deemed to disappear. The
>>>>>>                 added value isn't clear to me considering the
>>>>>>                 added work. I'm not sure it wouldn't add more
>>>>>>                 work for all the cleanup required by the
>>>>>>                 "typeless pointer", but I'm not sure what's
>>>>>>                 involved here and if David thinks the
>>>>>>                 intermediate steps of handling bit casts
>>>>>>                 everywhere is not making it harder I'm fine with it.
>>>>>>
>>>>>>                 -- 
>>>>>>                 Mehdi
>>>>>>
>>>>>>>                 On Mar 22, 2016, at 12:36 PM, Philip Reames
>>>>>>>                 <listmail at philipreames.com
>>>>>>>                 <mailto:listmail at philipreames.com>> wrote:
>>>>>>>
>>>>>>>                 I'd phrase this differently: being
>>>>>>>                 pointer-bitcast agnostic is a step towards
>>>>>>>                 support typeless pointers.  :) We can either
>>>>>>>                 become bitcast agnostic all in one big change or
>>>>>>>                 incrementally. Personally, I'd prefer the later
>>>>>>>                 since it reduces the risk associated with
>>>>>>>                 enabling typeless pointers in the end.
>>>>>>>
>>>>>>>                 Philip
>>>>>>>
>>>>>>>                 On 03/22/2016 12:16 PM, Mehdi Amini via llvm-dev
>>>>>>>                 wrote:
>>>>>>>>                 I don't know enough about the tradeoff for 1,
>>>>>>>>                 but 2 seems like a bandaid for something that
>>>>>>>>                 is not a correctness issue neither a
>>>>>>>>                 regression. I'm not sure it justifies "bandaid
>>>>>>>>                 patches" while there is a clear path forward,
>>>>>>>>                 i.e. typeless pointers, unless there is an
>>>>>>>>                 acknowledgement that typeless pointers won't be
>>>>>>>>                 there before a couple of years.
>>>>>>>>
>>>>>>>>                 -- 
>>>>>>>>                 Mehdi
>>>>>>>>
>>>>>>>>>                 On Mar 22, 2016, at 11:32 AM, Ehsan Amiri
>>>>>>>>>                 <ehsanamiri at gmail.com
>>>>>>>>>                 <mailto:ehsanamiri at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>                 Back to the discussion on the RFC, I still see
>>>>>>>>>                 some advantage in following the proposed
>>>>>>>>>                 solution. I see two paths forward:
>>>>>>>>>
>>>>>>>>>                 1- Change canonical form, possibly lower
>>>>>>>>>                 memcpy to non-integer load and store in
>>>>>>>>>                 InstCombine. Then teach the backends to
>>>>>>>>>                 convert that to integer load and store if that
>>>>>>>>>                 is more profitable. Notice that we are talking
>>>>>>>>>                 about loads that have no use other than store.
>>>>>>>>>                 So it is a fairly simple change in the backends.
>>>>>>>>>
>>>>>>>>>                 2- Do not change the canonical form. Then for
>>>>>>>>>                 this bug, we need to teach select speculation
>>>>>>>>>                 to see through bitcasts. We will probably need
>>>>>>>>>                 to teach other optimizations to see though
>>>>>>>>>                 bitcasts in the future as problems are
>>>>>>>>>                 uncovered. That is until typeless pointer work
>>>>>>>>>                 is complete. Once the typeless pointer work is
>>>>>>>>>                 complete, we have some extra code in each
>>>>>>>>>                 optimization for seeing through bitcasts which
>>>>>>>>>                 is possibly no longer needed.
>>>>>>>>>
>>>>>>>>>                 Based on this I think (1) is the right thing
>>>>>>>>>                 to do. But there might be other reasons for
>>>>>>>>>                 the current canonical form that I am not aware
>>>>>>>>>                 of. Please let me know what you think.
>>>>>>>>>
>>>>>>>>>                 Thanks
>>>>>>>>>                 Ehsan
>>>>>>>>>
>>>>>>>>>                 On Wed, Mar 16, 2016 at 2:13 PM, David Blaikie
>>>>>>>>>                 <dblaikie at gmail.com
>>>>>>>>>                 <mailto:dblaikie at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                     On Wed, Mar 16, 2016 at 11:00 AM, Ehsan
>>>>>>>>>                     Amiri <ehsanamiri at gmail.com
>>>>>>>>>                     <mailto:ehsanamiri at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>                         David,
>>>>>>>>>
>>>>>>>>>                         Could you give us an update on the
>>>>>>>>>                         status of typeless pointer work? How
>>>>>>>>>                         much work is left and when you think
>>>>>>>>>                         it might be ready?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                     It's a bit of an onion peel, really -
>>>>>>>>>                     since it will eventually involve
>>>>>>>>>                     generalizing/fixing every optimization
>>>>>>>>>                     that's currently leaning on typed pointers
>>>>>>>>>                     to keep the performance while removing the
>>>>>>>>>                     crutch they're currently leaning on. (in
>>>>>>>>>                     cases where bitcasts are literally just
>>>>>>>>>                     getting in the way, those won't require
>>>>>>>>>                     cleaning up & should just become "free
>>>>>>>>>                     performance wins" once we remove them, though)
>>>>>>>>>
>>>>>>>>>                     At the moment we can roundtrip every LLVM
>>>>>>>>>                     IR test case through bitcode and textual
>>>>>>>>>                     IR (reading/writing both formats) while
>>>>>>>>>                     using only a narrow whitelist of places
>>>>>>>>>                     that request the type of a pointer (things
>>>>>>>>>                     like the verifier, the parser/printer
>>>>>>>>>                     where it actually needs the typed pointer
>>>>>>>>>                     to verify it matches the explicit type, etc).
>>>>>>>>>
>>>>>>>>>                     The next thing on the list is probably
>>>>>>>>>                     figuring out the byval/inalloca
>>>>>>>>>                     representation (add an explicit pointee
>>>>>>>>>                     type? just make the number of bytes
>>>>>>>>>                     explicit with no type information?).
>>>>>>>>>
>>>>>>>>>                     Then start migrating optimizations over -
>>>>>>>>>                     doing the same sort of testing I did for
>>>>>>>>>                     the IR/bitcode roundtripping - assert that
>>>>>>>>>                     the pointee type is not accessed,
>>>>>>>>>                     whitelist places that need it until the
>>>>>>>>>                     bitcasts go away, fix anything else...
>>>>>>>>>                     it'll still be a fair bit of work & I
>>>>>>>>>                     don't really know how much. It should
>>>>>>>>>                     parallelize pretty well (doing any of this
>>>>>>>>>                     work is really helpful, each optimization
>>>>>>>>>                     is indepednent, etc) if anyone wants to/is
>>>>>>>>>                     able to help.
>>>>>>>>>
>>>>>>>>>                     - Dave
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                         Thanks
>>>>>>>>>                         Ehsan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                         On Wed, Mar 16, 2016 at 1:12 PM, David
>>>>>>>>>                         Blaikie <dblaikie at gmail.com
>>>>>>>>>                         <mailto:dblaikie at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             On Wed, Mar 16, 2016 at 8:34 AM,
>>>>>>>>>                             Mehdi Amini via llvm-dev
>>>>>>>>>                             <llvm-dev at lists.llvm.org
>>>>>>>>>                             <mailto:llvm-dev at lists.llvm.org>>
>>>>>>>>>                             wrote:
>>>>>>>>>
>>>>>>>>>                                 Hi,
>>>>>>>>>
>>>>>>>>>                                 How do it interact with the
>>>>>>>>>                                 "typeless pointers" work?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             Right - the goal of the typeless
>>>>>>>>>                             pointer work is to fix all these
>>>>>>>>>                             bugs related to "didn't look
>>>>>>>>>                             through bitcasts" in
>>>>>>>>>                             optimizations. Sometimes that's
>>>>>>>>>                             going to mean more work (because
>>>>>>>>>                             the code is leaning on the absence
>>>>>>>>>                             of bitcasts & the presence of
>>>>>>>>>                             convenient (but not guaranteed)
>>>>>>>>>                             type information to inform
>>>>>>>>>                             optimization decisions) but if we
>>>>>>>>>                             remove typed pointers while
>>>>>>>>>                             keeping optimization quality in
>>>>>>>>>                             the cases we have today, then we
>>>>>>>>>                             should've also fixed the cases
>>>>>>>>>                             that were broken because the type
>>>>>>>>>                             information didn't end up aligning
>>>>>>>>>                             to produce the optimal output.
>>>>>>>>>
>>>>>>>>>                             & I know I've been off the
>>>>>>>>>                             typeless pointer stuff for a few
>>>>>>>>>                             months working on llvm-dwp - but
>>>>>>>>>                             happy for any help (the next
>>>>>>>>>                             immediate piece is probably
>>>>>>>>>                             figuring out teh right
>>>>>>>>>                             representation for byval and
>>>>>>>>>                             inalloca - there were some code
>>>>>>>>>                             reviews sent out for that that
>>>>>>>>>                             I'll need to come back around to -
>>>>>>>>>                             but also any optimizations people
>>>>>>>>>                             want to help rework/improve would
>>>>>>>>>                             be great too & I can provide some
>>>>>>>>>                             techniques/tools to help people
>>>>>>>>>                             approach those)
>>>>>>>>>
>>>>>>>>>                             - Dave
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                 Thanks,
>>>>>>>>>
>>>>>>>>>                                 -- 
>>>>>>>>>                                 Mehdi
>>>>>>>>>
>>>>>>>>>>                                 On Mar 16, 2016, at 6:41 AM,
>>>>>>>>>>                                 Ehsan Amiri via llvm-dev
>>>>>>>>>>                                 <llvm-dev at lists.llvm.org
>>>>>>>>>>                                 <mailto:llvm-dev at lists.llvm.org>>
>>>>>>>>>>                                 wrote:
>>>>>>>>>>
>>>>>>>>>>                                 === PROBLEM === (See this bug
>>>>>>>>>>                                 https://llvm.org/bugs/show_bug.cgi?id=26445)
>>>>>>>>>>
>>>>>>>>>>                                 IR contains code for loading
>>>>>>>>>>                                 a float from float * and
>>>>>>>>>>                                 storing it to a float *
>>>>>>>>>>                                 address. After
>>>>>>>>>>                                 canonicalization of load in
>>>>>>>>>>                                 InstCombine [1], new bitcasts
>>>>>>>>>>                                 are added to the IR (see
>>>>>>>>>>                                 bottom of the email for code
>>>>>>>>>>                                 samples). This prevents
>>>>>>>>>>                                 select speculation in SROA to
>>>>>>>>>>                                 work. Also after SROA we have
>>>>>>>>>>                                 bitcasts from int32 to float.
>>>>>>>>>>                                 (Whereas originally after
>>>>>>>>>>                                 instCombine, bitcasts are
>>>>>>>>>>                                 only done on pointer types).
>>>>>>>>>>
>>>>>>>>>>                                 === PROPOSED SOLUTION===
>>>>>>>>>>
>>>>>>>>>>                                 [1] implies that we need load
>>>>>>>>>>                                 canonicalization when we load
>>>>>>>>>>                                 a value only to store it
>>>>>>>>>>                                 again. The reason is to avoid
>>>>>>>>>>                                 generating slightly different
>>>>>>>>>>                                 code (due to different ways
>>>>>>>>>>                                 of adding bitcasts), in
>>>>>>>>>>                                 different situations. In all
>>>>>>>>>>                                 examples presented in [1]
>>>>>>>>>>                                 there is a non-zero number of
>>>>>>>>>>                                 bitcasts. I think when we
>>>>>>>>>>                                 load a value of type T from a
>>>>>>>>>>                                 T* address and store it as a
>>>>>>>>>>                                 type T value to one or more
>>>>>>>>>>                                 T* address (and there is no
>>>>>>>>>>                                 other use or store), we can
>>>>>>>>>>                                 redefine canonical form to
>>>>>>>>>>                                 mean there should not be any
>>>>>>>>>>                                 bitcasts. So we still have a
>>>>>>>>>>                                 canonical form, but its
>>>>>>>>>>                                 definition is slightly different.
>>>>>>>>>>
>>>>>>>>>>                                 === REASONS FOR / AGAINST===
>>>>>>>>>>
>>>>>>>>>>                                 Hal Finkel warns that while
>>>>>>>>>>                                 this may be useful for power
>>>>>>>>>>                                 pc, this may hurt more than
>>>>>>>>>>                                 one other platform and become
>>>>>>>>>>                                 a very large project. Despite
>>>>>>>>>>                                 this he is fine with bringing
>>>>>>>>>>                                 up the issue to the mailing
>>>>>>>>>>                                 list to get feedback, mostly
>>>>>>>>>>                                 because this seems inline
>>>>>>>>>>                                 with our future direction of
>>>>>>>>>>                                 having a unique type for all
>>>>>>>>>>                                 pointers. (Hal please correct
>>>>>>>>>>                                 me if I misunderstood your
>>>>>>>>>>                                 comment)
>>>>>>>>>>
>>>>>>>>>>                                 This is a much simpler fix
>>>>>>>>>>                                 compared to alternatives.
>>>>>>>>>>                                 (ignoring potential regressions)
>>>>>>>>>>
>>>>>>>>>>                                 === ALTERNATIVE SOLUTION ===
>>>>>>>>>>
>>>>>>>>>>                                 Fix select speculation in
>>>>>>>>>>                                 SROA to see through bitcasts.
>>>>>>>>>>                                 Handle remaining bitcasts
>>>>>>>>>>                                 during code gen. Other
>>>>>>>>>>                                 alternative solutions are
>>>>>>>>>>                                 welcome.
>>>>>>>>>>
>>>>>>>>>>                                 Should I implement the
>>>>>>>>>>                                 proposed solution or is it
>>>>>>>>>>                                 too risky? I understand that
>>>>>>>>>>                                 we may need to undo it if it
>>>>>>>>>>                                 breaks too many things.
>>>>>>>>>>                                 Comments are welcome.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 [1]
>>>>>>>>>>                                 http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
>>>>>>>>>>                                 r226781  git commit id:
>>>>>>>>>>                                 b778cbc0c8
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 Code Samples (only relevant
>>>>>>>>>>                                 part is copied):
>>>>>>>>>>
>>>>>>>>>>                                 -------------------- Before
>>>>>>>>>>                                 Canonicalization (contains
>>>>>>>>>>                                 call to std::max):
>>>>>>>>>>                                 --------------------
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                   %max_value = alloca float,
>>>>>>>>>>                                 align 4
>>>>>>>>>>                                   %1 = load float, float*
>>>>>>>>>>                                 %input, align 4, !tbaa !1
>>>>>>>>>>                                   store float %1, float*
>>>>>>>>>>                                 %max_value, align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                 for.body:
>>>>>>>>>>                                   %call = call
>>>>>>>>>>                                 dereferenceable(4) float*
>>>>>>>>>>                                 @_ZSt3maxIfERKT_S2_S2_(float*
>>>>>>>>>>                                 dereferenceable(4)
>>>>>>>>>>                                 %max_value, float*
>>>>>>>>>>                                 dereferenceable(4) %arrayidx1)
>>>>>>>>>>                                   %3 = load float, float*
>>>>>>>>>>                                 %call, align 4, !tbaa !1
>>>>>>>>>>                                   store float %3, float*
>>>>>>>>>>                                 %max_value, align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                 -------------------- After
>>>>>>>>>>                                 Canonicalization (contains
>>>>>>>>>>                                 call to
>>>>>>>>>>                                 std::max):--------------------
>>>>>>>>>>
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                   %max_value = alloca float,
>>>>>>>>>>                                 align 4
>>>>>>>>>>                                   %1 = bitcast float* %input
>>>>>>>>>>                                 to i32*
>>>>>>>>>>                                   %2 = load i32, i32* %1,
>>>>>>>>>>                                 align 4, !tbaa !1
>>>>>>>>>>                                   %3 = bitcast float*
>>>>>>>>>>                                 %max_value to i32*
>>>>>>>>>>                                   store i32 %2, i32* %3,
>>>>>>>>>>                                 align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                 for.body:
>>>>>>>>>>                                   %call = call
>>>>>>>>>>                                 dereferenceable(4) float*
>>>>>>>>>>                                 @_ZSt3maxIfERKT_S2_S2_(float*
>>>>>>>>>>                                 nonnull dereferenceable(4)
>>>>>>>>>>                                 %max_value, float*
>>>>>>>>>>                                 dereferenceable(4) %arrayidx1)
>>>>>>>>>>                                   %5 = bitcast float* %call
>>>>>>>>>>                                 to i32*
>>>>>>>>>>                                   %6 = load i32, i32* %5,
>>>>>>>>>>                                 align 4, !tbaa !1
>>>>>>>>>>                                   %7 = bitcast float*
>>>>>>>>>>                                 %max_value to i32*
>>>>>>>>>>                                   store i32 %6, i32* %7,
>>>>>>>>>>                                 align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                 -------------------- After
>>>>>>>>>>                                 SROA (the call to std::max is
>>>>>>>>>>                                 inlined
>>>>>>>>>>                                 now):--------------------
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                 %max_value.sroa.0 = alloca i32
>>>>>>>>>>                                   %0 = bitcast float* %input
>>>>>>>>>>                                 to i32*
>>>>>>>>>>                                   %1 = load i32, i32* %0,
>>>>>>>>>>                                 align 4, !tbaa !1
>>>>>>>>>>                                   store i32 %1, i32*
>>>>>>>>>>                                 %max_value.sroa.0
>>>>>>>>>>
>>>>>>>>>>                                 for.body:
>>>>>>>>>>                                 %max_value.sroa.0.0.max_value.sroa.0.0.6
>>>>>>>>>>                                 = load i32, i32*
>>>>>>>>>>                                 %max_value.sroa.0
>>>>>>>>>>                                   %3 = bitcast i32
>>>>>>>>>>                                 %max_value.sroa.0.0.max_value.sroa.0.0.6
>>>>>>>>>>                                 to float
>>>>>>>>>>                                 %max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>>                                 = bitcast i32*
>>>>>>>>>>                                 %max_value.sroa.0 to float*
>>>>>>>>>>                                   %__b.__a.i = select i1
>>>>>>>>>>                                 %cmp.i, float* %arrayidx1,
>>>>>>>>>>                                 float*
>>>>>>>>>>                                 %max_value.sroa.0.0.max_value.sroa_cast8
>>>>>>>>>>                                   %5 = bitcast float*
>>>>>>>>>>                                 %__b.__a.i to i32*
>>>>>>>>>>                                   %6 = load i32, i32* %5,
>>>>>>>>>>                                 align 4, !tbaa !1
>>>>>>>>>>                                   store i32 %6, i32*
>>>>>>>>>>                                 %max_value.sroa.0
>>>>>>>>>>
>>>>>>>>>>                                 -------------------- After
>>>>>>>>>>                                 SROA when Canonicalization is
>>>>>>>>>>                                 turned off--------------------
>>>>>>>>>>                                 entry:
>>>>>>>>>>                                   %0 = load float, float*
>>>>>>>>>>                                 %input, align 4, !tbaa !1
>>>>>>>>>>
>>>>>>>>>>                                 for.cond: ; preds =
>>>>>>>>>>                                 %for.body, %entry
>>>>>>>>>>                                   %max_value.0 = phi float [
>>>>>>>>>>                                 %0, %entry ], [
>>>>>>>>>>                                 %.sroa.speculated, %for.body ]
>>>>>>>>>>
>>>>>>>>>>                                 for.body:
>>>>>>>>>>                                   %1 = load float, float*
>>>>>>>>>>                                 %arrayidx1, align 4, !tbaa !1
>>>>>>>>>>                                   %cmp.i = fcmp olt float
>>>>>>>>>>                                 %max_value.0, %1
>>>>>>>>>>                                 %.sroa.speculate.load.true =
>>>>>>>>>>                                 load float, float*
>>>>>>>>>>                                 %arrayidx1, align 4, !tbaa !1
>>>>>>>>>>                                 %.sroa.speculated = select i1
>>>>>>>>>>                                 %cmp.i, float
>>>>>>>>>>                                 %.sroa.speculate.load.true,
>>>>>>>>>>                                 float %max_value.0
>>>>>>>>>>                                 _______________________________________________
>>>>>>>>>>                                 LLVM Developers mailing list
>>>>>>>>>>                                 llvm-dev at lists.llvm.org
>>>>>>>>>>                                 <mailto:llvm-dev at lists.llvm.org>
>>>>>>>>>>                                 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                 _______________________________________________
>>>>>>>>>                                 LLVM Developers mailing list
>>>>>>>>>                                 llvm-dev at lists.llvm.org
>>>>>>>>>                                 <mailto:llvm-dev at lists.llvm.org>
>>>>>>>>>                                 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 _______________________________________________
>>>>>>>>                 LLVM Developers mailing list
>>>>>>>>                 llvm-dev at lists.llvm.org
>>>>>>>>                 <mailto:llvm-dev at lists.llvm.org>
>>>>>>>>                 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/dc9d4c6a/attachment-0001.html>