[llvm] r274098 - [ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition.
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 29 15:42:13 PDT 2016
sdiv also prevents converting divide by power of 2 into a shift. That's
what was happening in the loop I was looking at. For x86 we ended up with a
multiple instruction abs idiom, a shift for the divide, and then undoing
On Wed, Jun 29, 2016 at 12:20 PM, Huang, Li1 <li1.huang at intel.com> wrote:
> Actually, D18777 is doing the same thing as Craig's change, and is based
> on the same motivation: udiv/urem -> sdiv/srem. Udiv/urem is generally
> faster than sdiv/srem on most archs.
> The reason it breaks the FE test is that it converts some sexts to zexts
> before indvars (during instcombine), but indvars cannot recognize that the
> zext is unnecessary and should be removed. Instead, indvars inserts truncs
> before zexts after widening the IV.
> D18867 was proposed to fix this problem, this change teaches indvars that
> zext of a signed value which is known non-negative is unnecessary. However,
> this change itself is not enough, a loop-rotate pass is needed before
> indvars for this change to work.
> The test uses O1 but O1 pipeline doesn't have a loop-rotate before
> indvars, neither does O2 or O3. So I proposed to change the test in D21773.
> Li Huang
> -----Original Message-----
> From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com]
> Sent: Wednesday, June 29, 2016 12:35 AM
> To: Craig Topper <craig.topper at gmail.com>
> Cc: Adam Nemet <anemet at apple.com>; llvm-commits <
> llvm-commits at lists.llvm.org>; Huang, Li1 <li1.huang at intel.com>
> Subject: Re: [llvm] r274098 - [ValueTracking] Teach computeKnownBits for
> PHI nodes to compute sign bit for a recurrence with a NSW addition.
> I think I have to redact my "I suspect this isn't anything complicated"
> tone. :)
> To do the right thing in cases like the one you pastebinned, we'll have to
> track `IsSigned` on individual `NarrowIVDefUse` instances.
> Then for instance, in
> %i.050 = phi i32 [ %add15, %for.body ], [ 0, %for.body.preheader ]
> %idxprom45 = zext i32 %i.050 to i64
> %arrayidx = getelementptr inbounds i32, i32* %A, i64 %idxprom45
> %0 = load i32, i32* %arrayidx, align 4
> %sub = add nsw i32 %i.050, -1
> %idxprom1 = sext i32 %sub to i64
> %arrayidx2 = getelementptr inbounds i32, i32* %B, i64 %idxprom1
> %1 = load i32, i32* %arrayidx2, align 4
> we'll first replace `%idxprom45` with the zero extended wide IV, and then
> when we get to `%sub`, we'll replace it with `add %wide.zexted.iv, -1` but
> the corresponding `NarrowIVDefUse` we push to the worklist will have to
> remember that this new wide add instruction is really a `sext` of `%sub`
> (and accordingly handle its users).
> I'll be on vacation from 30th June to 10th of July, but I should be able
> to get to this once I'm back (unfortunately, I probably won't have time
> before I leave). If you want to fix this and have someone else or me (once
> I'm back) review it that's fine too.
> -- Sanjoy
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits