[llvm] r344142 - [DAGCombine] Improve Load-Store Forwarding
Nirav Davé via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 23 02:41:16 PDT 2018
Given it's an out of tree test, its probably best if you write the patch
and I'll quickly LGTM it.
Nirav
On Tue, Oct 23, 2018, 04:26 Mikael Holmén <mikael.holmen at ericsson.com>
wrote:
> Hi,
>
> On 10/17/18 5:00 PM, Nirav Davé wrote:
> > Regardless, restricting this optimization to only byte-multiples
> > seems like the way to go then.
>
> Alright, should I create a patch with that or will you?
>
> I'll be out of office for a week now but I can create a patch after that
> and add you as reviewer if noone beats me to it.
>
> Regards,
> Mikael
>
> >
> > On Wed, Oct 17, 2018 at 2:34 AM, Mikael Holmén
> > <mikael.holmen at ericsson.com <mailto:mikael.holmen at ericsson.com>>
> wrote:
> >
> > Hi,
> >
> > On 10/16/2018 10:28 PM, Nirav Davé wrote:
> > > I thought the current state of non-8-bit bytes was not yet
> viable as
> > > things were baked in pretty hard.
> >
> > Yes there are many "* 8", "/ 8", ">> 3" etc in-tree that are
> > connected
> > to the byte size, but we've changed most of them in our clone and
> > generalized things so we use the bytesize that we've added to the
> > DataLayout instead. It works :)
> >
> > Every now and then there pops up a new hardcoded 8, 3 or similar
> > in-tree
> > that we must deal with but most of the time we notice that
> > pretty quickly.
> >
> > > Insisting that the LDType and STType
> > > are some multiple of Byte should be sufficient, but that's
> probably
> > > something that exists only for you.
> > >
> > > Can you try replacing with the following additional type
> legality check?
> > >
> > > -Nirav
> > > + if (DAG.getDataLayout().isBigEndian()) {
> > > + // Avoid dealing with big endian type legalization.
> > > + if (!TLI.isTypeLegal(STType) || !TLI.isTypeLegal(LDType))
> > > + return SDValue();
> > > + Offset =
> > > + (STMemType.getSizeInBits() -
> LDMemType.getSizeInBits()) / 8 -
> > > Offset;
> > > + }
> > > +
> >
> > Unfortunately it doesn't help in this case.
> >
> > We have
> >
> > STType: i40
> > LDType: i32
> >
> > and both i32 and i40 are legal on my target. (Yes, we've added
> > i40 to
> > MVT which has also been interesting, there are a couple of
> > places where
> > it's assumed that the MVTs are all power-of-2, which i40 clearly
> > isn't).
> >
> > Checking that the sizes of LDType and STType are multiple of
> > bytes like
> >
> > // Normalize for Endianness.
> > if (DAG.getDataLayout().isBigEndian()) {
> > // Avoid dealing with big endian type legalization.
> > if (!TLI.isTypeLegal(STType) || !TLI.isTypeLegal(LDType) ||
> > (LDType.getSizeInBits() % bitsPerByte() != 0) ||
> > (STType.getSizeInBits() % bitsPerByte() != 0))
> > return SDValue();
> > Offset =
> > (STMemType.getSizeInBits() - LDMemType.getSizeInBits()) /
> > bitsPerByte() - Offset;
> > }
> >
> > seems to work since it then bails out on the i40.
> >
> >
> > I still have a nagging feeling that this could be solved by
> > involving
> > getStoreSizeInBits() instead of, or in addition to,
> > getSizeInBits() in
> > some way though. If we calculate Offset as
> >
> > Offset =
> > (STMemType.getStoreSizeInBits() -
> > LDMemType.getStoreSizeInBits())
> > / bitsPerByte() - Offset;
> >
> > then we would get (48 - 32)/16 - 0 so Offset would be 1 and then
> > we'd
> > abort transformation with
> >
> > // TODO: Deal with nonzero offset.
> > if (LD->getBasePtr().isUndef() || Offset != 0)
> > return SDValue();
> >
> > so that would work too in this particular case but I don't know
> if
> > that's correct or not.
> >
> > I also don't know if this is really just a problem for bigendian
> > targets
> > or not, or if it's indeed just a problem for my target with all
> its
> > oddities.
> >
> > Thanks,
> > Mikael
> >
> > >
> > >
> > >
> > >
> > >
> > >
> > > 16-bit bytes? My understanding is that we've baked in 8-bit
> bytes pretty
> > > strongly into the backend. Are you
> > >
> > >
> > > On Mon, Oct 15, 2018 at 2:43 AM, Mikael Holmén
> > > <mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>> wrote:
> > >
> > > Hi,
> > >
> > > On 10/12/2018 04:03 PM, Nirav Davé wrote:
> > > > Hmm. I wonder if this is an issue with bigendian vector
> > representations.
> > > >
> > >
> > > No vectors involved in this case though.
> > >
> > > I think it's a size vs stores size issue, possibly that
> > might affect
> > > bigendian vectors too? I don't know.
> > >
> > > > I suspect there are not too many cases where we
> > actually get to the
> > > > getTruncatedStoreValue in your case. Can you tell me
> > what STType,
> > > > LDType, STMemType, LDMemType, and final offset are when
> > we get there for
> > > > your test case?
> > >
> > > When we reach getTruncatedStoreValue we have the
> > following DAG:
> > >
> > > SelectionDAG has 22 nodes:
> > > t0: ch = EntryToken
> > > t2: i24,ch = CopyFromReg t0, Register:i24 %0
> > > t23: ch = TokenFactor t2:1, t22
> > > t32: i32 = srl t38, Constant:i16<16>
> > > t17: ch,glue = CopyToReg t23, Register:i32 $a1_32, t32
> > > t27: i16 = add FrameIndex:i16<0>,
> Constant:i16<2>
> > > t34: i32,ch = load<(dereferenceable load 1 from
> > %ir._p48 + 2),
> > > zext from i16> t22, t27, undef:i16
> > > t30: i32 = shl t38, Constant:i16<16>
> > > t37: i32 = add t34, t30
> > > t19: ch,glue = CopyToReg t17, Register:i32 $a0_32,
> > t37, t17:1
> > > t24: i40 = any_extend t2
> > > t6: i40 = shl nuw t24, Constant:i16<16>
> > > t22: ch = store<(store 3 into %ir._p40, align 1)> t0,
> t6,
> > > FrameIndex:i16<0>, undef:i16
> > > t38: i32,ch = load<(dereferenceable load 2 from
> > %ir._p48, align 1)>
> > > t22, FrameIndex:i16<0>, undef:i16
> > > t20: ch = PHXISD::RETURN t19, Register:i32 $a1_32,
> > Register:i32
> > > $a0_32, t19:1
> > >
> > > and the following types
> > >
> > > STType: i40
> > > LDType: i32
> > > STMemType: i40
> > > LDMemType: i32
> > >
> > > The Offset is 0.
> > >
> > > Complicating factor: the byte size on my target is 16,
> > not 8, so we've
> > > replaced the hardcoded 8 in "* 8" and "/ 8" with
> > "bitsPerByte()" which
> > > in our case is 16 so the code looks like:
> > >
> > > bool STCoversLD =
> > > BasePtrST.equalBaseIndex(BasePtrLD, DAG, Offset)
> > && (Offset
> > > >= 0) &&
> > > (Offset * bitsPerByte() <=
> > LDMemType.getSizeInBits()) &&
> > > (Offset * bitsPerByte() +
> > LDMemType.getSizeInBits() <=
> > > STMemType.getSizeInBits());
> > >
> > > if (!STCoversLD)
> > > return SDValue();
> > >
> > > // Normalize for Endianness.
> > > if (DAG.getDataLayout().isBigEndian())
> > > Offset =
> > > (STMemType.getSizeInBits() -
> > LDMemType.getSizeInBits()) /
> > > bitsPerByte() - Offset;
> > >
> > > The Offset was thus calculated as
> > >
> > > (40 - 32)/16 - 0
> > >
> > > which I suspect is incorrect.
> > >
> > > STMemType is i40, so the size in bits is 40. However, the
> > store size
> > > for
> > > i40 is 48!
> > >
> > > So I think that we need to involve the store size in some
> > way here.
> > >
> > > Regards,
> > > Mikael
> > >
> > > >
> > > > -Nirav
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Oct 12, 2018 at 9:33 AM, Mikael Holmén
> > > > <mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>
> > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>>> wrote:
> > > >
> > > > Hi again,
> > > >
> > > > There seems to be another problem too.
> > > >
> > > > I don't have an in-tree reproducer for it but I
> > think the problem
> > > > has to
> > > > do with cases where the store size of the VT is
> > larger than the
> > > > "normal"
> > > > size.
> > > >
> > > >
> > > > Then I think the new code can miscalculate what
> > part of the
> > > stored
> > > > value
> > > > it should use instead of the load. Not sure if
> > this is only a
> > > problem
> > > > for bigendian targets or not.
> > > >
> > > > The end result is code that fails at runtime so
> > it's quite nasty.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Regards,
> > > > Mikael
> > > >
> > > > On 10/12/2018 08:08 AM, Mikael Holmén wrote:
> > > > > Yep, thanks!
> > > > >
> > > > > Regards,
> > > > > Mikael
> > > > >
> > > > > On 10/11/2018 08:44 PM, Nirav Davé wrote:
> > > > >> Should be fixed in rL344272.
> > > > >>
> > > > >> Thanks for the catch.
> > > > >>
> > > > >> -Nirav
> > > > >>
> > > > >> On Thu, Oct 11, 2018 at 7:41 AM, Mikael Holmén
> > > > >> <mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>
> > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>>
> > > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>
> > > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>
> > > <mailto:mikael.holmen at ericsson.com
> > <mailto:mikael.holmen at ericsson.com>>>>> wrote:
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> Reproducer for powerpc:
> > > > >>
> > > > >> llc bug.ll -o - -O1
> > > > >>
> > > > >> Without the patch we get
> > > > >>
> > > > >> addis 4, 2, .LC0 at toc@ha
> > > > >> sth 3, -2(1)
> > > > >> ld 4, .LC0 at toc@l(4)
> > > > >> lbz 3, -2(1)
> > > > >> stb 3, 0(4)
> > > > >> blr
> > > > >>
> > > > >> and with
> > > > >>
> > > > >> addis 4, 2, .LC0 at toc@ha
> > > > >> sth 3, -2(1)
> > > > >> ld 4, .LC0 at toc@l(4)
> > > > >> stb 3, 0(4)
> > > > >> blr
> > > > >>
> > > > >> Admittedly I'm no ppc expert but I think
> > the final
> > > stb will
> > > > write
> > > > >> bits
> > > > >> [0-7] of 3 to 0(4).
> > > > >>
> > > > >> Before your patch those 8 bits were setup
> > from the
> > > > >>
> > > > >> lbz 3, -2(1)
> > > > >>
> > > > >> but with the patch, I think the bits we're
> > after are
> > > placed
> > > > at bit
> > > > >> [8-15] in 3 so we'll get the wrong byte.
> > > > >>
> > > > >> /Mikael
> > > > >>
> > > > >> On 10/11/2018 12:54 PM, Mikael Holmén
> wrote:
> > > > >> > Hi Nirav,
> > > > >> >
> > > > >> > I suspect that this patch doesn't
> > handle big
> > > endian targets
> > > > >> correctly.
> > > > >> >
> > > > >> > I'll try to get back with a reproducer
> > for some
> > > in-tree
> > > > target,
> > > > >> but for
> > > > >> > my out-of-tree bigendian target it
> > looks like it
> > > changes
> > > > somethng
> > > > >> like
> > > > >> >
> > > > >> > store i16 %v, i16* %p16
> > > > >> > %p8 = bitcast i16* %p16 to i8*
> > > > >> > %ld = load i16, i16* %p8
> > > > >> >
> > > > >> > to
> > > > >> >
> > > > >> > store i16 %v, i16* %p16
> > > > >> > %ld = truncate i16 %v to i8
> > > > >> >
> > > > >> > but I think it should rather be
> > > > >> >
> > > > >> > store i16 %v, i16* %p16
> > > > >> > %tmp = lshr i16 %v, 8
> > > > >> > %ld = truncate i16 %tmp to i8
> > > > >> >
> > > > >> > I.e. if the target is bigendian, the
> > load will
> > > read the
> > > > high 8
> > > > >> bits from
> > > > >> > %v rather than the low. And the
> > truncate that this
> > > patch
> > > > >> generates gives
> > > > >> > us the low bits.
> > > > >> >
> > > > >> > Regards,
> > > > >> > Mikael
> > > > >> >
> > > > >> > On 10/10/2018 04:15 PM, Nirav Dave via
> > > llvm-commits wrote:
> > > > >> >> Author: niravd
> > > > >> >> Date: Wed Oct 10 07:15:52 2018
> > > > >> >> New Revision: 344142
> > > > >> >>
> > > > >> >> URL:
> > > >
> > http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>
> > > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>>
> > > >
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>
> > > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>>>
> > > > >>
> > > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>
> > > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>>
> > > >
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>
> > > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev
> > <http://llvm.org/viewvc/llvm-project?rev=344142&view=rev>>>>
> > > > >> >> Log:
> > > > >> >> [DAGCombine] Improve Load-Store
> Forwarding
> > > > >> >>
> > > > >> >> Summary:
> > > > >> >> Extend analysis forwarding loads from
> > preceeding
> > > stores to
> > > > >> work with
> > > > >> >> extended loads and truncated stores to
> > the same
> > > address
> > > > so long
> > > > >> as the
> > > > >> >> load is fully subsumed by the store.
> > > > >> >>
> > > > >> >> Hexagon's swp-epilog-phis.ll and
> > > swp-memrefs-epilog1.ll
> > > > test are
> > > > >> >> deleted as they've no longer seem to
> > be relevant.
> > > > >> >>
> > > > >> >> Reviewers: RKSimon, rnk, kparzysz,
> > javed.absar
> > > > >> >>
> > > > >> >> Subscribers: sdardis, nemanjai,
> > hiraditya, atanasyan,
> > > > >> llvm-commits
> > > > >> >>
> > > > >> >> Differential Revision:
> > > https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>
> > <https://reviews.llvm.org/D49200 <
> https://reviews.llvm.org/D49200>>
> > > > <https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>
> > > <https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>>>
> > > > >> <https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>
> > > <https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>>
> > > > <https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>
> > > <https://reviews.llvm.org/D49200
> > <https://reviews.llvm.org/D49200>>>>
> > > > >> >>
> > > > >> >> Removed:
> > > > >> >>
> > > llvm/trunk/test/CodeGen/Hexagon/swp-epilog-phis.ll
> > > > >> >>
> > > llvm/trunk/test/CodeGen/Hexagon/swp-memrefs-epilog1.ll
> > > > >> >> Modified:
> > > > >> >>
> > > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> > > > >> >>
> > > llvm/trunk/test/CodeGen/AArch64/arm64-ld-from-st.ll
> > > > >> >>
> > > >
> > llvm/trunk/test/CodeGen/AArch64/regress-tblgen-chains.ll
> > > > >> >>
> > > llvm/trunk/test/CodeGen/Hexagon/clr_set_toggle.ll
> > > > >> >>
> > llvm/trunk/test/CodeGen/Mips/cconv/vector.ll
> > > > >> >>
> > > > >>
> > >
> > llvm/trunk/test/CodeGen/Mips/indirect-jump-hazard/jumptables.ll
> > > > >> >>
> > llvm/trunk/test/CodeGen/Mips/o32_cc_byval.ll
> > > > >> >>
> > llvm/trunk/test/CodeGen/Mips/o32_cc_vararg.ll
> > > > >> >>
> > > llvm/trunk/test/CodeGen/PowerPC/addi-offset-fold.ll
> > > > >> >>
> > > >
> > llvm/trunk/test/CodeGen/SystemZ/store_nonbytesized_vecs.ll
> > > > >> >>
> > > llvm/trunk/test/CodeGen/X86/i386-shrink-wrapping.ll
> > > > >> >>
> > llvm/trunk/test/CodeGen/X86/pr32108.ll
> > > > >> >>
> > llvm/trunk/test/CodeGen/X86/pr38533.ll
> > > > >> >>
> > llvm/trunk/test/CodeGen/X86/win64_vararg.ll
> > > > >> >>
> > > > >> >> Modified:
> > > > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> > > > >> >> URL:
> > > > >> >>
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> > <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> >
> > >
> > <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> >>
> > > >
> > >
> > <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff>
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> >>>
> > > >
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> > <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff>
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> >>
> > > >
> > >
> > <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff>
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> <
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff
> >>>>
> > > >
> > > > >>
> > > > >>
> > > > >> >>
> > > > >> >>
> > > > >>
> > > > >>
> > > >
> > >
> >
> ==============================================================================
> > > >
> > > > >>
> > > > >>
> > > > >> >>
> > > > >> >> ---
> > > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> > > > >> (original)
> > > > >> >> +++
> > > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed
> > > > >> Oct 10
> > > > >> >> 07:15:52 2018
> > > > >> >> @@ -250,6 +250,11 @@ namespace {
> > > > >> >> SDValue
> > SplitIndexingFromLoad(LoadSDNode *LD);
> > > > >> >> bool SliceUpLoad(SDNode *N);
> > > > >> >> + // Scalars have size 0 to
> > distinguish from
> > > singleton
> > > > >> vectors.
> > > > >> >> + SDValue
> > > ForwardStoreValueToDirectLoad(LoadSDNode *LD);
> > > > >> >> + bool
> > getTruncatedStoreValue(StoreSDNode *ST,
> > > > SDValue &Val);
> > > > >> >> + bool
> > extendLoadedValueToExtension(LoadSDNode
> > > *LD,
> > > > SDValue
> > > > >> &Val);
> > > > >> >> +
> > > > >> >> /// Replace an
> > ISD::EXTRACT_VECTOR_ELT of a
> > > load
> > > > with a
> > > > >> narrowed
> > > > >> >> /// load.
> > > > >> >> ///
> > > > >> >> @@ -12762,6 +12767,133 @@ SDValue
> > > > >> DAGCombiner::SplitIndexingFromLo
> > > > >> >> return DAG.getNode(Opc, SDLoc(LD),
> > > > BP.getSimpleValueType(),
> > > > >> BP, Inc);
> > > > >> >> }
> > > > >> >> +static inline int
> > numVectorEltsOrZero(EVT T) {
> > > > >> >> + return T.isVector() ?
> > T.getVectorNumElements()
> > > : 0;
> > > > >> >> +}
> > > > >> >> +
> > > > >> >> +bool
> > > DAGCombiner::getTruncatedStoreValue(StoreSDNode *ST,
> > > > >> SDValue
> > > > >> >> &Val) {
> > > > >> >> + Val = ST->getValue();
> > > > >> >> + EVT STType = Val.getValueType();
> > > > >> >> + EVT STMemType = ST->getMemoryVT();
> > > > >> >> + if (STType == STMemType)
> > > > >> >> + return true;
> > > > >> >> + if (isTypeLegal(STMemType))
> > > > >> >> + return false; // fail.
> > > > >> >> + if (STType.isFloatingPoint() &&
> > > > STMemType.isFloatingPoint() &&
> > > > >> >> +
> TLI.isOperationLegal(ISD::FTRUNC,
> > > STMemType)) {
> > > > >> >> + Val = DAG.getNode(ISD::FTRUNC,
> > SDLoc(ST),
> > > > STMemType, Val);
> > > > >> >> + return true;
> > > > >> >> + }
> > > > >> >> + if (numVectorEltsOrZero(STType) ==
> > > > >> numVectorEltsOrZero(STMemType) &&
> > > > >> >> + STType.isInteger() &&
> > STMemType.isInteger()) {
> > > > >> >> + Val = DAG.getNode(ISD::TRUNCATE,
> > SDLoc(ST),
> > > > STMemType, Val);
> > > > >> >> + return true;
> > > > >> >> + }
> > > > >> >> + if (STType.getSizeInBits() ==
> > > > STMemType.getSizeInBits()) {
> > > > >> >> + Val = DAG.getBitcast(STMemType,
> Val);
> > > > >> >> + return true;
> > > > >> >> + }
> > > > >> >> + return false; // fail.
> > > > >> >> +}
> > > > >> >> +
> > > > >> >> +bool
> > > >
> > DAGCombiner::extendLoadedValueToExtension(LoadSDNode *LD,
> > > > >> >> SDValue &Val) {
> > > > >> >> + EVT LDMemType = LD->getMemoryVT();
> > > > >> >> + EVT LDType = LD->getValueType(0);
> > > > >> >> + assert(Val.getValueType() ==
> > LDMemType &&
> > > > >> >> + "Attempting to extend value
> > of non-matching
> > > > type");
> > > > >> >> + if (LDType == LDMemType)
> > > > >> >> + return true;
> > > > >> >> + if (LDMemType.isInteger() &&
> > LDType.isInteger()) {
> > > > >> >> + switch (LD->getExtensionType()) {
> > > > >> >> + case ISD::NON_EXTLOAD:
> > > > >> >> + Val = DAG.getBitcast(LDType,
> Val);
> > > > >> >> + return true;
> > > > >> >> + case ISD::EXTLOAD:
> > > > >> >> + Val =
> > DAG.getNode(ISD::ANY_EXTEND, SDLoc(LD),
> > > > LDType,
> > > > >> Val);
> > > > >> >> + return true;
> > > > >> >> + case ISD::SEXTLOAD:
> > > > >> >> + Val =
> > DAG.getNode(ISD::SIGN_EXTEND, SDLoc(LD),
> > > > LDType,
> > > > >> Val);
> > > > >> >> + return true;
> > > > >> >> + case ISD::ZEXTLOAD:
> > > > >> >> + Val =
> > DAG.getNode(ISD::ZERO_EXTEND, SDLoc(LD),
> > > > LDType,
> > > > >> Val);
> > > > >> >> + return true;
> > > > >> >> + }
> > > > >> >> + }
> > > > >> >> + return false;
> > > > >> >> +}
> > > > >> >> +
> > > > >> >> +SDValue
> > > >
> DAGCombiner::ForwardStoreValueToDirectLoad(LoadSDNode
> > > > >> *LD) {
> > > > >> >> + if (OptLevel == CodeGenOpt::None ||
> > > LD->isVolatile())
> > > > >> >> + return SDValue();
> > > > >> >> + SDValue Chain = LD->getOperand(0);
> > > > >> >> + StoreSDNode *ST =
> > > dyn_cast<StoreSDNode>(Chain.getNode());
> > > > >> >> + if (!ST || ST->isVolatile())
> > > > >> >> + return SDValue();
> > > > >> >> +
> > > > >> >> + EVT LDType = LD->getValueType(0);
> > > > >> >> + EVT LDMemType = LD->getMemoryVT();
> > > > >> >> + EVT STMemType = ST->getMemoryVT();
> > > > >> >> + EVT STType =
> > ST->getValue().getValueType();
> > > > >> >> +
> > > > >> >> + BaseIndexOffset BasePtrLD =
> > > > BaseIndexOffset::match(LD, DAG);
> > > > >> >> + BaseIndexOffset BasePtrST =
> > > > BaseIndexOffset::match(ST, DAG);
> > > > >> >> + int64_t Offset;
> > > > >> >> +
> > > > >> >> + bool STCoversLD =
> > > > >> >> +
> > BasePtrST.equalBaseIndex(BasePtrLD, DAG,
> > > Offset) &&
> > > > >> (Offset >=
> > > > >> >> 0) &&
> > > > >> >> + (Offset * 8 <=
> > LDMemType.getSizeInBits()) &&
> > > > >> >> + (Offset * 8 +
> > LDMemType.getSizeInBits() <=
> > > > >> >> STMemType.getSizeInBits());
> > > > >> >> +
> > > > >> >> + if (!STCoversLD)
> > > > >> >> + return SDValue();
> > > > >> >> +
> > > > >> >> + // Memory as copy space
> > (potentially masked).
> > > > >> >> + if (Offset == 0 && LDType == STType
> &&
> > > STMemType ==
> > > > >> LDMemType) {
> > > > >> >> + // Simple case: Direct
> > non-truncating forwarding
> > > > >> >> + if (LDType.getSizeInBits() ==
> > > > LDMemType.getSizeInBits())
> > > > >> >> + return CombineTo(LD,
> > ST->getValue(), Chain);
> > > > >> >> + // Can we model the truncate and
> > extension
> > > with an
> > > > and mask?
> > > > >> >> + if (STType.isInteger() &&
> > > LDMemType.isInteger() &&
> > > > >> >> !STType.isVector() &&
> > > > >> >> + !LDMemType.isVector() &&
> > > LD->getExtensionType() !=
> > > > >> >> ISD::SEXTLOAD) {
> > > > >> >> + // Mask to size of LDMemType
> > > > >> >> + auto Mask =
> > > > >> >> +
> > > > >>
> > DAG.getConstant(APInt::getLowBitsSet(STType.getSizeInBits(),
> > > > >> >> +
> > > > >> >> STMemType.getSizeInBits()),
> > > > >> >> + SDLoc(ST),
> > STType);
> > > > >> >> + auto Val = DAG.getNode(ISD::AND,
> > > SDLoc(LD), LDType,
> > > > >> >> ST->getValue(), Mask);
> > > > >> >> + return CombineTo(LD, Val,
> Chain);
> > > > >> >> + }
> > > > >> >> + }
> > > > >> >> +
> > > > >> >> + // TODO: Deal with nonzero offset.
> > > > >> >> + if (LD->getBasePtr().isUndef() ||
> > Offset != 0)
> > > > >> >> + return SDValue();
> > > > >> >> + // Model necessary truncations /
> > extenstions.
> > > > >> >> + SDValue Val;
> > > > >> >> + // Truncate Value To Stored Memory
> > Size.
> > > > >> >> + do {
> > > > >> >> + if (!getTruncatedStoreValue(ST,
> Val))
> > > > >> >> + continue;
> > > > >> >> + if (!isTypeLegal(LDMemType))
> > > > >> >> + continue;
> > > > >> >> + if (STMemType != LDMemType) {
> > > > >> >> + if
> > (numVectorEltsOrZero(STMemType) ==
> > > > >> >> numVectorEltsOrZero(LDMemType) &&
> > > > >> >> + STMemType.isInteger() &&
> > > LDMemType.isInteger())
> > > > >> >> + Val =
> > DAG.getNode(ISD::TRUNCATE, SDLoc(LD),
> > > > LDMemType,
> > > > >> Val);
> > > > >> >> + else
> > > > >> >> + continue;
> > > > >> >> + }
> > > > >> >> + if
> > (!extendLoadedValueToExtension(LD, Val))
> > > > >> >> + continue;
> > > > >> >> + return CombineTo(LD, Val, Chain);
> > > > >> >> + } while (false);
> > > > >> >> +
> > > > >> >> + // On failure, cleanup dead nodes
> > we may have
> > > created.
> > > > >> >> + if (Val->use_empty())
> > > > >> >> + deleteAndRecombine(Val.getNode());
> > > > >> >> + return SDValue();
> > > > >> >> +}
> > > > >> >> +
> > > > >> >> SDValue
> > DAGCombiner::visitLOAD(SDNode *N) {
> > > > >> >> LoadSDNode *LD =
> cast<LoadSDNode>(N);
> > > > >> >> SDValue Chain = LD->getChain();
> > > > >> >> @@ -12828,17 +12960,8 @@ SDValue
> > > > DAGCombiner::visitLOAD(SDNode *N
> > > > >> >> // If this load is directly
> > stored, replace
> > > the load
> > > > value
> > > > >> with
> > > > >> >> the stored
> > > > >> >> // value.
> > > > >> >> - // TODO: Handle store large -> read
> > small portion.
> > > > >> >> - // TODO: Handle TRUNCSTORE/LOADEXT
> > > > >> >> - if (OptLevel != CodeGenOpt::None &&
> > > > >> >> - ISD::isNormalLoad(N) &&
> > !LD->isVolatile()) {
> > > > >>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181023/12018cfa/attachment-0001.html>
More information about the llvm-commits
mailing list