<div dir="auto">Given it's an out of tree test, its probably best if you write the patch and I'll quickly LGTM it.<div dir="auto"><br></div><div dir="auto">Nirav</div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Oct 23, 2018, 04:26 Mikael Holmén <<a href="mailto:mikael.holmen@ericsson.com">mikael.holmen@ericsson.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
On 10/17/18 5:00 PM, Nirav Davé wrote:<br>
> Regardless, restricting this optimization to only byte-multiples<br>
> seems like the way to go then.<br>
<br>
Alright, should I create a patch with that or will you?<br>
<br>
I'll be out of office for a week now but I can create a patch after that <br>
and add you as reviewer if noone beats me to it.<br>
<br>
Regards,<br>
Mikael<br>
<br>
> <br>
> On Wed, Oct 17, 2018 at 2:34 AM, Mikael Holmén<br>
> <<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>> wrote:<br>
> <br>
> Hi,<br>
> <br>
> On 10/16/2018 10:28 PM, Nirav Davé wrote:<br>
> > I thought the current state of non-8-bit bytes was not yet viable as <br>
> > things were baked in pretty hard.<br>
> <br>
> Yes there are many "* 8", "/ 8", ">> 3" etc in-tree that are<br>
> connected<br>
> to the byte size, but we've changed most of them in our clone and<br>
> generalized things so we use the bytesize that we've added to the<br>
> DataLayout instead. It works :)<br>
> <br>
> Every now and then there pops up a new hardcoded 8, 3 or similar<br>
> in-tree<br>
> that we must deal with but most of the time we notice that<br>
> pretty quickly.<br>
> <br>
> > Insisting that the LDType and STType <br>
> > are some multiple of Byte should be sufficient, but that's probably <br>
> > something that exists only for you.<br>
> > <br>
> > Can you try replacing with the following additional type legality check?<br>
> > <br>
> > -Nirav<br>
> > + if (DAG.getDataLayout().isBigEndian()) {<br>
> > + // Avoid dealing with big endian type legalization.<br>
> > + if (!TLI.isTypeLegal(STType) || !TLI.isTypeLegal(LDType))<br>
> > + return SDValue();<br>
> > + Offset =<br>
> > + (STMemType.getSizeInBits() - LDMemType.getSizeInBits()) / 8 -<br>
> > Offset;<br>
> > + }<br>
> > +<br>
> <br>
> Unfortunately it doesn't help in this case.<br>
> <br>
> We have<br>
> <br>
> STType: i40<br>
> LDType: i32<br>
> <br>
> and both i32 and i40 are legal on my target. (Yes, we've added<br>
> i40 to<br>
> MVT which has also been interesting, there are a couple of<br>
> places where<br>
> it's assumed that the MVTs are all power-of-2, which i40 clearly<br>
> isn't).<br>
> <br>
> Checking that the sizes of LDType and STType are multiple of<br>
> bytes like<br>
> <br>
> // Normalize for Endianness.<br>
> if (DAG.getDataLayout().isBigEndian()) {<br>
> // Avoid dealing with big endian type legalization.<br>
> if (!TLI.isTypeLegal(STType) || !TLI.isTypeLegal(LDType) ||<br>
> (LDType.getSizeInBits() % bitsPerByte() != 0) ||<br>
> (STType.getSizeInBits() % bitsPerByte() != 0))<br>
> return SDValue();<br>
> Offset =<br>
> (STMemType.getSizeInBits() - LDMemType.getSizeInBits()) /<br>
> bitsPerByte() - Offset;<br>
> }<br>
> <br>
> seems to work since it then bails out on the i40.<br>
> <br>
> <br>
> I still have a nagging feeling that this could be solved by<br>
> involving<br>
> getStoreSizeInBits() instead of, or in addition to,<br>
> getSizeInBits() in<br>
> some way though. If we calculate Offset as<br>
> <br>
> Offset =<br>
> (STMemType.getStoreSizeInBits() -<br>
> LDMemType.getStoreSizeInBits())<br>
> / bitsPerByte() - Offset;<br>
> <br>
> then we would get (48 - 32)/16 - 0 so Offset would be 1 and then<br>
> we'd<br>
> abort transformation with<br>
> <br>
> // TODO: Deal with nonzero offset.<br>
> if (LD->getBasePtr().isUndef() || Offset != 0)<br>
> return SDValue();<br>
> <br>
> so that would work too in this particular case but I don't know if<br>
> that's correct or not.<br>
> <br>
> I also don't know if this is really just a problem for bigendian<br>
> targets<br>
> or not, or if it's indeed just a problem for my target with all its<br>
> oddities.<br>
> <br>
> Thanks,<br>
> Mikael<br>
> <br>
> > <br>
> > <br>
> > <br>
> > <br>
> > <br>
> > <br>
> > 16-bit bytes? My understanding is that we've baked in 8-bit bytes pretty <br>
> > strongly into the backend. Are you<br>
> > <br>
> > <br>
> > On Mon, Oct 15, 2018 at 2:43 AM, Mikael Holmén <br>
> > <<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>>> wrote:<br>
> ><br>
> > Hi,<br>
> ><br>
> > On 10/12/2018 04:03 PM, Nirav Davé wrote:<br>
> > > Hmm. I wonder if this is an issue with bigendian vector<br>
> representations.<br>
> > ><br>
> ><br>
> > No vectors involved in this case though.<br>
> ><br>
> > I think it's a size vs stores size issue, possibly that<br>
> might affect<br>
> > bigendian vectors too? I don't know.<br>
> ><br>
> > > I suspect there are not too many cases where we<br>
> actually get to the<br>
> > > getTruncatedStoreValue in your case. Can you tell me<br>
> what STType,<br>
> > > LDType, STMemType, LDMemType, and final offset are when<br>
> we get there for<br>
> > > your test case?<br>
> ><br>
> > When we reach getTruncatedStoreValue we have the<br>
> following DAG:<br>
> ><br>
> > SelectionDAG has 22 nodes:<br>
> > t0: ch = EntryToken<br>
> > t2: i24,ch = CopyFromReg t0, Register:i24 %0<br>
> > t23: ch = TokenFactor t2:1, t22<br>
> > t32: i32 = srl t38, Constant:i16<16><br>
> > t17: ch,glue = CopyToReg t23, Register:i32 $a1_32, t32<br>
> > t27: i16 = add FrameIndex:i16<0>, Constant:i16<2><br>
> > t34: i32,ch = load<(dereferenceable load 1 from<br>
> %ir._p48 + 2),<br>
> > zext from i16> t22, t27, undef:i16<br>
> > t30: i32 = shl t38, Constant:i16<16><br>
> > t37: i32 = add t34, t30<br>
> > t19: ch,glue = CopyToReg t17, Register:i32 $a0_32,<br>
> t37, t17:1<br>
> > t24: i40 = any_extend t2<br>
> > t6: i40 = shl nuw t24, Constant:i16<16><br>
> > t22: ch = store<(store 3 into %ir._p40, align 1)> t0, t6,<br>
> > FrameIndex:i16<0>, undef:i16<br>
> > t38: i32,ch = load<(dereferenceable load 2 from<br>
> %ir._p48, align 1)><br>
> > t22, FrameIndex:i16<0>, undef:i16<br>
> > t20: ch = PHXISD::RETURN t19, Register:i32 $a1_32,<br>
> Register:i32<br>
> > $a0_32, t19:1<br>
> ><br>
> > and the following types<br>
> ><br>
> > STType: i40<br>
> > LDType: i32<br>
> > STMemType: i40<br>
> > LDMemType: i32<br>
> ><br>
> > The Offset is 0.<br>
> ><br>
> > Complicating factor: the byte size on my target is 16,<br>
> not 8, so we've<br>
> > replaced the hardcoded 8 in "* 8" and "/ 8" with<br>
> "bitsPerByte()" which<br>
> > in our case is 16 so the code looks like:<br>
> ><br>
> > bool STCoversLD =<br>
> > BasePtrST.equalBaseIndex(BasePtrLD, DAG, Offset)<br>
> && (Offset<br>
> > >= 0) &&<br>
> > (Offset * bitsPerByte() <=<br>
> LDMemType.getSizeInBits()) &&<br>
> > (Offset * bitsPerByte() +<br>
> LDMemType.getSizeInBits() <=<br>
> > STMemType.getSizeInBits());<br>
> ><br>
> > if (!STCoversLD)<br>
> > return SDValue();<br>
> ><br>
> > // Normalize for Endianness.<br>
> > if (DAG.getDataLayout().isBigEndian())<br>
> > Offset =<br>
> > (STMemType.getSizeInBits() -<br>
> LDMemType.getSizeInBits()) /<br>
> > bitsPerByte() - Offset;<br>
> ><br>
> > The Offset was thus calculated as<br>
> ><br>
> > (40 - 32)/16 - 0<br>
> ><br>
> > which I suspect is incorrect.<br>
> ><br>
> > STMemType is i40, so the size in bits is 40. However, the<br>
> store size<br>
> > for<br>
> > i40 is 48!<br>
> ><br>
> > So I think that we need to involve the store size in some<br>
> way here.<br>
> ><br>
> > Regards,<br>
> > Mikael<br>
> ><br>
> > ><br>
> > > -Nirav<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > On Fri, Oct 12, 2018 at 9:33 AM, Mikael Holmén<br>
> > > <<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>><br>
> > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>>>> wrote:<br>
> > ><br>
> > > Hi again,<br>
> > ><br>
> > > There seems to be another problem too.<br>
> > ><br>
> > > I don't have an in-tree reproducer for it but I<br>
> think the problem<br>
> > > has to<br>
> > > do with cases where the store size of the VT is<br>
> larger than the<br>
> > > "normal"<br>
> > > size.<br>
> > ><br>
> > ><br>
> > > Then I think the new code can miscalculate what<br>
> part of the<br>
> > stored<br>
> > > value<br>
> > > it should use instead of the load. Not sure if<br>
> this is only a<br>
> > problem<br>
> > > for bigendian targets or not.<br>
> > ><br>
> > > The end result is code that fails at runtime so<br>
> it's quite nasty.<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > Regards,<br>
> > > Mikael<br>
> > ><br>
> > > On 10/12/2018 08:08 AM, Mikael Holmén wrote:<br>
> > > > Yep, thanks!<br>
> > > ><br>
> > > > Regards,<br>
> > > > Mikael<br>
> > > ><br>
> > > > On 10/11/2018 08:44 PM, Nirav Davé wrote:<br>
> > > >> Should be fixed in rL344272.<br>
> > > >><br>
> > > >> Thanks for the catch.<br>
> > > >><br>
> > > >> -Nirav<br>
> > > >><br>
> > > >> On Thu, Oct 11, 2018 at 7:41 AM, Mikael Holmén<br>
> > > >> <<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>><br>
> > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>>><br>
> > > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>><br>
> > > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>><br>
> > <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a><br>
> <mailto:<a href="mailto:mikael.holmen@ericsson.com" target="_blank" rel="noreferrer">mikael.holmen@ericsson.com</a>>>>>> wrote:<br>
> > > >><br>
> > > >> Hi,<br>
> > > >><br>
> > > >> Reproducer for powerpc:<br>
> > > >><br>
> > > >> llc bug.ll -o - -O1<br>
> > > >><br>
> > > >> Without the patch we get<br>
> > > >><br>
> > > >> addis 4, 2, .LC0@toc@ha<br>
> > > >> sth 3, -2(1)<br>
> > > >> ld 4, .LC0@toc@l(4)<br>
> > > >> lbz 3, -2(1)<br>
> > > >> stb 3, 0(4)<br>
> > > >> blr<br>
> > > >><br>
> > > >> and with<br>
> > > >><br>
> > > >> addis 4, 2, .LC0@toc@ha<br>
> > > >> sth 3, -2(1)<br>
> > > >> ld 4, .LC0@toc@l(4)<br>
> > > >> stb 3, 0(4)<br>
> > > >> blr<br>
> > > >><br>
> > > >> Admittedly I'm no ppc expert but I think<br>
> the final<br>
> > stb will<br>
> > > write<br>
> > > >> bits<br>
> > > >> [0-7] of 3 to 0(4).<br>
> > > >><br>
> > > >> Before your patch those 8 bits were setup<br>
> from the<br>
> > > >><br>
> > > >> lbz 3, -2(1)<br>
> > > >><br>
> > > >> but with the patch, I think the bits we're<br>
> after are<br>
> > placed<br>
> > > at bit<br>
> > > >> [8-15] in 3 so we'll get the wrong byte.<br>
> > > >><br>
> > > >> /Mikael<br>
> > > >><br>
> > > >> On 10/11/2018 12:54 PM, Mikael Holmén wrote:<br>
> > > >> > Hi Nirav,<br>
> > > >> ><br>
> > > >> > I suspect that this patch doesn't<br>
> handle big<br>
> > endian targets<br>
> > > >> correctly.<br>
> > > >> ><br>
> > > >> > I'll try to get back with a reproducer<br>
> for some<br>
> > in-tree<br>
> > > target,<br>
> > > >> but for<br>
> > > >> > my out-of-tree bigendian target it<br>
> looks like it<br>
> > changes<br>
> > > somethng<br>
> > > >> like<br>
> > > >> ><br>
> > > >> > store i16 %v, i16* %p16<br>
> > > >> > %p8 = bitcast i16* %p16 to i8*<br>
> > > >> > %ld = load i16, i16* %p8<br>
> > > >> ><br>
> > > >> > to<br>
> > > >> ><br>
> > > >> > store i16 %v, i16* %p16<br>
> > > >> > %ld = truncate i16 %v to i8<br>
> > > >> ><br>
> > > >> > but I think it should rather be<br>
> > > >> ><br>
> > > >> > store i16 %v, i16* %p16<br>
> > > >> > %tmp = lshr i16 %v, 8<br>
> > > >> > %ld = truncate i16 %tmp to i8<br>
> > > >> ><br>
> > > >> > I.e. if the target is bigendian, the<br>
> load will<br>
> > read the<br>
> > > high 8<br>
> > > >> bits from<br>
> > > >> > %v rather than the low. And the<br>
> truncate that this<br>
> > patch<br>
> > > >> generates gives<br>
> > > >> > us the low bits.<br>
> > > >> ><br>
> > > >> > Regards,<br>
> > > >> > Mikael<br>
> > > >> ><br>
> > > >> > On 10/10/2018 04:15 PM, Nirav Dave via<br>
> > llvm-commits wrote:<br>
> > > >> >> Author: niravd<br>
> > > >> >> Date: Wed Oct 10 07:15:52 2018<br>
> > > >> >> New Revision: 344142<br>
> > > >> >><br>
> > > >> >> URL:<br>
> > ><br>
> <a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>><br>
> > <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>>><br>
> > > <br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>><br>
> > <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>>>><br>
> > > >><br>
> > <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>><br>
> > <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>>><br>
> > > <br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>><br>
> > <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project?rev=344142&view=rev" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=344142&view=rev</a>>>>><br>
> > > >> >> Log:<br>
> > > >> >> [DAGCombine] Improve Load-Store Forwarding<br>
> > > >> >><br>
> > > >> >> Summary:<br>
> > > >> >> Extend analysis forwarding loads from<br>
> preceeding<br>
> > stores to<br>
> > > >> work with<br>
> > > >> >> extended loads and truncated stores to<br>
> the same<br>
> > address<br>
> > > so long<br>
> > > >> as the<br>
> > > >> >> load is fully subsumed by the store.<br>
> > > >> >><br>
> > > >> >> Hexagon's swp-epilog-phis.ll and<br>
> > swp-memrefs-epilog1.ll<br>
> > > test are<br>
> > > >> >> deleted as they've no longer seem to<br>
> be relevant.<br>
> > > >> >><br>
> > > >> >> Reviewers: RKSimon, rnk, kparzysz,<br>
> javed.absar<br>
> > > >> >><br>
> > > >> >> Subscribers: sdardis, nemanjai,<br>
> hiraditya, atanasyan,<br>
> > > >> llvm-commits<br>
> > > >> >><br>
> > > >> >> Differential Revision:<br>
> > <a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>>><br>
> > > <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>><br>
> > <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>>>><br>
> > > >> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>><br>
> > <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>>><br>
> > > <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>><br>
> > <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a><br>
> <<a href="https://reviews.llvm.org/D49200" rel="noreferrer noreferrer" target="_blank">https://reviews.llvm.org/D49200</a>>>>><br>
> > > >> >><br>
> > > >> >> Removed:<br>
> > > >> >><br>
> > llvm/trunk/test/CodeGen/Hexagon/swp-epilog-phis.ll<br>
> > > >> >><br>
> > llvm/trunk/test/CodeGen/Hexagon/swp-memrefs-epilog1.ll<br>
> > > >> >> Modified:<br>
> > > >> >><br>
> > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
> > > >> >><br>
> > llvm/trunk/test/CodeGen/AArch64/arm64-ld-from-st.ll<br>
> > > >> >><br>
> > > <br>
> llvm/trunk/test/CodeGen/AArch64/regress-tblgen-chains.ll<br>
> > > >> >><br>
> > llvm/trunk/test/CodeGen/Hexagon/clr_set_toggle.ll<br>
> > > >> >> <br>
> llvm/trunk/test/CodeGen/Mips/cconv/vector.ll<br>
> > > >> >><br>
> > > >><br>
> > <br>
> llvm/trunk/test/CodeGen/Mips/indirect-jump-hazard/jumptables.ll<br>
> > > >> >> <br>
> llvm/trunk/test/CodeGen/Mips/o32_cc_byval.ll<br>
> > > >> >> <br>
> llvm/trunk/test/CodeGen/Mips/o32_cc_vararg.ll<br>
> > > >> >><br>
> > llvm/trunk/test/CodeGen/PowerPC/addi-offset-fold.ll<br>
> > > >> >><br>
> > > <br>
> llvm/trunk/test/CodeGen/SystemZ/store_nonbytesized_vecs.ll<br>
> > > >> >><br>
> > llvm/trunk/test/CodeGen/X86/i386-shrink-wrapping.ll<br>
> > > >> >> <br>
> llvm/trunk/test/CodeGen/X86/pr32108.ll<br>
> > > >> >> <br>
> llvm/trunk/test/CodeGen/X86/pr38533.ll<br>
> > > >> >> <br>
> llvm/trunk/test/CodeGen/X86/win64_vararg.ll<br>
> > > >> >><br>
> > > >> >> Modified:<br>
> > > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
> > > >> >> URL:<br>
> > > >> >><br>
> > > >><br>
> > > >><br>
> > ><br>
> ><br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a><br>
> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>><br>
> > <br>
> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>>><br>
> > ><br>
> > <br>
> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>>>><br>
> > ><br>
> > > >><br>
> > > >><br>
> > > >><br>
> > ><br>
> > <br>
> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>>><br>
> > ><br>
> > <br>
> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a> <<a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff" rel="noreferrer noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=344142&r1=344141&r2=344142&view=diff</a>>>>><br>
> > ><br>
> > > >><br>
> > > >><br>
> > > >> >><br>
> > > >> >><br>
> > > >><br>
> > > >><br>
> > ><br>
> > <br>
> ==============================================================================<br>
> > ><br>
> > > >><br>
> > > >><br>
> > > >> >><br>
> > > >> >> ---<br>
> > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
> > > >> (original)<br>
> > > >> >> +++<br>
> > llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed<br>
> > > >> Oct 10<br>
> > > >> >> 07:15:52 2018<br>
> > > >> >> @@ -250,6 +250,11 @@ namespace {<br>
> > > >> >> SDValue<br>
> SplitIndexingFromLoad(LoadSDNode *LD);<br>
> > > >> >> bool SliceUpLoad(SDNode *N);<br>
> > > >> >> + // Scalars have size 0 to<br>
> distinguish from<br>
> > singleton<br>
> > > >> vectors.<br>
> > > >> >> + SDValue<br>
> > ForwardStoreValueToDirectLoad(LoadSDNode *LD);<br>
> > > >> >> + bool<br>
> getTruncatedStoreValue(StoreSDNode *ST,<br>
> > > SDValue &Val);<br>
> > > >> >> + bool<br>
> extendLoadedValueToExtension(LoadSDNode<br>
> > *LD,<br>
> > > SDValue<br>
> > > >> &Val);<br>
> > > >> >> +<br>
> > > >> >> /// Replace an<br>
> ISD::EXTRACT_VECTOR_ELT of a<br>
> > load<br>
> > > with a<br>
> > > >> narrowed<br>
> > > >> >> /// load.<br>
> > > >> >> ///<br>
> > > >> >> @@ -12762,6 +12767,133 @@ SDValue<br>
> > > >> DAGCombiner::SplitIndexingFromLo<br>
> > > >> >> return DAG.getNode(Opc, SDLoc(LD),<br>
> > > BP.getSimpleValueType(),<br>
> > > >> BP, Inc);<br>
> > > >> >> }<br>
> > > >> >> +static inline int<br>
> numVectorEltsOrZero(EVT T) {<br>
> > > >> >> + return T.isVector() ?<br>
> T.getVectorNumElements()<br>
> > : 0;<br>
> > > >> >> +}<br>
> > > >> >> +<br>
> > > >> >> +bool<br>
> > DAGCombiner::getTruncatedStoreValue(StoreSDNode *ST,<br>
> > > >> SDValue<br>
> > > >> >> &Val) {<br>
> > > >> >> + Val = ST->getValue();<br>
> > > >> >> + EVT STType = Val.getValueType();<br>
> > > >> >> + EVT STMemType = ST->getMemoryVT();<br>
> > > >> >> + if (STType == STMemType)<br>
> > > >> >> + return true;<br>
> > > >> >> + if (isTypeLegal(STMemType))<br>
> > > >> >> + return false; // fail.<br>
> > > >> >> + if (STType.isFloatingPoint() &&<br>
> > > STMemType.isFloatingPoint() &&<br>
> > > >> >> + TLI.isOperationLegal(ISD::FTRUNC,<br>
> > STMemType)) {<br>
> > > >> >> + Val = DAG.getNode(ISD::FTRUNC,<br>
> SDLoc(ST),<br>
> > > STMemType, Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + }<br>
> > > >> >> + if (numVectorEltsOrZero(STType) ==<br>
> > > >> numVectorEltsOrZero(STMemType) &&<br>
> > > >> >> + STType.isInteger() &&<br>
> STMemType.isInteger()) {<br>
> > > >> >> + Val = DAG.getNode(ISD::TRUNCATE,<br>
> SDLoc(ST),<br>
> > > STMemType, Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + }<br>
> > > >> >> + if (STType.getSizeInBits() ==<br>
> > > STMemType.getSizeInBits()) {<br>
> > > >> >> + Val = DAG.getBitcast(STMemType, Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + }<br>
> > > >> >> + return false; // fail.<br>
> > > >> >> +}<br>
> > > >> >> +<br>
> > > >> >> +bool<br>
> > > <br>
> DAGCombiner::extendLoadedValueToExtension(LoadSDNode *LD,<br>
> > > >> >> SDValue &Val) {<br>
> > > >> >> + EVT LDMemType = LD->getMemoryVT();<br>
> > > >> >> + EVT LDType = LD->getValueType(0);<br>
> > > >> >> + assert(Val.getValueType() ==<br>
> LDMemType &&<br>
> > > >> >> + "Attempting to extend value<br>
> of non-matching<br>
> > > type");<br>
> > > >> >> + if (LDType == LDMemType)<br>
> > > >> >> + return true;<br>
> > > >> >> + if (LDMemType.isInteger() &&<br>
> LDType.isInteger()) {<br>
> > > >> >> + switch (LD->getExtensionType()) {<br>
> > > >> >> + case ISD::NON_EXTLOAD:<br>
> > > >> >> + Val = DAG.getBitcast(LDType, Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + case ISD::EXTLOAD:<br>
> > > >> >> + Val =<br>
> DAG.getNode(ISD::ANY_EXTEND, SDLoc(LD),<br>
> > > LDType,<br>
> > > >> Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + case ISD::SEXTLOAD:<br>
> > > >> >> + Val =<br>
> DAG.getNode(ISD::SIGN_EXTEND, SDLoc(LD),<br>
> > > LDType,<br>
> > > >> Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + case ISD::ZEXTLOAD:<br>
> > > >> >> + Val =<br>
> DAG.getNode(ISD::ZERO_EXTEND, SDLoc(LD),<br>
> > > LDType,<br>
> > > >> Val);<br>
> > > >> >> + return true;<br>
> > > >> >> + }<br>
> > > >> >> + }<br>
> > > >> >> + return false;<br>
> > > >> >> +}<br>
> > > >> >> +<br>
> > > >> >> +SDValue<br>
> > > DAGCombiner::ForwardStoreValueToDirectLoad(LoadSDNode<br>
> > > >> *LD) {<br>
> > > >> >> + if (OptLevel == CodeGenOpt::None ||<br>
> > LD->isVolatile())<br>
> > > >> >> + return SDValue();<br>
> > > >> >> + SDValue Chain = LD->getOperand(0);<br>
> > > >> >> + StoreSDNode *ST =<br>
> > dyn_cast<StoreSDNode>(Chain.getNode());<br>
> > > >> >> + if (!ST || ST->isVolatile())<br>
> > > >> >> + return SDValue();<br>
> > > >> >> +<br>
> > > >> >> + EVT LDType = LD->getValueType(0);<br>
> > > >> >> + EVT LDMemType = LD->getMemoryVT();<br>
> > > >> >> + EVT STMemType = ST->getMemoryVT();<br>
> > > >> >> + EVT STType =<br>
> ST->getValue().getValueType();<br>
> > > >> >> +<br>
> > > >> >> + BaseIndexOffset BasePtrLD =<br>
> > > BaseIndexOffset::match(LD, DAG);<br>
> > > >> >> + BaseIndexOffset BasePtrST =<br>
> > > BaseIndexOffset::match(ST, DAG);<br>
> > > >> >> + int64_t Offset;<br>
> > > >> >> +<br>
> > > >> >> + bool STCoversLD =<br>
> > > >> >> + <br>
> BasePtrST.equalBaseIndex(BasePtrLD, DAG,<br>
> > Offset) &&<br>
> > > >> (Offset >=<br>
> > > >> >> 0) &&<br>
> > > >> >> + (Offset * 8 <=<br>
> LDMemType.getSizeInBits()) &&<br>
> > > >> >> + (Offset * 8 +<br>
> LDMemType.getSizeInBits() <=<br>
> > > >> >> STMemType.getSizeInBits());<br>
> > > >> >> +<br>
> > > >> >> + if (!STCoversLD)<br>
> > > >> >> + return SDValue();<br>
> > > >> >> +<br>
> > > >> >> + // Memory as copy space<br>
> (potentially masked).<br>
> > > >> >> + if (Offset == 0 && LDType == STType &&<br>
> > STMemType ==<br>
> > > >> LDMemType) {<br>
> > > >> >> + // Simple case: Direct<br>
> non-truncating forwarding<br>
> > > >> >> + if (LDType.getSizeInBits() ==<br>
> > > LDMemType.getSizeInBits())<br>
> > > >> >> + return CombineTo(LD,<br>
> ST->getValue(), Chain);<br>
> > > >> >> + // Can we model the truncate and<br>
> extension<br>
> > with an<br>
> > > and mask?<br>
> > > >> >> + if (STType.isInteger() &&<br>
> > LDMemType.isInteger() &&<br>
> > > >> >> !STType.isVector() &&<br>
> > > >> >> + !LDMemType.isVector() &&<br>
> > LD->getExtensionType() !=<br>
> > > >> >> ISD::SEXTLOAD) {<br>
> > > >> >> + // Mask to size of LDMemType<br>
> > > >> >> + auto Mask =<br>
> > > >> >> +<br>
> > > >><br>
> DAG.getConstant(APInt::getLowBitsSet(STType.getSizeInBits(),<br>
> > > >> >> +<br>
> > > >> >> STMemType.getSizeInBits()),<br>
> > > >> >> + SDLoc(ST),<br>
> STType);<br>
> > > >> >> + auto Val = DAG.getNode(ISD::AND,<br>
> > SDLoc(LD), LDType,<br>
> > > >> >> ST->getValue(), Mask);<br>
> > > >> >> + return CombineTo(LD, Val, Chain);<br>
> > > >> >> + }<br>
> > > >> >> + }<br>
> > > >> >> +<br>
> > > >> >> + // TODO: Deal with nonzero offset.<br>
> > > >> >> + if (LD->getBasePtr().isUndef() ||<br>
> Offset != 0)<br>
> > > >> >> + return SDValue();<br>
> > > >> >> + // Model necessary truncations /<br>
> extenstions.<br>
> > > >> >> + SDValue Val;<br>
> > > >> >> + // Truncate Value To Stored Memory<br>
> Size.<br>
> > > >> >> + do {<br>
> > > >> >> + if (!getTruncatedStoreValue(ST, Val))<br>
> > > >> >> + continue;<br>
> > > >> >> + if (!isTypeLegal(LDMemType))<br>
> > > >> >> + continue;<br>
> > > >> >> + if (STMemType != LDMemType) {<br>
> > > >> >> + if<br>
> (numVectorEltsOrZero(STMemType) ==<br>
> > > >> >> numVectorEltsOrZero(LDMemType) &&<br>
> > > >> >> + STMemType.isInteger() &&<br>
> > LDMemType.isInteger())<br>
> > > >> >> + Val =<br>
> DAG.getNode(ISD::TRUNCATE, SDLoc(LD),<br>
> > > LDMemType,<br>
> > > >> Val);<br>
> > > >> >> + else<br>
> > > >> >> + continue;<br>
> > > >> >> + }<br>
> > > >> >> + if<br>
> (!extendLoadedValueToExtension(LD, Val))<br>
> > > >> >> + continue;<br>
> > > >> >> + return CombineTo(LD, Val, Chain);<br>
> > > >> >> + } while (false);<br>
> > > >> >> +<br>
> > > >> >> + // On failure, cleanup dead nodes<br>
> we may have<br>
> > created.<br>
> > > >> >> + if (Val->use_empty())<br>
> > > >> >> + deleteAndRecombine(Val.getNode());<br>
> > > >> >> + return SDValue();<br>
> > > >> >> +}<br>
> > > >> >> +<br>
> > > >> >> SDValue<br>
> DAGCombiner::visitLOAD(SDNode *N) {<br>
> > > >> >> LoadSDNode *LD = cast<LoadSDNode>(N);<br>
> > > >> >> SDValue Chain = LD->getChain();<br>
> > > >> >> @@ -12828,17 +12960,8 @@ SDValue<br>
> > > DAGCombiner::visitLOAD(SDNode *N<br>
> > > >> >> // If this load is directly<br>
> stored, replace<br>
> > the load<br>
> > > value<br>
> > > >> with<br>
> > > >> >> the stored<br>
> > > >> >> // value.<br>
> > > >> >> - // TODO: Handle store large -> read<br>
> small portion.<br>
> > > >> >> - // TODO: Handle TRUNCSTORE/LOADEXT<br>
> > > >> >> - if (OptLevel != CodeGenOpt::None &&<br>
> > > >> >> - ISD::isNormalLoad(N) &&<br>
> !LD->isVolatile()) {<br>
> > > >> </blockquote></div>