<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">I've looked into this. The main issue is when the register allocator folded reloads from fixed slots the loads ended up with alignment 1. That's because by default all fixed slots have alignment of 1. This is overly conservative. DAG combiner will infer the correct alignments from stack pointer alignments and frame offset. But at register allocation time we are simply taking the frame object's alignments.<div><br></div><div>I think the right fix is to make sure the fixed objects have the inferred alignments when they are created. Does anyone see a problem with that?<br><div><br></div><div>Evan</div><div><br><div><div>On Jun 29, 2010, at 6:03 PM, Evan Cheng wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Jun 29, 2010, at 1:39 PM, Jakob Stoklund Olesen wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div><br>On Jun 28, 2010, at 10:46 PM, Evan Cheng wrote:<br><br><blockquote type="cite"><br></blockquote><blockquote type="cite">On Jun 28, 2010, at 10:09 PM, Jakob Stoklund Olesen wrote:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">On Jun 28, 2010, at 10:00 PM, Evan Cheng wrote:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">On Jun 28, 2010, at 6:13 PM, Jakob Stoklund Olesen wrote:<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Author: stoklund<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Date: Mon Jun 28 20:13:07 2010<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">New Revision: 107114<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=107114&view=rev">http://llvm.org/viewvc/llvm-project?rev=107114&view=rev</a><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Log:<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">When no memoperands are present, assume unaligned, volatile.<br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">Have you checked if this disabled a lot of potential optimizations? I am afraid assuming volatileness is overly conservative.<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">I haven't checked anything but running unit tests.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">We need both non-volatile and 4-byte alignment to safely combine loads and stores. If memoperands are missing, I don't see any alternative. How else can we guarantee the optimization is valid?<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">If this causes missed optimizations, we should ensure that memory operands are present.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I agree. I am just wondering if that's indeed the case. Can you add a llc beta check to look for differences in # of load / store multiple optimizations?<br></blockquote><br>I ran an assembly diff across the entire nightly test suite.<br><br>We lost 117 of 56783 ldms and 40 of 47702 stms.<font class="Apple-style-span"><font class="Apple-style-span" color="#144FAE"><br></font></font></div></blockquote><br></div><div>Thanks. Can you file a bug about this? We should take a look.</div><div><br></div><div>Evan</div><div><br></div></div>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br></blockquote></div><br></div></div></body></html>