<div dir="ltr">Hi Florian,<div><br></div><div>First, thank you for working on this. I'm really glad to see this work so close to being enabled.<br><div><br></div><div>I think the numbers look good for run time, and the benefits of switching for all configurations are clear.</div><div><br></div><div>For compile time, the current regressions are noticeable, but not a deal breaker in my opinion. I'm very much in favor of switching in all configurations.</div><div><br></div><div>To address some of the concerns, it may make sense to lower the threshold somewhat to minimize impact at this time (we won't have benefits as large at the time of the switch). I'm talking about getting the geomean closer to 1% in all configurations if possible.</div><div>I believe that the regressions introduced by this flag flip can be undone by further using MemorySSA in the other passes currently using MemDepAnalysis, and offsetting the cost of computing MemorySSA in the first place. The threshold could be raised again to enable more stores eliminated once the MemCpyOpt+MSSA and NewGVN become the default.</div><div><br></div><div>If reducing the thresholds is not possible or removes most of the run time benefits, I would vote for enabling as is.</div><div><br></div><div>Best,</div><div>Alina</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 19, 2020 at 7:37 AM Florian Hahn via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
> On Aug 18, 2020, at 22:14, Florian Hahn via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>
> <br>
> <br>
> <br>
>> On Aug 18, 2020, at 16:59, Michael Kruse <<a href="mailto:llvmdev@meinersbur.de" target="_blank">llvmdev@meinersbur.de</a>> wrote:<br>
>> <br>
>> Thanks for all the work. The reductions in stores look promising. Do you also have performance numbers how much this improves the execution time? Did you observe any regressions where MSSA resulted in fewer removed stores?<br>
> <br>
> I did not gather numbers for execution time yet, but I’ll try to share some tomorrow.<br>
<br>
<br>
Here are some execution time results for ARM64 with -O3 -flto with the MemorySSA-DSE compared against the current DSE implementation for CINT2006 (negative % means reduction in execution time with MemorySSA-DSE). This excludes small changes within the noise (<= 0.5%)<br>
<br>
Exec_time number of stores removed<br>
test-suite...T2006/456.hmmer/456.hmmer.test -1.6%. + 70.8%<br>
test-suite.../CINT2006/403.gcc/403.gcc.test -1.4%. + 35.7%<br>
test-suite...0.perlbench/400.perlbench.test -1.2%. + 33.2%<br>
test-suite...3.xalancbmk/483.xalancbmk.test -1.0%. + 3.02%<br>
test-suite...T2006/401.bzip2/401.bzip2.test -0.8%. + 70.6%<br>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>