<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Chandler,<div class=""><br class=""></div><div class="">Here is a test case for the biggest offender (oourafft.c).</div><div class="">To reproduce:</div><div class="">llc <span style="font-family: Menlo; font-size: 11px;" class="">-mcpu=core-avx-i </span><span style="font-family: Menlo; font-size: 11px;" class="">-x86-experimental-vector-shuffle-lowering=true repro.ll</span></div><div class="">llc <span style="font-family: Menlo; font-size: 11px;" class="">-mcpu=core-avx-i </span><span style="font-family: Menlo; font-size: 11px;" class="">-x86-experimental-vector-shuffle-lowering=false repro.ll</span></div><div class=""><span style="font-family: Menlo; font-size: 11px;" class=""><br class=""></span></div><div class="">The main problem is that we miss:</div><div class=""><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vmovsd<span class="Apple-tab-span" style="white-space:pre"> </span>(%rdi,%rcx,8), %xmm2</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vmovlhps<span class="Apple-tab-span" style="white-space:pre"> </span>%xmm2, %xmm2, %xmm2 ## xmm2 = xmm2[0,0]</div></div><div class="">=></div><div class=""><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vmovddup<span class="Apple-tab-span" style="white-space:pre"> </span>(%rdi,%rcx,8), %xmm2</div></div><div class=""><br class=""></div><div class="">I do not know how problematic is that (it seems we catch up on the performance with just the previous transformation), but we also miss:</div><div class=""><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vsubpd<span class="Apple-tab-span" style="white-space:pre"> </span>%xmm1, %xmm0, %xmm2</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vaddpd<span class="Apple-tab-span" style="white-space:pre"> </span>%xmm1, %xmm0, %xmm0</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vshufpd<span class="Apple-tab-span" style="white-space:pre"> </span>$2, %xmm0, %xmm2, %xmm0 ## xmm0 = xmm2[0],xmm0[1]</div></div><div class="">=></div><div class=""><div style="margin: 0px; font-size: 11px; font-family: Menlo;" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>vaddsubpd<span class="Apple-tab-span" style="white-space:pre"> </span>%xmm1, %xmm0, %xmm0</div></div><div class=""><br class=""></div><div class="">I’ll look into the other regressions.</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">-Quentin</div><div class=""></div></body></html>