<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - spec2000/188.ammp, spec2006/433.milc, 444.namd, 447.dealII, 453.povray compilation fails on LTO stage after commit r256394"

   href="https://llvm.org/bugs/show_bug.cgi?id=25999">25999</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>spec2000/188.ammp, spec2006/433.milc, 444.namd, 447.dealII, 453.povray compilation fails on LTO stage after commit r256394

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Keywords</th>

          <td>miscompilation

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>sergey.k.okunev@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>david.l.kreitzer@intel.com, denis.briltz@intel.com, elena.demikhovsky@intel.com, llvm-bugs@lists.llvm.org, sergos.gnu@gmail.com, spatel+llvm@rotateright.com, zia.ansari@intel.com

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Bisect analysis showed LLVM revision 256394 is responsible for the fails. The

comments to commit are the following.

commit 75759ab3e9255fe5f716e4a71ca1ee56901dedf8

Author: Sanjay Patel <<a href="mailto:spatel@rotateright.com">spatel@rotateright.com</a>>

Date:   Thu Dec 24 21:17:56 2015 +0000

    [InstCombine] transform more extract/insert pairs into shuffles (PR2109)

    This is an extension of the shuffle combining from r203229:

    <a href="http://reviews.llvm.org/rL203229">http://reviews.llvm.org/rL203229</a>

    The idea is to widen a short input vector with undef elements so the

    existing shuffle transform for extract/insert can kick in.

    The motivation is to finally solve PR2109:

    <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - Missed optimization for extract/insertelements equivalent to movhps"

   href="show_bug.cgi?id=2109">https://llvm.org/bugs/show_bug.cgi?id=2109</a>

    For that example, the IR becomes:

    %1 = bitcast <2 x i32>* %P to <2 x float>*

    %ld1 = load <2 x float>, <2 x float>* %1, align 8

    %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0,

i32 1, i32 undef, i32 undef>

    %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32

1, i32 4, i32 5>

    ret <4 x float> %i2

    And x86 SSE output improves from:

    movq        (%rdi), %xmm1           ## xmm1 = mem[0],zero

    movdqa      %xmm1, %xmm2

    shufps      $229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]

    shufps      $48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]

    shufps      $132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]

    shufps      $32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]

    shufps      $36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]

    retq

    To the almost optimal:

    movhpd      (%rdi), %xmm0

    Note: There's a tension in the existing transform related to generating

    arbitrary shufflevector masks. We avoid that in other places in InstCombine

    because we're scared that codegen can't handle strange masks, but it looks

    like we're ok with producing those here. I purposely chose weird

insert/extract

    indexes for the regression tests to see the effect in these cases.

    For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal

or

    better for these examples.

    Differential Revision: <a href="http://reviews.llvm.org/D15096">http://reviews.llvm.org/D15096</a>

    git-svn-id: <a href="https://llvm.org/svn/llvm-project/llvm/trunk@256394">https://llvm.org/svn/llvm-project/llvm/trunk@256394</a>

91177308-0d34-0410-b5e6-96231b3b80d8

LLVM-clang options: -m64 -fuse-ld=gold -Ofast -funroll-loops -flto -static

-mfpmath=sse -march=core-avx2

During LTO phase spec benchmarks fail with the following error message (e.g.,

spec2006/444.namd).

runspec --config=lnx-x86_64-clang-default.cfg --rebuild -a build -e ref64 -T

base 444

…………………………………………

clang++ -m64  -m64  -fuse-ld=gold  -Ofast -funroll-loops -flto -static 

-mfpmath=sse -march=core-avx2   -DSPEC_CPU_LP64        Compute.o ComputeList.o

ComputeNonbondedUtil.o LJTable.o Molecule.o Patch.o PatchList.o ResultSet.o

SimParameters.o erf.o spec_namd.o                     -o namd

Instruction does not dominate all uses!

  %782 = extractelement <2 x double> %721, i32 1

  %779 = insertelement <4 x double> undef, double %782, i32 0

Instruction does not dominate all uses!

  %1053 = extractelement <2 x double> %974, i32 1

  %1050 = insertelement <4 x double> undef, double %1053, i32 0

Instruction does not dominate all uses!

  %1332 = shufflevector <2 x double> %1263, <2 x double> undef, <4 x i32> <i32

0, i32 1, i32 undef, i32 undef>

  %1330 = shufflevector <4 x double> %1329, <4 x double> %1332, <4 x i32> <i32

0, i32 5, i32 undef, i32 undef>

LLVM ERROR: Broken function found, compilation aborted!

clang-3.8: error: linker command failed with exit code 1 (use -v to see

invocation)

specmake: *** [namd] Error 1

Okunev Sergey,

Software Engineer

Intel Compiler Team</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>