<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - Incorrect codegen for "store <2 x i48>" triggered by -fslp-vectorize-aggressive"

   href="https://llvm.org/bugs/show_bug.cgi?id=28845">28845</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Incorrect codegen for "store <2 x i48>" triggered by -fslp-vectorize-aggressive

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>babokin@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>elena.demikhovsky@intel.com, llvm-bugs@lists.llvm.org, Vsevolod.Livinskij@frtk.ru

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=16882" name="attach_16882" title="reproducer">attachment 16882</a> <a href="attachment.cgi?id=16882&action=edit" title="reproducer">[details]</a></span>

reproducer

Attached test case has a structure with a number of bit fields (discard the

weirdness of C defintion of the structure, it's not important, while LLVM IR

defintion is important).

LLVM IR structure defintion:

%struct.struct_1 = type { [6 x i8], [6 x i8], i24 }

Initialization happens by read-modity-write of two 48 bit chunks, no magic

here.

define void @_Z4initv() local_unnamed_addr #0 {

entry:

  %bf.load = load i48, i48* bitcast (%struct.struct_1* @s1 to i48*), align 8

  %bf.clear = and i48 %bf.load, -8796091973633

  %bf.set3 = or i48 %bf.clear, 7326889148416

  store i48 %bf.set3, i48* bitcast (%struct.struct_1* @s1 to i48*), align 8

  %bf.load4 = load i48, i48* bitcast ([6 x i8]* getelementptr inbounds

(%struct.struct_1, %struct.struct_1* @s1, i64 0, i32 1) to i48*), align 2

  %bf.clear5 = and i48 %bf.load4, -2198956146689

  %bf.set6 = or i48 %bf.clear5, 822419128320

  store i48 %bf.set6, i48* bitcast ([6 x i8]* getelementptr inbounds

(%struct.struct_1, %struct.struct_1* @s1, i64 0, i32 1) to i48*), align 2

  ret void

}

But when test case is compiler with -fslp-vectorize-aggressive, this is

optimized to vector operations:

define void @_Z4initv() local_unnamed_addr #0 {

entry:

  %bf.load = load <2 x i48>, <2 x i48>* bitcast (%struct.struct_1* @s1 to <2 x

i48>*), align 8

  %bf.clear = and <2 x i48> %bf.load, <i48 -8796091973633, i48 -2198956146689>

  %bf.set3 = or <2 x i48> %bf.clear, <i48 7326889148416, i48 822419128320>

  store <2 x i48> %bf.set3, <2 x i48>* bitcast (%struct.struct_1* @s1 to <2 x

i48>*), align 8

  ret void

}

This seems legal, but it leads to incorrect code generation. More specifically,

instead of two *consequent* 48 bit stores, stores happen with 16 bit gap.

Good:

    movl    %ecx, s1(%rip)

    movw    %cx, s1+4(%rip)

    movl    %ecx, s1+6(%rip)

    movw    %cx, s1+10(%rip)

Bad:

    movq    %xmm1, s1(%rip)

    movq    %xmm0, s1+8(%rip)

So the problem is in different meaning of <2 x i48> - in LLVM IR consequent 48

bit locations are assumed, while code generation assumes 64 bit alignment. My

guess that it's code gen bug.

To reproduce:

<span class="quote">>clang++ func.cpp test.cpp -o out -fslp-vectorize-aggressive -O2</span ></pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>