<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Suboptimal codegen for storing std::pair<std::uint32_t, std::uint32_t>"
   href="https://bugs.llvm.org/show_bug.cgi?id=43864">43864</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Suboptimal codegen for storing std::pair<std::uint32_t, std::uint32_t>
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C++17
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>denis.yaroshevskij@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>blitzrakete@gmail.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>When storing a pair<uint32_t, uint32_t> (and probably similar types) it is
significantly faster to first shift two ints together and then do one bigger
store (according to my measurements - up to two times for uint32_t).

However, clang at the moment for std::pair does two moves. When doing shifts in
code by hand, clang does not modify them back (even though in other cases I can
clearly see that it perfectly understands how to switch between shifts and
moves)

Can clang generate shifts + 1 store for pairs instead of two stores?

<span class="quote">>>>>>>>>>></span >
Data:

Note: uint_tuple is a template magic that stores uints into a bigger uint.


Benchmark:
populate a vector of a 1000 pairs. pairs look like pair<uint8_t, uint8_t>, ,
pair<uint16_t, uint16_t>, - up to 64. Alternative - same pair but into a one
bigger integer.


Godbolt with codegen for different operations:
<a href="https://gcc.godbolt.org/z/GKf7lE">https://gcc.godbolt.org/z/GKf7lE</a>

Mesurement results:
Quick-Bench: <a href="http://quick-bench.com/aDq3iN3dpi9VWQc8XSd6o7Hlzl4">http://quick-bench.com/aDq3iN3dpi9VWQc8XSd6o7Hlzl4</a>
My machine: <a href="https://denisyaroshevskiy.github.io/algorithm_dumpster/#uint_tuple">https://denisyaroshevskiy.github.io/algorithm_dumpster/#uint_tuple</a>


<span class="quote">>>>>>>>>></span >

P.S. trunk gcc does something new when storing pairs:
<a href="https://gcc.godbolt.org/z/Zbz96a">https://gcc.godbolt.org/z/Zbz96a</a></pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>