<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/54949>54949</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [libcxx] std::rotate gcd is 3-10x slower than the default variant.
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          easyaspi314
      </td>
    </tr>
</table>

<pre>
    This shows a benchmark of 500000 ints rotated randomly using the GCD rotate, the default rotate code (by forcing the BiDi tag on the internal `__rotate` overload), and block swap adapted from GrailSort, both with the worst case optimization (`left % right == 1 || right % left == 1`) requiring a temporary (as used in `__rotate_forward` and `__rotate_backward`), and one without. 

https://quick-bench.com/q/HpzwAmp718EZTGKvEdjvUn5e1KU

A benchmark of similar code on an AArch64 device (2.21GHz Snapdragon 730/Cortex-A76-like) showed a whopping **10x slowdown** in a similar benchmark, 0.74s vs 7.77s, with block swap getting 0.29s. This is likely due to this processor only having 64 KiB of L1 cache, compared to the 128-192 KiB standard on many x86 processors.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJxVU01zmzAQ_TX4smMGxJc5-EDiJp1Jb20vvWQWSQbVAlFJGDu_vivitI5HBrGr3X373qo14rr_0SsHrjeLA4RWjrwf0J7AHKFIwg_U6B1Y49FLARZHYQZ9hdmpsQPfS3h-PNzcEXtcLUIecdb-ZgVuhISI7dorHI3lH3EP6qDAYwdmXL-pjrQjaojK5PX1lrFMwJyl1QZFxOpQgABAqw0_gVtwAhQ4BWBHawZ4tqj0d2N9ONga38Oi6BGyL8Y6DxydBDN5Nag39IoqEy4qouXR07YAq7qedtmBFqQQVY-0Pqzkfz94c1MggQIr_8zKhrYQvBwmY9FeQ2J0RBNhU-N9T69EwoJWhN5CM_euFvnp5rtr14xybcTMPoYoOURJ8_7svZ9clDURe6JFKPhpu0oYczMEC_2_Tm9LM0xVuvvy68fzy_mL-H3-ORYyffl5n6r5rL0jhjTad-2IJhyhaSzvy5zUPSu-Cspilj5_fYPvI07CYkfnqoyAPz2SBPKybapyq9VJBpLChBEVCEtvpimQFTHC3aTJBZw2izDL-G4JdOE_AP9QBTKSuModnB1UcVW5YFn1vRuHTnofkicxq10M63DTCihoaMUswRuaBzJN1nDpnLHUHrl6PIc46u9FPQQGvqU0Lbxfh5rYnNAS_DVYQsp227Rm61HnSSKSLLA04HiFy678n9zFG7HPRJ3VuPHKa7mPigetWn65RMWBgkWQL2tuV6XjIsDNth-0SEsFcfx0rc5oFY4-3sxW7z-PQEd0zO1Nfa3PH68tAfotOd2LJ-XcLIm7pyKv83rT76tCSs6StCiPosqqKi95Kqr6mNdZy8tdudHYSu0C8IixUS6wpqA9dbBRe5YwluRplZRZyXZxlmdtXhRJwdK8Kss6yhM50LWMA47Y2G5j9yukdu4cObVy3v13onOqG-XKU8iPM8293Ut0V3STytJ8s5bfr_D_AiHdi8A">