Hello everyone, I discovered a way to perform optimization on the following code (I gave an example that uses 32bit integer, but it works with any size.):<div><br><div>const uint32 d,r;//d is an odd number</div><div>//d is the divisor, r is the remainder</div><div>bool check_remainder(uint32 x)</div><div>{</div><div>return x%d==r;</div><div>}</div><div><br></div><div>if we know d and r at compile time, and d is an odd integer, we can use modular multiplicative inverse to bypass the division operation.</div><div>I wrote the following code to calculate the M.M.I of a number over 2^b. I give anyone the permission to use it (unlicensed).</div><div><pre style="box-sizing:border-box;margin-top:0px;margin-bottom:0px;overflow:scroll;padding:15px 0px"><div class="line js-file-line" id="LC3" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px"><span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">uint64_t</span> <span class="pl-en" style="box-sizing:border-box;color:rgb(111,66,193)">find_mod_mul_inverse</span>(<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">uint64_t</span> x, <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">uint64_t</span> bits)</div><div class="line js-file-line" id="LC4" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">{</div><div class="line js-file-line" id="LC5" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-k" style="box-sizing:border-box;color:rgb(215,58,73)">if</span> (bits > <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">64</span> || ((x&<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1</span>)==<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">0</span>))</div><div class="line js-file-line" id="LC6" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                <span class="pl-k" style="box-sizing:border-box;color:rgb(215,58,73)">return</span> <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">0</span>;// invalid parameters</div><div class="line js-file-line" id="LC7" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">uint64_t</span> mask;</div><div class="line js-file-line" id="LC8" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-k" style="box-sizing:border-box;color:rgb(215,58,73)">if</span> (bits == <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">64</span>)</div><div class="line js-file-line" id="LC9" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                mask = -<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1</span>;</div><div class="line js-file-line" id="LC10" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-k" style="box-sizing:border-box;color:rgb(215,58,73)">else</span></div><div class="line js-file-line" id="LC11" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        {                </div><div class="line js-file-line" id="LC12" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                mask = <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1</span>;</div><div class="line js-file-line" id="LC13" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                mask<<=bits;</div><div class="line js-file-line" id="LC14" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                mask--;</div><div class="line js-file-line" id="LC15" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        }</div><div class="line js-file-line" id="LC16" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        x&=mask;</div><div class="line js-file-line" id="LC17" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">uint64_t</span> result=<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1</span>, state=x, ctz=<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">0</span>;</div><div class="line js-file-line" id="LC18" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-k" style="box-sizing:border-box;color:rgb(215,58,73)">while</span>(state!=<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1ULL</span>)</div><div class="line js-file-line" id="LC19" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        {</div><div class="line js-file-line" id="LC20" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                ctz=<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">__builtin_ctzll</span>(state^<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1</span>);</div><div class="line js-file-line" id="LC21" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                result|=<span class="pl-c1" style="box-sizing:border-box;color:rgb(0,92,197)">1ULL</span><<ctz;</div><div class="line js-file-line" id="LC22" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                state+=x<<ctz;</div><div class="line js-file-line" id="LC23" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">                state&=mask;</div><div class="line js-file-line" id="LC24" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        }</div><div class="line js-file-line" id="LC25" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">        <span class="pl-k" style="box-sizing:border-box;color:rgb(215,58,73)">return</span> result;</div><div class="line js-file-line" id="LC26" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">}</div><div class="line js-file-line" id="LC26" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px"><br></div><div class="line js-file-line" id="LC26" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">now consider the following steps:</div><div class="line js-file-line" id="LC26" style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre;box-sizing:border-box;padding:0px 15px">from the 2 constants (d and r) we create 3 constants (with the same bit length):</div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><font color="#24292e" face="Consolas, Liberation Mono, Courier, monospace"><span style="font-size:14px;white-space:pre">constants uint32 s,u,mmi;</span></font></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><font color="#24292e" face="Consolas, Liberation Mono, Courier, monospace"><span style="font-size:14px;white-space:pre">mmi = </span></font><span style="color:rgb(111,66,193);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre">find_mod_mul_inverse(d,32);</span></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><font color="#24292e" face="Consolas, Liberation Mono, Courier, monospace"><span style="font-size:14px;white-space:pre">s = (r*mmi);
u = (UINT32_MAX-r)/d; // </span></font><span style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre">UINT32_MAX corresponds to pow(2,32)-1.</span></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><span style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre">the idea behind these constants is the following formula:</span></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><span style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre">mmi_of(d)*x=x/d+(x%d)*mmi_of(d)</span></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><span style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre"><br></span></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><font color="#24292e" face="Consolas, Liberation Mono, Courier, monospace"><span style="font-size:14px;white-space:pre">now after we generated the constants, we will just emit the following code instead of the former:
</span></font><div style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;white-space:normal">bool check_remainder(uint32 x)</div><div style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;white-space:normal">{</div><div style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;white-space:normal">return <span style="color:rgb(36,41,46);font-family:Consolas,"Liberation Mono",Courier,monospace;font-size:14px;white-space:pre">((x*mmi)-s)<=u</span>;</div><span style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;white-space:normal">}</span><font color="#24292e" face="Consolas, Liberation Mono, Courier, monospace"><span style="font-size:14px;white-space:pre"><br></span></font></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><span style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;white-space:normal"><br></span></div><div class="line js-file-line" id="LC26" style="box-sizing:border-box;padding:0px 15px"><span style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;white-space:normal">Anyway, I looked at the file IntegerDivision.cpp, it seems to me that this new optimization is more effective then the optimization used there. However, I have no experience with compiler development, so I can just give you my idea. if further explanation is needed, just ask. I tested my method and it gives the correct results.</span></div></pre></div></div>