[llvm-dev] FENV_ACCESS and floating point LibFunc calls

Michael Clark via llvm-dev llvm-dev at lists.llvm.org
Thu May 11 19:23:23 PDT 2017


> On 12 May 2017, at 1:53 PM, Michael Clark <michaeljclark at mac.com> wrote:
> 
> 
>> On 12 May 2017, at 1:48 PM, Tim Northover <t.p.northover at gmail.com> wrote:
>> 
>> On 11 May 2017 at 18:30, Michael Clark via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>> I note that on your bug that you have stated that the branch is faster than
>>> the conditional move. Faster code is a side effect of the fix in this
>>> particular case.
>> 
>> On the contrary: the faster code is pretty much the only reason this
>> can happen before the rest of the FENV support lands.
>> 
>> It's been said before, but I'll reiterate: LLVM IR does not model the
>> FENV on its instructions. CodeGen and other passes are free to
>> de-conditionalize exceptions, remove them, or add spurious ones just
>> for the giggles. What LLVM does now is not incorrect.
> 
> OK. So we are in fact lucky that the correct case is actually faster, and it’s a bug in the predicate lowering i.e. speculative execution and conditional move being slower than a branch.
> 
> I’m curious how the select lowering models the cost, when I figure out where to look in the codebase…


Just as a few data points on the x86 branch predictor.

I have 6 small integer benchmarks that I am using to test a RISC-V to x86 binary translator and I was using perf last night to read the performance counters. I had these stats in my command line history as I was curious about branch predictor accuracy. It seems branch prediction accuracy in all my experiments is > 99%. Note the test programs are compiled by RISC-V GCC. RISC-V has no conditional moves and branch mis-predict latency is only 3 cycles on Rocket, so its also an architecture that prefers branches over predication. We are translating RISC-V branches to x86 branches. We don’t use conditional moves in any of our translations. I believe a predicted branch is just 1 cycle latency on x86. Here is the translator: http://rv8.io/ <http://rv8.io/> (BTW  - the RISC-V interpreter rv-sim seems to be a pathological test case for the Clang/LLVM optimiser, with the Clang/LLVM code running at just over half the speed of the GCC generated code, of course the translator is not really affected by the speed of Clang, as we spend most time in the JIT code).


$ perf stat -e cycles,instructions,branches,branch-misses ./build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-sha512 

 Performance counter stats for './build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-sha512':

     2,386,668,826      cycles                                                     
     8,226,368,806      instructions              #    3.45  insn per cycle        
       556,426,385      branches                                                   
         1,120,630      branch-misses             #    0.20% of all branches       

       0.766480608 seconds time elapsed

$ perf stat -e cycles,instructions,branches,branch-misses ./build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-aes 

 Performance counter stats for './build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-aes':

     3,390,012,091      cycles                                                     
     8,165,055,539      instructions              #    2.41  insn per cycle        
       166,612,327      branches                                                   
           393,687      branch-misses             #    0.24% of all branches       

       0.999783799 seconds time elapsed

$ perf stat -e cycles,instructions,branches,branch-misses ./build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-primes 

 Performance counter stats for './build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-primes':

       585,513,229      cycles                                                     
     1,570,274,312      instructions              #    2.68  insn per cycle        
       199,550,674      branches                                                   
         1,373,005      branch-misses             #    0.69% of all branches       

       0.180905897 seconds time elapsed

$ perf stat -e cycles,instructions,branches,branch-misses ./build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-miniz 

 Performance counter stats for './build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-miniz':

     7,181,383,837      cycles                                                     
    12,171,106,005      instructions              #    1.69  insn per cycle        
     1,309,704,230      branches                                                   
        10,246,710      branch-misses             #    0.78% of all branches       

       2.120649526 seconds time elapsed

$ perf stat -e cycles,instructions,branches,branch-misses ./build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-dhrystone 

 Performance counter stats for './build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-dhrystone':

     1,705,866,284      cycles                                                     
     5,902,622,960      instructions              #    3.46  insn per cycle        
       852,430,738      branches                                                   
            65,576      branch-misses             #    0.01% of all branches       

       0.530201822 seconds time elapsed

$ perf stat -e cycles,instructions,branches,branch-misses ./build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-qsort 

 Performance counter stats for './build/linux_x86_64/bin/rv-jit build/riscv64-unknown-elf/bin/test-qsort':

       951,218,523      cycles                                                     
     2,060,457,742      instructions              #    2.17  insn per cycle        
       432,171,433      branches                                                   
         3,844,290      branch-misses             #    0.89% of all branches       

       0.288089656 seconds time elapsed

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170512/4d3c8704/attachment.html>


More information about the llvm-dev mailing list