[llvm-bugs] [Bug 27103] New: Improve NEON autovectorization?
    via llvm-bugs 
    llvm-bugs at lists.llvm.org
       
    Mon Mar 28 16:49:46 PDT 2016
    
    
  
https://llvm.org/bugs/show_bug.cgi?id=27103
            Bug ID: 27103
           Summary: Improve NEON autovectorization?
           Product: libraries
           Version: 3.8
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: ARM
          Assignee: unassignedbugs at nondot.org
          Reporter: tulipawn at gmail.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified
Created attachment 16107
  --> https://llvm.org/bugs/attachment.cgi?id=16107&action=edit
VFP assembly
Benchmarking matrix multiplication Rust code
(https://github.com/bluss/matrixmultiply/), we find that on Cortex-A5:
Using VFP:
test mat_mul_f32::m004 ... bench:       1,632 ns/iter (+/- 51)
test mat_mul_f32::m007 ... bench:       3,767 ns/iter (+/- 56)
test mat_mul_f32::m008 ... bench:       4,151 ns/iter (+/- 96)
test mat_mul_f32::m012 ... bench:       8,712 ns/iter (+/- 408)
Using NEON:
test mat_mul_f32::m004 ... bench:       1,588 ns/iter (+/- 89)
test mat_mul_f32::m007 ... bench:       3,307 ns/iter (+/- 94)
test mat_mul_f32::m008 ... bench:       3,056 ns/iter (+/- 62)
test mat_mul_f32::m012 ... bench:       6,197 ns/iter (+/- 181)
Starting with m>=16 the speedup finally reaches 2x. Is there room for
improvement here?
-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160328/0c844c4b/attachment.html>
    
    
More information about the llvm-bugs
mailing list