[cfe-dev] Performance problem with SIMD support

Richard Hadsell hadsell at blueskystudios.com
Fri Sep 6 15:07:47 PDT 2013


On 09/06/2013 05:27 PM, Richard Hadsell wrote:
> On 09/06/2013 05:16 PM, Richard Hadsell wrote:
>> I'll see what I can do about a simpler test case.
>>
>> I just talked with a colleague more familiar with our SIMD code, and he pointed out that our function called in this test case is using asm code for the SIMD instructions, not the intrinsic functions.  My guess that the difference was due to _mm_ 
>> functions was wrong.  I apologize for jumping to the wrong conclusion.
>>
>> This colleague thinks the problem might be related to data packing, which compilers could handle differently.  I will try to send our code in a case that demonstrates the performance issue.
>
> Another colleague pointed out that we are compiling the functions using asm code with '-fno-dse' for G++ builds.  Perhaps this option will help for Clang++, too.
>
> I will try it out as soon as I can re-enable our SIMD code and rebuild everything.
>
This pinpoints the problem.  We can compile the asm-based SIMD code with G++ using '-fno-dse', but Clang++ ignores the option and produces code that runs more slowly.

Is there any plan to support this option in Clang?

BTW, we are unable to compile the asm code with Clang in a debug build (-O0).  We get a bunch of errors like this:

xxx.cc:504:2: error: ran out of registers during register allocation
         compute_factors (f0, f1, f2, df0, df1, df2,
         ^
xxx.cc:302:8: note: expanded from macro 'compute_factors'
         asm ( "movapd %3, %%xmm0                \n\t"   /* xmm0 = f0 */ \
               ^


-- 
Dick Hadsell			203-992-6320  Fax: 203-992-6001
Reply-to:			hadsell at blueskystudios.com
Blue Sky Studios                http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560




More information about the cfe-dev mailing list