[cfe-dev] Performance problem with SIMD support

Richard Hadsell hadsell at blueskystudios.com
Tue Sep 10 12:04:08 PDT 2013


On 09/06/2013 06:12 PM, Eric Christopher wrote:
>> This pinpoints the problem. We can compile the asm-based SIMD code with G++ using '-fno-dse', but Clang++ ignores the option and produces code that runs more slowly. 
> That's for the dead store elimination pass. We do eliminate some dead
> stores, and a testcase would be ideal.
I am ready to send you a test case.  The asm SIMD code runs slower than the equivalent C++ non-SIMD code.  How do I send it to you?
>> BTW, we are unable to compile the asm code with Clang in a debug build
>> (-O0).  We get a bunch of errors like this:
>>
>> xxx.cc:504:2: error: ran out of registers during register allocation
>>          compute_factors (f0, f1, f2, df0, df1, df2,
>>          ^
>> xxx.cc:302:8: note: expanded from macro 'compute_factors'
>>          asm ( "movapd %3, %%xmm0                \n\t"   /* xmm0 = f0 */ \
> Means that you've got a lot of inline assembly that basically depends
> upon the compiler picking decent memory operands for your inline asm.
> A couple of comments here:
>
> a) that's a lot of inline assembly then,
> b) you'll be better off using the intrinsics, especially with clang
>
Te test case also demonstrates the errors reported, when compiling with -g instead of -O2.

-- 
Dick Hadsell			203-992-6320  Fax: 203-992-6001
Reply-to:			hadsell at blueskystudios.com
Blue Sky Studios                http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560




More information about the cfe-dev mailing list