[cfe-dev] Performance problem with SIMD support

Eric Christopher echristo at gmail.com
Fri Sep 6 15:12:49 PDT 2013


>>
> This pinpoints the problem.  We can compile the asm-based SIMD code with G++
> using '-fno-dse', but Clang++ ignores the option and produces code that runs
> more slowly.

That's for the dead store elimination pass. We do eliminate some dead
stores, and a testcase would be ideal.

>
> Is there any plan to support this option in Clang?
>
> BTW, we are unable to compile the asm code with Clang in a debug build
> (-O0).  We get a bunch of errors like this:
>
> xxx.cc:504:2: error: ran out of registers during register allocation
>         compute_factors (f0, f1, f2, df0, df1, df2,
>         ^
> xxx.cc:302:8: note: expanded from macro 'compute_factors'
>         asm ( "movapd %3, %%xmm0                \n\t"   /* xmm0 = f0 */ \
>

Means that you've got a lot of inline assembly that basically depends
upon the compiler picking decent memory operands for your inline asm.
A couple of comments here:

a) that's a lot of inline assembly then,
b) you'll be better off using the intrinsics, especially with clang

-eric



More information about the cfe-dev mailing list