<div>I write a small function and test it under clang and gcc, </div>
<div> </div>
<div>filet test.c:</div>
<div>double X[100]; double Y[100]; double DA = 0.3;</div>
<div>int f()<br>{<br> int i;</div>
<div> for (i = 0; i < 100; i++)<br> Y[i] = Y[i] - DA * X[i];</div>
<div> return 0;<br>}<br></div>
<div>clang -S -O3 -o test.s test.c -march=native -ccc-echo</div>
<div>result:</div>
<div>"D:/work/trunk/bin/Release/clang.exe" -cc1 -triple i686-pc-win32 -S -disable-fr<br>e -disable-llvm-verifier -main-file-name test.c -mrelocation-model static -mdis<br>ble-fp-elim -masm-verbose -mconstructor-aliases -target-cpu corei7 -momit-leaf-<br>
rame-pointer -coverage-file test.s -resource-dir "D:/work/trunk/bin/Release\\..<br>\lib\\clang\\3.1" -fmodule-cache-path "C:\\DOCUME~1\\ADMINI~1\\LOCALS~1\\Temp\\<br>lang-module-cache" -internal-isystem D:/work/trunk/bin/Release/../lib/clang/3.1<br>
include -internal-isystem "C:\\Program Files\\Microsoft Visual Studio 9.0\\VC\\<br>nclude" -internal-isystem "C:\\Program Files\\Microsoft SDKs\\Windows\\v6.0A\\\<br>include" -O3 -ferror-limit 19 -fmessage-length 80 -mstackrealign -fms-extension<br>
-fms-compatibility -fmsc-version=1300 -fdelayed-template-parsing -fgnu-runtime<br>-fobjc-runtime-has-arc -fobjc-runtime-has-weak -fobjc-fragile-abi -fdiagnostics<br>show-option -fcolor-diagnostics -o test.s -x c test.c</div>
<div> </div>
<div> .def _f;<br> .scl 2;<br> .type 32;<br> .endef<br> .text<br> .globl _f<br> .align 16, 0x90<br>_f: # @f<br># BB#0:<br> movl $-800, %eax # imm = 0xFFFFFFFFFFFFFCE0<br> movsd _DA, %xmm0<br>
.align 16, 0x90<br>LBB0_1: # =>This Inner Loop Header: Depth=1<br> movsd _X+800(%eax), %xmm1<br> mulsd %xmm0, %xmm1<br> movsd _Y+800(%eax), %xmm2<br> subsd %xmm1, %xmm2<br> movsd %xmm2, _Y+800(%eax)<br>
addl $8, %eax<br> jne LBB0_1<br># BB#2:<br> xorl %eax, %eax<br> ret</div>
<div> .data<br> .globl _DA # @DA<br> .align 8<br>_DA:<br> .quad 4599075939470750515 # double 3.000000e-01</div>
<div> .comm _Y,800,3 # @Y<br> .comm _X,800,3 # @X</div>
<div> </div>
<div> </div>
<div>gcc -S -O3 -o test2.s test.c -march=native</div>
<div>result:</div>
<div> .file "test.c"<br> .text<br> .p2align 4,,15<br>.globl _f<br> .def _f; .scl 2; .type 32; .endef<br>_f:<br> pushl %ebp<br> movddup _DA, %xmm2<br> movl %esp, %ebp<br> xorl %eax, %eax<br> .p2align 4,,10<br>L2:<br>
movapd _Y(%eax), %xmm0<br> movapd _X(%eax), %xmm1<br> mulpd %xmm2, %xmm1<br> subpd %xmm1, %xmm0<br> movapd %xmm0, _Y(%eax)<br> addl $16, %eax<br> cmpl $800, %eax<br> jne L2<br> xorw %ax, %ax<br> leave<br> ret<br>.globl _DA<br>
.data<br> .align 16<br>_DA:<br> .long 858993459<br> .long 1070805811<br> .comm _X, 800, 5<br> .comm _Y, 800, 5<br></div>
<div> </div>
<div>It seems gcc emit more effectivenss instuction. Are there any clang command arguments to get the similar result?</div>
<div> </div>
<div> </div>