[PATCH] Add memory variant of vcvtps2ph.
llvm-dev at redking.me.uk
Mon Feb 2 06:04:53 PST 2015
Thanks for working on this Alex - I've a concern with the 128->64 bit folded store.
Comment at: lib/Target/X86/X86InstrSSE.td:8523
@@ +8522,3 @@
+ (VCVTPS2PHmr addr:$dst, VR128:$src1, imm:$src2)>;
+ def : Pat<(store (v8i16 (int_x86_vcvtps2ph_256 VR256:$src1, i32:$src2)),
While VCVTPS2PHrr does write to an entire xmm (result half floats to lower 64-bits, and clears the upper 64-bits), VCVTPS2PHmr only writes out 64-bits to memory (so with this pattern the upper 64-bits would be undefined / not touched). Please can you update the pattern to use a suitable extract?
More information about the llvm-commits