[PATCH] prevent folding a scalar FP load into a packed logical FP instruction (PR22371)
Sanjay Patel
spatel at rotateright.com
Thu Feb 12 15:20:47 PST 2015
================
Comment at: lib/Target/X86/X86InstrFragmentsSIMD.td:374
@@ +373,3 @@
+def loadf32_128 : PatFrag<(ops node:$ptr),
+ (bitconvert (v4f32 (scalar_to_vector (loadf32 node:$ptr))))>;
+def loadf64_128 : PatFrag<(ops node:$ptr),
----------------
qcolombet wrote:
> This is not valid, is it?
>
> When this matches we will read 128-bit from the memory, i.e., pass what we do for the load32. Aren’t we?
>
> Something correct would be load128 -> extract element.
> Though I do not think that happens a lot…
I don't understand; we only want to match a 32-bit load that has been extended to fit in the 128-bit register, right? Isn't that what loadf32 guarantees? Perhaps this should be a zero-extend rather than scalar_to_vector though?
http://reviews.llvm.org/D7474
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list