[PATCH] prevent folding a scalar FP load into a packed logical FP instruction (PR22371)
Quentin Colombet
qcolombet at apple.com
Thu Feb 12 15:30:00 PST 2015
================
Comment at: lib/Target/X86/X86InstrFragmentsSIMD.td:374
@@ +373,3 @@
+def loadf32_128 : PatFrag<(ops node:$ptr),
+ (bitconvert (v4f32 (scalar_to_vector (loadf32 node:$ptr))))>;
+def loadf64_128 : PatFrag<(ops node:$ptr),
----------------
spatel wrote:
> qcolombet wrote:
> > This is not valid, is it?
> >
> > When this matches we will read 128-bit from the memory, i.e., pass what we do for the load32. Aren’t we?
> >
> > Something correct would be load128 -> extract element.
> > Though I do not think that happens a lot…
> I don't understand; we only want to match a 32-bit load that has been extended to fit in the 128-bit register, right? Isn't that what loadf32 guarantees? Perhaps this should be a zero-extend rather than scalar_to_vector though?
>
Well I may certainly misread the uses of loadf32_128, but does not this is used to fold the load in the related operation, thus we read 128-bit in memory, don't we?
http://reviews.llvm.org/D7474
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list