[PATCH] prevent folding a scalar FP load into a packed logical FP instruction (PR22371)

Quentin Colombet qcolombet at apple.com
Fri Feb 13 10:20:12 PST 2015


================
Comment at: lib/Target/X86/X86InstrFragmentsSIMD.td:374
@@ +373,3 @@
+def loadf32_128 : PatFrag<(ops node:$ptr),
+  (bitconvert (v4f32 (scalar_to_vector (loadf32 node:$ptr))))>;
+def loadf64_128 : PatFrag<(ops node:$ptr),
----------------
spatel wrote:
> qcolombet wrote:
> > spatel wrote:
> > > qcolombet wrote:
> > > > This is not valid, is it?
> > > > 
> > > > When this matches we will read 128-bit from the memory, i.e., pass what we do for the load32. Aren’t we?
> > > > 
> > > > Something correct would be load128 -> extract element.
> > > > Though I do not think that happens a lot…
> > > I don't understand; we only want to match a 32-bit load that has been extended to fit in the 128-bit register, right? Isn't that what loadf32 guarantees? Perhaps this should be a zero-extend rather than scalar_to_vector though?
> > > 
> > Well I may certainly misread the uses of loadf32_128, but does not this is used to fold the load in the related operation, thus we read 128-bit in memory, don't we?
> I think I understand now: this is only working because it's difficult to match the more complicated pattern. If we somehow managed to match that pattern, then we'd trigger the same bug all over again.
> 
> So the fundamental problem is that we're creating instruction definitions that don't exist in x86 reality. I see 2 potential fixes: 
> 
> (1) Pass a null / void memory operand pattern to the existing multiclass:
> 
>  defm PS : sse12_fp_packed<opc, !strconcat(OpcodeStr, "ps"), OpNode, FR32,
>  f32, f128mem, [how to specify that nothing works here?] , SSEPackedSingle, itins>, PS;
> 
> (2) Write a new multiclass that just has a register-register variant; no reg-mem option is possible.
For #1, I'm not sure this is possible.

For #2, that sounds like the right approach. Just make sure that the related opcode are not used anywhere. If they are double check that this is correct and if it is correct, make sure to have a definition for those, just omit the patterns ([]).

That's a lot of "if" :).

Thanks.

http://reviews.llvm.org/D7474

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list