[PATCH] D11438: Part 2 to fix x86_64 fp128 calling convention.

Sat Dec 5 21:32:31 PST 2015

davidxl added inline comments.

================
Comment at: lib/Target/X86/X86InstrSSE.td:8862
@@ +8861,3 @@
+          (MOVAPSmr addr:$dst, (COPY_TO_REGCLASS (f128 FR128:$src), VR128))>;
+// When the data is used as floating point, "movaps" should be faster and shorter
+// than "movdqa". "movaps" is in SSE and movdqa is in SSE2.
----------------
Move the comment above the pattern def. 

1) movaps is shorter, not 'should be'
2) regarding 'faster' part -- put a reference there. In fact, f128 operations should be considered in integer domain so movdqa should be used to avoid domain bypass penalty.

================
Comment at: lib/Target/X86/X86InstrSSE.td:8868
@@ +8867,3 @@
+
+// andps is faster and shorter than andpd, andps is SSE and andpd is SSE2
+def : Pat<(X86fand FR128:$src1, (loadf128 addr:$src2)),
----------------
pand is for SIMD integer. andps is shorter though.

http://reviews.llvm.org/D11438