[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

Thu Jun 14 06:50:43 PDT 2018

spatel added inline comments.

================
Comment at: lib/CodeGen/CGBuiltin.cpp:10107-10112
+    case 0x0b: // FALSE_OQ
+    case 0x1b: // FALSE_OS
+      return llvm::Constant::getNullValue(ConvertType(E->getType()));
+    case 0x0f: // TRUE_UQ
+    case 0x1f: // TRUE_US
+      return llvm::Constant::getAllOnesValue(ConvertType(E->getType()));
----------------
GBuella wrote:
> spatel wrote:
> > On 2nd thought, why are we optimizing when we have matching IR predicates for these?
> > Just translate to FCMP_TRUE / FCMP_FALSE instead of special-casing these values.
> > InstSimplify can handle the constant folding if optimization is on.
> I don't know, these TRUE/FALSE cases were already handled here, I only rearranged the code.
> Does this cause any problems? I mean, if it meant an extra dozen lines of code I would get it, but as it is, does it hurt anything?
It's mostly about being consistent. I think it's generally out-of-bounds for clang to optimize code. That's not its job.

The potential end user difference is that in unoptimized code, a user might expect to see the vcmpXX asm corresponding to the source-level intrinsic when debugging.

I agree that this is changing existing behavior, so it's better if we make this change before or after this patch.

https://reviews.llvm.org/D45616