[PATCH] D48225: [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.

Tue Jun 19 10:54:43 PDT 2018

andreadb added inline comments.

================
Comment at: lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp:348
+    // bits of the underlying ZMM register.
+    if ((HasVEX || HasEVEX) && VR256XRC.contains(RegID))
+      return true;
----------------
craig.topper wrote:
> What should happen if you enable avx512f and xop instructions at the same time? I know no real CPU supports it, but should a 256-bit xop instruction clear the upper bits of zmm?
For XOP instructions, document "AMD64 Architecture Programmer’s Manual Volume 4: 128-Bit and 256-Bit Media Instructions" says that:

```Bits [255:128] of the YMM register that corresponds to the destination are cleared```

That sentence is related to XOP instructions that set an XMM register.

However, that same document uses the a very similar sentence when describing AVX instructions. For example, for VADDPD we have this:

```
XMM Encoding:
The first source operand is an XMM register. The second source  operand is either an XMM register or  a 128-bit memory location. The destination is a third XMM regis
ter. Bits [255:128] of the YMM register that corresponds to the destination are cleared.
```

VLMAX (or, the concept of a "maximum vector register width" for the processor) is not even mentioned in the entire document. So, I honestly don't know what is the right answer to your question.

If we want to be conservative, then we can assume for now that XOP does not update the upper bits of a ZMM register. In future (if AMD decides not to drop XOP), then we revisit this choice and update/simplify this code. What do you think?

https://reviews.llvm.org/D48225