[PATCH] D29874: [X86] Generate VZEROUPPER for Skylake-avx512

Amjad Aboud via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 15 04:08:11 PST 2017


aaboud marked an inline comment as done.
aaboud added a comment.

Thanks Michael for the comments.
Answer inline.



================
Comment at: lib/Target/X86/X86VZeroUpper.cpp:222
+    // explicitly specified.
+    if (IsCall && !callHasRegMask(MI))
       continue;
----------------
hliao wrote:
> aaboud wrote:
> > craig.topper wrote:
> > > aaboud wrote:
> > > > craig.topper wrote:
> > > > > Why did this change to not check specific register clobbers?
> > > > We can get to this point only if ClobberAllTmmReg is true, in such case ClobberAnyYmmReg will also be true, and we will fall-through exactly as we do with this implementation.
> > > > This was a kind of did code.
> > > > 
> > > > Also, can you explain what is unique with "cannot clobber any" and "cannot clobber all" regarding these lines?
> > > This is probably the first time I've really looked at this code so I think explain what is unique without more thought.
> > Michael added this code about 3 years ago, he might be able to explain better why he implemented it like this.
> for CALL, clobberAllTmmReg checks whether it clobbers ALL tmm regs. Here, clobberAnyTmmReg checks whether that call will uses any YMM registers. they are not equivalent.
> 
> however, a previous test case added seems not valid any more due to the calling into builtin library _ftol2 is replaced with native code sequence. not sure whether we will have other code to reproduce PR17631.
That is right that clobberAllYmmReg and clobberAnyYmmReg are different.
However:
1. If clobberAllYmmReg is false, we will not get to this line.
2. If clobberAllYmmReg is true, then clobberAnyYmmReg is true, i.e. !clobberAnyYmmReg is false.

So at this point, we will "continue" only if Call instruction has no RegMask.

Now, can you explain why we would not add vzeroupper before call instruction that does not clobberAnyTmmReg (i.e. has all YMM registers in its RegMask), but we do not want to add the vzeroupper before function that has only some of the YMM registers in the RegMask?
I believe that we want to add vzeroupper in both cases.

In fact, I believe that we should not check for clobbering at all in this pass (but this is for other patch).


https://reviews.llvm.org/D29874





More information about the llvm-commits mailing list