[LLVMdev] X86VZeroUpper optimization question.

Lang Hames lhames at gmail.com
Fri Mar 14 17:57:08 PDT 2014


Hi all,

Upon further investigation I can confirm that the pass does miss some
important cases, and visits some instructions more than it needs to. A
rewrite is in the works and will be posted soonish.

Cheers,
Lang.


On Thu, Mar 13, 2014 at 3:36 PM, Lang Hames <lhames at gmail.com> wrote:

> Hi Bruno,
>
> I'm looking at a test case where we're failing to insert a vzeroupper
> between an instruction that dirties the YMM regs and a call that uses SSE
> regs. No test case yet - I'm still trying to reduce it to something sane. I
> can see where the logic in the X86VZeroUpper optimization goes off the
> rails though: The entry state for the basic block is ST_UNKNOWN, and the
> optimization contains the following logic:
>
> if (CurState == ST_DIRTY) {
>   // Only insert the VZEROUPPER in case the entry state isn't unknown.
>   // When unknown, only compute the information within the block to have
>   // it available in the exit if possible, but don't change the block.
>   if (EntryState != ST_UNKNOWN) {
>     BuildMI(BB, I, dl, TII->get(X86::VZEROUPPER));
>     ++NumVZU;
>   }
>   // After the inserted VZEROUPPER the state becomes clean again, but
>   // other YMM may appear before other subsequent calls or even before
>   // the end of the BB.
>   CurState = ST_CLEAN;
> }
>
> If CurState == ST_DIRTY and EntryState == ST_UNKNOWN, then some
> instruction in this basic block has dirtied the YMM regs. In that case, why
> would you want to avoid putting a vzeroupper instruction in? Is it just to
> avoid inserting duplicate vzerouppers when the block is revisited? If
> that's the case then I think the problem is actually in
> runOnMachineFunction, which contains the comment: "Each BB state depends on
> all predecessors, loop over until everything converges.  (Once we converge,
> we can implicitly mark everything that is still ST_UNKNOWN as ST_CLEAN.)".
> We do iterate to convergence, but we don't mark anything as clean
> afterwards, nor do a final re-visit of the basic blocks that had previously
> had ST_UNKNOWN entry states. Is that an oversight?
>
> Cheers,
> Lang.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140314/d2ae0a8a/attachment.html>


More information about the llvm-dev mailing list