[PATCH] D110869: [X86] Implement -fzero-call-used-regs option

Bill Wendling via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 8 12:11:25 PST 2022


void added inline comments.


================
Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:595
+        continue;
+      XorOp = X86::PXORrr;
+    } else if (X86::VR256RegClass.contains(Reg)) {
----------------
nickdesaulniers wrote:
> void wrote:
> > nickdesaulniers wrote:
> > > Is there any way to figure the `XorOp` outside of this loop? Seems unfortunate to repeatedly scan almost every register class for every used register.
> > > 
> > > Like instead of querying each register set whether a given register is in it, is it possible to ask a register what register class it's in? Or can a register belong to more than one register class?
> > There's a function in TRI that you can call to grab the RegClass of a register, but it calls a "`contains`" on each register class to see if it belongs in it, so it would be worse than this code.
> > 
> > In practice, the register classes won't have many members in it. It sucks, but it's probably something like `16*16` in a worst case scenario.
> > 
> > (I think registers can belong to multiple register classes (e.g. sub- and super-classes), but I don't quote me on that.)
> Right, I just get a sinking feeling that we're going to repeatedly scan these RegClass lists for every MachineFunction, when the answer doesn't change; we should eventually be able to map these O(1).
That could be a separate change, since it's incidental to this feature. We could create a map of this information, which should help performance a lot.


================
Comment at: llvm/lib/Target/X86/X86RegisterInfo.cpp:629-633
+  if (!ST.is64Bit())
+    return llvm::any_of(
+               SmallVector<MCRegister>{X86::EAX, X86::ECX, X86::EDX},
+               [&](MCRegister &RegA) { return IsSubReg(RegA, Reg); }) ||
+           (ST.hasMMX() && X86::VR64RegClass.contains(Reg));
----------------
nickdesaulniers wrote:
> Do we need to clear stack slots for i386?
> 
> The Linux kernel uses `-mreg-param=3` to use a faster, though custom calling convention. This corresponds to the `inreg` parameter attribute in LLVM IR.
> 
> Otherwise, perhaps a todo and diagnose `-m32` in the front end?
Clearing stack slots for i386 isn't done by GCC (https://godbolt.org/z/af8e61d3q). I think we can omit doing that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110869/new/

https://reviews.llvm.org/D110869



More information about the llvm-commits mailing list