[PATCH] D110869: [X86] Implement -fzero-call-used-regs option
Bill Wendling via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 8 12:11:25 PST 2022
void added inline comments.
================
Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:595
+ continue;
+ XorOp = X86::PXORrr;
+ } else if (X86::VR256RegClass.contains(Reg)) {
----------------
nickdesaulniers wrote:
> void wrote:
> > nickdesaulniers wrote:
> > > Is there any way to figure the `XorOp` outside of this loop? Seems unfortunate to repeatedly scan almost every register class for every used register.
> > >
> > > Like instead of querying each register set whether a given register is in it, is it possible to ask a register what register class it's in? Or can a register belong to more than one register class?
> > There's a function in TRI that you can call to grab the RegClass of a register, but it calls a "`contains`" on each register class to see if it belongs in it, so it would be worse than this code.
> >
> > In practice, the register classes won't have many members in it. It sucks, but it's probably something like `16*16` in a worst case scenario.
> >
> > (I think registers can belong to multiple register classes (e.g. sub- and super-classes), but I don't quote me on that.)
> Right, I just get a sinking feeling that we're going to repeatedly scan these RegClass lists for every MachineFunction, when the answer doesn't change; we should eventually be able to map these O(1).
That could be a separate change, since it's incidental to this feature. We could create a map of this information, which should help performance a lot.
================
Comment at: llvm/lib/Target/X86/X86RegisterInfo.cpp:629-633
+ if (!ST.is64Bit())
+ return llvm::any_of(
+ SmallVector<MCRegister>{X86::EAX, X86::ECX, X86::EDX},
+ [&](MCRegister &RegA) { return IsSubReg(RegA, Reg); }) ||
+ (ST.hasMMX() && X86::VR64RegClass.contains(Reg));
----------------
nickdesaulniers wrote:
> Do we need to clear stack slots for i386?
>
> The Linux kernel uses `-mreg-param=3` to use a faster, though custom calling convention. This corresponds to the `inreg` parameter attribute in LLVM IR.
>
> Otherwise, perhaps a todo and diagnose `-m32` in the front end?
Clearing stack slots for i386 isn't done by GCC (https://godbolt.org/z/af8e61d3q). I think we can omit doing that.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D110869/new/
https://reviews.llvm.org/D110869
More information about the llvm-commits
mailing list