[PATCH] D110869: [X86] Implement -fzero-call-used-regs option

Bill Wendling via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 7 16:20:59 PST 2022


void added a comment.

In D110869#3296047 <https://reviews.llvm.org/D110869#3296047>, @craig.topper wrote:

> In D110869#3296031 <https://reviews.llvm.org/D110869#3296031>, @void wrote:
>
>> In D110869#3295912 <https://reviews.llvm.org/D110869#3295912>, @craig.topper wrote:
>>
>>> In D110869#3295906 <https://reviews.llvm.org/D110869#3295906>, @void wrote:
>>>
>>>> In D110869#3295578 <https://reviews.llvm.org/D110869#3295578>, @nickdesaulniers wrote:
>>>>
>>>>> In D110869#3295559 <https://reviews.llvm.org/D110869#3295559>, @void wrote:
>>>>>
>>>>>> Weird. We generate similar code to GCC:
>>>>>>
>>>>>>   Clang:
>>>>>>   _paravirt_ident_64:                     # @_paravirt_ident_64
>>>>>>           movq    %rdi, %rax
>>>>>>           xorq    %rdi, %rdi
>>>>>>           retq
>>>>>>   
>>>>>>   GCC:
>>>>>>   _paravirt_ident_64:
>>>>>>           movq    %rdi, %rax      # tmp85, x
>>>>>>           xorl    %edi, %edi      #
>>>>>>           ret
>>>>>
>>>>> Does `xorl` not zero the upper 32b?
>>>>
>>>> I'm thinking no. But it's odd, because both are using `%rdi` but GCC is only zeroing out the bottom 32-bits. Seems a bit counter-intuitive.
>>>
>>> Every write to a 32-bit register on x86-64 zeros bits 63:32 of the register. `xorl %edi, %edi` has the same behavior as `xorq %rdi, %rdi`, but is 1 byte shorter to encode.
>>
>> Oh, interesting! TIL. So it's really not profitable to use `xorq` at all here...Though it does beg the question of why `xorq` exists. :-)
>
> xorq is still useful when the registers are different.

So `xorl %ecx, %edx` doesn't zero out all 64-bits of `%rcx` and `%rdx`? That's two 32-bit writes to two different registers, isn't it?



================
Comment at: clang/test/CodeGen/zero-call-used-regs.c:1-2
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu %s -emit-llvm -fzero-call-used-regs=skip -o - | FileCheck %s --check-prefix CHECK-SKIP
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu %s -emit-llvm -fzero-call-used-regs=used-gpr-arg -o - | FileCheck %s --check-prefix CHECK-USED-GPR-ARG
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu %s -emit-llvm -fzero-call-used-regs=used-gpr -o - | FileCheck %s --check-prefix CHECK-USED-GPR
----------------
nickdesaulniers wrote:
> If you use `--check-prefixes=<unique>,<non-unique>` i.e.
> 
> `--check-prefixes=CHECK-USED-GPR-ARG,CHECK`
> 
> then you could combine a whole lot of the below `CHECK-<unique>` into un-suffixed `CHECK:`.
> 
> i.e.
> 
> ```
> // CHECK-SKIP:               attributes #[[ATTR_NUM_SKIP]] = {{.*}} "zero-call-used-regs"="skip"
> ```
> could become
> ```
> // CHECK:               attributes #[[ATTR_NUM_SKIP]] = {{.*}} "zero-call-used-regs"="skip"
> ```
> for all `-fzero-call-used-regs=` tests.
> 
> Really, the behavior is unique only for the unattributed `@no_attribute`; this test is much larger than it needs to be.
The tests don't have exactly the same output. The attributes differ between them.


================
Comment at: llvm/lib/CodeGen/PrologEpilogInserter.cpp:1203-1205
+  const bool OnlyGPR = static_cast<unsigned>(ZeroRegsKind) & ONLY_GPR;
+  const bool OnlyUsed = static_cast<unsigned>(ZeroRegsKind) & ONLY_USED;
+  const bool OnlyArg = static_cast<unsigned>(ZeroRegsKind) & ONLY_ARG;
----------------
nickdesaulniers wrote:
> Do we need to check ENABLED? I'm curious if `ZeroCallUsedRegsKind::All` will do anything here when set? Looks like no; I think it should be doing something?
I modified it to set All to `0` instead of `ENABLED`.


================
Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:523
+  const MachineFunction &MF = *MBB.getParent();
+  const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
+
----------------
nickdesaulniers wrote:
> sink the declaration of `ST` closer to its use (L555?)
I guess.


================
Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:595
+        continue;
+      XorOp = X86::PXORrr;
+    } else if (X86::VR256RegClass.contains(Reg)) {
----------------
nickdesaulniers wrote:
> Is there any way to figure the `XorOp` outside of this loop? Seems unfortunate to repeatedly scan almost every register class for every used register.
> 
> Like instead of querying each register set whether a given register is in it, is it possible to ask a register what register class it's in? Or can a register belong to more than one register class?
There's a function in TRI that you can call to grab the RegClass of a register, but it calls a "`contains`" on each register class to see if it belongs in it, so it would be worse than this code.

In practice, the register classes won't have many members in it. It sucks, but it's probably something like `16*16` in a worst case scenario.

(I think registers can belong to multiple register classes (e.g. sub- and super-classes), but I don't quote me on that.)


================
Comment at: llvm/lib/Target/X86/X86RegisterInfo.cpp:637
+
+  if (CC == CallingConv::X86_64_SysV && IsSubReg(X86::RAX, Reg))
+    return true;
----------------
nickdesaulniers wrote:
> I thought the 64b x86 cc used `rdi, rsi, rdx, rcx, r8, r9` as registers for arguments?
That's captured in the if-then below this one.


================
Comment at: llvm/lib/Target/X86/X86RegisterInfo.cpp:650-651
+
+  if (ST.hasSSE1() && X86::VR128RegClass.contains(Reg))
+    return true;
+
----------------
nickdesaulniers wrote:
> so ANY of the `X86::VR128RegClass` are argument registers?
[XYZ]MM0-7 are apparently. I'll make it explicit.


================
Comment at: llvm/test/CodeGen/X86/zero-call-used-regs.ll:186-200
+; I386-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; I386-NEXT:    fldz
+; I386-NEXT:    fldz
+; I386-NEXT:    fldz
+; I386-NEXT:    fldz
+; I386-NEXT:    fldz
+; I386-NEXT:    fldz
----------------
nickdesaulniers wrote:
> is this how we clear the x87 stack?
Yes.


================
Comment at: llvm/test/CodeGen/X86/zero-call-used-regs.ll:287
+entry:
+  store volatile i32 2, i32* @result, align 4
+  ret i32 0
----------------
nickdesaulniers wrote:
> necessary?
It's main. Of course it's necessary. :-) (And I want to check main explicitly.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110869/new/

https://reviews.llvm.org/D110869



More information about the llvm-commits mailing list