[PATCH] D143463: [X86] Use the CFA when appropriate for better variable locations around calls.

Wed Apr 12 13:00:13 PDT 2023

scott.linder added subscribers: probinson, aprantl, dblaikie.
scott.linder added a comment.

In D143463#4259947 <https://reviews.llvm.org/D143463#4259947>, @khuey wrote:

> In D143463#4259475 <https://reviews.llvm.org/D143463#4259475>, @scott.linder wrote:
>
>> As a bit of a side-note, I think X86 always maintains a "precise CFA" to refer to, but that may not always be true (see https://reviews.llvm.org/D14948).
>
> This is relying on the CFA being correct when it's present. CFI is not always present (see X86FrameLowering::needsDwarfCFI which at a minimum excludes some Win64 stuff) but it might be present in all cases that matter. I'm not sure.
>
>> I'm curious about a couple points:
>>
>> - If we do always have the CFA available, then is there any reason not to use it unconditionally?
>
> In decreasing order of persuasiveness to me
>
> 1. If the frame pointer is present (or the stack pointer remains unchanged) using it is simpler for the debug info consumer (as there's no need to consult the CFI and run through its state machine). In theory there might also be old tools out there that don't support CFI as well.

Very reasonable, but if GCC is already using the CFA unconditionally it seems likely that implies a pretty good support base?

> 2. If the frame pointer is present (or the stack pointer remains unchanged) using it works today. Don't fix what isn't broke.

I definitely think this holds a lot of weight, but just having one concept (CFA) instead of several (stack-pointer, frame-pointer, frame-base, CFA) is also very attractive. The fact that the patch is even necessary also demonstrates something is broken, although it might be the exception that proves the rule.

> 3. If there are any bugs out there that result in an incorrect CFA the machine registers are more likely to be correct as they're actually used by the program and thus issues are more likely to be noticed.

If the CFA is incorrect, unwinding seems like it is doomed to fail, which should constitute a program execution bug if e.g. exceptions are enabled, right? I don't necessarily think we will end up with more elusive debug-info bugs as a result of relying on the CFA.

> Although it seems the gcc maintainers were not persuaded by any of this. gcc emits DW_OP_call_frame_cfa in every circumstance I can think of (at least on x86-64).
>
>> - If we cannot use it unconditionally, can the condition be more direct, i.e. can we remove the heuristic?
>
> I'm not aware of any reason we cannot use the CFA unconditionally (at least on x86) if it is present. I believe it is possible to remove the setHasCFIAdjustCfa heuristic and use the CFA in place of the stack pointer unconditionally if the above arguments are not persuasive. I believe it's also possible to use the CFA in place of the frame pointer even when it's present, if desired.

I meant to propose another middle-ground, where we use the CFA whenever `!hasFP`, which (IIUC) is a precondition for x86 ever adjusting the CFA. It would mean we use the CFA more than with the heuristic, but still not unconditionally.

I think I am mostly concerned with the statefulness of the heuristic, coupled with the fact that it seems to not be completely contained. The majority of it is handled by `X86FrameLowering`, but there is at least the one instance where it has to be explicitly managed in `X86MCInstLower`. If others are not as concerned with this I'm happy to concede on it, but it is the main concern I have.

>> - Is the "frame base" value ABI-stable?
>
> In the debug info? I don't see why it would be. As mentioned above gcc emits DW_AT_frame_base = DW_OP_call_frame_cfa.
>
>> If not, is there a strong reason not to rework the variable expressions to instead be CFA-relative, and then just define the "frame base" to always be equal to the CFA? It would avoid the extra offset needed in the current version of the patch.
>
> The reason I didn't do this originally was that I expected it to require updating a massive number of tests, but it turns out `git grep DW_OP_fbreg | wc -l` has fewer than 100 hits in LLVM so it probably wouldn't be that bad to update the tests. Looking into it a bit more, it would require changes in DwarfExpression::addMachineRegExpression which is 1) already pretty complex and 2) shared across architectures. It's not entirely clear to me what's going on with CFI on other arches, CFIInstrInserter is definitely x86 specific but other arches seem to have at least some CFI stuff in the target specific frame lowering code. Untangling all this seems like a significantly bigger task than merely switching to using the CFA + offset.

I definitely understand the complexity in `DwarfExpression::addMachineRegExpression` makes any change very difficult. I am happy with continuing to apply the offset to the CFA in the frame base, and if some other brave soul wants to change that in a future patch they won't be any worse off.

My current leaning is to change to using the CFA unconditionally, but offset to maintain the same frame base value. It will result in a slightly larger expression in the default case, but I would be much more confident that the result is always correct.

I would appreciate any other opinions from @aprantl @dblaikie @probinson et al.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143463/new/

https://reviews.llvm.org/D143463