[PATCH] D95518: [Debug-Info][XCOFF] support dwarf for XCOFF for assembly output

Thu Mar 4 18:39:34 PST 2021

jasonliu added inline comments.

================
Comment at: llvm/lib/MC/MCAsmStreamer.cpp:2323
+
+  // FIXME: use section end symbol as end of the Section. We need to consider
+  // the explicit sections and -ffunction-sections when we try to generate or
----------------
shchenz wrote:
> jasonliu wrote:
> > shchenz wrote:
> > > jasonliu wrote:
> > > > I hope this FIXME is not going to stay here for long (and we will quickly have a patch to address this). As this is really a workaround here and it creates different behavior between object file generation and assembly generation which is undesirable.
> > > I left this as a FIXME because I can not find an easy way to fix this. Why we need this function is because of the difference of the `changeSection` function of `MCAsmStreamer` and `MCObjectStreamer`. `MCObjectStreamer` will change the insert point to the section end, while `MCAsmStreamer` will add a new section start in the assembly output. The `MCAsmStreamer` current behavior is not what we need for debug lines, we can not add another `.text` section when we generate assembly for `.dwline`. 
> > > So if we want to have the same behavior for `MCAsmStreamer` and `MCObjectStreamer`, we must make sure that we know a function is a section end. So we can add a section end symbol in the function `emitFunctionBodyEnd`. But this seems hard due to explicit sections. When we handle a function in `AsmPrinter` pass, it is hard to know there are other functions(which will be handled later) that also belong to the same section unless a CU level analysis is performed to fetch this info.
> > I guess one way to do this for assembly would be:
> > 1. collect all the seen text csects somehow into an array.
> > 2. After all the text csects are emitted (doFinalization or emitEndOfAsmFile?), traverse (and switch to) each text csect in the array and emit the section end label for it. 
> > I think in this way, you could make sure that the section end labels are indeed emitted to the section end.
> hmm, doing this in `doFinalization` for the explicit sections will be too late, just like the case we do this in dwarf line emission. We can not switch back to the previous section at the end of all text sections.
> 
> Collecting this info in `doInitialization` is a possible way. Before we run `AsmPrinter::runOnMachineFunction()` for any function, we go through all functions in the module and try to find out which function is the last one of a specific section.
> 
> But I am not sure that the function order (when we iterate them in the module ) is the same as the order they exist in the source file. We need more investigation here if we want to fix this later.
I'm not sure if I understand what you meant by we could not switch back to the previous section at the end of all text sections. 
Assuming you have csects like this:
```
.csect main[PR]
xxx
.csect text[PR]
xxx
.csect main[PR]
xxx
.csect text[PR]
xxx
```
Then at finalization, we know no more text csect will be emitted. Then we could start emitting the end label for each of the csect above:
```
.csect main[PR]
.L..sec_end0
.csect text[PR]
.L..sec_end1
```
Then you could refer back to these labels in the dwarf sections. Those labels should give you the address of the end of that particular csect.
Wouldn't this work?

================
Comment at: llvm/test/DebugInfo/XCOFF/empty.ll:9
+target datalayout = "E-m:a-p:32:32-i64:64-n32"
+
+; Function Attrs: noinline nounwind optnone
----------------
shchenz wrote:
> jasonliu wrote:
> > So I tried a simple program with this patch:
> > ```
> > (gdb) list
> > 1       int bar2() {return 1;}
> > 2       int  main() {
> > 3         int c = 10;
> > 4         return bar2()+c;
> > 5       }
> > (gdb) b 3
> > Breakpoint 1 at 0x100004d8: file /gsa/tlbgsa/home/j/a/jasonli/defect/XCOFF/main.c, line 3.
> > (gdb) r
> > Starting program: /gsa/tlbgsa-h1/06/jasonli/defect/XCOFF/main 
> > 
> > Breakpoint 1, main () at /gsa/tlbgsa/home/j/a/jasonli/defect/XCOFF/main.c:3
> > 3         int c = 10;
> > (gdb) n
> > 4         return bar2()+c;
> > (gdb) s
> > bar2 () at /gsa/tlbgsa/home/j/a/jasonli/defect/XCOFF/main.c:1
> > 1       int bar2() {return 1;}
> > (gdb) n
> > 0x100002fc in __start ()
> > ```
> > It seems when we try the last "next", it skips the `main` function and goes directly into `__start()`. But gcc on AIX seems to be able to advance to `main` before goes to `__start()`. 
> Yeah, this is a difference between GCC and clang. We also get the same behavior for clang on Linux for this small case.
> 
> I think maybe clang's behavior is better than GCC. In the debugger, when we break a function name, we will always skip the function's prologue and stop after the prologue, so when we return back to a function at the address for a function's epilogue, should we just execute them and not stop there? If so, clang does the right thing, we don't stop at the function's epilogue and return to the function's caller.
I see. That's fine if this is just the behavior of clang. It's a corner case anyway. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95518/new/

https://reviews.llvm.org/D95518