[PATCH] D89049: [AIX][XCOFF] print out the traceback info

Thu Sep 2 01:37:46 PDT 2021

jhenderson added inline comments.

================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:1203
+    if (ExtensionTable.getValue() & ExtendedTBTableFlag::TB_EH_INFO) {
+      // There is 4 byte alignment before eh_info displacement.
+      uint64_t Pos = Cur.tell();
----------------

================
Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:1204-1206
+      uint64_t Pos = Cur.tell();
+      if (Pos % 4 != 0)
+        DE.skip(Cur, 4 - (Pos % 4));
----------------
I think you should be able to use the alignTo function to do this alignment calculation. However, it looks like the `Cursor` and `DataExtractor` classes are missing any form of `seek` function to set the position directly, without needing to calculate from the current location. I think it would be useful to add this function to one or the other. You could then do something like one of the following:
```
Cur.seek(alignTo(Cur.tell(), 4));
DE.seek(Cur, alignTo(Cur.tell(), 4));
```
which I'm sure you'll agree is much clearer and less likely to have any issues.

================
Comment at: llvm/test/CodeGen/PowerPC/aix-emit-tracebacktable-clobber-register.ll:9

+; RUN: llc -verify-machineinstrs -mtriple powerpc-ibm-aix-xcoff -mcpu=pwr4 \
+; RUN:     -mattr=+altivec -vec-extabi -xcoff-traceback-table=true -filetype=obj -o %t.o  < %s
----------------
Is there any particular reason you can't fold this in with the other cases?

================
Comment at: llvm/test/CodeGen/PowerPC/aix-emit-tracebacktable-clobber-register.ll:11
+; RUN:     -mattr=+altivec -vec-extabi -xcoff-traceback-table=true -filetype=obj -o %t.o  < %s
+; RUN: llvm-objdump -D --traceback-table  --symbol-description  %t.o | \
+; RUN:   FileCheck --check-prefix=OBJ_DIS %s
----------------

================
Comment at: llvm/test/CodeGen/PowerPC/aix-emit-tracebacktable-clobber-register.ll:12
+; RUN: llvm-objdump -D --traceback-table  --symbol-description  %t.o | \
+; RUN:   FileCheck --check-prefix=OBJ_DIS %s
+
----------------
Use `-` not `_` in your prefix names, as per the prevailing style in this file.

================
Comment at: llvm/test/CodeGen/PowerPC/aix-emit-tracebacktable-clobber-register.ll:106
+; OBJ_DIS:           9c: 00 00 00 00    # Traceback table begin
+; OBJ_DIS_NEXT:      a0: 00             # Version = 0
+; OBJ_DIS_NEXT:      a1: 09             # Language = CPlusPlus
----------------
This won't work. `NEXT` suffixes must be prefixed by a `-` not `_`.

================
Comment at: llvm/test/CodeGen/PowerPC/aix-emit-tracebacktable-clobber-register.ll:148
+; OBJ_DIS_NEXT:     13c: 00 00 00 08    # Except info displacement
+
----------------
Delete blank lines at file end.

================
Comment at: llvm/test/tools/llvm-objdump/XCOFF/disassemble-invalid-traceback-table.test:1
+# REQUIRES: powerpc-registered-target
+
----------------
1) This test needs additional comments explaining what the test is trying to test. How is the input invalid?
2) You should invest time in adding functionality for writing traceback tables to yaml2obj, so you can use it to generate the invalid output, rather than checking in an opaque and impossible to maintain binary (most people won't have access to the xlc compiler, for example, should there be a problem with the test).

================
Comment at: llvm/test/tools/llvm-objdump/XCOFF/disassemble-invalid-traceback-table.test:14
+
+#WARN:     {{.*}}llvm-objdump: warning: '{{.*}}xcoff-invalid-traceback-table.o': traceback table parsing failure
+#WARN-NEXT: >>> unexpected end of data at offset 0x100 while reading [0x9a, 0x10d)
----------------
Here and below.

================
Comment at: llvm/test/tools/llvm-objdump/XCOFF/disassemble-invalid-traceback-table.test:24
+#WARN-NEXT:         ...
+
----------------
Get rid of additional blank lines.

================
Comment at: llvm/test/tools/llvm-objdump/XCOFF/disassemble-traceback-table.test:7
+## xcoff-traceback-table.o compiled with IBM XL C/C++ for AIX, V16.1.0
+## compiler command: xlc -o xcoff-traceback-table.o -c foo.c
+## foo.c:
----------------
DiggerLin wrote:
> Xiangling_L wrote:
> > Should we use xlclang here?
> yes, of course, we can use xlclang here, but  I think using xlc is ok too.
See my above comments - add traceback table generation to yaml2obj rather than adding canned binaries that people are unable to update.

================
Comment at: llvm/tools/llvm-objdump/ObjdumpOpts.td:49
+  HelpText<"Decode traceback table for disassembly. This "
+           "option is for XCOFF files only.">;
+
----------------
Aside from the also incorrect --symbol-description help text, the help text for llvm-objdump options does not include a trailing full stop. Please remove.

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:22
+
+#include <sstream>

----------------
Don't use std::stringstream. Use the LLVM `raw_string_ostream` class instead.

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:168
+        Twine("traceback table parsing failure\n>>> ") +
+            addOffsetToValueInString(toString(TTOrErr.takeError()), Address) +
+            "\n>>> Raw traceback table data is:",
----------------
So I really don't like the need to parse the error message for numbers and updating them. Aside from anything else, it's possible that a future message will include other numbers in it, unrelated to the address, so this function would incorrectly change them.

I assume you want to make it clear that the error message offsets are relative to the start of the traceback table. I think the following would be a way of clarifying:
"failure parsing traceback table with address 0x1234\n>>> " (etc)
And then leave the remaining numbers untouched. This helps because I think in most cases, it will be fairly obvious that the offsets in the error messages are relative to the address, because they won't make sense otherwise (e.g. they'll be smaller than the table's address). The raw data will also have the leading address values, which again will not usually line up with the offset, although you could also consider having the leading address actually be the offset value within the traceback table instead, whichever is simpler.

Finally, I don't think we need to worry too much about any risk of confusion - hopefully this warning will never be seen by users, and when it is, it should be straightforward enough to figure things out from context. Therefore, I think we can afford to keep things simple.

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:174
+
+    // The value of Size has been changed in function XCOFFTracebackTable::create()
+    Size = End - Address;
----------------
Remember to put a full stop at the end of sentences.

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:201
+
+  auto printOutBytes = [&](uint64_t n) {
+    printRawData(Bytes.slice(Index, n), Address + Index, OS, STI);
----------------
Shorter and more grammatically correct name. You "print" something usually, not "print out" something.

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:287
+
+  // Print out function name length and function name if there is.
+  if (TbTable.isFuncNamePresent()) {
----------------

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:292
+    assert(FunctionNameLen > 0 &&
+           "The length of the function name must be large than zero.");
+
----------------

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:340
+
+    // There is two bytes of padding after vector info.
+    OS << "\n";
----------------

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:355
+  if (TbTable.getEhInfoDisp()) {
+    // There is 4 bytes alignment before eh info displacement.
+    if (Index % 4) {
----------------

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:364
+    OS << "\t# Except info displacement";
+    // Print out another 4 bytes for 64 bit.
+    if (Is64Bit) {
----------------
Why do we need another 4 bytes printed here? Explain why in comments, not what (it is clear from the code what is happening).

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:114
+
+std::string AddOffsetToValueInString(const std::string &StrMsg,
+                                     unsigned Offset) {
----------------
DiggerLin wrote:
> jhenderson wrote:
> > Can this function be static too?
> I still use the const std::string &StrMsg as parameter here, for   int Num = std::stoi(NumString, 0, 16) and Res += std::string(StrMsg, PrePos, Pos - PrePos);  we still need to convert the StringRef to std:string.
There are plenty of LLVM functions for string concatenation and manipulation using Twine and StringRef. Please use those, not std::string. Convert only to a std::string on return.

If you haven't already, please read the LLVM Programmer's Manual, especially the section here: https://llvm.org/docs/ProgrammersManual.html#passing-strings-the-stringref-and-twine-classes

That all being said, I don't think you should do any of this - see my comment below.

================
Comment at: llvm/tools/llvm-objdump/XCOFFDump.cpp:140-146
+#define SPLIT                                                                  \
+  OS << "\n";                                                                  \
+  OS.indent(TabStop)
+
+#define PRINTOUTBYTES(N)                                                       \
+  formatTracebackTableOutput(Bytes.slice(Index, N), Address + Index, OS);      \
+  Index += N;
----------------
DiggerLin wrote:
> jasonliu wrote:
> > If you don't need the convenient of a macro, could you write them in lambda's instead?
> > For example, the SPLIT and PRINTOUTBYTES.
> I change the PRINTOUTBYTES to lambda, but I can not see a benefit of change SPLIT from Macro to lambda
Macros are a pain to debug typically, if nothing else, and could cause problems later in the file (unlikely, but possible), unless you undef them.

================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.h:155
+using namespace llvm;
+unsigned getInstStartColumn(const MCSubtargetInfo &STI);
+void printRawData(llvm::ArrayRef<uint8_t> Bytes, uint64_t Address,
----------------
DiggerLin wrote:
> DiggerLin wrote:
> > Xiangling_L wrote:
> > > May I ask why we don't put these two functions in objdump namespace as well?
> > 
> > the call stack as 
> > PrettyPrinter::printInst() ->printRawData() ->getInstStartColumn()
> > 
> > the class PrettyPrinter is defined in anonymous  namespace  and  these two new function only used by the class PrettyPrinter. So I define two function as the same namespace as  PrettyPrinter .
> sorry . I change the above comment 
> these two new function only used by the class PrettyPrinter
> to 
> these two new function is used by the class PrettyPrinter
These functions are in a public header. Anybody who includes "llvm-objdump.h" whill get these two functions without any form of namespace scoping which is very undesirable.

Either 1) if the functions are only used in one file, move their definition into that file, or 2) if the functions are used in multiple places, put them in the `llvm` namespace at the bare minimum (and since you are in objdump, they should be in the `llvm::objdump` namespace.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89049/new/

https://reviews.llvm.org/D89049