[PATCH] D85024: [llvm-objdump] Implement --prefix option

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 25 01:36:39 PDT 2020


jhenderson added inline comments.


================
Comment at: llvm/docs/CommandGuide/llvm-objdump.rst:172
+
+  When disassembling with the ``--source`` option, prepend ``prefix`` to absolute paths.
+
----------------
jhenderson wrote:
> 
Ping this comment.


================
Comment at: llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test:15
+
+; RUN: llvm-objdump --prefix %p --source %t.o 2>&1 | FileCheck %s --check-prefix CHECK-MISS-PREFIX-FIX
+; CHECK-MISS-PREFIX-FIX: ; int foo() {
----------------
tinti wrote:
> tinti wrote:
> > jhenderson wrote:
> > > Hmmm... this is interesting. With llvm-objdump, this test case fails for Windows. However, if I replace llvm-objdump with GNU objdump and run the same invocation, it works. I dug into this a little. The problem is with the meaning of "absolute paths" for Windows - it's ambiguous. C++17's `filesystem::is_absolute`, whose behaviour is matched by `sys::path::is_absolute` essentially says that the path must be unambiguous (i.e. that there must be no drive letter for Windows). However, other languages, e.g. Python, treat a path as absolute with or without the drive letter. GNU objdump appears to follow the Python etc approach, so '/Inputs\foo' is treated as an absolute path and is prefixed.
> > > 
> > > I think we need extra logic to get this to work on Windows, to ensure GNU compatibility. Basically, you need to  identify the case where a path is absolute within the current drive (i.e. starts with '/', though it might be better, if possible to rely on the underlying functions to do that), and treat that as absolute. What you can't do is modify the behaviour of `sys::path::is_absolute` itself.
> >  I did a small test.
> > 
> > ```
> > bool is_absolute2(const llvm::Twine &path, llvm::sys::path::Style style) {
> >   if (style == llvm::sys::path::Style::windows) {
> >     llvm::SmallString<128> path_storage;
> >     llvm::StringRef p = path.toStringRef(path_storage);
> >     
> >     if (p.front() == '/' || p.front() == '\\')
> >       return true;
> > 
> >     if (p.size() >= 2 && (p[0] && p[1] == ':'))
> >       return true;
> >   }
> > 
> >   return llvm::sys::path::is_absolute(path, style); 
> > }
> > 
> > bool is_separator2(char value, llvm::sys::path::Style style) {
> >   // not needed
> >   return llvm::sys::path::is_separator(value, style);
> > }
> > 
> > ...
> > 
> >   std::string test1("/input/foo.c");
> >   std::string test2("c:/input/foo.c");
> >   std::string test3("\\\\input\\foo.c");
> >   std::string test4("c:/");
> >   std::string test5("c:");
> >   std::string test6("//");
> >   std::string test7("\\");
> >   std::string test8(":");
> > 
> >   std::vector<std::string> tests;
> >   tests.push_back(test1);
> >   tests.push_back(test2);
> >   tests.push_back(test3);
> >   tests.push_back(test4);
> >   tests.push_back(test5);
> >   tests.push_back(test6);
> >   tests.push_back(test7);
> >   tests.push_back(test8);
> > 
> >   for (auto it = tests.begin(); it < tests.end(); ++it) {
> >     llvm::outs() << "is_absolute?(\"" << *it << "\")" << "\n";
> >     llvm::outs() << "  gnu::windows:   " << IS_DOS_ABSOLUTE_PATH(it->c_str()) << "\n";
> >     llvm::outs() << "  llvm::windows:  " << llvm::sys::path::is_absolute(*it, llvm::sys::path::Style::windows) << "\n";
> >     llvm::outs() << "  llvm2::windows: " << is_absolute2(*it, llvm::sys::path::Style::windows) << "\n";
> > 
> >     llvm::outs() << "  gnu::posix:     " << IS_UNIX_ABSOLUTE_PATH(it->c_str()) << "\n";
> >     llvm::outs() << "  llvm::posix:    " << llvm::sys::path::is_absolute(*it, llvm::sys::path::Style::posix) << "\n";
> >     llvm::outs() << "  llvm2::posix:   " << is_absolute2(*it, llvm::sys::path::Style::posix) << "\n";
> >     
> >     llvm::outs() << "is_separator?(\"" << *it << "\")" << "\n";
> >     llvm::outs() << "  gnu::windows:   " << IS_DOS_DIR_SEPARATOR(it->back()) << "\n";
> >     llvm::outs() << "  llvm::windows:  " << llvm::sys::path::is_separator(it->back(), llvm::sys::path::Style::windows) << "\n";
> >     llvm::outs() << "  llvm2::windows: " << is_separator2(it->back(), llvm::sys::path::Style::windows) << "\n";
> > 
> >     llvm::outs() << "  gnu::posix:     " << IS_UNIX_DIR_SEPARATOR(it->back()) << "\n";
> >     llvm::outs() << "  llvm::posix:    " << llvm::sys::path::is_separator(it->back(), llvm::sys::path::Style::posix) << "\n";
> >     llvm::outs() << "  llvm2::posix:   " << is_separator2(it->back(), llvm::sys::path::Style::posix) << "\n";
> >     llvm::outs() << "\n";
> >   }
> > ```
> > 
> > The output is:
> > 
> > ```
> > is_absolute?("/input/foo.c")
> >   gnu::windows:   1
> >   llvm::windows:  0
> >   llvm2::windows: 1
> >   gnu::posix:     1
> >   llvm::posix:    1
> >   llvm2::posix:   1
> > is_separator?("/input/foo.c")
> >   gnu::windows:   0
> >   llvm::windows:  0
> >   llvm2::windows: 0
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > 
> > is_absolute?("c:/input/foo.c")
> >   gnu::windows:   1
> >   llvm::windows:  1
> >   llvm2::windows: 1
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > is_separator?("c:/input/foo.c")
> >   gnu::windows:   0
> >   llvm::windows:  0
> >   llvm2::windows: 0
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > 
> > is_absolute?("\\input\foo.c")
> >   gnu::windows:   1
> >   llvm::windows:  1
> >   llvm2::windows: 1
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > is_separator?("\\input\foo.c")
> >   gnu::windows:   0
> >   llvm::windows:  0
> >   llvm2::windows: 0
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > 
> > is_absolute?("c:/")
> >   gnu::windows:   1
> >   llvm::windows:  1
> >   llvm2::windows: 1
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > is_separator?("c:/")
> >   gnu::windows:   1
> >   llvm::windows:  1
> >   llvm2::windows: 1
> >   gnu::posix:     1
> >   llvm::posix:    1
> >   llvm2::posix:   1
> > 
> > is_absolute?("c:")
> >   gnu::windows:   1
> >   llvm::windows:  0
> >   llvm2::windows: 1
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > is_separator?("c:")
> >   gnu::windows:   0
> >   llvm::windows:  0
> >   llvm2::windows: 0
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > 
> > is_absolute?("//")
> >   gnu::windows:   1
> >   llvm::windows:  0
> >   llvm2::windows: 1
> >   gnu::posix:     1
> >   llvm::posix:    1
> >   llvm2::posix:   1
> > is_separator?("//")
> >   gnu::windows:   1
> >   llvm::windows:  1
> >   llvm2::windows: 1
> >   gnu::posix:     1
> >   llvm::posix:    1
> >   llvm2::posix:   1
> > 
> > is_absolute?("\")
> >   gnu::windows:   1
> >   llvm::windows:  0
> >   llvm2::windows: 1
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > is_separator?("\")
> >   gnu::windows:   1
> >   llvm::windows:  1
> >   llvm2::windows: 1
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > 
> > is_absolute?(":")
> >   gnu::windows:   0
> >   llvm::windows:  0
> >   llvm2::windows: 0
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > is_separator?(":")
> >   gnu::windows:   0
> >   llvm::windows:  0
> >   llvm2::windows: 0
> >   gnu::posix:     0
> >   llvm::posix:    0
> >   llvm2::posix:   0
> > 
> > 
> > ```
> > 
> > GNU and LLVM don't agree with `/input/foo.c`, `c:`, `/` and `\`. The `is_absolute2` function agrees with GNU. I believe it is enough for this patch.
> > 
> > Working on it!
> I believe this time it will work. I have added a helper function `bool objdumpIsAbsolute(const Twine &path, sys::path::Style style = sys::path::Style::native);` at `llvm-objdump.cpp`. At first I have added it on `llvm-objdump.h` but I changed now. Should I keep it this way?
> 
> About Windows, would you mind to test again?
In general terms, put the function near to where it is actually used. Where you've got it is fine.


================
Comment at: llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test:7
+
+;; Generate %t2 with %p/Inputs/source-interleave-x86_64.c
+; RUN: sed -e "s,SRC_COMPDIR,%/p/Inputs,g" %p/Inputs/source-interleave.ll > %t2.ll
----------------
I'm not sure this or the above similar comment line are useful - this test doesn't have any control over how source-interleave.ll is generated, so the comment could become misleading. It's clear from hthe sed line that you are using the source-interleave.ll input too. No need to restate it.


================
Comment at: llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test:29
+; RUN: llvm-objdump --prefix myprefix/// --source %t.o 2>&1 | FileCheck %s --check-prefix CHECK-TRAILING -DFILE=%t.o
+; CHECK-TRAILING: warning: '[[FILE]]': failed to find source myprefix/Inputs/source-interleave-x86_64.c
----------------
You don't need `CHECK-TRAILING`. You can use `CHECK-BROKEN-PREFIX` with `-D ROOT=''`.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:424
 
+/// GNU's objdump is_absolute() equivalent.
+///
----------------
We tend to refer to the GNU tools as "GNU objdump", "GNU objcopy" etc (without the "'s")


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:427-428
+/// LLVM's is_absolute() matches C++17 behavior that essentially says that the
+/// path must be unambiguous (i.e. that there must be no drive letter for
+/// Windows). However GNU's objdump treat a path as absolute with or without
+/// the drive letter. Therefore in Windows targets '/', '\' and 'c:' are
----------------



================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:429-430
+/// Windows). However GNU's objdump treat a path as absolute with or without
+/// the drive letter. Therefore in Windows targets '/', '\' and 'c:' are
+/// treated as absolute paths by GNU's objdump.
+static bool
----------------
I'm not sure this last sentence makes sense to me. Maybe you can just delete it.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:432
+static bool
+objdumpIsAbsolute(const Twine &path,
+                  sys::path::Style style = sys::path::Style::native) {
----------------
Please change all your variable names in this function to use LLVM naming styles. See the Coding Standards for the details.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:433
+objdumpIsAbsolute(const Twine &path,
+                  sys::path::Style style = sys::path::Style::native) {
+  if (style == sys::path::Style::windows) {
----------------
You never specify the `style` parameter so unsurprisingly, the check for `style == sys::path::Style::windows` doesn't work, and the test still fails on Windows. You can see this by looking at the test failures listed by the pre-merge checks near the top of this page. You might want a call to `real_style`.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:438
+
+    if (p.front() == '/' || p.front() == '\\')
+      return true;
----------------
I think this will crash if `LineInfo.FileName` is empty. This might be caused by a malformed input, but we shouldn't crash still.

You also should consider crafting a test for this case.

Also, use `sys::path::is_separator` instead of directly querying the character.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85024/new/

https://reviews.llvm.org/D85024



More information about the llvm-commits mailing list