[PATCH] D85024: [llvm-objdump] Implement --prefix option
James Henderson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 25 01:36:39 PDT 2020
jhenderson added inline comments.
================
Comment at: llvm/docs/CommandGuide/llvm-objdump.rst:172
+
+ When disassembling with the ``--source`` option, prepend ``prefix`` to absolute paths.
+
----------------
jhenderson wrote:
>
Ping this comment.
================
Comment at: llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test:15
+
+; RUN: llvm-objdump --prefix %p --source %t.o 2>&1 | FileCheck %s --check-prefix CHECK-MISS-PREFIX-FIX
+; CHECK-MISS-PREFIX-FIX: ; int foo() {
----------------
tinti wrote:
> tinti wrote:
> > jhenderson wrote:
> > > Hmmm... this is interesting. With llvm-objdump, this test case fails for Windows. However, if I replace llvm-objdump with GNU objdump and run the same invocation, it works. I dug into this a little. The problem is with the meaning of "absolute paths" for Windows - it's ambiguous. C++17's `filesystem::is_absolute`, whose behaviour is matched by `sys::path::is_absolute` essentially says that the path must be unambiguous (i.e. that there must be no drive letter for Windows). However, other languages, e.g. Python, treat a path as absolute with or without the drive letter. GNU objdump appears to follow the Python etc approach, so '/Inputs\foo' is treated as an absolute path and is prefixed.
> > >
> > > I think we need extra logic to get this to work on Windows, to ensure GNU compatibility. Basically, you need to identify the case where a path is absolute within the current drive (i.e. starts with '/', though it might be better, if possible to rely on the underlying functions to do that), and treat that as absolute. What you can't do is modify the behaviour of `sys::path::is_absolute` itself.
> > I did a small test.
> >
> > ```
> > bool is_absolute2(const llvm::Twine &path, llvm::sys::path::Style style) {
> > if (style == llvm::sys::path::Style::windows) {
> > llvm::SmallString<128> path_storage;
> > llvm::StringRef p = path.toStringRef(path_storage);
> >
> > if (p.front() == '/' || p.front() == '\\')
> > return true;
> >
> > if (p.size() >= 2 && (p[0] && p[1] == ':'))
> > return true;
> > }
> >
> > return llvm::sys::path::is_absolute(path, style);
> > }
> >
> > bool is_separator2(char value, llvm::sys::path::Style style) {
> > // not needed
> > return llvm::sys::path::is_separator(value, style);
> > }
> >
> > ...
> >
> > std::string test1("/input/foo.c");
> > std::string test2("c:/input/foo.c");
> > std::string test3("\\\\input\\foo.c");
> > std::string test4("c:/");
> > std::string test5("c:");
> > std::string test6("//");
> > std::string test7("\\");
> > std::string test8(":");
> >
> > std::vector<std::string> tests;
> > tests.push_back(test1);
> > tests.push_back(test2);
> > tests.push_back(test3);
> > tests.push_back(test4);
> > tests.push_back(test5);
> > tests.push_back(test6);
> > tests.push_back(test7);
> > tests.push_back(test8);
> >
> > for (auto it = tests.begin(); it < tests.end(); ++it) {
> > llvm::outs() << "is_absolute?(\"" << *it << "\")" << "\n";
> > llvm::outs() << " gnu::windows: " << IS_DOS_ABSOLUTE_PATH(it->c_str()) << "\n";
> > llvm::outs() << " llvm::windows: " << llvm::sys::path::is_absolute(*it, llvm::sys::path::Style::windows) << "\n";
> > llvm::outs() << " llvm2::windows: " << is_absolute2(*it, llvm::sys::path::Style::windows) << "\n";
> >
> > llvm::outs() << " gnu::posix: " << IS_UNIX_ABSOLUTE_PATH(it->c_str()) << "\n";
> > llvm::outs() << " llvm::posix: " << llvm::sys::path::is_absolute(*it, llvm::sys::path::Style::posix) << "\n";
> > llvm::outs() << " llvm2::posix: " << is_absolute2(*it, llvm::sys::path::Style::posix) << "\n";
> >
> > llvm::outs() << "is_separator?(\"" << *it << "\")" << "\n";
> > llvm::outs() << " gnu::windows: " << IS_DOS_DIR_SEPARATOR(it->back()) << "\n";
> > llvm::outs() << " llvm::windows: " << llvm::sys::path::is_separator(it->back(), llvm::sys::path::Style::windows) << "\n";
> > llvm::outs() << " llvm2::windows: " << is_separator2(it->back(), llvm::sys::path::Style::windows) << "\n";
> >
> > llvm::outs() << " gnu::posix: " << IS_UNIX_DIR_SEPARATOR(it->back()) << "\n";
> > llvm::outs() << " llvm::posix: " << llvm::sys::path::is_separator(it->back(), llvm::sys::path::Style::posix) << "\n";
> > llvm::outs() << " llvm2::posix: " << is_separator2(it->back(), llvm::sys::path::Style::posix) << "\n";
> > llvm::outs() << "\n";
> > }
> > ```
> >
> > The output is:
> >
> > ```
> > is_absolute?("/input/foo.c")
> > gnu::windows: 1
> > llvm::windows: 0
> > llvm2::windows: 1
> > gnu::posix: 1
> > llvm::posix: 1
> > llvm2::posix: 1
> > is_separator?("/input/foo.c")
> > gnu::windows: 0
> > llvm::windows: 0
> > llvm2::windows: 0
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> >
> > is_absolute?("c:/input/foo.c")
> > gnu::windows: 1
> > llvm::windows: 1
> > llvm2::windows: 1
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> > is_separator?("c:/input/foo.c")
> > gnu::windows: 0
> > llvm::windows: 0
> > llvm2::windows: 0
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> >
> > is_absolute?("\\input\foo.c")
> > gnu::windows: 1
> > llvm::windows: 1
> > llvm2::windows: 1
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> > is_separator?("\\input\foo.c")
> > gnu::windows: 0
> > llvm::windows: 0
> > llvm2::windows: 0
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> >
> > is_absolute?("c:/")
> > gnu::windows: 1
> > llvm::windows: 1
> > llvm2::windows: 1
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> > is_separator?("c:/")
> > gnu::windows: 1
> > llvm::windows: 1
> > llvm2::windows: 1
> > gnu::posix: 1
> > llvm::posix: 1
> > llvm2::posix: 1
> >
> > is_absolute?("c:")
> > gnu::windows: 1
> > llvm::windows: 0
> > llvm2::windows: 1
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> > is_separator?("c:")
> > gnu::windows: 0
> > llvm::windows: 0
> > llvm2::windows: 0
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> >
> > is_absolute?("//")
> > gnu::windows: 1
> > llvm::windows: 0
> > llvm2::windows: 1
> > gnu::posix: 1
> > llvm::posix: 1
> > llvm2::posix: 1
> > is_separator?("//")
> > gnu::windows: 1
> > llvm::windows: 1
> > llvm2::windows: 1
> > gnu::posix: 1
> > llvm::posix: 1
> > llvm2::posix: 1
> >
> > is_absolute?("\")
> > gnu::windows: 1
> > llvm::windows: 0
> > llvm2::windows: 1
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> > is_separator?("\")
> > gnu::windows: 1
> > llvm::windows: 1
> > llvm2::windows: 1
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> >
> > is_absolute?(":")
> > gnu::windows: 0
> > llvm::windows: 0
> > llvm2::windows: 0
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> > is_separator?(":")
> > gnu::windows: 0
> > llvm::windows: 0
> > llvm2::windows: 0
> > gnu::posix: 0
> > llvm::posix: 0
> > llvm2::posix: 0
> >
> >
> > ```
> >
> > GNU and LLVM don't agree with `/input/foo.c`, `c:`, `/` and `\`. The `is_absolute2` function agrees with GNU. I believe it is enough for this patch.
> >
> > Working on it!
> I believe this time it will work. I have added a helper function `bool objdumpIsAbsolute(const Twine &path, sys::path::Style style = sys::path::Style::native);` at `llvm-objdump.cpp`. At first I have added it on `llvm-objdump.h` but I changed now. Should I keep it this way?
>
> About Windows, would you mind to test again?
In general terms, put the function near to where it is actually used. Where you've got it is fine.
================
Comment at: llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test:7
+
+;; Generate %t2 with %p/Inputs/source-interleave-x86_64.c
+; RUN: sed -e "s,SRC_COMPDIR,%/p/Inputs,g" %p/Inputs/source-interleave.ll > %t2.ll
----------------
I'm not sure this or the above similar comment line are useful - this test doesn't have any control over how source-interleave.ll is generated, so the comment could become misleading. It's clear from hthe sed line that you are using the source-interleave.ll input too. No need to restate it.
================
Comment at: llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test:29
+; RUN: llvm-objdump --prefix myprefix/// --source %t.o 2>&1 | FileCheck %s --check-prefix CHECK-TRAILING -DFILE=%t.o
+; CHECK-TRAILING: warning: '[[FILE]]': failed to find source myprefix/Inputs/source-interleave-x86_64.c
----------------
You don't need `CHECK-TRAILING`. You can use `CHECK-BROKEN-PREFIX` with `-D ROOT=''`.
================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:424
+/// GNU's objdump is_absolute() equivalent.
+///
----------------
We tend to refer to the GNU tools as "GNU objdump", "GNU objcopy" etc (without the "'s")
================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:427-428
+/// LLVM's is_absolute() matches C++17 behavior that essentially says that the
+/// path must be unambiguous (i.e. that there must be no drive letter for
+/// Windows). However GNU's objdump treat a path as absolute with or without
+/// the drive letter. Therefore in Windows targets '/', '\' and 'c:' are
----------------
================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:429-430
+/// Windows). However GNU's objdump treat a path as absolute with or without
+/// the drive letter. Therefore in Windows targets '/', '\' and 'c:' are
+/// treated as absolute paths by GNU's objdump.
+static bool
----------------
I'm not sure this last sentence makes sense to me. Maybe you can just delete it.
================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:432
+static bool
+objdumpIsAbsolute(const Twine &path,
+ sys::path::Style style = sys::path::Style::native) {
----------------
Please change all your variable names in this function to use LLVM naming styles. See the Coding Standards for the details.
================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:433
+objdumpIsAbsolute(const Twine &path,
+ sys::path::Style style = sys::path::Style::native) {
+ if (style == sys::path::Style::windows) {
----------------
You never specify the `style` parameter so unsurprisingly, the check for `style == sys::path::Style::windows` doesn't work, and the test still fails on Windows. You can see this by looking at the test failures listed by the pre-merge checks near the top of this page. You might want a call to `real_style`.
================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:438
+
+ if (p.front() == '/' || p.front() == '\\')
+ return true;
----------------
I think this will crash if `LineInfo.FileName` is empty. This might be caused by a malformed input, but we shouldn't crash still.
You also should consider crafting a test for this case.
Also, use `sys::path::is_separator` instead of directly querying the character.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D85024/new/
https://reviews.llvm.org/D85024
More information about the llvm-commits
mailing list