[PATCH] D126177: [BOLT] Fix disassembly of data in text

Rafael Auler via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 24 18:13:48 PDT 2022


rafaelauler added a comment.

I'm logged in another account because phabricator won't let me submit the review I wrote in the original reviewer account. I'm going to paste my full review as a subscriber instead of reviewer to try to bypass the "you can't edit this diff" permission thing. Here it goes:

Hi @treapster, thanks for working on this and for submitting this patch!

It looks like the per-merge check is failing

Failed Tests (1):

  BOLT :: AArch64/unmarked-data.s

https://buildkite.com/llvm-project/premerge-checks/builds/94328#fa09bc94-6223-4f63-93d0-e5c7aae21e66

The check is also failing for me locally. See my comments below.

BinaryContext.h Line 85:

Add a comment

// AArch64-specific symbol markers used to delimit code/data in .text.

Lines 675-676:
We should avoid exporting these in the public interface of BinaryContext if they are not (currently) used outside of BinaryContext. We should leave them as internal helper functions inside BinaryContext.cpp

RewriteInstance.cpp  lines 893-895:

If both A and B are markers at the same address, then we cannot establish order according to this function. This comparison function returns:

A < B ? False
B < A ? False

Which breaks the requirement on std::sort to have the comparison function define a total order among elements.

Line 918:
I don't think we need to save memory in discoverFileObjects() (this is not a step in BOLT that achieves the peak of memory consumption), so feel free to drop attribute packed. This attribute is not used in the LLVM codebase as far as I can tell.

Line 1272:

In theory, you wouldn't need to call markAtDataAtOffset for consecutive objects. For example, if you have:

0x10 : code
0x14 : data
0x18: data
0x20: code

You would only need to call markDAtaAtOffset(0x14).

But after this patch, we will be calling markDataAtOffset(0x18) if we have any symbol at 0x18. So why was the change necessary? Am I right in my interpretation that this to cover the case when 0x18 would be located in the next BinaryFunction, for some reason? In that case, there is a side effect that AddressToConstantIslandMap will possibly be bloated with so many more entries. But I guess that's OK? I would like @yota9 approval here if possible.

In this case, I would rewrite the description of this patch as:
[BOLT][AArch64] Handle data markers spanning multiple functions
"Fix BOLT's constant island mapping when a data marker ($d) spans multiple functions. Currently, because BOLT only marks the constant island in the first function where $d is located, if the next function contains data at its start, BOLT will miss the data marker and try to disassemble it. This patch adds code to explicitly go through all symbols between $d and $x markers and mark their respective offsets as data, which stops BOLT from trying to disassemble data. It also adds MarkerType enum and refactors related functions.

unmarked data.s line 3:

I think I know why this test is failing in pre-merge. If you call the driver (clang) to link your code, it will break in X86 because we don't have an AArch64 toolchain in X86. You would need to move this test to runtime/.

But in an X86 machine, we do have an AArch64 assembler and linker, we just don't have AArch64 libs to link in your testcase. Because we're not running the executable, there's no need to call the driver (clang) to link in libs. So we can rewrite this by calling the assembler (llvm-mc) and the linker (ld.lld). As an example on how to do this, see https://github.com/llvm/llvm-project/blob/main/bolt/test/X86/gotpcrelx.s#L15


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126177/new/

https://reviews.llvm.org/D126177



More information about the llvm-commits mailing list