[PATCH] D69411: [MC] Calculate difference of symbols in two fragments when possible

Wed Nov 6 11:43:10 PST 2019

jcai19 marked an inline comment as done.
jcai19 added a comment.

In D69411#1735172 <https://reviews.llvm.org/D69411#1735172>, @peter.smith wrote:

> Thanks for the update. I'll wait a few days to see what comments we get. If there are none then I guess there aren't any strong objections.
>
> To clarify the remarks about resolving .if at layout time. I think that there are two major obstacles.
>
> - MC assembles instructions once, if the condition in the .if is not satisfied the contents of the block aren't even parsed. For example, the following will assemble if the .if condition fails, but will fail to parse if the .if condition passes. We'd need to change MC to either reparse (effectively making it a multipass assembler), or to parse and remember all parts. This is doable but it isn't a small change. ``` .text .if 0 You aren't parsing me are you? .else nop .endif ```
> - The second problem, and not one unique to llvm-mc, is that it is possible to write a program that doesn't converge, something like (not testable as it needs relaxation and late evaluation of .if): ``` label: beq after // In thumb 2 this 2-byte branch will be relaxed to a 4-byte branch if out of range. .if . - label == 2 .space 1024 * 1024 // sufficient to make after out of range .endif after: nop ``` In pass 1, beq is 2-bytes in size, the .if passes, beq is then relaxed to 4-bytes, which would make the .if fail, which then makes beq 2-bytes ... The interaction with relaxation could be fixed by making all size increases permanent. However I think it is possible to write multiple .if conditions that conflict. In Arm's old 2 - pass assembler (pass -1 find all sizes so layout is known, pass-2 encode instructions knowing layout), we had an error message when the equivalent of .if evaluated differently in subsequent passes. In summary, I think it would be a major change to MC to support something like late evaluation of .if in a reliable way.

Thanks for all the clarification and examples. I was wondering if there would be an easier way for late evaluate of .if condition so we could avoid the current implementation, which I totally agree is not straightforward and somehow ad-hoc. But I guess a major haul to MC seems to be the only alternative based on what you and Nick said, which is probably an overkill for the particular issue this patch tries to solve. However, this patch will not solve the second case you brought up, so how likely do you think we would encounter such cases, and should we consider the multiple-pass solution to future-proof these cases if they happen frequently enough, or maybe we could rewrite the assembly instead to avoid such complexity?

================
Comment at: llvm/lib/MC/MCExpr.cpp:584
+    // the two symbols belong to two fragments in the same section.
+    // FIXME: can we resolve .if conditions while finalizing layout?
+    if (IsCond && SecB.getFragmentList().getNextNode(*FragB) == FragA &&
----------------
nickdesaulniers wrote:
> jcai19 wrote:
> > peter.smith wrote:
> > > It is difficult to see how it would be possible to resolve .if conditions at layout time in a single pass assembler. In theory the assembler could evaluate all conditional blocks and select between them at layout time, if such a layout could be converged on.
> > Thanks for the clarification, although I am not sure I follow.  The code looks iterative to me https://llvm.org/doxygen/MCAssembler_8cpp_source.html#l00785. I was thinking as the loop iterates and calls layoutOnce, we can relax (if needed) and calculate the sizes of the fragments before a .if statement and resolve the condition there. But I am not completely convinced by myself that it is doable. Also there some cases like the one below will create more complexity.
> > 
> > foo: jump to bar
> > ...
> > .if . - foo = ${constant integer}
> > instr1
> > .else.
> > instr2
> > .endif
> > ...
> > bar:
> > 
> > This creates a loop of dependency as depending on the instruction selected in the if-else block, the size of the jump instruction may change due to the number of bits it need to specify the offset, which in turn affects which instruction should be chosen.
> > 
> See also b/132538429.
> 
> > In-order to evaluate the if statement truthfully, it will have to be done after (well, during, since the result could change the relaxation) relaxation, and LLVM MC is just not set up for this currently. Even then, you could probably construct an if statement that could create a paradox, and never converge on a valid relaxation.
Yeah, thanks for the explanation. I wonder if GAS would be able to support all these paradoxes.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69411/new/

https://reviews.llvm.org/D69411