<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Oct 7, 2016 at 12:20 PM, Evgenii Stepanov <span dir="ltr"><<a href="mailto:eugeni.stepanov@gmail.com" target="_blank">eugeni.stepanov@gmail.com</a>></span> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I've tried implementing some of the alternatives mentioned in this<br>

thread, and so far I like this syntax the most:<br>

<br>

i32 reloc (29, void ()* @f, 3925868544)<br>

; 29 = 0x1d = R_ARM_JUMP24<br>

; 3925868544 = 0xea000000<br>

<br>

Note the zeroes in the relocated data instead of 0xfffffe in the<br>

original proposal. This is aligned with the way LLVM emits relocations<br>

in the backend, and avoids encoding the addend in a<br>

relocation-specific way in the IR.</blockquote><div><br></div><div>I am confused by this statement. If the zeros aren't what appear in the object file, it seems rather relocation specific to me.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Instead, the addend can be<br>

specified in the second argument with the regular IR expressions, like<br>

the following:<br>

<br>

@w = internal global [3 x i32]<br>

   [i32 reloc (29, void ()* @f, 3925868544),<br>

    i32 reloc (29, [3 x i32]* @w, 3925868544),<br>

    i32 reloc (29, i32* getelementptr (i32, i32* bitcast ([3 x i32]*<br>

@w to i32*), i32 1), 3925868544)<br>

], align 4</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">we also get relocations for elements 1 and 2 of @w optimized out for</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">free. If the "addend" (i.e. the third arg of reloc) was specified as</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">0xeafffffe, the backend would have had to decode this value first.</blockquote><div><br></div><div>I think it may be ok to allow non-global constants as the second operand (the utility of this feature being the ability to freely RAUW a global without worrying about reloc constants).</div><div><br></div><div>This doesn't necessarily need to act as an alternative means of specifying an addend, though. Instead, the backend could synthesise local symbols to act as relocation targets. For example, your example would conceptually translate to:</div><div><br></div><div><div>@w = internal global [3 x i32]</div><div>   [i32 reloc (29, void ()* @f, 3925868544),</div><div>    i32 reloc (29, [3 x i32]* @w, 3925868544),</div><div>    i32 reloc (29, i32* @dummy, i32 1), 3925868544)</div></div><div><br></div><div>@dummy = internal alias i32* getelementptr (i32, i32* bitcast ([3 x i32]* @w to i32*), i32 1)</div><div><br></div><div>This way, you save yourself from needing to worry about manipulating addends in the backend, the linker will take care of it for you.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

On the other hand, it is possible for a constant expression in the IR<br>

to be lowered to something that is not a valid relocation target, and<br>

it is hard to detect this problem at the IR level.<br></blockquote><div><br></div><div>Right, this is of course a problem we already have for aliasees and constant initializers.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

Also, separating the addend from the section data allows the backend<br>

to choose between .rel and .rela representations.<br></blockquote><div><br></div><div>Do you have an example of a rela relocation which uses both r_addend and the underlying value in the object file?</div><div><br></div><div>Peter</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

On Wed, Aug 26, 2015 at 3:29 PM, Peter Collingbourne via llvm-dev<br>

<<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>

> On Wed, Aug 26, 2015 at 03:53:33PM -0400, Rafael Espíndola wrote:<br>

>> > I'm not sure if this would be sufficient. The R_ARM_JUMP24 relocation<br>

>> > on ARM has specific semantics to implement ARM/Thumb interworking; see<br>

>> > <a href="http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf" rel="noreferrer" target="_blank">http://infocenter.arm.com/help<wbr>/topic/com.arm.doc.ihi0044e/<wbr>IHI0044E_aaelf.pdf</a><br>

>> > Note that R_ARM_CALL has the same operation but different semantics.<br>

>> > I suppose that we could try looking at the addend to decide which relocation<br>

>> > to use, but this would mean adding more complexity to the assembler (along<br>

>> > with any pattern matching that would need to be done). It seems simpler,<br>

>> > both conceptually and in the implementation, for the client to directly say<br>

>> > what it wants in the object file.<br>

>> ><br>

>> > There's also the point that if @foo is defined outside the current linkage<br>

>> > unit, or refers to a Thumb function, the above expression in a constant<br>

>> > initializer would refer to the function's PLT entry or a shim, but in a<br>

>> > function it would refer to the function's actual address, so the evaluation<br>

>> > of this expression would depend on whether it was constant folded. (Although<br>

>> > on the other hand we might just declare that by using such a constant in a<br>

>> > global initializer that may be constant folded the client is asserting that<br>

>> > it doesn't care which address is used.)<br>

>><br>

>> I am pretty sure there is use for some target specific expressions, my<br>

>> concerns are<br>

>> * Using a target specific expression when it could be represented in a<br>

>> target independent way (possibly a bit more verbose).<br>

><br>

> Well I don't think there's a target independent way to write an R_ARM_JUMP24<br>

> relocation, as there's no way to represent the PLT entry or interworking<br>

> shim in IR.<br>

><br>

>> * Using the raw relocation values, instead of something like<br>

>> thumb_addr_delta. With this the semantics of each constant expression<br>

>> are still documented in the language reference.<br>

><br>

> I guess there are two ways we can go here:<br>

><br>

> 1) expose the raw relocation values<br>

> 2) introduce new specific ConstantExpr subtypes for the target-specific things we need<br>

><br>

> In this case I think we should do one or the other, I don't really think it's<br>

> worth adding a half measure of flexibility (e.g. providing a way to specify<br>

> the addend of a R_ARM_JUMP24 when it will pretty much always be the same).<br>

><br>

> I like option 1 because it's more general purpose and ultimately less of an<br>

> impedance mismatch between what the client wants and what appears in the<br>

> object file, and we can solve the documentation problem with reference to<br>

> the object file format documentation, but it would require our documentation<br>

> to depend on sometimes poorly documented object file formats.<br>

><br>

> Option 2 could look something like this (produces the same bytes as "b<br>

> some_label" in every object format when targeting ARM, or "b.w some_label"<br>

> when targeting Thumb):<br>

><br>

> i32 arm_b (void ()* @some_label)<br>

><br>

> and that would be easy to document on its own. The downside is that it's<br>

> pretty specific to my use case, but maybe that's ok.<br>

><br>

> 2 seems like it would be less implementation work, and doesn't require any<br>

> changes to the assembly format (and ultimately could be upgraded to 1 later<br>

> if needed), so maybe it's best to start with that.<br>

><br>

>> >> Why do you need to be able to avoid them showing up in function<br>

>> >> bodies? It would be unusual but valid to pass the above value as an<br>

>> >> argument to a function.<br>

>> ><br>

>> > This was part of the proposal mainly for the constant folding reasons mentioned<br>

>> > above, but if we did go with a reloc expression we'd need to encode the<br>

>> > original constant address in the reloc for PC-relative expressions, which<br>

>> > wouldn't be necessary if we disallow it.<br>

>><br>

>> Seems better to make it explicit IMHO.<br>

><br>

> Okay, but if we do introduce a new constant kind, there doesn't seem to be<br>

> much point in teaching the backend to lower it in a function, other than<br>

> for completeness. If we can avoid having to do that, that seems preferable.<br>

><br>

>> BTW, about the assembly change: Please check what the binutils guys<br>

>> think of it. We do have extensions, but it is nice to at least let<br>

>> them know so that we don't end up with two independent solutions in<br>

>> the future.<br>

><br>

> Yes if I ultimately go with 1.<br>

><br>

> Thanks,<br>

<span><font color="#888888">> --<br>

> Peter<br>

> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr">-- <div>Peter</div></div></div>

</div></div>