[llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Tue Jul 10 13:26:44 PDT 2018

On 10 July 2018 at 17:51, Roger Ferrer Ibáñez via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> H all,
>
> I'm looking at generating PIC code for RISC-V in the context of Linux. Not
> sure if anyone is working on this already, any inputs are very welcome.

Great, that would be a useful contribution.

> I'm now looking at function calls which in the RISCV backend are represented
> via two pseudoinstructions RISCV::TAIL and RISCV::CALL.
>
> Currently those pseudos are lowered in MCCodeEmitter. They are expanded into
> AUIPC and JALR instructions and the first one needs a relocation, which for
> a static reloc model is R_RISCV_CALL but for PIC code should be
> R_RISCV_CALL_PLT.
>
> The problem I find is that at this point it is too late to tell the exact
> relocation needed: as far as I can tell there is no way to determine the
> relocation model. Perhaps this is on purpose and the MCCodeEmitter should
> not have that knowledge. Or maybe not and it is just a matter to "push" a
> TargetMachine to it, but the way the class is constructed does not look like
> this approach is workable.
>
> So I was considering lowering these pseudo-instructions in AsmPrinter
> instead. There I can tell the exact kind of the MCOperand I want thanks to
> the fact that the AsmPrinter is constructed with a TargetMachine.
>
> That said perhaps there are extra constraints that require doing the
> lowering in MCCodeEmitter, unfortunately I can't tell exactly what is the
> advantage of lowering that late.

As there is no way of generating an R_RISCV_CALL relocation in
assembly other than using the call pseudoinstruction, the desire is
that you can produce an ELF with that relocation regardless of whether
you emit .s and then assemble it or emit the .o directly. This pushes
you towards lowering at rather a late stage. There may be better ways
of structuring the current logic to achieve that aim of course.

> Alternatively I was considering adding two new pseudos like RISCV::CALL_PLT
> and RISCV::TAIL_PLT and also lower them at MCCodeEmitter. But this looks a
> bit too bulky to me and I think I would still have the issue that the "call"
> and "tail" pseudos in the assembler would need some extra magic (i.e. when
> assembling a "call" pseudoinstruction with -fPIC) so they don't end being
> parsed as the non-PIC counterparts. I might be wrong here though.

As Eli suggests, using the same instruction with different VariantKind
and/or MachineOperand flags would be the right way to go. It seems
that `call foo` in binutils gas always produces an R_RISCV_CALL
relocation while `call foo at plt` will produce R_RISCV_CALL_PLT.

Best,

Alex