[llvm-dev] Limited use types in the back end
Eli Friedman via llvm-dev
llvm-dev at lists.llvm.org
Wed Jan 29 15:01:57 PST 2020
If you want to investigate extending inline asm capabilities, I’d start by looking at InstrEmitter::EmitSpecialNode. It already understands inline asm operands that don’t fit in a single register; it just isn’t handling them the way you want it to. Basically, the idea would be that you emit a REG_SEQUENCE pseudo-instruction to merge the register operands into one big register, and EXTRACT_SUBREG to split the register result into multiple registers.
From: Nemanja Ivanovic <nemanja.i.ibm at gmail.com>
Sent: Wednesday, January 29, 2020 12:19 PM
To: Eli Friedman <efriedma at quicinc.com>
Cc: llvm-dev <llvm-dev at lists.llvm.org>
Subject: [EXT] Re: [llvm-dev] Limited use types in the back end
Well, frankly the issue is mainly the inline asm.
Say the instruction has the form
<opcode> RT, RA, RB
Where all of RT/RA/RB have to be multiples of 4. The instruction does a binary operation: RA/RA+1/RA+2/RA+3 <op> RB/RB+1/RB+2/RB+3. Namely, the operation is performed on 4 vector registers at a time, producing a 4 vector register result.
This can be modeled in a rather straightforward way with the right number of operands and results to the SDAG node for the operation.
However, to let the register allocator select a register for an inline asm constraint, I need to say that variable X goes into register R. I have a constraint that says give me one of these registers that are composed of 4 other registers. And the variable has a type that is as wide as 4 vectors (say v8i64). Then when the DAG builder tries to build the INLINEASM node for that directive, it wants to split the illegal type into 4 vectors to create the CopyToReg nodes.
This is really an issue with any type that is wider than the widest register.
On Mon, Jan 27, 2020 at 3:26 PM Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>> wrote:
I’m not sure I understand the difficulty here. Normally, if you have an instruction which has multiple operand/result registers, you just make the SelectionDAG node have multiple operand/result values. If there are weird register allocation constraints, you can handle that in ISelDAGToDAG. (There are a few ARM instructions that expect multiple registers in ascending order, like vtbl and vld4/vst4.)
If you need inline asm operands/results with an illegal type, that’s sort of an independent issue. x86 uses a fake register class to handle the “A” constraint, which refers to the register pair RAX/RDX. (See X86TargetLowering::getRegForInlineAsmConstraint). If that doesn’t work in your case, not sure what I’d do off the top of my head; maybe the code for lowering inline asm could be extended.
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Nemanja Ivanovic via llvm-dev
Sent: Monday, January 27, 2020 8:59 AM
To: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: [EXT] [llvm-dev] Limited use types in the back end
I am hoping that someone can offer advice on a somewhat unusual issue that I am facing with the SDAG. Namely, I am trying to implement some custom operations that do very specific things on multiple registers at a time. The operations themselves will simply be intrinsics since there are no equivalent operations in IR/SDAG. However, handling the types seems rather tricky.
One approach I tried is to create a register class that has the wide registers with proper sub registers and then telling the SDAG that the correspondingly wide type can go into those registers. While this works, it has a very unfortunate side effect that the type legalizer leaves any node with such a type untouched and I have to mark all operations as non-legal (mostly Expand).
For example, I could say that the type v8i64 can go into these registers and then I can use the type for my intrinsics. However, the type legalizer will leave all nodes with this result/operand type alone which is not at all what I want.
Then I tried the opposite approach - just custom lower only specific nodes that have this result type and let the type legalizer handle all the others normally. This works quite well except if I want to expose those custom instructions through inline asm. The DAG builder complains if I am trying to assign one of these wide registers to a value with the wide type because it assumes that the wide value will need to be broken up.
I suppose I could define a new type for the IR/SDAG and use it, but that seems like a super pervasive approach.
So either direction I go in seems to have a major drawback.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev