[LLVMdev] Symbol folding with MC

Tue Apr 26 15:02:47 PDT 2011

On Apr 26, 2011, at 1:27 PM, Borja Ferrer wrote:

> Hello Jim thanks for the reply,
> 
> For normal additions with immediates I've done the same as ARM does, basically transforming add(x, imm) nodes to sub(x, -imm) with a pattern in the .td file like this:
> def : Pat<(add DLDREGS:$src1, imm:$src2),
>               (SUBIWRdK DLDREGS:$src1, (imm16_neg_XFORM imm:$src2))>;
> 

Cool. That's exactly the sort of thing I was referring to.

> Now, the typical pattern concerning additions with global addresses looks like this: (taken from x86)
> def : Pat<(add GR32:$src1, (X86Wrapper tglobaladdr :$src2)),
>               (ADD32ri GR32:$src1, tglobaladdr:$src2)>;
> 
> but i can't write that since i dont have an add with imm instr, and doing:
> 
> def : Pat<(add DREGS:$src, (Wrapper tglobaladdr:$src2)),
>               (SUBIWRdK DREGS:$src, tglobaladdr:$src2)>;
> is wrong because the tglobaladdr has to be negated somehow, so i don't understand how should I negate the symbol reference using patterns, if it's even possible. The obvious hack is adding a "-" char when lowering the symbol reference into text.
> 

You can probably do some of this with a complex pattern that has a transform function. Something like (completely untested, etc):

def neg_tglobaladdr_XFORM : SDNodeXForm<tglobaladdr, [{return makeNegatedGlobalAddr(CurDAG);}]>;
def neg_tglobaladdr :  PatLeaf<(tglobaladdr), [{
    return <true if the curdag really is a tglobaladdr, false otherwise>;
  }], neg_tglobaladdr_XFORM>;

def : Pat<(add DREGS:$src, (Wrapper tglobaladdr:$src2)),
              (SUBIWRdK DREGS:$src, neg_tglobaladdr:$src2)>;

As you note below, however, that sort of thing only gets you partway there.

> Regarding my second question, as you mentioned all symbols have static addresses so no relocations are performed, so it should be safe to fold immediate operations with the symbol reference. My problem here is that i don't know how to fold an arbitrary expression on a global (initially in the form of a DAG) to something that can be translated later into an expression with MC. It's something weird because operations are performed in the operand of an instruction, and since it has to support any arbitrary expression you can't have all combinations of operations using custom instructions. So how should i proceed in here using custom lowering or target dag combines?

Yeah, machine instruction operands aren't set up to handle that sort of thing. This is outside the scope of what LLVM ordinarily does.

I suspect that you'll need to modify the MachineOperand class to have a Kind that accepts MCExpr operands. The combiners and isel patterns would then have a place to hang the expressions they create. Your MC lowering pass would then have the information it needs.

I'm not completely thrilled with that idea, as it seems a bit heavyweight. Perhaps someone else has a better plan they can suggest.

Regards,
  Jim