[LLVMdev] Instruction descriptions question
Roman Levenstein
romixlev at yahoo.com
Mon Oct 2 12:37:42 PDT 2006
Hi Chris,
Thanks a lot for your answer!
Chris Lattner wrote:
>> 1. Why does X86 instruction set description provide different
>> descriptions for the same instructions, which differ only in the
size
>> of operands?
>> E.g.
>>
>> def MOV8rm : I<0x8A, MRMSrcMem, (ops GR8 :$dst, i8mem :$src),
>> "mov{b} {$src, $dst|$dst, $src}",
>> [(set GR8:$dst, (load addr:$src))]>;
>> def MOV16rm : I<0x8B, MRMSrcMem, (ops GR16:$dst, i16mem:$src),
>> "mov{w} {$src, $dst|$dst, $src}",
>> [(set GR16:$dst, (load addr:$src))]>, OpSize;
>> def MOV32rm : I<0x8B, MRMSrcMem, (ops GR32:$dst, i32mem:$src),
>> "mov{l} {$src, $dst|$dst, $src}",
>> [(set GR32:$dst, (load addr:$src))]>;
>
> We must do this, because they perform different operations.
Specifically,
> the loads all read a different number of bytes (1/2/4).
OK.
>> For my target processor, only the types of operands (e.g. rm) are
>> really important. The size is encoded into the opcode, but otherwise
it
>> does not affect anything except for constraints on the sizes of
>> operands. And constraint is basically that the size of both operands
>> should be the same.
>
> Ok.
>
>> Wouldn't it be possible and even more clean to have just one
>> description like (I use a pseudo-description here):
>>
>> def MOVrr : I<0x88, MRMDestReg, (ops (GR8|GR16|GR32) :$dst,
>> (i8mem|i16mem|i32mem):$src),
>> "mov{b} {$src, $dst|$dst, $src}", []>,
isSameSize($dst,
>> $src);
>
> We already have something like this, but it's a little more general.
The
> X86 backend hasn't been converted to use it.
Any plans to do it, to make things simpler?
> This is the 'multiclass'
> facility in tblgen:
> http://llvm.org/docs/TableGenFundamentals.html#multiclass
This is very interesting.
> Basically this lets you use one definition to implement multiple
different
> instructions. For example, most instructions in the sparc target
come in
> "reg,reg" and "reg,imm" forms. As such, it defines:
>
> multiclass F3_12<string OpcStr, bits<6> Op3Val, SDNode OpNode> {
> def rr : F3_1<2, Op3Val,
> (ops IntRegs:$dst, IntRegs:$b, IntRegs:$c),
> !strconcat(OpcStr, " $b, $c, $dst"),
> [(set IntRegs:$dst, (OpNode IntRegs:$b,
IntRegs:$c))]>;
> def ri : F3_2<2, Op3Val,
> (ops IntRegs:$dst, IntRegs:$b, i32imm:$c),
> !strconcat(OpcStr, " $b, $c, $dst"),
> [(set IntRegs:$dst, (OpNode IntRegs:$b,
simm13:$c))]>;
> }
>
> which allows it to use instructions like:
>
> defm AND : F3_12<"and" , 0b000001, and>;
> defm OR : F3_12<"or" , 0b000010, or>;
> defm XOR : F3_12<"xor" , 0b000011, xor>;
> defm SLL : F3_12<"sll" , 0b100101, shl>;
> defm SRL : F3_12<"srl" , 0b100110, srl>;
> defm SRA : F3_12<"sra" , 0b100111, sra>;
> defm ADD : F3_12<"add" , 0b000000, add>;
> defm ADDCC : F3_12<"addcc", 0b010000, addc>;
> defm ADDX : F3_12<"addx" , 0b001000, adde>;
> defm SUB : F3_12<"sub" , 0b000100, sub>;
> defm SUBX : F3_12<"subx" , 0b001100, sube>;
> defm SUBCC : F3_12<"subcc", 0b010100, SPcmpicc>;
> ...
>
> Each of these 'defm's expand into two instructions.
>
>> The semantic of such a description would mean that $dst should be
one
>> of GR8, GR16, GR32 and $dst is one of i8mem, i16mem, i32mem with the
>> additional constraint that the sizes of both operands are the same
>> (this is checked by isSameSize predicate).
>
> Sure. It would be straight-forward to define a multiclass for your
arch,
> in which each use of defm makes three instructions: one for each
width.
Beautiful! This saved my day! Multiclass is an extremely useful
construct. Thanks a lot for explanations. Multiclasses seem to provide
almost everything (if not all) of the features that I was asking for.
I'll try to use it for writing a short and concise definition of
instructions for my target.
>> 2. Another related question is about instruciton costs. In
BURG-based
>> selectors there is usually a possibility to describe costs for the
>> instructions so that a least-cost cover can be found during the
>> instruction selection process. I have not seen any such cost
>> descriptions in TableGen files. Does it mean that it is not
supported?
>
> LLVM currently uses assumes that all instructions have cost 1, unless
told
> otherwise ('Pat' patterns which produce multiple instructions have
cost
> equal to the sum of their result). Given this, it attempts to match
the
> largest patterns possible, which reduces cost.
>
> The 'largest' metric is somewhat intricate, but you can directly
affect it
> in your .td files with the 'AddedComplexity'. If you increase it, it
will
> make an instruction more likely to be matched. However, you really
> shouldn't need to do this, except in extreme cases. Are you seeing
cases
> where the selector is missing an optimal match, or are you just used
to
> other tools which aren't as smart at inferring cost as tblgen?
I guess, I'm just used to other more BURG-like tools. And yes, they are
not too intelligent compared to tblgen;)
So far, I haven't seen a real case where an optimal match is missed by
tblgen. But I have not completely converted the old BURG-spec into
tblgen yet. When I'm ready, I'll try to compare the results and check
some corner cases (for example, I was using tree grammar to
automatically select a best addressing mode, which tblgen doesn't
support yet as far as I understand from the documentation; there were
also interesting tree patterns for C operations of the form X OP= Y and
increment operators) - let's see how tblgen will perform.
>> Why? As far as I understand, LLVM uses BURG/IBURG for instruction
>> selection and both of these tools support costs, AFAIK.
>
> Nope, it uses neither. It uses custom code built into tblgen.
OK. Now I understand. Probably I thought that tblgen is BURG-based
because there is a Burg subdirectory in the llvm/utils. And in older
versions of llvm it was even non-empty, giving the impression that it
is used.
BTW, does it use the same theory as BURG/IBURG when it generates the
selector? I.e. dynamic programming + precomputed costs,etc? Does it
implicitly build a tree grammar or something like that? Does it try to
merge/share patterns as much as possible to speedup the recognition
phase as it is done by BURMs? I have the impression (by looking at the
produced C++ selector code) that tblgen-based selectors are not as
"precomputed" and optimal as BURG-based ones. Do you have an idea or
figures about how it compares to BURG/IBURG selectors?
Thanks again,
Roman
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
More information about the llvm-dev
mailing list