[LLVMdev] Enhancing TableGen
Che-Liang Chiou
clchiou at gmail.com
Thu Oct 6 22:00:29 PDT 2011
My purpose is to eliminate copy-paste style of programming in td files
as much as possible, but only to a point that the new language
constructs do not create too much overhead/readability-downgrade.
In other words, I am targeting those low-hanging fruit of copy-paste
programmings in td files that are eliminated by a simple for-loop
syntax. The repetitive patterns I observed in PTX backend (and
probably some other backends) are:
* Members of a multiclass, such as those in PTXInstrInfo.td.
* Consecutive similar defs, such as register declarations.
[Why for-loop?]
* All I need is a simple iteration language construct. I believe there
is no simpler constructs (in terms of readability) than a for-loop,
but I am happy to be convinced otherwise.
* It is sufficient to eliminate the most common copy-paste programming
that I observed.
* It is simple enough to understand and maintain, at least I believe so.
[Why preprocessor?]
I admit that a preprocessor is probably not the best solution. And we
can implement for-loop without a preprocessor. The only reason I chose
a preprocessor is because (I believe) this would add least changes to
the TGParser.cpp.
The TGParser.cpp as its current form parses and emits the results in
one-pass. That means it would emit the for-loop body even before we
are done parsing the entire for-loop.
So I believe a non-preprocessor approach would require 2 passes. The
first pass parses the input and generates a simple syntax tree, and
the second pass evaluate the syntax tree and emits output records (In
fact, this is how I implemented the current preprocessor). And I
believe that changing TGParser.cpp to accommodate 2 passes is quite a
lot, and so I chose a preprocessor.
But if you think we should really rewrite TGParser.cpp to parse and
evaluate for-loops correctly, I am glad that we could get away with a
preprocessor.
[Why NOT while-loop?]
* A while-loop requires evaluating an loop-condition expression; this
is complexity that I would like to avoid.
[Why NO if-else?]
* It requires at least evaluating a Boolean expression, too.
* If a piece of td codes is complicated enough that we need an if-else
to eliminate its duplication, I think it is worthy of the duplication.
[Why NO abstractions (like `define foo(a, b, c)`)?]
* Abstractions is probably worthy of, but I am not sure yet. I think
we could wait until it is clear that we really need abstractions.
Hi Dave and Jakob,
Thanks for comments. I try my best to respond any comments you wrote
about. If I missed any comments, as they are really a lot, please let
me know.
[string vs token]
The preprocessor (as its current form) has tokens by default, and it
only converts a series of tokens and white spaces into a string if
explicitly required (by a special escape #"#, see example below).
----------------------------------------
#for i = sequence(0, 127)
def P#i# : PTXReg<#"#p#i##"#>;
#end
----------------------------------------
* Anything between #"# are quoted, including white spaces and
non-tokens. E.g., #"#hello world#"# --> "hello world"
* Macro variable needs a '#' character at both front and back. This
looks like the multiclass's #NAME# substitution, and so I think is
more consistent than prepending a single '#' at front.
* So my current idea is very similar to Dave's, except that I replace
string with tokens (i.e., having both iterators as tokens and paste
"operator" results as tokens).
What do you think? Which one is more readable to you? !case<> or #"# or ... ?
[Can the for-loop proopsal be a preprocessing phase?]
I guess the example Dave gave (see below) cannot be handled in a (even
extended) preprocessor. I am not keen on implementing for-loop in a
preprocessor. I chose a preprocessor because I think it would cause
least impact to the codebase and, to be honest, I didn't address of
the pattern that Dave gave in his example in my design. I was trying
to avoid variable-length lists because I think that is too complicated
to users. But I could be wrong.
----------------------------------------
multiclass blah<list<int> Values> {
for v = Values {
def DEF#v : base_class<v>;
}
}
----------------------------------------
No preprocessor seems to have another syntactical benefits --- we
could remove extra '#' characters. To be honest, those '#' are not
very nice looking. And Dave's example looks cleaner than my excess-'#'
style.
Hi Dave,
I am not sure what you want to play around with, but you are not
disrupting anything so far.
Regards,
Che-Liang
On Fri, Oct 7, 2011 at 5:37 AM, Jakob Stoklund Olesen <jolesen at apple.com> wrote:
>
> On Oct 6, 2011, at 12:42 PM, David A. Greene wrote:
>
>> Jakob Stoklund Olesen <jolesen at apple.com> writes:
>>
>>> On Oct 6, 2011, at 7:59 AM, David A. Greene wrote:
>>>
>>>> For example, I want to be able to do this:
>>>>
>>>> defm MOVH :
>>>> vs1x_fps_binary_vv_node_rmonly<
>>>> 0x16, "movh", undef, 0,
>>>> // rr
>>>> [(undef)],
>>>> // rm
>>>> [(set DSTREGCLASS:$dst,
>>>> (DSTTYPE (movlhps SRCREGCLASS:$src1,
>>>> (DSTTYPE (bitconvert
>>>> (v2f64 (scalar_to_vector
>>>> (loadf64 addr:$src2))))))))],
>>>> // rr Pat
>>>> [],
>>>> // rm Pat
>>>> [[(DSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))),
>>>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)],
>>>> [(INTDSTTYPE (movlhps SRCREGCLASS:$src1, (load addr:$src2))),
>>>> (MNEMONIC SRCREGCLASS:$src1, addr:$src2)]]>;
>>>
>>> This kind of thing is very hard to read and understand.
>>
>> What's hard about it? I'm not trying to be agitational here. I'm truly
>> wondering what I can do to make this more understandable.
>
> If you didn't write these patterns yourself, or if you wrote them six months ago, it is nearly impossible to figure out where a specific pattern came from, or where a specific instruction is defined.
>
> It is hard enough mentally executing the current multiclasses. Injecting patterns into multi defms like this makes it much harder still.
>
>> I am certainly happy to make things more readable and welcome
>> lots of feedback in that area. But the ability to quickly and easily
>> extend the ISA for new vector lengths is critical to us.
>
> This is where our priorities differ.
>
> Readability and maintainability are key.
>
> After all, we need to fix isel and codegen bugs more often than Intel and AMD add ISA extensions.
>
> /jakob
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
More information about the llvm-dev
mailing list