[llvm-dev] Extending TableGen's 'foreach' to work with 'multiclass' and 'defm'

Wed Aug 23 10:44:24 PDT 2017

On 23 Aug 2017, at 18:21, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> 
> On 08/23/2017 12:06 PM, Krzysztof Parzyszek via llvm-dev wrote:
>> On 8/23/2017 11:58 AM, Hal Finkel via llvm-dev wrote:
>>> If we want to go down that route, I can certainly imagine a feasible incremental-transitioning strategy. We could allow TableGen to use an embedded Python interpreter to generate records based on Python data structures, and then, combine records from the existing .td files with those generated by the Python code. We'd use the existing TableGen plugins (which we may need to continue to use regardless, compared to writing Python, for performance reasons), and so we could incrementally transition existing definitions from .td files to Python as appropriate.
>> 
>> Would we then eliminate TableGen completely in the long term?
> 
> That could also be two separate questions: Would we replace the .td input language with Python completely in the long term? Would we rewrite the the backends (i.e., TableGen plugins) in Python? I don't yet have an opinion on either. I can see advantages to providing Python as input language. What do you think?

Replacing TableGen with general purpose language X runs into the issue of bikeshedding what X should be.  I’d be very much opposed to Python because:

 - It’s a large external dependency for the build (there’s no chance of FreeBSD shipping Python in the base system, for example, so we’d have to import the Python-generated files on each import, which would be annoying)

 - The language has had one backwards-incompatible break that it’s taken over a decade to recover from, I have little confidence that it will remain compatible going forward

 - It seems to encourage terrible code (I have yet to be presented with a piece of allegedly working Python software that I have not had to fix at least one bug in - git-imerge was almost an exception, but sadly not quite).

 - It intentionally doesn’t support tail recursion optimisation and imposes arbitrary stack depth limits, which forces some convoluted coding styles.

More generally, I’m not sure about the underlying goal.  We already have one solid general-purpose language in LLVM: C++.  A lot of the things that we currently use more complex TableGen programming practices for now would be good uses for the proposed metaclass / reflection APIs in C++21, which seems a more palatable end goal than a scripting language.

There are also external tools that both produce and consume the TableGen sources.  Having a language that is *not* a general-purpose programming language is a feature for these tools, not an obstacle.

David