[llvm-dev] [X86] Memory folding tables in x86 backend

Tue Jul 25 14:02:17 PDT 2017

> On 24 Jul 2017, at 17:35, Chris Lattner <clattner at nondot.org> wrote:
> 
> 
>> On Jul 23, 2017, at 7:09 AM, Simon Pilgrim via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> 
>> 
>>> On 23 Jul 2017, at 12:19, Joerg Sonnenberger via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> 
>>> On Sun, Jul 23, 2017 at 08:48:00AM +0000, Musa, Ayman via llvm-dev wrote:
>>>> 3-     Give up on the auto-generation idea and manually update the current tables iteratively with new chunks of instructions until full state is achieved.
>>>> 
>>>> P.s. The TableGen backend emits more than 5200 entries, while the known exception at this point are ~20.
>>> 
>>> Why can't you do a variation of this by tagging the instructions with a
>>> flag? I.e. for the rm variant, add a flag that says "this has an
>>> equivalent register operand version". Given that a lot of the instruction
>>> patterns are created via multi-classes, I would expect that to require a
>>> lot less than 5200 updates.
>> 
>> +1 An opt-in ‘GenerateMemoryFolding’ style flag (defaulting to false) seems safer to me than than trying to ensure you’ve correctly flagged all illegal instruction fold.
> 
> The counterargument to this is that it adds a lot of cruft to the td files.  It should be a lot more rare for an instruction to be “unfoldable” than to be “foldable”.
> 
> Validation should be straight-foward for this: generate the table automatically using the new tool and see what the differences are.  If the differences are due to incorrect additions, just add the flag to them.

I’m being pessimistic but the previous attempt at this failed because the testing missed so many cases, mainly due to the scale of the multiple problems that the patch was trying to fix in one go. Adding initial tablegen support and then enabling a few related instructions at a time gives us a decent chance of easily bisecting + fixing individual problems. Chandler went into some detail on this in D32684 and rL304762.

Once the majority of instructions are enabled, identifying the remaining problem cases should be more straightforward and inverting the flag to an opt-out ‘DoNotGenerateMemoryFolding’ a much easier task, removing much of the temporary cruft - although if the process stalls part way through we could be stuck with it longer than liked.

I’d add that many of these folding table entries are only tested by the stack-folding tests, and despite our efforts they don’t have close to full coverage (especially the endless AVX512 instructions and variations….)

Sorry this comment is coming over as so very glass half empty!