[PATCH] D32684: [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables.
Ayman Musa via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 8 05:18:04 PDT 2017
aymanmus added inline comments.
================
Comment at: test/CodeGen/X86/stack-folding-fp-avx1.ll:1654
-define double @stack_fold_sqrtsd(double %a0) {
- ;CHECK-LABEL: stack_fold_sqrtsd
----------------
filcab wrote:
> RKSimon wrote:
> > aymanmus wrote:
> > > RKSimon wrote:
> > > > aymanmus wrote:
> > > > > RKSimon wrote:
> > > > > > Please don't delete tests, especially if they've regressed - show the new codegen and add a todo comment
> > > > > The test checks that specific instructions are memory folded.
> > > > > In the removed test cases, the instructions were removed from the memory folding tables so testing that these instructions are NOT folded in this test, is not the best idea.
> > > > OK - so just leave TODO comments for the 4 cases:
> > > > ```
> > > > ; TODO stack_fold_sqrtsd
> > > > ; TODO stack_fold_sqrtsd_int
> > > > ; TODO stack_fold_sqrtss
> > > > ; TODO stack_fold_sqrtss_int
> > > > ```
> > > >
> > > >
> > > >
> > > Sorry if i'm being too picky about this, but I cannot understand why to add TODOs for these test cases. what should be done with them later on? what would it indicate?
> > The SSE scalar math memory folding has always been tricky - I had wondered if these cases could be handled by X86InstrInfo::foldMemoryOperandCustom? The _Int variants will probably have to be treated quite differently but it'd be great to get pass-through of the upper bits from the other operand being properly supported some day.
> >
> > I notice that you didn't have to alter the SSE versions, nor the rsqrt/rcp equivalents, so in the medium term I think we can get this fixed, which is why I say we keep the TODOs there to help keep track. You can strip the actual test cases, just leave the 4 TODO comments.
> >
> Adding `TODO` tests and comments allows anyone reading the source code/tests to notice that there's an optimization that should be done but isn't being done.
> It's especially helpful in cases like this. From your comments (and other tests in this diff), it looks like we're not folding some `sqrts?` instructions after this patch.
> We should keep these tests and add a `TODO` mentioning that folding this instructions is not easy to do with tables and they should be folded elsewhere.
Fair enough. Thanks for the detailed answer.
================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:118
+ // For manually mapping instructions that do not match by their encoding.
+ const std::vector<ManualMapEntry> ManualMapSet = {
+ { "ADD16ri_DB", "ADD16mi", NO_UNFOLD },
----------------
filcab wrote:
> aymanmus wrote:
> > craig.topper wrote:
> > > aymanmus wrote:
> > > > craig.topper wrote:
> > > > > Why does this need to be vector? Why not a regular array or std::array?
> > > > Defining an array as a class member requires stating it's size (ManualMapSet[18]), while std::vector doesn't. I thought this table might be dynamic and changed from time to time, so this way can be more friendly for future tuning.
> > > > What do you think?
> > > The number is required even if you put the array contents with it?
> > Yes, ManualMapSet[] =... would result in defining 0 sized array, and the compiler would give the following messages:
> > - warning: ISO C++ forbids zero-size array `ManualMapSet`
> > - error: too many initializers for `const {anonymous}::X86FoldTablesEmitter::ManualMapEntry [0]`
> Why? Not specifying the first (outermost) dimension is acceptable.
> Example: https://godbolt.org/g/HTnCIh
> Did I miss something?
In your example you defined the array to be global, in this case it works.
It doesn't work when the array is defined as a class member.
Thanks to your previous comment I moved this table outside the class and updated it accordingly.
================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:160
+ // Memory folding is enabled only when optimizing for size by DAG
+ // patterns only. (issue detailed in D28744 review)
+ "VCVTSS2SDrm", "VCVTSS2SDrr",
----------------
filcab wrote:
> D28744 was abandoned, and this comment made me think that we were doing something special when optimizing for size, which doesn't seem to be the case, after reading the patch.
> Is this a "this should be done" comment?
Yes, some of the listed instructions have DAG patterns for memory folding which are marked with OptForSize predicate. A more "sophisticated" dealing should be added to VEX and EVEX instructions (which is detailed in the abandoned review comments).
================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:310
+// Return the size of the register operand
+static inline unsigned int getRegOperandSize(const Record *RegRec) {
+ if (RegRec->isSubClassOf("RegisterClass"))
----------------
filcab wrote:
> You should probably comment that something ensures size == alignment.
Thanks for noticing, I planned to change that and somehow forgot. Updated with better approach.
================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:714
+ std::vector<const CodeGenInstruction *> &OpcRegInsts =
+ RegInsts.find(Opc)->second;
+
----------------
filcab wrote:
> Maybe this?
> ```auto FoundRegInst = RegInsts.find(Opc);
> if (FoundRegInst == RegInsts.end())
> continue;
> std::vector<const CodeGenInstruction *> &OpcRegInsts = FoundRegInst->second;```
The previous check:
```
if (RegInsts.count(Opc) == 0)
continue;
```
makes sure RegInsts has an element with the required key.
https://reviews.llvm.org/D32684
More information about the llvm-commits
mailing list