[PATCH] D32684: [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables.

Thu May 4 09:29:23 PDT 2017

filcab added a comment.

I'd prefer to have the tables, string tables, `IsMatch`, etc, outside of the `X86FoldTablesEmitter`.
Most (all?) of them don't really anything from there, and it would make the class smaller and easier to understand its workings (the tables and the small utility functions are all self-contained).

================
Comment at: test/CodeGen/X86/stack-folding-fp-avx1.ll:1654

-define double @stack_fold_sqrtsd(double %a0) {
-  ;CHECK-LABEL: stack_fold_sqrtsd
----------------
RKSimon wrote:
> aymanmus wrote:
> > RKSimon wrote:
> > > aymanmus wrote:
> > > > RKSimon wrote:
> > > > > Please don't delete tests, especially if they've regressed - show the new codegen and add a todo comment
> > > > The test checks that specific instructions are memory folded.
> > > > In the removed test cases, the instructions were removed from the memory folding tables so testing that these instructions are NOT folded in this test, is not the best idea.
> > > OK - so just leave TODO comments for the 4 cases:
> > > ```
> > > ; TODO stack_fold_sqrtsd
> > > ; TODO stack_fold_sqrtsd_int
> > > ; TODO stack_fold_sqrtss
> > > ; TODO stack_fold_sqrtss_int
> > > ```
> > > 
> > > 
> > > 
> > Sorry if i'm being too picky about this, but I cannot understand why to add TODOs for these test cases. what should be done with them later on? what would it indicate?
> The SSE scalar math memory folding has always been tricky - I had wondered if these cases could be handled by X86InstrInfo::foldMemoryOperandCustom? The _Int variants will probably have to be treated quite differently but it'd be great to get pass-through of the upper bits from the other operand being properly supported some day.
> 
> I notice that you didn't have to alter the SSE versions, nor the rsqrt/rcp equivalents, so in the medium term I think we can get this fixed, which is why I say we keep the TODOs there to help keep track. You can strip the actual test cases, just leave the 4 TODO comments.
> 
Adding `TODO` tests and comments allows anyone reading the source code/tests to notice that there's an optimization that should be done but isn't being done.
It's especially helpful in cases like this. From your comments (and other tests in this diff), it looks like we're not folding some `sqrts?` instructions after this patch.
We should keep these tests and add a `TODO` mentioning that folding this instructions is not easy to do with tables and they should be folded elsewhere.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:118
+  // For manually mapping instructions that do not match by their encoding.
+  const std::vector<ManualMapEntry> ManualMapSet = {
+      { "ADD16ri_DB",       "ADD16mi",         NO_UNFOLD  },
----------------
aymanmus wrote:
> craig.topper wrote:
> > aymanmus wrote:
> > > craig.topper wrote:
> > > > Why does this need to be vector? Why not a regular array or std::array?
> > > Defining an array as a class member requires stating it's size (ManualMapSet[18]), while std::vector doesn't. I thought this table might be dynamic and changed from time to time, so this way can be more friendly for future tuning.
> > > What do you think?
> > The number is required even if you put the array contents with it?
> Yes, ManualMapSet[] =...  would result in defining 0 sized array, and the compiler would give the following messages:
> - warning: ISO C++ forbids zero-size array `ManualMapSet`
> - error: too many initializers for `const {anonymous}::X86FoldTablesEmitter::ManualMapEntry [0]`
Why? Not specifying the first (outermost) dimension is acceptable.
Example: https://godbolt.org/g/HTnCIh
Did I miss something?

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:48
+      IsAligned = false;
+      Alignment = 0;
+    }
----------------
Please specify the default initialization value in the declaration of the member variables.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:88
+  // manual added entries.
+  typedef enum {
+    UNFOLD,     // Allow unfolding
----------------
Lose the typedef, just `enum UnfoldStrategy { ... };`.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:105
+
+  class IsMatch;
+
----------------
This isn't needed if you pull the IsMatch class outside of this one.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:108
+  // List of instructions requiring explicitly aligned memory.
+  const StringRef ExplicitAlign[7] = {"MOVDQA",  "MOVAPS",  "MOVAPD",
+                                      "MOVNTPS", "MOVNTPD", "MOVNTDQ",
----------------
No need to specify the size of the array.

I know Craig mentioned `StringRef` (which is strictly better than `string`), but why not a simple `const char *`? For `ExplicitUnalign` too. These are only used as parameters to `StringRef::find()`.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:113
+  // List of instructions NOT requiring explicit memory alignment.
+  const StringRef ExplicitUnalign[3] = {"MOVDQU", "MOVUPS", "MOVUPD"};
+
----------------
Array size.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:160
+      // Memory folding is enabled only when optimizing for size by DAG
+      // patterns only. (issue detailed in D28744 review)
+      "VCVTSS2SDrm",            "VCVTSS2SDrr",
----------------
D28744 was abandoned, and this comment made me think that we were doing something special when optimizing for size, which doesn't seem to be the case, after reading the patch.
Is this a "this should be done" comment?

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:204
+                    const CodeGenInstruction *MemInstr,
+                    const UnfoldStrategy Strgy = NO_STRATEGY);
+
----------------
Nit: I'd go with either just `S` or a full `Strategy`

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:248
+  // X86MemoryFoldTableEntry.
+  inline void printTable(const FoldTable &Table, std::string TableName,
+                         raw_ostream &OS) {
----------------
Don't use `inline` when defining a function in a class definition.
http://llvm.org/docs/CodingStandards.html#don-t-use-inline-when-defining-a-function-in-a-class-definition

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:268
+
+  // Return true if one of the instruction's operands is a RST register class
+  inline bool hasPtrTailcallRegClass(const CodeGenInstruction *Inst) {
----------------
Same comment as above?
(Remove `inline` from below, too)

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:277
+};
+} // end anonymous namespace
+
----------------
Why end the anonymous namespace here if you end up making a bunch of the following functions static inline?

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:281
+static inline uint64_t getValueFromBitsInit(const BitsInit *B) {
+  uint64_t Value = 0;
+  for (unsigned i = 0, e = B->getNumBits(); i != e; ++i) {
----------------
`assert(B->getNumBits() <= sizeof(uint64_t)*8);`

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:297
+  for (unsigned i = 0, e = B1->getNumBits(); i != e; ++i) {
+    if (BitInit *Bit1 = dyn_cast<BitInit>(B1->getBit(i))) {
+      if (BitInit *Bit2 = dyn_cast<BitInit>(B2->getBit(i))) {
----------------
You should probably use `cast<BitInit>...`. It doesn't seem like we should expect `getBit` to return something that's not a `BitInit`.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:310
+// Return the size of the register operand
+static inline unsigned int getRegOperandSize(const Record *RegRec) {
+  if (RegRec->isSubClassOf("RegisterClass"))
----------------
You should probably comment that something ensures size == alignment.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:436
+
+  unsigned int FoldedInd = getIndexOfFoldedOperand(Table);
+  Record *RegOpRec = RegInstr->Operands[FoldedInd].Rec;
----------------
You already know the index before calling `addEntryWithFlags`. Maybe just pass the index along?

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:634
+  bool areOppositeForms(const BitsInit *RegFormBits,
+                           const BitsInit *MemFormBits) {
+    uint64_t MemFormNum = getValueFromBitsInit(MemFormBits);
----------------
Please run `clang-format` on the file before committing.

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:714
+    std::vector<const CodeGenInstruction *> &OpcRegInsts =
+        RegInsts.find(Opc)->second;
+
----------------
Maybe this?
```auto FoundRegInst = RegInsts.find(Opc);
if (FoundRegInst == RegInsts.end())
  continue;
std::vector<const CodeGenInstruction *> &OpcRegInsts = FoundRegInst->second;```

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:728
+            getAltRegInst(RegInst, Records, Target);
+        updateTables(AltRegInst, MemInst);
+      }
----------------
Should we also do a `OpcRegInsts.erase(Match);` here?

================
Comment at: utils/TableGen/X86FoldTablesEmitter.cpp:735
+  for (const ManualMapEntry &Entry : ManualMapSet) {
+
+    Record *RegInstIter = Records.getDef(Entry.RegInstStr);
----------------
Nit: No need for this empty line

https://reviews.llvm.org/D32684