[PATCH] D45541: [globalisel] Update GlobalISel emitter to match new representation of extending loads

Sat Apr 28 17:37:33 PDT 2018

dsanders marked 11 inline comments as done.
dsanders added inline comments.

================
Comment at: include/llvm/CodeGen/GlobalISel/InstructionSelector.h:132
+  GIM_CheckMemorySizeSmallerThanLLT,
+  GIM_CheckMemorySizeLargerThanLLT,

----------------
rtereshin wrote:
> It appears to me that quite universally orderings are called "less than" and "greater than" with well known LT and GT abbreviations, maybe it makes sense to stick to the widely used naming scheme?
> 
> Also, looks like "Smaller Than" actually performs `<=` rather than `<`, same true for "Larger Than" (`>=` instead of `>`), so that makes the names even more confusing.
> It appears to me that quite universally orderings are called "less than" and "greater than" with well
> known LT and GT abbreviations, maybe it makes sense to stick to the widely used naming scheme?

I avoided LT/GT because GIM_CheckMemorySizeLTLLT/GIM_CheckMemorySizeGTLLT would be rather confusing :-). GIM_CheckMemorySizeLessThanLLT/GIM_CheckMemorySizeGreaterThanLLT makes sense though.

> Also, looks like "Smaller Than" actually performs <= rather than <, same true for "Larger Than" (>=
> instead of >), so that makes the names even more confusing.

They're inverted in the implementation because the code is checking for when to reject it rather than when to accept it.

================
Comment at: include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h:291
+      int64_t MMOIdx = MatchTable[CurrentIdx++];
+      uint64_t Size = MatchTable[CurrentIdx++];
+
----------------
rtereshin wrote:
> Do we anticipate `MMOIdx` being not `0` here in practice soon? It could be an early generalization, not sure if it needs to be.
MMO's are stored in a vector in MachineInstr. As far as I know we don't have any cases where there's more than one but I'd prefer to match the representation so we don't have to hunt down all the cases where we made assumptions later.

================
Comment at: include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h:320
+      int64_t MMOIdx = MatchTable[CurrentIdx++];
+      int64_t OpIdx = MatchTable[CurrentIdx++];
+
----------------
rtereshin wrote:
> ditto + the same question about `OpIdx` and the entire `GIM_CheckMemorySizeLargerThanLLT` opcode.
> the same question about OpIdx

We have no guarantees on the value of OpIdx. For G_LOAD/G_STORE and the generic atomics it will be 0 just because that's the relevant operand but targets can define pseudos that do something different.

> and the entire GIM_CheckMemorySizeLargerThanLLT opcode.

This one is mostly included for completeness at the moment. However, some targets have stores that do type conversion or vector packing/unpacking as part of the load/store and as a result write more memory than they had bits in the register.

================
Comment at: test/CodeGen/AArch64/GlobalISel/select-sextload.mir:2
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=aarch64-- -run-pass=instruction-select -verify-machineinstrs -global-isel %s -o - | FileCheck %s
+
----------------
rtereshin wrote:
> `-global-isel` is redundant
I'm surprised to find that's the case since addGlobalInstructionSelect() is conditional on having either -global-isel or TargetOptions::EnableGlobalISel. Looking at it again, I see that -run-pass builds a custom pipeline and skips that code entirely

We have -global-isel options everywhere in these tests. Let's clean that up in a separate patch.

================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:180
+// Track all types that are used so we can emit the corresponding enum.
+std::set<LLTCodeGen> KnownTypes;
+
----------------
rtereshin wrote:
> Maybe keeping this a member of `GlobalISelEmitter` is a little better, or rather more consistent with the rest of of implementation.
I agree it's more consistent but moving it there is going to cause a lot of source churn for little practical benefit. LLTOperandMatcher would need to be able to see GlobalISelEmitter's declaration causing a dependency loop that would require us to separate the declarations and implementations for many of the classes in this file. I don't think it's worth doing that to change a global variable into a static class member.

================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:1517
+  MemorySizePredicateMatcher(unsigned InsnVarID, unsigned MMOIdx, unsigned Size)
+      : InstructionPredicateMatcher(IPM_MemoryLLTSize, InsnVarID),
+        MMOIdx(MMOIdx), Size(Size) {}
----------------
rtereshin wrote:
> This class (`MemorySizePredicateMatcher`) having the exact same kind (`IPM_MemoryLLTSize`) as `MemoryVsLLTSizePredicateMatcher` doesn't seem right.
Well spotted. I thought I'd renamed it after duplicating the class.

================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:1540
+public:
+  enum RelationshipKind {
+    LargerThan,
----------------
rtereshin wrote:
> I believe the math term for this sort of things is "relation", not "relationship" (it's also shorter). I've also seen name "ordering" used for such enums, like in Haskell. "Relationship" feels a little odd, or maybe just unnecessary long.
Both words have the same definition. I've switched it to Relation on the basis that it's shorter

================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:1574
+                                          ? "GIM_CheckMemorySizeLargerThanLLT"
+                                          : "GIM_CheckMemorySizeSmallerThanLLT")
+          << MatchTable::Comment("MI") << MatchTable::IntValue(InsnVarID)
----------------
rtereshin wrote:
> The same is done to produce the debugging output in `executeMatchTable` and in a slightly different style. Maybe it makes sense to extract this as a separate `Relationship -> StringRef` function.
Sharing code between TableGen and CodeGen is tricky. The libraries they have in common are MC and Support.

These enums are private to CodeGen so it seems wrong to make them part of the public interface for either MC and Support. That said, we've historically put a couple CodeGen things in Support (e.g. LLT and MVT) at least partially to enable sharing with TableGen. In both cases, there would have been substantial code duplication without it.

I'm leaning towards accepting the duplication until the case for sharing gets stronger. Do you agree?

================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:3506
+                                  llvm::to_string(*P.getDstPattern()) << "\n";
+  errs() << "Rule ID " + llvm::to_string(M.getRuleID()) << "\n";

----------------
rtereshin wrote:
> Maybe wrap this in a `DEBUG` (or a friend) or eliminate entirely. Also, do we really need `llvm::to_string` here, just streaming it with `<<` doesn't work? Especially for the RuleID, it's just a number IIRC.
That's some temporary debugging that I've forgotten to remove. They're the same as the DebugCommentActions above (where the to_string() calls are needed)

Repository:
  rL LLVM

https://reviews.llvm.org/D45541