[PATCH] D50701: [MI] Change the array of `MachineMemOperand` pointers to be a generically extensible collection of extra info attached to a `MachineInstr`.

Chandler Carruth via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 14 19:46:03 PDT 2018


chandlerc marked 2 inline comments as done.
chandlerc added inline comments.


================
Comment at: llvm/include/llvm/ADT/PointerSumType.h:76-77
+  // address of the zero-tag pointer instead of just the zero-tag pointer
+  // itself. This is especially useful when building `ArrayRef`s out of a single
+  // pointer.
+  union {
----------------
rnk wrote:
> This is... "clever". =P Isn't this going to upset ubsan's active union member sanitizer? Do we need to worry about that?
Yeah, I meant to revisit this. I've done so with Richard's help and think we have a fully correct version now.


================
Comment at: llvm/include/llvm/CodeGen/MachineInstr.h:25
 #include "llvm/Analysis/AliasAnalysis.h"
+#include "llvm/CodeGen/MachineMemOperand.h"
 #include "llvm/CodeGen/MachineOperand.h"
----------------
rnk wrote:
> Might not work, but can we forward declare this?
`PointerSumType` needs the complete type. See below for details.


================
Comment at: llvm/include/llvm/CodeGen/MachineInstr.h:31
 #include "llvm/MC/MCInstrDesc.h"
+#include "llvm/MC/MCSymbol.h"
 #include "llvm/Support/ArrayRecycler.h"
----------------
rnk wrote:
> We should be able to forward declare MCSymbol.
No, and this is also why I can't move `ExtraInfo`'s definition as I had hoped.

Fundamentally, `PointerSumType` wants to check that the type parameters to it can be unioned into a sum type. To check that, it needs to know whether all of the tags will fit into the empty bits of the pointers. This requires the alignment of the pointer. The alignment requires the type to be complete.

We could delay this checking with some amount of work so that it is only triggered when the `PointerSumType` is *used*, but after trying several times to do that, I don't think I like the result at all.

First and foremost, the type *should* tell you, as early and as reliably as it can, that the parameters are valid. IMO, this should point the programmer at the *parameters*, not at some unrelated code. The declaration of the member is the best place to do this.

Second, the obvious way to defer this checking would cause unused parameters to *never be checked*. That seems ... really risky and bug prone to me.

Third, it seems reasonable for types with similar contracts (but perhaps slightly different implementation goals) to actually arrange their *storage* based on the alignment and available bits. It seems somewhat awkward to take advantage of the fact that `PointerSumType` *happens* to not change its layout / storage strategy based on the alignment just to defer checking of the validity of the template parameters given to it.


I definitely don't think avoiding an include is worth the kind of usability hit this would incur.

I would be somewhat interested in avoiding this type cluttering up the public interface, but it at least needs to come before the *private* data member in which it is used. The correct way to clean this up, if folks want, is to actually the class contents holistically in a way more optimized for the reader of the API:
1. Private constants and type aliases used to make naming public types substantially simpler. (Optional and expected to be rare.)
2. Public constants and types, forward declared where the definition is large and unnecessary for public data members or member function declarations.
3. Public data members.
4. Public member functions, declaration only where the definition is large.
5. Private constants and types, and any public constant and type definitions declared (but not defined) in #1 and necessary for private data members or member function declarations.
6. Private data members.
7. Private member functions, declaration only where the definition is large.
8. (out-of-line from class body) Definitions of member functions declared above which are worth making inlinable and the definitions of constants and types only declared above necessary for those member function definitions. All of these ordered relative to each other based on the order in which they are declared within the class body.
9. (in the .cpp file) Remaining definitions, ordered relative to each other based on the order in which they are declared within the class body.

I much prefer this fairly strict ordering with very narrow and rare exceptions. I think it would also address some of your readability concerns here w/o forcing us to make declaration and definition able to be re-ordered.


That said, this should be a follow-up change if we want to do it. I have tried to clean up the  comments and code a bit to make this slightly less of a readibility hit.


================
Comment at: llvm/include/llvm/CodeGen/MachineInstr.h:1485-1486
+  ///
+  /// This is only valid to call once -- an existing symbol can't safely be
+  /// discarded as anyone could already rely on it.
+  ///
----------------
rnk wrote:
> Perhaps the API should be something like, `getOrCreatePreInstrLabel(const Twine &LabelName = "tmp")`
> 
> It's not like we need to insert two labels in front of the same instruction. In fact, it's typically better to avoid that.
I like this API a lot better. I'll mak ethat change and update with it shortly....


Repository:
  rL LLVM

https://reviews.llvm.org/D50701





More information about the llvm-commits mailing list