[PATCH] D72425: [OptRemark] RFC: Introduce a message table for OptRemarks

Tue Jun 9 13:14:27 PDT 2020

andrew.w.kaylor marked 3 inline comments as done.
andrew.w.kaylor added a comment.

Sorry for having let this drop for so long. Some other priorities came up, but I am still interested in seeing this through.

================
Comment at: llvm/include/llvm/Remarks/OptRemarkDiagBase.td:23
+
+def Kind_General : OptRemarkKind<"General">;
+def Kind_Missed : OptRemarkKind<"Missed">;
----------------
thegameg wrote:
> I think a few places call this "Passed". Would that be better than "General"?
I don't like "Passed" but "General" isn't very helpful either. I think this will apply to "optimizations that were performed" as opposed to missed opportunities and "analysis" (which seems to mean "information"), but I'm not sure there are no cases where the base OptimizationRemark class is used to mean something else.

The clang documentation describes the groups this way:

  # When the pass makes a transformation (-Rpass).
  # When the pass fails to make a transformation (-Rpass-missed).
  # When the pass determines whether or not to make a transformation (-Rpass-analysis).

That last description seems bad.

I don't really have strong feelings about what we call it. I guess what's important is to give it a good enough name that someone won't accidentally use if for a remark that doesn't align with the intended use. "General" doesn't do that. I guess "Passed" does.

================
Comment at: llvm/include/llvm/Remarks/OptRemarkDiagBase.td:33
+  string VerboseFormatStr = FormatVerbose;
+  OptRemarkGroup Group = ?;
+  OptRemarkKind  Kind = Kind_General;
----------------
thegameg wrote:
> In this scheme, it will be hard for a remark to be part of multiple groups, right?
> 
> I was imagining something where we can do things like:
> ```
> def gvn_load_elim : OptRemark<"LoadElim", "...">;
> def gvn_load_elim_missed : OptRemark<"LoadElimMissed", "...">, Kind<Kind_Missed>;
> def licm_hoisted : OptRemark<"InstHoisted", "...">;
> def loop_vectorized : OptRemark<"LoopVectorized", "...">;
> 
> def GroupGVN : OptRemarkGroup<"GVN", [ gvn_load_elim, gvn_load_elim_missed ]>;
> def GroupLICM : OptRemarkGroup<"LICM", [ licm_hoisted ]>;
> def GroupLV : OptRemarkGroup<"LV", [ loop_vectorized ]>;
> def GroupLoopStuff : OptRemarkGroup<"LoopStuff", [ licm_hoisted, loop_vectorized ]>;
> ```
> 
> or even groups of groups to avoid listing everything:
> 
> ```
> def GroupLoopStuff : OptRemarkGroup<"LoopStuff", [ GroupLICM, GroupLoopStuff ]>;
> ```
> 
> where we would have the liberty of putting remarks in different groups:
> 
> * for compiler devs: group remarks by pass
> * for users: group remarks by a higher level concept: loop optimizations, actionability, sanitizer-added code, etc.
> 
> I don't know how hard it is to get something like this, but let me know what you think.
Multiple groups seems useful. I'll see what I can do with that.

================
Comment at: llvm/test/TableGen/opt-remark-diag.td:20
+
+// CHECK-ENUMS: enum OptRemarkDiagGroup_Test1 {
+// CHECK-ENUMS:   RemarkName = 10000,
----------------
a.elovikov wrote:
> thegameg wrote:
> > What would be the use case of these enums? Can't the same be achieved by not quoting `RemarkName` and `Test1` in the `OPT_REMARK(` macros like in `clang/include/clang/Driver/Options.h`:
> > 
> > ```
> > enum ID {
> >     OPT_INVALID = 0, // This is not an option ID.
> > #define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
> >                HELPTEXT, METAVAR, VALUES)                                      \
> >   OPT_##ID,
> > #include "clang/Driver/Options.inc"
> >     LastOption
> > #undef OPTION
> >   };
> > ```
> I'm more interested in what other use-cases for OPT_REMARKs are. I believe C++ macros and tablegen serve the same purpose so I'm not sure if it's beneficial to use both at the same time. On the other hand, I want to extend the .td description of remarks to simple statistics so that each remark will be something like "number of <smth>: X" and auto-generate most of the code for handling it. I.e., given
> 
> 
> ```
>   let Group = StatisticsFoo in {
>     def remark_statistic_one : OptRemark<"StatisticOne", "Number of EventOne: %{Arg}>;
>     def remark_statistic_two : OptRemark<"StatisticTwo", "Number of EventTwo: %{Arg}>;
>   }
> 
> ```
> I'd like to be able to generate
> 
> 
> ```
>   struct StatisticsFooStorage {
>     NumStatisticOne = 0;
>     NumStatisticTwo = 0;
> 
>     void emit(RemarkEmitter Emitter) {
>       Emitter.emit(StatisticOneRemarkString.format(NumStatisticOne);
>       Emitter.emit(StatisticTwoRemarkString.format(NumStatisticTwo);
>     }
>   }
> ```
> 
> And actual optizmiation
> 
> ```
> StatisticsFooStorage StatStorage;
> // ...
> if (something)
>   ++StatStorage.NumStatisticOne;
> //
> RemarksEmitter Emitter;
> StatStorage.emit(Emitter);
> ```
> 
> I think writing direct tblgen emitter for this might be easier than using the OPT_REMARK macros (although that would probably be doable as well).
@thegameg The enum ID would be used in the optimization pass to emit the remark and somewhere else (possibly the diagnostic handler) to look up the string for the remark. The reason I was generating the enum explicitly is that I wanted to establish the base ID for each group based on other information in the .td file, but I guess that could be accomplished by moving the starting ID to the header file in the way that you suggest.

@a.elovikov I'm not sure I understand what you want to accomplish with the statistics. What does this accomplish that you can't do with existing LLVM statistics other than enabling the information to be reported in a release build?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72425/new/

https://reviews.llvm.org/D72425