[PATCH] D24587: [RFC] Output optimization remarks in YAML

Fri Sep 23 10:45:50 PDT 2016

hfinkel added a comment.

In https://reviews.llvm.org/D24587#550359, @anemet wrote:

> Thanks for taking a look!
>
> In https://reviews.llvm.org/D24587#550270, @hfinkel wrote:
>
> > Thanks for working on this! I think this direction makes sense. I think that we'd like the remark classes to be serializable as YAML, but we probably don't want the file writing code to be in the backend. The diagnostic classes get passed to the handler, which is often in the frontend, and we should probably let the handler decide what to do with them (e.g. where/if to write them to a file).
>
>
> Then maybe I misunderstood what was meant by "doing this in the backend" in earlier conversations.  I thought the idea was (which made sense to me) that all the backend user (e.g. clang, lto, opt) does it to instruct the backend to write the diagnostics into a file by invoking Context.setDiagnosticsOutputFile.  I agree that this only gives binary control to the backend users but I am not sure we want more control at this time.  Do you have something in mind?
>
> It's also has the benefit that we don't need to duplicate this across the backend users.

I agree; this is good. The yaml::Output object can be a memory-mapped buffer. This implementation is fairly generic. The diagnostic handler does not get to filter what to serialize, true, but perhaps that's okay.

> 

> 

> > > - string:  will not be inlined into

> 

> > 

> 

> > 

> 

> > We don't need to fix this all at once, but we need to think carefully about how remarks are identified. I think we want it to be very clear how each diagnostic is identified, and we don't want strings that compose the human-readable summary message to subtly also form an interface contract with external tools. So the string here is fine, but we'll want each diagnostic to have some name that is designed to be machine consumed, and probably key-value pairs which likewise have machine-consumable names.

> 

> 

> OK, so sounds like we want the interface to enforce a single string per message policy.  So probably a format like this would work better:

> 

>   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(DEBUG_TYPE, &I)

>            << "% will not be inlined into %"

>            << Callee << CS.getCaller());

>    

> 

> What do you think?

I'd much rather we be forced to actually name these things. I don't want to have to change the tools that interpret the YAML files every time we update the messages to make them more helpful. How about this:

  ORE.emit(DiagnosticInfoOptimizationRemarkMissed(DEBUG_TYPE, &I, "InlinerNotInlined")
           << NV("Callee", Callee) << " will not be inlined into "
           << NV("Caller", CS.getCaller()));

Where NV is supposed to stand for "named value", and the idea being that the name is used as the key for the YAML output.

================
Comment at: lib/Analysis/OptimizationDiagnosticInfo.cpp:131
@@ +130,3 @@
+  else
+    F->getContext().diagnose(OptDiag);
+}
----------------
Why don't we call this regardless of whether we're saving the output? We might not want to change which messages the user sees while compiling just because we're also recording the output (I don't think that we do).

https://reviews.llvm.org/D24587