[PATCH] D115907: [misexpect] Re-implement MisExpect Diagnostics

Tue Mar 8 17:15:57 PST 2022

tejohnson added inline comments.
Herald added a project: All.

================
Comment at: clang/include/clang/Basic/CodeGenOptions.def:172
 CODEGENOPT(NoWarn            , 1, 0) ///< Set when -Wa,--no-warn is enabled.
+CODEGENOPT(MisExpect         , 1, 0) ///< Set -Wmisexpect is enabled
 CODEGENOPT(EnableSegmentedStacks , 1, 0) ///< Set when -fsplit-stack is enabled.
----------------
"Set when ..."

================
Comment at: clang/lib/CodeGen/CodeGenAction.cpp:1203

+  if (CodeGenOpts.MisExpect) {
+    Ctx.setMisExpectWarningRequested(true);
----------------
Out of curiosity, since I am less familiar with clang than llvm, when is this path taken vs where this is done at line 336 in HandleTranslationUnit?

Also, it might be nice to have this code placed either consistently before or after the OptRecordFile handling in these two locations, so it is easier to compare.

================
Comment at: llvm/docs/MisExpect.rst:2
+===================
+Misexpect
+===================
----------------
I don't see a mention of -Wmisexpect in the document. Should there be user-facing clang documentation, this seems more specific to the LLVM implementation?

================
Comment at: llvm/docs/MisExpect.rst:22
+The MisExpect checks in the LLVM backend follow a simple procedure: if the
+profiling counter associated with an instruction using the ``llvm.expect``
+intrinsic was too low along the expected path, then it emits a diagnostic
----------------
Suggest using "profile weight" not profiling counter since we don't really have profile counters associated with instructions (rather with edges in the MST, at least in the IR PGO).

Also, doesn't the profile weight being too low only handle the LIKELY case? What about something being marked UNLIKELY with a hot/high profile weight? edit: After looking through the implementation, I think the comparison only being done in this direction is based on the way it is implemented, but it is unclear from this documentation here how the comparison is handling things being off in either direction.

================
Comment at: llvm/docs/MisExpect.rst:27
+The most natural place to perform the verification is just prior to when
+branch weights being assigned to the target instruction in the form of
+branch weight metadata.
----------------
grammatical issue "prior to when ... being assigned" - suggest "are assigned".

================
Comment at: llvm/docs/MisExpect.rst:39
+``-unlikely-branch-weight`` LLVM options. During verification, if the
+profile count is less than the calculated threshold, then we will emit a
+remark or warning detailing a potential performance regression. The
----------------
s/count/weight/

Also, similar to my earlier comment, what about an expect UNLIKELY branch with a high profile weight?

================
Comment at: llvm/docs/MisExpect.rst:48
+
+.. option:: -pass-remarks=misexpect
+
----------------
As noted in my comment at the top, it would be good to have user-facing clang documentation, in which case the clang version of this option is -Rpass=misexpect. 

================
Comment at: llvm/docs/MisExpect.rst:54
+
+.. option:: -pgo-warn-misexpect
+
----------------
Looking at the code, the -pgo-warn-misexpect seems to be useful for internal LLVM testing via 'opt', to mimic -Wmisexpect, so it probably should be documented as such.

================
Comment at: llvm/include/llvm/Transforms/Utils/MisExpect.h:39
+
+/// checkBackendInstrumentation - compares PGO counters to the thresholds used
+/// for llvm.expect and warns if the PGO counters are outside of the expected
----------------
Update function name.

================
Comment at: llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp:109

+  SI.setCondition(ArgValue);
+
----------------
Is there a reason why this line has changed ordering w.r.t. setMetadata?

================
Comment at: llvm/lib/Transforms/Utils/MisExpect.cpp:157
+  const uint64_t UnlikelyBranchWeight = ExpectedWeights[MinId];
+  const uint64_t ProfileCount = RealWeights[Index];
+  const uint64_t CaseTotal =
----------------
This is the only use of Index - can you just use MaxId directly? I guess the assumption is that the ExpectedWeights array has the same ordering as the RealWeights array, so we can check if the corresponding real weight from profile is colder than the most likely weight from the expect. Some comments to describe how the comparison is being done in this code more intuitively would help make it easier to understand.

================
Comment at: llvm/lib/Transforms/Utils/MisExpect.cpp:166
+
+  const llvm::BranchProbability LikelyThreshold(LikelyBranchWeight,
+                                                TotalBranchWeight);
----------------
Why is this called a "Threshold" and not just a "Probability"?

================
Comment at: llvm/lib/Transforms/Utils/MisExpect.cpp:170
+
+  if (ProfileCount < ScaledThreshold)
+    emitMisexpectDiagnostic(I, Ctx, ProfileCount, CaseTotal);
----------------
Is this too strict - i.e. what if they are off by only a small amount?

================
Comment at: llvm/lib/Transforms/Utils/MisExpect.cpp:198
+
+void checkFrontendInstrumentation(Instruction &I,
+                                  ArrayRef<uint32_t> ExpectedWeights) {
----------------
A lot of the code in this and the backend case above are similar - could you refactor the common handling into a helper?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115907/new/

https://reviews.llvm.org/D115907