[PATCH] D146654: [clang] replaces numeric SARIF ids with heirarchical names

Vlad Serebrennikov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 28 11:40:28 PDT 2023


Endill added inline comments.


================
Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:51-52
+      Diag->getDiags()->getDiagnosticIDs()->getStableName(Diag->getID()).str();
+  std::replace(StableName.begin(), StableName.end(), '_', '.');
+  SarifRule Rule = SarifRule::create().setRuleId(StableName);
 
----------------
vaibhav.y wrote:
> cjdb wrote:
> > denik wrote:
> > > ยง3.5.4 says that the hierarchical strings are separated by "/".
> > > 
> > > But more generally, are diagnostic names really fall under "hierarchical" definition?
> > > Words between underscores should be treated as components. And $3.27.5 says that the leading components have to be the identifier of the rule.
> > > In some cases they look like valid components, e.g. `err_access`, `err_access_dtor`, `err_access_dtor_exception`.
> > > But in cases like `err_cannot_open_file` neither of the leading components exists.
> > > 
> > > Theoretically we could use groups as the leading component for warnings for example. For errors the leading components are probably not even necessary, since if I understood correctly they are needed to suppress subsets of violations on the SARIF consumer side.
> > > Or we just could keep the names with underscores as is. WDYT?
> > I think in light of what you've said, changing back to underscores is probably best.
> > But more generally, are diagnostic names really fall under "hierarchical" definition?
> 
> I have the same concern, but this is okay for a first pass as a "flat hierarchy" :)
> 
> If we want a deeper structure, we'll need some extra metadata in `DiagnosticSemaKinds.td`'s `Error<...>`  to add cluster names as a follow up to this.
> 
> WDYT about something like `clang/visibility/err_access` or `clang/syntax/err_stmtexpr_file_scope`.
> 
> Alternatively: we could draw from the c++ standard structure: https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code for an ODR violation could be `clang/basic/def/odr`, again I'm unsure how well this meshes with clang's diagnostic model.
> Alternatively: we could draw from the c++ standard structure: https://eel.is/c++draft/basic.def.odr#term.odr.use and say the error code for an ODR violation could be clang/basic/def/odr, again I'm unsure how well this meshes with clang's diagnostic model.

The only reliable thing there are stable clause names like `[namespace.udecl]`, because there are precedents of them being rearranged in the table of contents ([[ https://github.com/cplusplus/draft/commit/982a456f176ca00409c6e514af932051dce2485f | 1 ]], [[ https://github.com/cplusplus/draft/commit/3c580cd204fde95a21de1830ace75d14d429f845 | 2 ]]). We've got bitten by that in C++ conformance tests, which use table of contents for directory hierarchy.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146654/new/

https://reviews.llvm.org/D146654



More information about the cfe-commits mailing list