[PATCH] D105167: [analyzer] Fix HTML report deduplication.

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 26 09:59:43 PDT 2021


NoQ added a comment.

In D105167#2949521 <https://reviews.llvm.org/D105167#2949521>, @ASDenysPetrov wrote:

> Nice work! Unfortunately I'm not able to run tests on my Windows env, but I've run you tests files manually. It works for me.
>
> P.S. BTW, is there any workarounds to make current tests supported on Windows? I know there is //REQUIRES// instruction (https://llvm.org/docs/TestingGuide.html#constraining-test-execution) but I didn't find any sufficient description of it.

Yes, you can always remove the `REQUIRES:` directive. The problem with generally enabling these tests on Windows is that scan-build is written in Perl and the Perl interpreter isn't shipped with Windows by default. Nothing else in LLVM is written in Perl so requiring a Perl installation to run LLVM tests just for this single use case is counter-productive.



================
Comment at: clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp:286-287
+  // but the stable report filename is still more verbose.
+  // We should rename the option ("verbose" filename?) but it will break
+  // people's workflows.
+  if (DiagOpts.ShouldWriteStableReportFilename) {
----------------
ASDenysPetrov wrote:
> vsavchenko wrote:
> > Can we make a mirror for this option and mark the other one as deprecated?
> Nice idea. I'm in favor of a mirorring.
The new option is `-analyzer-config verbose-report-filename=true` and the old option is preserved and acts exactly the same; the help text says that it's deprecated.


================
Comment at: clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp:300
+
+  filename << StringRef(getIssueHash(D, PP)).substr(0, 6).str() << ".html";
+  llvm::sys::path::append(ResultPath, Directory, filename.str());
----------------
ASDenysPetrov wrote:
> Do you think 6 trimmed characters are enough to avoid collisions?
It's basically same as before. Previously these 6 digits were completely random, now they're chunks of MD5 hashes of the real thing which is probably as random.

That's 16.7 million variants. The probability of collision on 6 digits with 100 warnings is around 0.0001%, with 1000 warnings it goes up to 3% ([[ https://en.wikipedia.org/wiki/Birthday_problem | Birthday Problem]]). Even though 1000 warnings are a real possibility on a huge project (eg., LLVM or WebKit), this is way above the point where you care about collisions.

I think there's nothing that prevents us from bumping it (at least in the non-verbose filename mode; in the verbose filename mode it may cause slightly more breakages in the case where it's close to hitting the file system's filename length limit). Bumping from 6 digits to 8 digits would reduce the collision probability on 1000 warnings to 0.0001%. I'll think about it a bit more :)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105167/new/

https://reviews.llvm.org/D105167



More information about the cfe-commits mailing list