[all-commits] [llvm/llvm-project] 9873ef: [analyzer] Ignore flex generated files

Balazs Benics via All-commits all-commits at lists.llvm.org
Mon Dec 6 01:20:50 PST 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9873ef409c4a937c566e20e7a88af049d212ce03
      https://github.com/llvm/llvm-project/commit/9873ef409c4a937c566e20e7a88af049d212ce03
  Author: Balazs Benics <balazs.benics at sigmatechnology.se>
  Date:   2021-12-06 (Mon, 06 Dec 2021)

  Changed paths:
    M clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
    M clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
    M clang/test/Analysis/analyzer-config.c
    A clang/test/Analysis/flexignore.c
    M clang/test/Analysis/yaccignore.c

  Log Message:
  -----------
  [analyzer] Ignore flex generated files

Some projects [1,2,3] have flex-generated files besides bison-generated
ones.
Unfortunately, the comment `"/* A lexical scanner generated by flex */"`
generated by the tools is not necessarily at the beginning of the file,
thus we need to quickly skim through the file for this needle string.

Luckily, StringRef can do this operation in an efficient way.

That being said, now the bison comment is not required to be at the very
beginning of the file. This allows us to detect a couple more cases
[4,5,6].

Alternatively, we could say that we only allow whitespace characters
before matching the bison/flex header comment. That would prevent the
(probably) unnecessary string search in the buffer. However, I could not
verify that these tools would actually respect this assumption.

Additionally to this, e.g. the Twin project [1] has other non-whitespace
characters (some preprocessor directives) before the flex-generated
header comment. So the heuristic in the previous paragraph won't work
with that.
Thus, I would advocate the current implementation.

According to my measurement, this patch won't introduce measurable
performance degradation, even though we will do 2 linear scans.

I introduce the ignore-bison-generated-files and
ignore-flex-generated-files to disable skipping these files.
Both of these options are true by default.

[1]: https://github.com/cosmos72/twin/blob/master/server/rcparse_lex.cpp#L7
[2]: https://github.com/marcauresoar/make-examples/blob/22362cdcf9dd7c597b5049ce7f176621e2e9ac7a/sandbox/count-words/lexer.c#L6
[3]: https://github.com/vladcmanea/2nd-faculty-year-Formal-Languages---Automata-assignments/blob/11abdf64629d9eb741438ba69f04636769d5a374/lab1/lex.yy.c#L6

[4]: https://github.com/KritikaChoudhary/System-Software-Lab/blob/47f5b2cfe2a2738fd54eae9f8439817f6a22034e/B_yacc/1/y1.tab.h#L2
[5]: https://github.com/VirtualMonitor/VirtualMonitor/blob/71d1bf9b1e7b392a7bd0c73dc217138dc5865651/src/VBox/Additions/x11/x11include/xorg-server-1.8.0/parser.h#L2
[6]: https://github.com/bspaulding/DrawTest/blob/3f773ceb13de14275429036b9cbc5aa19e29bab9/Framework/OpenEars.framework/Versions/A/Headers/jsgf_parser.h#L2

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D114510




More information about the All-commits mailing list