[PATCH] D52390: [analyzer] StackSizeChecker

Mon Sep 24 13:26:55 PDT 2018

Szelethus added a comment.

> Thanks for all your detailed and helpful input, I will make sure to go over all the comments and answer them, but it will take some time.

Cheers! I can't emphasize enough however that I might be wrong on what I've said, or say in this comment.

> It was my introduction to the clang tool-chain

I always struggled getting enough literature about the static analyzer, if you didn't come across these works already, it might be worth taking a look :)

1. http://lcs.ios.ac.cn/~xuzb/canalyze/memmodel.pdf
2. https://github.com/llvm-mirror/clang/blob/master/lib/StaticAnalyzer/README.txt
3. https://clang-analyzer.llvm.org/checker_dev_manual.html
4. https://github.com/llvm-mirror/clang/blob/master/docs/analyzer/RegionStore.txt
5. https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf <--- by far the best guide for the static analyzer at the moment.
6. https://github.com/llvm-mirror/clang/blob/master/docs/analyzer/IPA.txt

> A bit more background information on this checker and how it came to be might help you and others to understand some of the choices I made, and also address some of your general questions and worries.
>  I was hesitant about putting too much in the summary but it seems I should have added more.

Summaries are one thing, that most likely won't be (and shouldn't have to be) read by a future maintainer, the code should speak for itself.

> This was a university assignment and I was encouraged to put it up here. This code has seen quite a few iterations and criticism.

It's great that you put it up here! New checkers (which from what I've seen are mostly written by beginners) tend to go through in some cases a many month long review process, so I guess be prepared for some back and forth, but I personally really enjoyed the process, I hope you will too.

I have read the rest of your comment, thanks for the details! I can't say that I understand every aspect of your algorithm, but I have a couple question for the grand picture (but feel free to correct me if I misunderstood anything):

I can see that you use `check::PreCall, check::PostCall, check::EndFunction`, and you also modifiy the program state to have a fairly good idea about what execution path did the analyzer take, but a function's stack usage it calculated very roughly. Let's imagine that in the next example, `f` is only called after heavy stack usage:

  // Let's just pretend that this function actually
  // has a meaningful implementation that the analyzer
  // knows will return false in this case.
  bool hasStackEnoughSpace() { return false; }

  void f() {
    if (hasStackEnoughSpace())
      // use the stack like an absolute madman
    else
      chill();
  }

This is silly, but what it wants to demonstrate, that you could report a false positive here, despite the analyzer potentially knowing that that path will  never be taken. To me it seems like you don't utilize the actual `ProgramState`, but I think you should.

To me it also seems like that you pretty much reinvent the compiler in this patch, by modeling almost everything the compiler does by itself. I'm sadly nowhere near an expert on this topic, but it begs the question whether there's an already existing solution to this. Let's wait for others to weigh in on this, maybe I'm wrong and this is what has to be done.

A couple tips for easier development and review process:

- I think before deep diving into the fixes, you really should split up this patch at least into an empty callback (essentially an announcement), and add each feature as a separate patch. Ideally, each time you implement a new feature, such as handling branches, you should make a neat patch with a small implementation and related test cases. As a beginner (and I suffered from this myself) this is exceptionally hard to do, because you can often find yourself deleting everything and starting all over, but I've grown to appreciate this principle as I often saved myself a whole lot of effort due to some feedback. However, making a rough proof of concept is in this case IMO a good idea, because it's hard to see at the start where the code will end up, but later it should be split up.
- Comment everything. I mean, `bool isSuccessful() const { return IsSuccessful; }` and similar functions speak for themselves, but ideally every non-trivial function and structure should be documented, as well as any non-trivial hackery inside a function.

Repository:
  rC Clang

https://reviews.llvm.org/D52390